Splunk Search

How to extract specific field based on another field in a nested json array?

ashish_boss
Explorer

I have below json data: 

{
"source": "Mule",
"sourcetype": "_json",
"index": "metrics",
"event": [
{
"date": "2022-11-19T13:57:01.427+05:30",
"businessGroup": "testgroup",
"businessGroupId": "1234",
"coreServicesMetrics": {
"users": {
"total": 38
},
"environments": {
"total": 3
}
},
"designCenterMetrics": {
"total": 5
},
"exchangeMetrics": {
"total": 1,
"reuse": {
"fragments": 0,
"implementedApis": 0
}
},
"apiManagerMetrics": {
"clients": 0,
"apis": {
"production": {
"total": 0,
"contracts": 0,
"policiesUsed": [

],
"policiesUsedTotal": 0,
"automatedPoliciesUsed": [
],
"automatedPoliciesUsedTotal": 0,
"transactions": 0,
"details": [
{
"environmentName": "PROD",
"environmentType": "Production",
"contracts": 0,
"policiesUsed": [
],
"policiesUsedTotal": 0,
"automatedPoliciesUsed": [
],
"automatedPoliciesUsedTotal": 0,
"transactions": 0
}
]
},
"sandbox": {
"total": 0,
"contracts": 0,
"policiesUsed": [
],
"policiesUsedTotal": 0,
"automatedPoliciesUsed": [
],
"automatedPoliciesUsedTotal": 0,
"transactions": 0,
"details": [
{
"environmentName": "DEV",
"environmentType": "Sandbox",
"policiesUsed": [
],
"policiesUsedTotal": 0,
"automatedPoliciesUsed": [
],
"automatedPoliciesUsedTotal": 0,
"transactions": 0
},
{
"environmentName": "TEST",
"environmentType": "Sandbox",
"contracts": 0,
"policiesUsed": [

],
"policiesUsedTotal": 0,
"automatedPoliciesUsed": [

],
"automatedPoliciesUsedTotal": 0,
"transactions": 0
}
]
}
}
},
"runtimeManagerMetrics": {
"cloudhub": {
"networking": {
"vpcsTotal": 9,
},
"applications": {
"production": {
"vcoresTotal": 31,
"runtimesUsed": [

],
"runtimesUsedTotal": 0,
"details": [
{
"environmentName": "X",
"environmentType": "X",
"runtimesUsed": [

],
"runtimesUsedTotal": 0
}
]
},
"sandbox": {
"vcoresTotal": 14,
"runtimesUsed": [

],
"runtimesUsedTotal": 0,
"details": [
{
"environmentName": "DEV",
"environmentType": "Sandbox",
"vcoresUsed": 0,
"runtimesUsed": [

],
"runtimesUsedTotal": 0
},
{
"environmentName": "TEST",
"environmentType": "Sandbox",
"runtimesUsed": [

],
"runtimesUsedTotal": 0
}
]
}
}
}
}

},

{
"date": "2022-11-19T13:57:01.427+05:30",
"businessGroup": "MainGroup",
"businessGroupId": "5678",
"coreServicesMetrics": {
"users": {
"total": 16
},
"environments": {
"total": 3
}
},
"designCenterMetrics": {
"total": 11
},
"exchangeMetrics": {
"total": 11,
"reuse": {
"fragments": 10,

}
},
"apiManagerMetrics": {
"clients": 0,
"apis": {
"production": {
"total": 0
"contracts": 0,
"policiesUsed": [

],
"policiesUsedTotal": 0,
"automatedPoliciesUsed": [

],
"automatedPoliciesUsedTotal": 0,
"transactions": 0,
"details": [
{
"environmentName": "X",
"environmentType": "X",
"policiesUsed": [

],
"policiesUsedTotal": 0,
"automatedPoliciesUsed": [

],
"automatedPoliciesUsedTotal": 0,
"transactions": 0
}
]
},
"sandbox": {
"total": 20,
"contracts": 24,
"policiesUsed": [
"client-id-enforcement"
],
"policiesUsedTotal": 1,
"automatedPoliciesUsed": [

],
"automatedPoliciesUsedTotal": 0,
"transactions": 1499,
"details": [
{
"environmentName": "DEV",
"environmentType": "Sandbox",
"contracts": 11,
"policiesUsed": [
"client-id-enforcement"
],
"policiesUsedTotal": 1,
"automatedPoliciesUsed": [

],
"automatedPoliciesUsedTotal": 0,
"transactions": 585
},
{
"environmentName": "TEST",
"environmentType": "Sandbox",
"contracts": 13,
"policiesUsed": [
"client-id-enforcement"
],
"policiesUsedTotal": 1,
"automatedPoliciesUsed": [

],
"automatedPoliciesUsedTotal": 0,
"transactions": 914
}
]
}
}
}
}
]
}

 

I want to extract below data :

event{where businessGroup="MainGroup"}.apiManagerMetrics.apis.sandbox.details{where environmentName="DEV"}.transactions using splunk query. Can you please help me with that

 

Labels (2)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

As a side note, please make sure your sample data syntax is correct.  The sample you posted contains so many errors that it took more than an hour for me to reconstruct its proper structure.  It is best to use a pretty print tool to format the raw data, then anonymize on such format. (A simple tool that is ubiquitous these days is python3 -m json.tool.)

For reference, the correct sample is

{
    "source": "Mule",
    "sourcetype": "_json",
    "index": "metrics",
    "event": [
        {
            "date": "2022-11-19T13:57:01.427+05:30",
            "businessGroup": "testgroup",
            "businessGroupId": "1234",
            "coreServicesMetrics": {
                "users": {
                    "total": 38
                },
                "environments": {
                    "total": 3
                }
            },
            "designCenterMetrics": {
                "total": 5
            },
            "exchangeMetrics": {
                "total": 1,
                "reuse": {
                    "fragments": 0,
                    "implementedApis": 0
                }
            },
            "apiManagerMetrics": {
                "clients": 0,
                "apis": {
                    "production": {
                        "total": 0,
                        "contracts": 0,
                        "policiesUsed": [],
                        "policiesUsedTotal": 0,
                        "automatedPoliciesUsed": [],
                        "automatedPoliciesUsedTotal": 0,
                        "transactions": 0,
                        "details": [
                            {
                                "environmentName": "PROD",
                                "environmentType": "Production",
                                "contracts": 0,
                                "policiesUsed": [],
                                "policiesUsedTotal": 0,
                                "automatedPoliciesUsed": [],
                                "automatedPoliciesUsedTotal": 0,
                                "transactions": 0
                            }
                        ]
                    },
                    "sandbox": {
                        "total": 0,
                        "contracts": 0,
                        "policiesUsed": [],
                        "policiesUsedTotal": 0,
                        "automatedPoliciesUsed": [],
                        "automatedPoliciesUsedTotal": 0,
                        "transactions": 0,
                        "details": [
                            {
                                "environmentName": "DEV",
                                "environmentType": "Sandbox",
                                "policiesUsed": [],
                                "policiesUsedTotal": 0,
                                "automatedPoliciesUsed": [],
                                "automatedPoliciesUsedTotal": 0,
                                "transactions": 0
                            },
                            {
                                "environmentName": "TEST",
                                "environmentType": "Sandbox",
                                "contracts": 0,
                                "policiesUsed": [],
                                "policiesUsedTotal": 0,
                                "automatedPoliciesUsed": [],
                                "automatedPoliciesUsedTotal": 0,
                                "transactions": 0
                            }
                        ]
                    }
                }
            },
            "runtimeManagerMetrics": {
                "cloudhub": {
                    "networking": {
                        "vpcsTotal": 9
                    },
                    "applications": {
                        "production": {
                            "vcoresTotal": 31,
                            "runtimesUsed": [],
                            "runtimesUsedTotal": 0,
                            "details": [
                                {
                                    "environmentName": "X",
                                    "environmentType": "X",
                                    "runtimesUsed": [],
                                    "runtimesUsedTotal": 0
                                }
                            ]
                        },
                        "sandbox": {
                            "vcoresTotal": 14,
                            "runtimesUsed": [],
                            "runtimesUsedTotal": 0,
                            "details": [
                                {
                                    "environmentName": "DEV",
                                    "environmentType": "Sandbox",
                                    "vcoresUsed": 0,
                                    "runtimesUsed": [],
                                    "runtimesUsedTotal": 0
                                },
                                {
                                    "environmentName": "TEST",
                                    "environmentType": "Sandbox",
                                    "runtimesUsed": [],
                                    "runtimesUsedTotal": 0
                                }
                            ]
                        }
                    }
                }
            }
        },
        {
            "date": "2022-11-19T13:57:01.427+05:30",
            "businessGroup": "MainGroup",
            "businessGroupId": "5678",
            "coreServicesMetrics": {
                "users": {
                    "total": 16
                },
                "environments": {
                    "total": 3
                }
            },
            "designCenterMetrics": {
                "total": 11
            },
            "exchangeMetrics": {
                "total": 11,
                "reuse": {
                    "fragments": 10
                }
            },
            "apiManagerMetrics": {
                "clients": 0,
                "apis": {
                    "production": {
                        "total": 0,
                        "contracts": 0,
                        "policiesUsed": [],
                        "policiesUsedTotal": 0,
                        "automatedPoliciesUsed": [],
                        "automatedPoliciesUsedTotal": 0,
                        "transactions": 0,
                        "details": [
                            {
                                "environmentName": "X",
                                "environmentType": "X",
                                "policiesUsed": [],
                                "policiesUsedTotal": 0,
                                "automatedPoliciesUsed": [],
                                "automatedPoliciesUsedTotal": 0,
                                "transactions": 0
                            }
                        ]
                    },
                    "sandbox": {
                        "total": 20,
                        "contracts": 24,
                        "policiesUsed": [
                            "client-id-enforcement"
                        ],
                        "policiesUsedTotal": 1,
                        "automatedPoliciesUsed": [],
                        "automatedPoliciesUsedTotal": 0,
                        "transactions": 1499,
                        "details": [
                            {
                                "environmentName": "DEV",
                                "environmentType": "Sandbox",
                                "contracts": 11,
                                "policiesUsed": [
                                    "client-id-enforcement"
                                ],
                                "policiesUsedTotal": 1,
                                "automatedPoliciesUsed": [],
                                "automatedPoliciesUsedTotal": 0,
                                "transactions": 585
                            },
                            {
                                "environmentName": "TEST",
                                "environmentType": "Sandbox",
                                "contracts": 13,
                                "policiesUsed": [
                                    "client-id-enforcement"
                                ],
                                "policiesUsedTotal": 1,
                                "automatedPoliciesUsed": [],
                                "automatedPoliciesUsedTotal": 0,
                                "transactions": 914
                            }
                        ]
                    }
                }
            }
        }
    ]
}

(Pretty-print is not necessary, but makes syntax check easier.)

ashish_boss
Explorer

I apologise for the syntax. I will make sure next time. Thanks

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Nested multivalue fields of interest needs to be enumerated separately before you do definitive selection.  Otherwise, if any value in the field matches, the entire entry will be selected.  In your case, they include event{}, and event{}.apiManagerMetrics.apis.sandbox.details{}.  The tool is mvexpand; you need to use spath to extract them first.

| spath path=event{}
| mvexpand event{}
| spath input=event{}
| where businessGroup=="MainGroup"
| spath input=event{} path=apiManagerMetrics.apis.sandbox.details{}
| mvexpand apiManagerMetrics.apis.sandbox.details{}
| spath input=apiManagerMetrics.apis.sandbox.details{}
| where environmentName=="DEV"

In the above, I inserted selection command where (you can use search if you like) in the earliest possible sequence in order to maximize performance.  If you like maximum flexibility in stats, you can do

| spath path=event{}
| mvexpand event{}
| spath input=event{}
| spath input=event{} path=apiManagerMetrics.apis.sandbox.details{}
| mvexpand apiManagerMetrics.apis.sandbox.details{}
| spath input=apiManagerMetrics.apis.sandbox.details{}
| where businessGroup=="MainGroup" AND environmentName=="DEV"
Tags (2)

ashish_boss
Explorer

Thanks for the solution.

Is there any way I can manipulate where clause. Actually, I want to get number of transactions based on businessGroup and environmentName.

Soif user selects "SHI Japan" as businessGroup and "DEV" as environmentName then it should give a sum of transactions satisfying the above two conditions. If user does not select anything, then it will be sum of all transactions and so on.

 

I am able to get the individual numbers but I am unable to adjust this query in a way it gets all transactions if no value is selected:
index=$platformMetricsIndex$ | eventstats max(_time) as maxTimestamp | where _time=maxTimestamp | spath | rename "apiManagerMetrics.apis.sandbox.details{}.*" as * | fillnull | eval deets=mvzip(businessGroup, mvzip(environmentName,transactions)) | mvexpand deets | eval deets=split(deets,",") | eval bg=mvindex(deets,0), env=mvindex(deets,1), trans=mvindex(deets,2) | table bg, env, trans | where bg in ("SHI Japan") AND env in ("DEV") | stats sum(trans) as "APIRequests"

0 Karma

yuanliu
SplunkTrust
SplunkTrust

After reading your (longish) sample code, I couldn't find what counts as a "transaction".  Can you explain in terms of data?  If each event is one transaction, wouldn't it be as simple as this?

| spath path=event{}
| mvexpand event{}
| spath input=event{}
| spath input=event{} path=apiManagerMetrics.apis.sandbox.details{}
| mvexpand apiManagerMetrics.apis.sandbox.details{}
| spath input=apiManagerMetrics.apis.sandbox.details{}
| stats count by businessGroup environmentName

Your next sentence centers on what user "selects".  This suggests that you are designing some sort of dashboard interaction.  If that is the case, user input would be in some input token.  Is that correct?  Assume the tokens are $bizgroup_tok$ and $env_tok$, it can be as simple as

| spath path=event{}
| mvexpand event{}
| spath input=event{}
| spath input=event{} path=apiManagerMetrics.apis.sandbox.details{}
| mvexpand apiManagerMetrics.apis.sandbox.details{}
| where businessGroup=="$bizgroup_tok$"
| spath input=apiManagerMetrics.apis.sandbox.details{}
| where environmentName=="$env_tok$"
| stats count by businessGroup environmentName

 

0 Karma

ashish_boss
Explorer

Thanks for your reply. Yes, you are correct. I am creating a dashboard where there will be two inputs in multiselect field:

businessGroup: $bizgroup_tok$ (Multiselect -- values (ALL, Maingroup, testgroup))

environmentName: $env_tok$ (Multiselect -- values (ALL, DEV, TEST))

 

"transactions" is a field in the json sample code:

Please notice: 

"transactions": 914
{
                                "environmentName": "TEST",
                                "environmentType": "Sandbox",
                                "contracts": 13,
                                "policiesUsed": [
                                    "Client-id-enforcement"
                                ],
                                "policiesUsedTotal": 1,
                                "automatedPoliciesUsed": [],
                                "automatedPoliciesUsedTotal": 0,
                                "transactions": 914
                            }

 

I am able to select specific transactions based on particular businessGroup and environmentName now. The issue here is that when we select "ALL" in businessGroup multiselect input, it should give a total of all transactions in all business groups and if user selects "ALL" in environmentName multiselect input then all transactions of environments in that business group should be given and if "ALL" in both then a total of all transactions should be there and so on.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Let's try to cut down confusion.  The numerous eventstats, mvzip, split, then join just create distraction for other people who want to help.  Can you describe in plain words, maybe add a mockup of the intended results?

After going round and round, it seems to me that all you needed is a sum of all transactions, with no breakdown of group or environment.  If this is the case, the following will do without any of those transforms.

 

| spath path=event{}
| mvexpand event{}
| spath input=event{}
| spath input=event{} path=apiManagerMetrics.apis.sandbox.details{}
| mvexpand apiManagerMetrics.apis.sandbox.details{}
| spath input=apiManagerMetrics.apis.sandbox.details{}
| eval deets=mvzip(businessGroup, mvzip(environmentName,transactions)) ``` this is immaterial to any calculation ```
| stats sum(transactions) as APIRequests values(deets)

 

In the above, I calculated what you named as deets which in my opinion doesn't contribute anything to your end result; but I included its values so you can verify if the outcome is expected.

APIRequests
values(deets)
1499
MainGroup,DEV,585
MainGroup,TEST,914
testgroup,DEV,0
testgroup,TEST,0

Disregard deets, is APIRequests shown above what you wanted when the user selects "all"?

Second, when you design dashboard interaction, what is important is not how you call each selection, but what is the value of each selection.  In other words, you need to design input to make search successful.  You didn't explain which value the "all" option gets so others cannot know why the selection "doesn't work".  Additionally, until this last post, you didn't explain that the tokens would be multiselect. (Multiselect is not the first choice for most applications.  When a selection of "all" exists, user experience is often quite poor.)

I happen to use multiselect quite a bit, so I'll give you the design that will work, and leave input coding as your homework.

This is how each selection should evaluate into:

$bizgroup_tok$

ALL*
testgrouptestgroup
MainGroupMainGroup

$env_tok$

ALL*
DEVDEV
TESTTEST

In both inputs, if more than one group is selected, the value should be a comma delimited list of all selections. (There are other ways to design multiselect.  This is the one used by the code below.)

With this design, you can accomplish what you needed with

 

| spath path=event{}
| mvexpand event{}
| spath input=event{}
| spath input=event{} path=apiManagerMetrics.apis.sandbox.details{}
| mvexpand apiManagerMetrics.apis.sandbox.details{}
| search businessGroup IN ($group_tok$)
| spath input=apiManagerMetrics.apis.sandbox.details{}
| search environmentName IN ($env_tok$)
| eval deets=mvzip(businessGroup, mvzip(environmentName,transactions)) ``` this is immaterial to any calculation ```
| stats sum(transactions) as APIRequests values(deets)​

 

As a test, consider these selections

$group_tok$MainGroup,testgroup
$env_tok$DEV

The rendered code will be

 

| spath path=event{}
| mvexpand event{}
| spath input=event{}
| spath input=event{} path=apiManagerMetrics.apis.sandbox.details{}
| mvexpand apiManagerMetrics.apis.sandbox.details{}
| search businessGroup IN (MainGroup,testgroup)
| spath input=apiManagerMetrics.apis.sandbox.details{}
| search environmentName IN (DEV)
| eval deets=mvzip(businessGroup, mvzip(environmentName,transactions)) ``` this is immaterial to any calculation ```
| stats sum(transactions) as APIRequests values(deets)

 

Output will be

APIRequests
values(deets)
585
MainGroup,DEV,585
testgroup,DEV,0
0 Karma

ashish_boss
Explorer

Actually, I want to get the values dynamically like based on businessGroup and environmentName, which means if I change the businessGroup or Envrinment name then it should give me the correct transactions and if more than one business group or environmentName is selected then it tshould give the sum of transactions

0 Karma

richgalloway
SplunkTrust
SplunkTrust

What have you tried so far?  What results did you get?

---
If this reply helps you, Karma would be appreciated.
0 Karma

ashish_boss
Explorer

I have used below query:

index=metrics | eventstats max(_time) as maxTimestamp | where _time=maxTimestamp | search businessGroup IN ("Maingroup") | stats values(apiManagerMetrics.apis.sandbox.details{}.environmentName) as env, values(apiManagerMetrics.apis.sandbox.details{}.transactions) as trans by businessGroup | eval keyvalue = mvzip(env, trans) | mvexpand keyvalue | rex field=keyvalue "^(?<key>[^,]+),(?<value>.+)$" | search key=TEST | stats sum(value) as "API Requests"

 

which gives me correct result as:

API Requests

914

 

but If I add value to testgroup data in Sandbox, Test Env as below:

"details": [
{
"environmentName": "DEV",
"environmentType": "Sandbox",
"policiesUsed": [
],
"policiesUsedTotal": 0,
"automatedPoliciesUsed": [
],
"automatedPoliciesUsedTotal": 0,
"transactions": 25
 
then this query fails and says that this is TEST data instead of DEV as below:
 
index=metrics | eventstats max(_time) as maxTimestamp | where _time=maxTimestamp | search businessGroup IN ("testgroup") | stats values(apiManagerMetrics.apis.sandbox.details{}.environmentName) as env, values(apiManagerMetrics.apis.sandbox.details{}.transactions) as trans by businessGroup | eval keyvalue = mvzip(env, trans) | mvexpand keyvalue | rex field=keyvalue "^(?<key>[^,]+),(?<value>.+)$" | search key=DEV| stats sum(value) as "API Requests"
 
API Requests

0

==> Notice 0 result but it contains value 25


index=metrics | eventstats max(_time) as maxTimestamp | where _time=maxTimestamp | search businessGroup IN ("testgroup") | stats values(apiManagerMetrics.apis.sandbox.details{}.environmentName) as env, values(apiManagerMetrics.apis.sandbox.details{}.transactions) as trans by businessGroup | eval keyvalue = mvzip(env, trans) | mvexpand keyvalue | rex field=keyvalue "^(?<key>[^,]+),(?<value>.+)$" | search key=TEST | stats sum(value) as "API Requests"
 
API Requests
25

==> notice 25 but it has value 0

0 Karma
Get Updates on the Splunk Community!

Harnessing Splunk’s Federated Search for Amazon S3

Managing your data effectively often means balancing performance, costs, and compliance. Splunk’s Federated ...

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...