Splunk Search

Obtain key with the largest number of objects from array in JSON

spy_jr
Engager

Good day everyone

Someone here will have had experience obtaining values from a JSON.. Currently I have _raws in JSON format from which I try to obtain a table that shows in a single row the values of the object that has the array with the most data.

I better explain myself with the following example:

This is the JSON code that comes in each event:

 

 

 

{
   "investigationStatus":"New",
   "status":1,
   "priorityScore":38,
   "workbenchName":"PSEXEC Execution By Process",
   "workbenchId":"WB-18286-20231106-00005",
   "severity":"low",
   "caseId":null,
   "detail":{
      "schemaVersion":"1.14",
      "alertProvider":"SAE",
      "description":"PSEXEC execution to start remote process",
      "impactScope":[
         {
            "entityValue":{
               "name":"SERVER01",
               },
            "relatedIndicators":[
               2
            ]
         },
         {
            "entityValue":{
               "name":"SERVER02",
              },
            "relatedIndicators":[
               2,
			   3
            ]
		 },
         {
            "entityValue":{
               "name":"SERVER03",
            },
            "relatedIndicators":[
               1,
               2,
			   3,
			   4
            ]
         },
		 {
            "entityValue":{
               "name":"SERVER04",
               },
            "relatedIndicators":[
               1
            ]
         }
      ]
   }
}

 

 

 

And this is the table I'm trying to get:

workbenchIdworkbenchName severity name_host
"WB-18286-20231106-00005""PSEXEC Execution By Process""low""SERVER03"

 

If you can see, the values of the 1st level of the JSON are found, and then there is the host_name SERVER03, since this has the largest number of values in the "relatedIndicators" array (from 1 to 4), the rest of the servers do not because they have smaller amount in the array.

Maybe any idea how I could achieve it? I tried with json_extract but didn't succeed

 

Labels (1)
Tags (1)
0 Karma
1 Solution

tscroggins
Influencer

Hi @spy_jr,

Splunk may fail to process your _raw JSON as-is. There is an extra comma after the name string value of each entityValue object, e.g. after "name": "SERVER01":

{
"entityValue": {
"name": "SERVER01",
},
"relatedIndicators": [
2
]
},

After correcting that, you can extract the values you need using JSON and multivalue eval functions:

| spath
| eval impactScope_mv=json_array_to_mv(json_extract(_raw, "detail.impactScope{}"), false())
| eval relatedIndicators_mvcount=mvmap(impactScope_mv, mvcount(json_array_to_mv(json(json_extract(impactScope_mv, "relatedIndicators")), false())))
| eval relatedIndicators_mvindex=mvfind(relatedIndicators_mvcount, max(relatedIndicators_mvcount))
| eval impactScope_name=json_extract(mvindex(impactScope_mv, relatedIndicators_mvindex), "entityValue.name")
| table workbenchId workbenchName severity impactScope_name

Note that ties--two or more arrays of equal length--will return the first entry.

In a nutshell, we:

1. Convert the detail.impactScope{} array in a multivalued field.

2. For each entry of the impactScope array, convert the relatedIndicators array to a multivalued field and store the count of entries in a new multivalued field.

3. Find the index of the largest value (the longest array).

4. Extract the entityValue.name value from the impactScope_mv multivalued field using the index identified in step 3.

5. Display a table using the fields extracted by spath and the field created in step 4.

View solution in original post

0 Karma

tscroggins
Influencer

Hi @spy_jr,

Splunk may fail to process your _raw JSON as-is. There is an extra comma after the name string value of each entityValue object, e.g. after "name": "SERVER01":

{
"entityValue": {
"name": "SERVER01",
},
"relatedIndicators": [
2
]
},

After correcting that, you can extract the values you need using JSON and multivalue eval functions:

| spath
| eval impactScope_mv=json_array_to_mv(json_extract(_raw, "detail.impactScope{}"), false())
| eval relatedIndicators_mvcount=mvmap(impactScope_mv, mvcount(json_array_to_mv(json(json_extract(impactScope_mv, "relatedIndicators")), false())))
| eval relatedIndicators_mvindex=mvfind(relatedIndicators_mvcount, max(relatedIndicators_mvcount))
| eval impactScope_name=json_extract(mvindex(impactScope_mv, relatedIndicators_mvindex), "entityValue.name")
| table workbenchId workbenchName severity impactScope_name

Note that ties--two or more arrays of equal length--will return the first entry.

In a nutshell, we:

1. Convert the detail.impactScope{} array in a multivalued field.

2. For each entry of the impactScope array, convert the relatedIndicators array to a multivalued field and store the count of entries in a new multivalued field.

3. Find the index of the largest value (the longest array).

4. Extract the entityValue.name value from the impactScope_mv multivalued field using the index identified in step 3.

5. Display a table using the fields extracted by spath and the field created in step 4.

0 Karma

spy_jr
Engager

Great @tscroggins 
With that it worked perfectly for me, I didn't know the functions for processing json, with this I will be able to learn them

Thank you so much!

Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...