Solved: Obtain key with the largest number of objects from...

spy_jr · ‎11-11-2023

Good day everyone

Someone here will have had experience obtaining values from a JSON.. Currently I have _raws in JSON format from which I try to obtain a table that shows in a single row the values of the object that has the array with the most data.

I better explain myself with the following example:

This is the JSON code that comes in each event:

{
   "investigationStatus":"New",
   "status":1,
   "priorityScore":38,
   "workbenchName":"PSEXEC Execution By Process",
   "workbenchId":"WB-18286-20231106-00005",
   "severity":"low",
   "caseId":null,
   "detail":{
      "schemaVersion":"1.14",
      "alertProvider":"SAE",
      "description":"PSEXEC execution to start remote process",
      "impactScope":[
         {
            "entityValue":{
               "name":"SERVER01",
               },
            "relatedIndicators":[
               2
            ]
         },
         {
            "entityValue":{
               "name":"SERVER02",
              },
            "relatedIndicators":[
               2,
			   3
            ]
		 },
         {
            "entityValue":{
               "name":"SERVER03",
            },
            "relatedIndicators":[
               1,
               2,
			   3,
			   4
            ]
         },
		 {
            "entityValue":{
               "name":"SERVER04",
               },
            "relatedIndicators":[
               1
            ]
         }
      ]
   }
}

And this is the table I'm trying to get:

workbenchId	workbenchName	severity	name_host
"WB-18286-20231106-00005"	"PSEXEC Execution By Process"	"low"	"SERVER03"

If you can see, the values of the 1st level of the JSON are found, and then there is the host_name SERVER03, since this has the largest number of values in the "relatedIndicators" array (from 1 to 4), the rest of the servers do not because they have smaller amount in the array.

Maybe any idea how I could achieve it? I tried with json_extract but didn't succeed

tscroggins · ‎11-11-2023

Hi @spy_jr,

Splunk may fail to process your _raw JSON as-is. There is an extra comma after the name string value of each entityValue object, e.g. after "name": "SERVER01":

{
  "entityValue": {
    "name": "SERVER01",
  },
  "relatedIndicators": [
    2
  ]
},

After correcting that, you can extract the values you need using JSON and multivalue eval functions:

| spath
| eval impactScope_mv=json_array_to_mv(json_extract(_raw, "detail.impactScope{}"), false())
| eval relatedIndicators_mvcount=mvmap(impactScope_mv, mvcount(json_array_to_mv(json(json_extract(impactScope_mv, "relatedIndicators")), false())))
| eval relatedIndicators_mvindex=mvfind(relatedIndicators_mvcount, max(relatedIndicators_mvcount))
| eval impactScope_name=json_extract(mvindex(impactScope_mv, relatedIndicators_mvindex), "entityValue.name")
| table workbenchId workbenchName severity impactScope_name

Note that ties--two or more arrays of equal length--will return the first entry.

In a nutshell, we:

1. Convert the detail.impactScope{} array in a multivalued field.

2. For each entry of the impactScope array, convert the relatedIndicators array to a multivalued field and store the count of entries in a new multivalued field.

3. Find the index of the largest value (the longest array).

4. Extract the entityValue.name value from the impactScope_mv multivalued field using the index identified in step 3.

5. Display a table using the fields extracted by spath and the field created in step 4.

View solution in original post

tscroggins · ‎11-11-2023

Hi @spy_jr,

Splunk may fail to process your _raw JSON as-is. There is an extra comma after the name string value of each entityValue object, e.g. after "name": "SERVER01":

{
  "entityValue": {
    "name": "SERVER01",
  },
  "relatedIndicators": [
    2
  ]
},

After correcting that, you can extract the values you need using JSON and multivalue eval functions:

| spath
| eval impactScope_mv=json_array_to_mv(json_extract(_raw, "detail.impactScope{}"), false())
| eval relatedIndicators_mvcount=mvmap(impactScope_mv, mvcount(json_array_to_mv(json(json_extract(impactScope_mv, "relatedIndicators")), false())))
| eval relatedIndicators_mvindex=mvfind(relatedIndicators_mvcount, max(relatedIndicators_mvcount))
| eval impactScope_name=json_extract(mvindex(impactScope_mv, relatedIndicators_mvindex), "entityValue.name")
| table workbenchId workbenchName severity impactScope_name

Note that ties--two or more arrays of equal length--will return the first entry.

In a nutshell, we:

1. Convert the detail.impactScope{} array in a multivalued field.

2. For each entry of the impactScope array, convert the relatedIndicators array to a multivalued field and store the count of entries in a new multivalued field.

3. Find the index of the largest value (the longest array).

4. Extract the entityValue.name value from the impactScope_mv multivalued field using the index identified in step 3.

5. Display a table using the fields extracted by spath and the field created in step 4.

spy_jr · ‎11-12-2023

Great @tscroggins
With that it worked perfectly for me, I didn't know the functions for processing json, with this I will be able to learn them

Thank you so much!

Obtain key with the largest number of objects from array in JSON

field extraction

Tech Talk Recap | Mastering Threat Hunting

Observability for AI Applications: Troubleshooting Latency

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

Are you a member of the Splunk Community?

Obtain key with the largest number of objects from array in JSON

field extraction

Tech Talk Recap | Mastering Threat Hunting

Observability for AI Applications: Troubleshooting Latency

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?