Getting Data In

How to parse nested Json so I can extract parent process information that lines up attributes together?


The am having some issues with extracting what I want out of the json that goes into splunk from Tanium for signal alerts. I am trying to extract relevant parent process information and have it line up properly with the correct PID, arguments, user, hash, etc. that is nested for each parent or parents parent, etc. Below is the raw JSON that goes into splunk.

{ "Event": [{"Event Id":"608f2b04-b782-4557-96cd-f2a53b84dc9b","Event Name":"detect.match","Timestamp":"2019-04-02T12:49:40.000Z","Priority":"high","Severity":"info","Computer Name":"SomeComputerName","Computer IP":"10.0.0106","User Id":"","User Name":"","User Domain":"","Other Parameters":"payload={\"config_id\":40,\"config_rev_id\":2,\"intel_id\":82,\"match\":{\"hash\":753634242,\"properties\":{\"args\":\"wmic  process call create winword.exe\",\"cwd\":null,\"file\":{\"fullpath\":\"C:\\\\Windows\\\\System32\\\\wbem\\\\WMIC.exe\",\"md5\":\"05FE5A6B01FE80083C6E5FE1BC62BA6E\"},\"name\":\"WMIC.exe\",\"parent\":{\"args\":\"\\\"C:\\\\Windows\\\\system32\\\\cmd.exe\\\" \",\"cwd\":null,\"file\":{\"fullpath\":\"C:\\\\Windows\\\\System32\\\\cmd.exe\",\"md5\":\"E08FE2DE3DDD22123247D49A11B4F53D\"},\"name\":\"cmd.exe\",\"parent\":{\"args\":\"C:\\\\Windows\\\\Explorer.EXE\",\"cwd\":null,\"file\":{\"fullpath\":\"C:\\\\Windows\\\\explorer.exe\",\"md5\":\"5CDE14540712838961E3B63930CE8C5D\"},\"name\":\"explorer.exe\",\"parent\":{\"args\":\"C:\\\\Windows\\\\system32\\\\userinit.exe\",\"cwd\":null,\"file\":{\"fullpath\":\"C:\\\\Windows\\\\System32\\\\userinit.exe\",\"md5\":\"755ED4FDBD7D6C3980610E26E527E2F5\"},\"name\":\"userinit.exe\",\"parent\":{\"args\":\"\\\\Device\\\\HarddiskVolume4\\\\Windows\\\\System32\\\\winlogon.exe\",\"cwd\":null,\"file\":{\"fullpath\":\"C:\\\\Windows\\\\System32\\\\winlogon.exe\",\"md5\":null},\"name\":\"winlogon.exe\",\"parent\":null,\"pid\":424,\"ppid\":0,\"start_time\":\"2019-04-02T11:26:53Z\",\"trace_process_table_id\":1765,\"user\":\"NT AUTHORITY\\\\SYSTEM\"},\"pid\":8108,\"ppid\":424,\"start_time\":\"2019-04-02T11:28:10Z\",\"trace_process_table_id\":1973,\"user\":\"SomeName"},\"pid\":8120,\"ppid\":8108,\"start_time\":\"2019-04-02T11:28:10Z\",\"trace_process_table_id\":1975,\"user\":\"SomeName"},\"pid\":19376,\"ppid\":8120,\"start_time\":\"2019-04-02T11:47:26Z\",\"trace_process_table_id\":2338,\"user\":\"SomeName"},\"pid\":12432,\"ppid\":19376,\"start_time\":\"2019-04-02T12:49:27Z\",\"trace_process_table_id\":2465,\"user\":\"SomeName"},\"source\":\"signals\",\"type\":\"process\",\"version\":1},\"service_id\":\"0131f75a-3885-493b-b345-d3463b039630\"}"}] }

I basically want to be able to pull out the nested parent process information and be able to keep it together so I can make it a multivalue field that has the proper associated PID, parent process name, hash, arguments, etc. The goal is to be able to group together any number of parent.parent.parent fields that may show up in there with their associated information and then display that in a user friendly way when I create a correlation rule in ES down the road.

Here is the Search I was trying to work up to do something like I wanted, but it seemed like the order of the users and the order of the PIDs are in reverse of where I want to be. I also do not enough if this is the most efficient way of doing this.

Splunk Search:

index=main sourcetype=tanium "\"Event\":" 
| spath 
| spath path="Event{}" output="Event" 
| mvexpand "Event" 
| eval _raw='Event' 
| kv 
| rex field="Other Parameters" "payload=(?P<payload>.*)" 
| rename "Event Id" AS alert_id "Computer Name" AS ComputerName "Computer IP" AS src_ip "Event Name" AS intel_name 
| eval _raw='payload' 
| kv 
| rename match.source AS intel_source 
| fields alert_id, ComputerName, src_ip, intel_name, intel_source, match.*, config_id | where intel_source="signals"
| eval parent_processes=""
| eval parent_pids=""
| eval parent_users=""
| eval parent_md5s=""
| eval parent_args=""
| foreach *"" [eval parent_processes=parent_processes + tostring('<<FIELD>>') + ";" ]
| foreach *".parent.args" [eval parent_args=parent_args + tostring('<<FIELD>>') + ";" ]
| foreach *".parent.user" [eval parent_users=parent_users + tostring('<<FIELD>>') + ";" ]
| foreach *"" [eval parent_pids=parent_pids + tostring('<<FIELD>>') + ";" ]
| foreach *".parent.file.md5"  [eval parent_md5s=parent_md5s + tostring('<<FIELD>>') + ";" ]

The results will look something like below for the foreach. The PIDs and Users are not in the proper order as for as aligning with the process and arguments and hashes. Is there a better way to group all this properly and together preferably. Thanks!

"C:\Windows\system32\cmd.exe" ;C:\Windows\Explorer.EXE;C:\Windows\system32\userinit.exe;\Device\HarddiskVolume4\Windows\System32\winlogon.exe;




NT AUTHORITY\SYSTEM\SomeName\SomeName\SomeName;

0 Karma


Get anywhere with this? I'm currently working the exact same use case.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...