Splunk Search

Eval JSON_EXTRACT Issues

indigo42
Explorer

All,

I'm working on extracting some key info out of an Ansible HEC collector.  I'm hoping to use json_extract stuff like run time, machine etc.

  • The data shows up in Search 
  • The data is formatted in proper json "tree" view and color coding in Search.
  • Ansible app uses the _json source type. 

When I tried to use .

...| eval foo = json_extract(<objectname>) | table foo

I can only get it show values for the first object in the list. 

After many hours of fiddling around I decided to see if I could get json_extract to work in a simpler scenario. I decided to try out the "cities" example from the Splunk online Dovs

https://docs.splunk.com/Documentation/SCS/current/SearchReference/JSONFunctions 

 

I ingested the example below as a file. I did NOT use _json source type so no index field extractions we should just have the raw JSON below.

{
  "cities": [
    {
      "name": "London",
      "Bridges": [
        { "name": "Tower Bridge", "length": 801 },
        { "name": "Millennium Bridge", "length": 1066 }
      ]
    },
    {
      "name": "Venice",
      "Bridges": [
        { "name": "Rialto Bridge", "length": 157 },
        { "name": "Bridge of Sighs", "length": 36 },
        { "name": "Ponte della Paglia" }
      ]
    },
    {
      "name": "San Francisco",
      "Bridges": [
        { "name": "Golden Gate Bridge", "length": 8981 },
        { "name": "Bay Bridge", "length": 23556 }
      ]
    }
  ]
}

I then try the following statement from the Splunk Doc

...| eval extract_cities = json_extract(cities) | table extract_cities

I get nothing.

The example says I should get this below.

JSON functions - Splunk Documentation 2020-10-15 10-44-48.png

I'm on Splunk 8.0.6. Is this a bug?  This is the first time I've had to work with JSON on this box.

Many thanks in advance for the help.

Labels (2)
Tags (1)
1 Solution

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw _time | eval _raw="{
  \"cities\": [
    {
      \"name\": \"London\",
      \"Bridges\": [
        { \"name\": \"Tower Bridge\", \"length\": 801 },
        { \"name\": \"Millennium Bridge\", \"length\": 1066 }
      ]
    },
    {
      \"name\": \"Venice\",
      \"Bridges\": [
        { \"name\": \"Rialto Bridge\", \"length\": 157 },
        { \"name\": \"Bridge of Sighs\", \"length\": 36 },
        { \"name\": \"Ponte della Paglia\" }
      ]
    },
    {
      \"name\": \"San Francisco\",
      \"Bridges\": [
        { \"name\": \"Golden Gate Bridge\", \"length\": 8981 },
        { \"name\": \"Bay Bridge\", \"length\": 23556 }
      ]
    }
  ]
}"
| rename COMMENT as "the logic"
| spath cities{} output=cities
| stats count by cities
| spath input=cities Bridges{} output=Bridges
| mvexpand Bridges
| spath input=cities name output=city
| spath input=Bridges
| table city name length

View solution in original post

waechtler_amaso
Explorer

I just stumbled over the same issue.

The reason why it fails is because the eval command json_extract expects a FIELD as the first argument.

So when you run

...| eval extract_cities = json_extract(cities) | table extract_cities

it evaluates the content of a field named "cities", which does not exist. Hence no results.

Just try

...| eval extract_cities = json_extract(_raw) | table extract_cities

and it will work.

0 Karma

to4kawa
Ultra Champion
0 Karma

to4kawa
Ultra Champion

your log is not valid json and _raw.

 

 

index=_internal |head 1| fields _raw | eval _raw="{\"ansible_check_mode\":false,\"ansible_host\":\"XXXXXXXXX\",\"ansible_playbook\":\"update.yml\",\"ansible_result\":{\"_ansible_no_log\":false,\"changed\":true,\"failed_update_count\":0,\"filtered_updates\":{},\"found_update_count\":2,\"installed_update_count\":2,\"reboot_required\":true,\"updates\":{\"a33fefeb-b078-4cde-b9b9-f3473007bef2\":{\"categories\":[\"Security Updates\",\"Windows Server 2016\"],\"id\":\"a33fefeb-b078-4cde-b9b9-f3473007bef2\",\"installed\":true,\"kb\":[4580346],\"title\":\"2020-10 Cumulative Update for Windows Server 2016 for x64-based Systems (KB4580346)\"},\"b0a743d6-025a-4ca4-a998-b019a5b176f3\":{}}}}"
| spath

 

 

Accurate path is ansible_result.updates.a33fefeb-b078-4cde-b9b9-f3473007bef2.kb{}

| rename ansible_result.updates.a33fefeb-b078-4cde-b9b9-f3473007bef2.* as *
| rename *{} as *

how about this?



0 Karma

indigo42
Explorer

I copied a section out of the whole JSON, I may have missed something. 

The problem is I can't pre populate the session ID (the  b0a743d6-025a-4ca4-a998-b019a5b176f3 part is dynamically generated.  So if I were to psuedo code it, I would need something like this...

ansible_result.updates.*.*

But alas, it doesn't seem to want to let me do that. 

 Since this is a json object type I was able to find an index extracted field to wild card, so I tried this...

index=* ansible_task.action="win_updates"  ansible_host="XXXX"
| foreach ansible_result.filtered_updates*.kb* [eval flt_kb=flt_kb.'<<FIELD>>']
| table flt_kb ansible_result.filtered_updates*.kb*

 It produced all of the values but it did not concatenate them in the end. 

0 Karma

indigo42
Explorer

I got something that works.

| eval upd_kb=""
| foreach ansible_result.filtered_updates.*.kb{} [eval upd_kb=upd_kb.if(upd_kb=="","",if(isnull('<<FIELD>>'),"",",")).if(isnull('<<FIELD>>'),"",'<<FIELD>>') ] 
| table upd_kb

 I modified the issue from this post  here. Their problem was just like mine except my data is all in one event, so I could simplify some things. I loved the isnull feature, so I stole it. 

Thanks so much all for helping me on this. 

I will give props to you for stearing me away from json_extract though! I was convinced that was the way to go. 

 

0 Karma

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw _time | eval _raw="{
  \"cities\": [
    {
      \"name\": \"London\",
      \"Bridges\": [
        { \"name\": \"Tower Bridge\", \"length\": 801 },
        { \"name\": \"Millennium Bridge\", \"length\": 1066 }
      ]
    },
    {
      \"name\": \"Venice\",
      \"Bridges\": [
        { \"name\": \"Rialto Bridge\", \"length\": 157 },
        { \"name\": \"Bridge of Sighs\", \"length\": 36 },
        { \"name\": \"Ponte della Paglia\" }
      ]
    },
    {
      \"name\": \"San Francisco\",
      \"Bridges\": [
        { \"name\": \"Golden Gate Bridge\", \"length\": 8981 },
        { \"name\": \"Bay Bridge\", \"length\": 23556 }
      ]
    }
  ]
}"
| rename COMMENT as "the logic"
| spath cities{} output=cities
| stats count by cities
| spath input=cities Bridges{} output=Bridges
| mvexpand Bridges
| spath input=cities name output=city
| spath input=Bridges
| table city name length

indigo42
Explorer

Ok This almost works but....

index=* ansible_task.action="win_updates" session="f725550b-c8f2-4d01-89e4-6a07527060d3" 
| eval win_upd_failed=spath(_raw, "ansible_result.failed_update_count") 
| eval win_upd_found=spath(_raw, "ansible_result.found_update_count")
| eval win_upd_installed=spath(_raw, "ansible_result.installed_update_count")
| eval win_upd_reboot=spath(_raw, "ansible_result.reboot_required")
| foreach ansible_result.updates*.categories{} [eval upd_cat='<<FIELD>>']
| foreach ansible_result.updates*.installed [eval upd_inst='<<FIELD>>']
| foreach ansible_result.updates*.kb{} [eval upd_kb='<<FIELD>>']
| foreach ansible_result.updates*.title [eval upd_title='<<FIELD>>']
| foreach ansible_result.filtered_updates*.filtered_reason [eval flt_rsn='<<FIELD>>']
| foreach ansible_result.filtered_updates*.categories{} [eval flt_upd_cat='<<FIELD>>']
| foreach ansible_result.filtered_updates*.kb{} [eval flt_kb='<<FIELD>>']
| foreach ansible_result.filtered_updates*.title [eval flt_title='<<FIELD>>']
| table ansible_host win_upd_failed win_upd_found win_upd_installed runtime win_upd_reboot ansible_result.found_update_count upd_inst upd_kb upd_cat upd_title flt_upd_cat flt_rsn flt_kb flt_title

 

But...it only show the last value, instead of multiple values for the following

| foreach ansible_result.updates*.categories{} [eval upd_cat='<<FIELD>>']
| foreach ansible_result.updates*.kb{} [eval upd_kb='<<FIELD>>']
| foreach ansible_result.filtered_updates*.categories{} [eval flt_upd_cat='<<FIELD>>']
| foreach ansible_result.filtered_updates*.kb{} [eval flt_kb='<<FIELD>>']

The rest of them show multiple values as they should. 

0 Karma

indigo42
Explorer

Thanks to4kawa for the redirect to spath!

I'm trying extract fields from the json output from ansible win_update. Which as fairly deep nesting.

 

 

index=* ansible_task.action="win_updates" 
| eval win_upd_failed=spath(_raw, "ansible_result.failed_update_count") 
| eval win_upd_found=spath(_raw, "ansible_result.found_update_count")
| eval win_upd_installed=spath(_raw, "ansible_result.installed_update_count")
| eval win_upd_reboot=spath(_raw, "ansible_result.reboot_required")
| eval win_upd_reboot=spath(_raw, "ansible_result.filtered_updates.categories")
| table ansible_host win_upd_failed  win_upd_found win_upd_installed win_upd_reboot runtime

 

 

This part works well... 

Here's the next issue... I need to pull the KB:  numbers and categories: which are buried in there like this. The problem is they exist in an object that gets dynamically named. I don't need to know the object name ever. It's under updates { in the json below...

 

 

   ansible_check_mode: false
   ansible_host: XXXXXXXXX
   ansible_playbook: update.yml
   ansible_result: { [-]
     _ansible_no_log: false
     changed: true
     failed_update_count: 0
     filtered_updates: { [-]
     }
     found_update_count: 2
     installed_update_count: 2
     reboot_required: true
     updates: { [-]
       a33fefeb-b078-4cde-b9b9-f3473007bef2: { [-] <--I can't pre-populate this in path.
         categories: [ [-]
           Security Updates
           Windows Server 2016
         ]
         id: a33fefeb-b078-4cde-b9b9-f3473007bef2
         installed: true
         kb: [ [-]
           4580346
         ]
         title: 2020-10 Cumulative Update for Windows Server 2016 for x64-based Systems (KB4580346)
       }
       b0a743d6-025a-4ca4-a998-b019a5b176f3: { [+]
       }
     }
   }

 

 

 

I was trying to get something like this to work with no luck

 

 

spath path=ansible_result.filtered_updates.updates{}.categories{} output=foo

 

 

 

I can only get the categories like this by using the darn ID GUID thing. Which I can't pre populate.

 

 

index=* ansible_task.action="win_updates" 
| spath ansible_result.filtered_updates output=filtered_updates
| spath input=filtered_updates path="25ca9da5-4eba-4cc5-94b5-d25f1a70e216{}.categories{}" output=foo
| table foo

 

 

 

Thanks for looking into this!!!!

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...