Splunk Search

Eval JSON_EXTRACT Issues

indigo42
Explorer

All,

I'm working on extracting some key info out of an Ansible HEC collector.  I'm hoping to use json_extract stuff like run time, machine etc.

  • The data shows up in Search 
  • The data is formatted in proper json "tree" view and color coding in Search.
  • Ansible app uses the _json source type. 

When I tried to use .

...| eval foo = json_extract(<objectname>) | table foo

I can only get it show values for the first object in the list. 

After many hours of fiddling around I decided to see if I could get json_extract to work in a simpler scenario. I decided to try out the "cities" example from the Splunk online Dovs

https://docs.splunk.com/Documentation/SCS/current/SearchReference/JSONFunctions 

 

I ingested the example below as a file. I did NOT use _json source type so no index field extractions we should just have the raw JSON below.

{
  "cities": [
    {
      "name": "London",
      "Bridges": [
        { "name": "Tower Bridge", "length": 801 },
        { "name": "Millennium Bridge", "length": 1066 }
      ]
    },
    {
      "name": "Venice",
      "Bridges": [
        { "name": "Rialto Bridge", "length": 157 },
        { "name": "Bridge of Sighs", "length": 36 },
        { "name": "Ponte della Paglia" }
      ]
    },
    {
      "name": "San Francisco",
      "Bridges": [
        { "name": "Golden Gate Bridge", "length": 8981 },
        { "name": "Bay Bridge", "length": 23556 }
      ]
    }
  ]
}

I then try the following statement from the Splunk Doc

...| eval extract_cities = json_extract(cities) | table extract_cities

I get nothing.

The example says I should get this below.

JSON functions - Splunk Documentation 2020-10-15 10-44-48.png

I'm on Splunk 8.0.6. Is this a bug?  This is the first time I've had to work with JSON on this box.

Many thanks in advance for the help.

Labels (2)
Tags (1)
1 Solution

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw _time | eval _raw="{
  \"cities\": [
    {
      \"name\": \"London\",
      \"Bridges\": [
        { \"name\": \"Tower Bridge\", \"length\": 801 },
        { \"name\": \"Millennium Bridge\", \"length\": 1066 }
      ]
    },
    {
      \"name\": \"Venice\",
      \"Bridges\": [
        { \"name\": \"Rialto Bridge\", \"length\": 157 },
        { \"name\": \"Bridge of Sighs\", \"length\": 36 },
        { \"name\": \"Ponte della Paglia\" }
      ]
    },
    {
      \"name\": \"San Francisco\",
      \"Bridges\": [
        { \"name\": \"Golden Gate Bridge\", \"length\": 8981 },
        { \"name\": \"Bay Bridge\", \"length\": 23556 }
      ]
    }
  ]
}"
| rename COMMENT as "the logic"
| spath cities{} output=cities
| stats count by cities
| spath input=cities Bridges{} output=Bridges
| mvexpand Bridges
| spath input=cities name output=city
| spath input=Bridges
| table city name length

View solution in original post

waechtler_amaso
Explorer

I just stumbled over the same issue.

The reason why it fails is because the eval command json_extract expects a FIELD as the first argument.

So when you run

...| eval extract_cities = json_extract(cities) | table extract_cities

it evaluates the content of a field named "cities", which does not exist. Hence no results.

Just try

...| eval extract_cities = json_extract(_raw) | table extract_cities

and it will work.

0 Karma

to4kawa
Ultra Champion
0 Karma

to4kawa
Ultra Champion

your log is not valid json and _raw.

 

 

index=_internal |head 1| fields _raw | eval _raw="{\"ansible_check_mode\":false,\"ansible_host\":\"XXXXXXXXX\",\"ansible_playbook\":\"update.yml\",\"ansible_result\":{\"_ansible_no_log\":false,\"changed\":true,\"failed_update_count\":0,\"filtered_updates\":{},\"found_update_count\":2,\"installed_update_count\":2,\"reboot_required\":true,\"updates\":{\"a33fefeb-b078-4cde-b9b9-f3473007bef2\":{\"categories\":[\"Security Updates\",\"Windows Server 2016\"],\"id\":\"a33fefeb-b078-4cde-b9b9-f3473007bef2\",\"installed\":true,\"kb\":[4580346],\"title\":\"2020-10 Cumulative Update for Windows Server 2016 for x64-based Systems (KB4580346)\"},\"b0a743d6-025a-4ca4-a998-b019a5b176f3\":{}}}}"
| spath

 

 

Accurate path is ansible_result.updates.a33fefeb-b078-4cde-b9b9-f3473007bef2.kb{}

| rename ansible_result.updates.a33fefeb-b078-4cde-b9b9-f3473007bef2.* as *
| rename *{} as *

how about this?



0 Karma

indigo42
Explorer

I copied a section out of the whole JSON, I may have missed something. 

The problem is I can't pre populate the session ID (the  b0a743d6-025a-4ca4-a998-b019a5b176f3 part is dynamically generated.  So if I were to psuedo code it, I would need something like this...

ansible_result.updates.*.*

But alas, it doesn't seem to want to let me do that. 

 Since this is a json object type I was able to find an index extracted field to wild card, so I tried this...

index=* ansible_task.action="win_updates"  ansible_host="XXXX"
| foreach ansible_result.filtered_updates*.kb* [eval flt_kb=flt_kb.'<<FIELD>>']
| table flt_kb ansible_result.filtered_updates*.kb*

 It produced all of the values but it did not concatenate them in the end. 

0 Karma

indigo42
Explorer

I got something that works.

| eval upd_kb=""
| foreach ansible_result.filtered_updates.*.kb{} [eval upd_kb=upd_kb.if(upd_kb=="","",if(isnull('<<FIELD>>'),"",",")).if(isnull('<<FIELD>>'),"",'<<FIELD>>') ] 
| table upd_kb

 I modified the issue from this post  here. Their problem was just like mine except my data is all in one event, so I could simplify some things. I loved the isnull feature, so I stole it. 

Thanks so much all for helping me on this. 

I will give props to you for stearing me away from json_extract though! I was convinced that was the way to go. 

 

0 Karma

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw _time | eval _raw="{
  \"cities\": [
    {
      \"name\": \"London\",
      \"Bridges\": [
        { \"name\": \"Tower Bridge\", \"length\": 801 },
        { \"name\": \"Millennium Bridge\", \"length\": 1066 }
      ]
    },
    {
      \"name\": \"Venice\",
      \"Bridges\": [
        { \"name\": \"Rialto Bridge\", \"length\": 157 },
        { \"name\": \"Bridge of Sighs\", \"length\": 36 },
        { \"name\": \"Ponte della Paglia\" }
      ]
    },
    {
      \"name\": \"San Francisco\",
      \"Bridges\": [
        { \"name\": \"Golden Gate Bridge\", \"length\": 8981 },
        { \"name\": \"Bay Bridge\", \"length\": 23556 }
      ]
    }
  ]
}"
| rename COMMENT as "the logic"
| spath cities{} output=cities
| stats count by cities
| spath input=cities Bridges{} output=Bridges
| mvexpand Bridges
| spath input=cities name output=city
| spath input=Bridges
| table city name length

indigo42
Explorer

Ok This almost works but....

index=* ansible_task.action="win_updates" session="f725550b-c8f2-4d01-89e4-6a07527060d3" 
| eval win_upd_failed=spath(_raw, "ansible_result.failed_update_count") 
| eval win_upd_found=spath(_raw, "ansible_result.found_update_count")
| eval win_upd_installed=spath(_raw, "ansible_result.installed_update_count")
| eval win_upd_reboot=spath(_raw, "ansible_result.reboot_required")
| foreach ansible_result.updates*.categories{} [eval upd_cat='<<FIELD>>']
| foreach ansible_result.updates*.installed [eval upd_inst='<<FIELD>>']
| foreach ansible_result.updates*.kb{} [eval upd_kb='<<FIELD>>']
| foreach ansible_result.updates*.title [eval upd_title='<<FIELD>>']
| foreach ansible_result.filtered_updates*.filtered_reason [eval flt_rsn='<<FIELD>>']
| foreach ansible_result.filtered_updates*.categories{} [eval flt_upd_cat='<<FIELD>>']
| foreach ansible_result.filtered_updates*.kb{} [eval flt_kb='<<FIELD>>']
| foreach ansible_result.filtered_updates*.title [eval flt_title='<<FIELD>>']
| table ansible_host win_upd_failed win_upd_found win_upd_installed runtime win_upd_reboot ansible_result.found_update_count upd_inst upd_kb upd_cat upd_title flt_upd_cat flt_rsn flt_kb flt_title

 

But...it only show the last value, instead of multiple values for the following

| foreach ansible_result.updates*.categories{} [eval upd_cat='<<FIELD>>']
| foreach ansible_result.updates*.kb{} [eval upd_kb='<<FIELD>>']
| foreach ansible_result.filtered_updates*.categories{} [eval flt_upd_cat='<<FIELD>>']
| foreach ansible_result.filtered_updates*.kb{} [eval flt_kb='<<FIELD>>']

The rest of them show multiple values as they should. 

0 Karma

indigo42
Explorer

Thanks to4kawa for the redirect to spath!

I'm trying extract fields from the json output from ansible win_update. Which as fairly deep nesting.

 

 

index=* ansible_task.action="win_updates" 
| eval win_upd_failed=spath(_raw, "ansible_result.failed_update_count") 
| eval win_upd_found=spath(_raw, "ansible_result.found_update_count")
| eval win_upd_installed=spath(_raw, "ansible_result.installed_update_count")
| eval win_upd_reboot=spath(_raw, "ansible_result.reboot_required")
| eval win_upd_reboot=spath(_raw, "ansible_result.filtered_updates.categories")
| table ansible_host win_upd_failed  win_upd_found win_upd_installed win_upd_reboot runtime

 

 

This part works well... 

Here's the next issue... I need to pull the KB:  numbers and categories: which are buried in there like this. The problem is they exist in an object that gets dynamically named. I don't need to know the object name ever. It's under updates { in the json below...

 

 

   ansible_check_mode: false
   ansible_host: XXXXXXXXX
   ansible_playbook: update.yml
   ansible_result: { [-]
     _ansible_no_log: false
     changed: true
     failed_update_count: 0
     filtered_updates: { [-]
     }
     found_update_count: 2
     installed_update_count: 2
     reboot_required: true
     updates: { [-]
       a33fefeb-b078-4cde-b9b9-f3473007bef2: { [-] <--I can't pre-populate this in path.
         categories: [ [-]
           Security Updates
           Windows Server 2016
         ]
         id: a33fefeb-b078-4cde-b9b9-f3473007bef2
         installed: true
         kb: [ [-]
           4580346
         ]
         title: 2020-10 Cumulative Update for Windows Server 2016 for x64-based Systems (KB4580346)
       }
       b0a743d6-025a-4ca4-a998-b019a5b176f3: { [+]
       }
     }
   }

 

 

 

I was trying to get something like this to work with no luck

 

 

spath path=ansible_result.filtered_updates.updates{}.categories{} output=foo

 

 

 

I can only get the categories like this by using the darn ID GUID thing. Which I can't pre populate.

 

 

index=* ansible_task.action="win_updates" 
| spath ansible_result.filtered_updates output=filtered_updates
| spath input=filtered_updates path="25ca9da5-4eba-4cc5-94b5-d25f1a70e216{}.categories{}" output=foo
| table foo

 

 

 

Thanks for looking into this!!!!

0 Karma
Get Updates on the Splunk Community!

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...

Introducing New Splunkbase Governance!

Splunk apps are essential for maximizing the value of your Splunk Experience. Whether you’re using the default ...

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...