Solved: Re: Expanding nested JSON events into multiple eve...

himynamesdave · ‎02-21-2017

I have nested json events indexed in Splunk. Here's an example of 2 (note confidence value differs):

Event 1:

 {  [-] 
         email:  hidden@hidden.com  
         filter:     confidence >= 60   
         id:     2087   
         integrations:  [   [-] 
            {   [-] 
             name:   nitro  
             product:    nitro  
             product_version:    9.3    
            }   
            {   [-] 
             name:   paloaltonetworks   
             product:    paloaltonetworks   
             product_version:    3020   
            }   
        ]   
         last_intelligence:  2017-02-21T11:54:39.260329+00:00   
         title:  hidden 
         user_id:    8721   
         username:   hidden@hidden.com  
    }

Raw E1:

{"username": " hidden@hidden.com", "user_id": 8721, "title": "hidden", "integrations": [{"product": "nitro", "product_version": "9.3", "name": "nitro"}, {"product": "paloaltonetworks", "product_version": "3020", "name": "paloaltonetworks"}], "email": " hidden@hidden.com", "filter": "confidence >= 60", "id": 2087, "last_intelligence": "2017-02-21T11:54:39.260329+00:00"}

Event 2:

{   [-] 
         email:  hidden@hidden.com  
         filter:     confidence >= 50   
         id:     2087   
         integrations:  [   [-] 
            {   [-] 
             name:   nitro  
             product:    nitro  
             product_version:    9.3    
            }   
            {   [-] 
             name:   paloaltonetworks   
             product:    paloaltonetworks   
             product_version:    3020   
            }   
        ]   
         last_intelligence:  2017-02-21T11:54:39.260329+00:00   
         title:  hidden 
         user_id:    8721   
         username:   hidden@hidden.com  
    }

Raw E2

{"username": " hidden@hidden.com", "user_id": 8721, "title": "hidden", "integrations": [{"product": "nitro", "product_version": "9.3", "name": "nitro"}, {"product": "paloaltonetworks", "product_version": "3020", "name": "paloaltonetworks"}], "email": " hidden@hidden.com", "filter": "confidence >= 50", "id": 2087, "last_intelligence": "2017-02-21T11:54:39.260329+00:00"}

Fields are extracted into fields integration{}.name, integration{}.product, integration{}.product_version. i.e integration{}.product_version=9.3, integration{}.product_version=3020.

I want to have each nested value for each represent a single event for each "integration{}.*". If we imagine this as events:

Event 1A:

{   [-] 
     email:  hidden@hidden.com  
     filter:     confidence >= 60   
     id:     2087   
     integrations:  [   [-] 
        {   [-] 
         name:   nitro  
         product:    nitro  
         product_version:    9.3    
        }
    ]   
     last_intelligence:  2017-02-21T11:54:39.260329+00:00   
     title:  hidden 
     user_id:    8721   
     username:   hidden@hidden.com  
}

Event 1B:

{   [-] 
     email:  hidden@hidden.com  
     filter:     confidence >= 60   
     id:     2087   
     integrations:  [   [-] 
        {   [-] 
         name:   paloaltonetworks   
         product:    paloaltonetworks   
         product_version:    3020   
        }   
    ]   
     last_intelligence:  2017-02-21T11:54:39.260329+00:00   
     title:  hidden 
     user_id:    8721   
     username:   hidden@hidden.com  
}

Event 2A:

{   [-] 
     email:  hidden@hidden.com  
     filter:     confidence >= 50   
     id:     2087   
     integrations:  [   [-] 
        {   [-] 
         name:   nitro  
         product:    nitro  
         product_version:    9.3    
        }
    ]   
     last_intelligence:  2017-02-21T11:54:39.260329+00:00   
     title:  hidden 
     user_id:    8721   
     username:   hidden@hidden.com  
}

Event 2B:

{   [-] 
     email:  hidden@hidden.com  
     filter:     confidence >= 50   
     id:     2087   
     integrations:  [   [-] 
        {   [-] 
         name:   paloaltonetworks   
         product:    paloaltonetworks   
         product_version:    3020   
        }   
    ]   
     last_intelligence:  2017-02-21T11:54:39.260329+00:00   
     title:  hidden 
     user_id:    8721   
     username:   hidden@hidden.com  
}

I am experimenting with spath and mvexpand searches but I am getting some odd results and behaviour using examples from previous answer threads (lots of duplicated events, mvfields, etc).

Ultimately I want to graph these events as tables like:

username, user_id, id, email,title,name,product,product_version,last_intelligence,filter
hidden@hidden.com, 8721, 2087, hidden@hidden.com, hidden, nitro, nitro, 9.3, 2017-02-21T11:54:39.260329+00:00, confidence >= 60
hidden@hidden.com, 8721, 2087, hidden@hidden.com, hidden, paloaltonetworks, paloaltonetworks, 3020, 2017-02-21T11:54:39.260329+00:00, confidence >= 60
hidden@hidden.com, 8721, 2087, hidden@hidden.com, hidden, nitro, nitro, 9.3, 2017-02-21T11:54:39.260329+00:00, confidence >= 50
hidden@hidden.com, 8721, 2087, hidden@hidden.com, hidden, paloaltonetworks, paloaltonetworks, 3020, 2017-02-21T11:54:39.260329+00:00, confidence >= 50

One note which might be pertinent: all my events have the same timestamp (using DATETIME_CONFIG=CURRENT)

Can anyone give me any pointers? Thanks!

martin_mueller · ‎02-21-2017

Here you go:

| makeresults | eval foo = "{\"username\": \" hidden@hidden.com\", \"user_id\": 8721, \"title\": \"hidden\", \"integrations\": [{\"product\": \"nitro\", \"product_version\": \"9.3\", \"name\": \"nitro\"}, {\"product\": \"paloaltonetworks\", \"product_version\": \"3020\", \"name\": \"paloaltonetworks\"}], \"email\": \" hidden@hidden.com\", \"filter\": \"confidence >= 60\", \"id\": 2087, \"last_intelligence\": \"2017-02-21T11:54:39.260329+00:00\"}
{\"username\": \" hidden@hidden.com\", \"user_id\": 8721, \"title\": \"hidden\", \"integrations\": [{\"product\": \"nitro\", \"product_version\": \"9.3\", \"name\": \"nitro\"}, {\"product\": \"paloaltonetworks\", \"product_version\": \"3020\", \"name\": \"paloaltonetworks\"}], \"email\": \" hidden@hidden.com\", \"filter\": \"confidence >= 50\", \"id\": 2087, \"last_intelligence\": \"2017-02-21T11:54:39.260329+00:00\"}" | makemv foo delim="
" | mvexpand foo | rename foo as _raw | spath
 | spath integrations{} | mvexpand integrations{}
 | spath input=integrations{}
 | table username, user_id, id, email,title,name,product,product_version,last_intelligence,filter

Depending on how your sourcetype is configured, you can probably leave off the first no-parameter spath, the key is spath integrations{} | mvexpand integrations{} | spath input=integrations{} - extract the objects inside the array into a multivalue field, expand the field, extract each object's content.

View solution in original post

martin_mueller · ‎02-21-2017

Here you go:

| makeresults | eval foo = "{\"username\": \" hidden@hidden.com\", \"user_id\": 8721, \"title\": \"hidden\", \"integrations\": [{\"product\": \"nitro\", \"product_version\": \"9.3\", \"name\": \"nitro\"}, {\"product\": \"paloaltonetworks\", \"product_version\": \"3020\", \"name\": \"paloaltonetworks\"}], \"email\": \" hidden@hidden.com\", \"filter\": \"confidence >= 60\", \"id\": 2087, \"last_intelligence\": \"2017-02-21T11:54:39.260329+00:00\"}
{\"username\": \" hidden@hidden.com\", \"user_id\": 8721, \"title\": \"hidden\", \"integrations\": [{\"product\": \"nitro\", \"product_version\": \"9.3\", \"name\": \"nitro\"}, {\"product\": \"paloaltonetworks\", \"product_version\": \"3020\", \"name\": \"paloaltonetworks\"}], \"email\": \" hidden@hidden.com\", \"filter\": \"confidence >= 50\", \"id\": 2087, \"last_intelligence\": \"2017-02-21T11:54:39.260329+00:00\"}" | makemv foo delim="
" | mvexpand foo | rename foo as _raw | spath
 | spath integrations{} | mvexpand integrations{}
 | spath input=integrations{}
 | table username, user_id, id, email,title,name,product,product_version,last_intelligence,filter

Depending on how your sourcetype is configured, you can probably leave off the first no-parameter spath, the key is spath integrations{} | mvexpand integrations{} | spath input=integrations{} - extract the objects inside the array into a multivalue field, expand the field, extract each object's content.

martin_mueller · ‎02-21-2017

If your events already are configured to do the basic JSON extractions then leave off the first |spath. Only use lines 4-6 from my answer, the rest is preparation of the dummy data.

himynamesdave · ‎02-22-2017

Thanks, Martin. Works perfectly.

himynamesdave · ‎02-21-2017

Hi Martin - thanks for taking the time to answer my question. I think I have an issue with field extraction which might be causing a problem, one event (e.g event 1 above) reports a count of 2 for each integration.* value.

When I run SPATH it duplicates events:

Which results in 3 values in each row

And now reports a count of 6 (vs 2)

Now I'm really confused!

JDukeSplunk · ‎02-21-2017

Oddly enough, I have had some success with | xmlkv with supposedly json rest inputs.

It's worth throwing it in a quick search and seeing if some fields generate, and are more consistent.

index=yourindex sourcetype=yoursourcetype source=yourfilepath | xmlkv

You may also want to look at the raw data, and see if Splunk is inserting line breakers in the wrong places (most likely at the embedded timestamp), and only giving you partial events, or lumping multiple events together. If so, you will need to put a transforms.conf in place for the input, and wrestle with the regex that determines a newline.

martin_mueller · ‎02-21-2017

It'd help if you added a pasteable or indexable example.

himynamesdave · ‎02-21-2017

Of course, question now updated

Expanding nested JSON events into multiple events for search

Modern way of developing distributed application using OTel

Enterprise Security Content Update (ESCU) | New Releases

Archived Metrics Now Available for APAC and EMEA realms