I have nested json events indexed in Splunk. Here's an example of 2 (note confidence value differs):
Event 1:
{ [-]
email: hidden@hidden.com
filter: confidence >= 60
id: 2087
integrations: [ [-]
{ [-]
name: nitro
product: nitro
product_version: 9.3
}
{ [-]
name: paloaltonetworks
product: paloaltonetworks
product_version: 3020
}
]
last_intelligence: 2017-02-21T11:54:39.260329+00:00
title: hidden
user_id: 8721
username: hidden@hidden.com
}
Raw E1:
{"username": " hidden@hidden.com", "user_id": 8721, "title": "hidden", "integrations": [{"product": "nitro", "product_version": "9.3", "name": "nitro"}, {"product": "paloaltonetworks", "product_version": "3020", "name": "paloaltonetworks"}], "email": " hidden@hidden.com", "filter": "confidence >= 60", "id": 2087, "last_intelligence": "2017-02-21T11:54:39.260329+00:00"}
Event 2:
{ [-]
email: hidden@hidden.com
filter: confidence >= 50
id: 2087
integrations: [ [-]
{ [-]
name: nitro
product: nitro
product_version: 9.3
}
{ [-]
name: paloaltonetworks
product: paloaltonetworks
product_version: 3020
}
]
last_intelligence: 2017-02-21T11:54:39.260329+00:00
title: hidden
user_id: 8721
username: hidden@hidden.com
}
Raw E2
{"username": " hidden@hidden.com", "user_id": 8721, "title": "hidden", "integrations": [{"product": "nitro", "product_version": "9.3", "name": "nitro"}, {"product": "paloaltonetworks", "product_version": "3020", "name": "paloaltonetworks"}], "email": " hidden@hidden.com", "filter": "confidence >= 50", "id": 2087, "last_intelligence": "2017-02-21T11:54:39.260329+00:00"}
Fields are extracted into fields integration{}.name
, integration{}.product
, integration{}.product_version
. i.e integration{}.product_version=9.3
, integration{}.product_version=3020
.
I want to have each nested value for each represent a single event for each "integration{}.*". If we imagine this as events:
Event 1A:
{ [-]
email: hidden@hidden.com
filter: confidence >= 60
id: 2087
integrations: [ [-]
{ [-]
name: nitro
product: nitro
product_version: 9.3
}
]
last_intelligence: 2017-02-21T11:54:39.260329+00:00
title: hidden
user_id: 8721
username: hidden@hidden.com
}
Event 1B:
{ [-]
email: hidden@hidden.com
filter: confidence >= 60
id: 2087
integrations: [ [-]
{ [-]
name: paloaltonetworks
product: paloaltonetworks
product_version: 3020
}
]
last_intelligence: 2017-02-21T11:54:39.260329+00:00
title: hidden
user_id: 8721
username: hidden@hidden.com
}
Event 2A:
{ [-]
email: hidden@hidden.com
filter: confidence >= 50
id: 2087
integrations: [ [-]
{ [-]
name: nitro
product: nitro
product_version: 9.3
}
]
last_intelligence: 2017-02-21T11:54:39.260329+00:00
title: hidden
user_id: 8721
username: hidden@hidden.com
}
Event 2B:
{ [-]
email: hidden@hidden.com
filter: confidence >= 50
id: 2087
integrations: [ [-]
{ [-]
name: paloaltonetworks
product: paloaltonetworks
product_version: 3020
}
]
last_intelligence: 2017-02-21T11:54:39.260329+00:00
title: hidden
user_id: 8721
username: hidden@hidden.com
}
I am experimenting with spath
and mvexpand
searches but I am getting some odd results and behaviour using examples from previous answer threads (lots of duplicated events, mvfields, etc).
Ultimately I want to graph these events as tables like:
username, user_id, id, email,title,name,product,product_version,last_intelligence,filter
hidden@hidden.com, 8721, 2087, hidden@hidden.com, hidden, nitro, nitro, 9.3, 2017-02-21T11:54:39.260329+00:00, confidence >= 60
hidden@hidden.com, 8721, 2087, hidden@hidden.com, hidden, paloaltonetworks, paloaltonetworks, 3020, 2017-02-21T11:54:39.260329+00:00, confidence >= 60
hidden@hidden.com, 8721, 2087, hidden@hidden.com, hidden, nitro, nitro, 9.3, 2017-02-21T11:54:39.260329+00:00, confidence >= 50
hidden@hidden.com, 8721, 2087, hidden@hidden.com, hidden, paloaltonetworks, paloaltonetworks, 3020, 2017-02-21T11:54:39.260329+00:00, confidence >= 50
One note which might be pertinent: all my events have the same timestamp (using DATETIME_CONFIG=CURRENT)
Can anyone give me any pointers? Thanks!
Here you go:
| makeresults | eval foo = "{\"username\": \" hidden@hidden.com\", \"user_id\": 8721, \"title\": \"hidden\", \"integrations\": [{\"product\": \"nitro\", \"product_version\": \"9.3\", \"name\": \"nitro\"}, {\"product\": \"paloaltonetworks\", \"product_version\": \"3020\", \"name\": \"paloaltonetworks\"}], \"email\": \" hidden@hidden.com\", \"filter\": \"confidence >= 60\", \"id\": 2087, \"last_intelligence\": \"2017-02-21T11:54:39.260329+00:00\"}
{\"username\": \" hidden@hidden.com\", \"user_id\": 8721, \"title\": \"hidden\", \"integrations\": [{\"product\": \"nitro\", \"product_version\": \"9.3\", \"name\": \"nitro\"}, {\"product\": \"paloaltonetworks\", \"product_version\": \"3020\", \"name\": \"paloaltonetworks\"}], \"email\": \" hidden@hidden.com\", \"filter\": \"confidence >= 50\", \"id\": 2087, \"last_intelligence\": \"2017-02-21T11:54:39.260329+00:00\"}" | makemv foo delim="
" | mvexpand foo | rename foo as _raw | spath
| spath integrations{} | mvexpand integrations{}
| spath input=integrations{}
| table username, user_id, id, email,title,name,product,product_version,last_intelligence,filter
Depending on how your sourcetype is configured, you can probably leave off the first no-parameter spath
, the key is spath integrations{} | mvexpand integrations{} | spath input=integrations{}
- extract the objects inside the array into a multivalue field, expand the field, extract each object's content.
Here you go:
| makeresults | eval foo = "{\"username\": \" hidden@hidden.com\", \"user_id\": 8721, \"title\": \"hidden\", \"integrations\": [{\"product\": \"nitro\", \"product_version\": \"9.3\", \"name\": \"nitro\"}, {\"product\": \"paloaltonetworks\", \"product_version\": \"3020\", \"name\": \"paloaltonetworks\"}], \"email\": \" hidden@hidden.com\", \"filter\": \"confidence >= 60\", \"id\": 2087, \"last_intelligence\": \"2017-02-21T11:54:39.260329+00:00\"}
{\"username\": \" hidden@hidden.com\", \"user_id\": 8721, \"title\": \"hidden\", \"integrations\": [{\"product\": \"nitro\", \"product_version\": \"9.3\", \"name\": \"nitro\"}, {\"product\": \"paloaltonetworks\", \"product_version\": \"3020\", \"name\": \"paloaltonetworks\"}], \"email\": \" hidden@hidden.com\", \"filter\": \"confidence >= 50\", \"id\": 2087, \"last_intelligence\": \"2017-02-21T11:54:39.260329+00:00\"}" | makemv foo delim="
" | mvexpand foo | rename foo as _raw | spath
| spath integrations{} | mvexpand integrations{}
| spath input=integrations{}
| table username, user_id, id, email,title,name,product,product_version,last_intelligence,filter
Depending on how your sourcetype is configured, you can probably leave off the first no-parameter spath
, the key is spath integrations{} | mvexpand integrations{} | spath input=integrations{}
- extract the objects inside the array into a multivalue field, expand the field, extract each object's content.
If your events already are configured to do the basic JSON extractions then leave off the first |spath
. Only use lines 4-6 from my answer, the rest is preparation of the dummy data.
Thanks, Martin. Works perfectly.
Hi Martin - thanks for taking the time to answer my question. I think I have an issue with field extraction which might be causing a problem, one event (e.g event 1 above) reports a count of 2 for each integration.* value.
When I run SPATH it duplicates events:
Which results in 3 values in each row
And now reports a count of 6 (vs 2)
Now I'm really confused!
Oddly enough, I have had some success with | xmlkv
with supposedly json rest inputs.
It's worth throwing it in a quick search and seeing if some fields generate, and are more consistent.
index=yourindex sourcetype=yoursourcetype source=yourfilepath | xmlkv
You may also want to look at the raw data, and see if Splunk is inserting line breakers in the wrong places (most likely at the embedded timestamp), and only giving you partial events, or lumping multiple events together. If so, you will need to put a transforms.conf in place for the input, and wrestle with the regex that determines a newline.
It'd help if you added a pasteable or indexable example.
Of course, question now updated