- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Avoid multiple spath for a better performant query
I have a json with the following structure:
{
"version":"v0.2",
"prints":{
"urls":[
{
"response_time":256,
"uri":{
"bool":false,
"name":"abc"
},
"Time":{
"total":52,
"db":11
}
},
{
"response_time":578,
"uri":{
"bool":false,
"name":"xyz"
},
"Time":{
"total":78,
"db":13
}
}
]
}
}
I've to create a table with columns : _time, rv, av, wm, an, et, uri_name, response_time, db_time, total_time
here is my query that I'm trying to optimize by removing multiple spaths
basic search rv=*, av=*, wm=*, an=*, et=*
| spath input=data path=prints.urls{} output=urls
| spath input=urls path=response_time output=response_time
| spath input=urls path=uri.name output=uri_name
| spath input=urls path=Time.db output=db_time
| spath input=urls path=Time.total output=total_time
| table _time, rv, av, wm, an, et, uri_name, response_time, db_time, total_time
I can't help but think there would be a more optimized way to get the table.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content


You can try this also.
YOUR_SEARCH | spath prints.urls{} output=urls | stats count latest(_time) as Time by urls | rename urls as _raw,Time as _time | extract | rename Time.db as db_time, Time.total as total_time, uri.name as uri_name | table _time uri_name, response_time, db_time, total_time
Sample Search:
| makeresults
| eval _raw="{\"version\":\"v0.2\",\"prints\":{\"urls\":[{\"response_time\":256,\"uri\":{\"bool\":false,\"name\":\"abc\"},\"Time\":{\"total\":52,\"db\":11}},{\"response_time\":578,\"uri\":{\"bool\":false,\"name\":\"xyz\"},\"Time\":{\"total\":78,\"db\":13}}]}}"
| spath prints.urls{} output=urls | stats count latest(_time) as Time by urls | rename urls as _raw,Time as _time | extract | rename Time.db as db_time, Time.total as total_time, uri.name as uri_name | table _time uri_name, response_time, db_time, total_time
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
basic search rv=*, av=*, wm=*, an=*, et=*
| spath input=data path=prints.urls{} output=urls
| mvexpand urls
| spath input=urls
| rename uri.name as uri_name, Time.db as time_db, Time.total as total_time
| table _time, rv, av, wm, an, et, uri_name, response_time, db_time, total_time
try rename
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@to4kawa renaming didn't work 😞
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
| makeresults
| eval _raw="{\"version\":\"v0.2\",\"prints\":{\"urls\":[{\"response_time\":256,\"uri\":{\"bool\":false,\"name\":\"abc\"},\"Time\":{\"total\":52,\"db\":11}},{\"response_time\":578,\"uri\":{\"bool\":false,\"name\":\"xyz\"},\"Time\":{\"total\":78,\"db\":13}}]}}"
| spath prints.urls{} output=urls
| mvexpand urls
| spath input=urls
| rename Time.db as db_time, Time.total as total_time, uri.name as uri_name
@monika0511
this query works well.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Beware that mvexpand can really chew through memory on your search head if you have a large amount of events that it's being applied to. This can cause it to terminate early if your admins are enforcing memory limits on searches or can cause the PID to get killed by OOM Killer on a nix box. If you are okay with a multivalued list type view in your table for each event it's unnecessary to mvexpand. If you do have a requirement that each row be one url array, then just be careful on the amount of data it is being applied to.
Also even though the same amount of spaths are in my original solution, the performance is still better than invoking the spath search command the same amount of times because it is on a single pipe eval which means it's processed all at once. Having multiple pipes with spath will cause it to stop and start at each pipe which will add an admittedly negligible performance reduction.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It appears the board has ate my answer I wrote earlier to prove that multiple eval spaths is faster than using even 1 spath command. I'll try again.
Using your original query with your sample here is your search performance on my local machine's splunk install:
| makeresults
| eval data="{
\"version\":\"v0.2\",
\"prints\":{
\"urls\":[
{
\"response_time\":256,
\"uri\":{
\"bool\":false,
\"name\":\"abc\"
},
\"Time\":{
\"total\":52,
\"db\":11
}
},
{
\"response_time\":578,
\"uri\":{
\"bool\":false,
\"name\":\"xyz\"
},
\"Time\":{
\"total\":78,
\"db\":13
}
}
]
}
}"
| spath input=data path=prints.urls{} output=urls
| spath input=urls path=response_time output=response_time
| spath input=urls path=uri.name output=uri_name
| spath input=urls path=Time.db output=db_time
| spath input=urls path=Time.total output=total_time
| table _time, rv, av, wm, an, et, uri_name, response_time, db_time, total_time
This search has completed and has returned 1 results by scanning 0 events in 0.119 seconds
The query I originally proposed:
| makeresults
| eval data="{
\"version\":\"v0.2\",
\"prints\":{
\"urls\":[
{
\"response_time\":256,
\"uri\":{
\"bool\":false,
\"name\":\"abc\"
},
\"Time\":{
\"total\":52,
\"db\":11
}
},
{
\"response_time\":578,
\"uri\":{
\"bool\":false,
\"name\":\"xyz\"
},
\"Time\":{
\"total\":78,
\"db\":13
}
}
]
}
}"
| eval response_time=spath(data, "prints.urls{}.response_time"),
uri_name=spath(data, "prints.urls{}.uri.name"),
db_time=spath(data, "prints.urls{}.Time.db"),
total_time=spath(data, "prints.urls{}.Time.total")
| table _time, rv, av, wm, an, et, uri_name, response_time, db_time, total_time
This search has completed and has returned 1 results by scanning 0 events in 0.099 seconds
The search proposed by @to4kawa
| makeresults
| eval _raw="{\"version\":\"v0.2\",\"prints\":{\"urls\":[{\"response_time\":256,\"uri\":{\"bool\":false,\"name\":\"abc\"},\"Time\":{\"total\":52,\"db\":11}},{\"response_time\":578,\"uri\":{\"bool\":false,\"name\":\"xyz\"},\"Time\":{\"total\":78,\"db\":13}}]}}"
| spath prints.urls{} output=urls
| mvexpand urls
| spath input=urls
| rename Time.db as db_time, Time.total as total_time, uri.name as uri_name
This search has completed and has returned 2 results by scanning 0 events in 0.244 seconds
And if you have a requirement that each url in the url array appears on it's own row in your table here's my modified version with the caveat I posed above about mvexpand being problematic on large data sets:
| makeresults
| eval data="{
\"version\":\"v0.2\",
\"prints\":{
\"urls\":[
{
\"response_time\":256,
\"uri\":{
\"bool\":false,
\"name\":\"abc\"
},
\"Time\":{
\"total\":52,
\"db\":11
}
},
{
\"response_time\":578,
\"uri\":{
\"bool\":false,
\"name\":\"xyz\"
},
\"Time\":{
\"total\":78,
\"db\":13
}
}
]
}
}"
| eval urls=spath(data,"prints.urls{}")
| mvexpand urls
| eval response_time=spath(urls, "response_time"),
uri_name=spath(urls, "uri.name"),
db_time=spath(urls, "Time.db"),
total_time=spath(urls, "Time.total")
| table _time, rv, av, wm, an, et, uri_name, response_time, db_time, total_time
This search has completed and has returned 2 results by scanning 0 events in 0.113 seconds
Ultimately you can see that using a single pipe eval with the spath command on each field you want will produce a more performant query by about 17% to your original query and at 48% improvement compared to the one by @to4kawa.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
refer to below response as this was a duplicate of that.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can use spath in an eval command and you can chain all of the fields into a single eval with a comma separating each field. This will make it more performant and it removes the need to do multiple spath commands:
basic search rv=*, av=*, wm=*, an=*, et=*
| eval response_time=spath(data, "prints.urls{}.response_time"),
uri_name=spath(data, "prints.urls{}.uri.name"),
db_time=spath(data, "prints.urls{}.Time.db"),
total_time=spath(data, "prints.urls{}.Time.total")
| table _time, rv, av, wm, an, et, uri_name, response_time, db_time, total_time
Here's a run anywhere example using your sample data to demonstrate that it will work. Keep in mind it won' t have the rv, av, wm, an, and et fields due to that not being present in the sample example:
| makeresults
| eval data="{
\"version\":\"v0.2\",
\"prints\":{
\"urls\":[
{
\"response_time\":256,
\"uri\":{
\"bool\":false,
\"name\":\"abc\"
},
\"Time\":{
\"total\":52,
\"db\":11
}
},
{
\"response_time\":578,
\"uri\":{
\"bool\":false,
\"name\":\"xyz\"
},
\"Time\":{
\"total\":78,
\"db\":13
}
}
]
}
}"
| eval response_time=spath(data, "prints.urls{}.response_time"),
uri_name=spath(data, "prints.urls{}.uri.name"),
db_time=spath(data, "prints.urls{}.Time.db"),
total_time=spath(data, "prints.urls{}.Time.total")
| table _time, rv, av, wm, an, et, uri_name, response_time, db_time, total_time
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@dmarling it will perform the same as it's using spath equal number of times as the original query I posted. but I agree the
spath(X,Y)
syntax is better. Thanks!
