Re: Using tstats to extract first element in multi...

BarnesLeo · ‎08-16-2018

I have a field that looks something like this in the event viewer:

project_sources: [
    {
        scmEvent: {
            message: message1
            sha: xxxxxx
            ......
    }
    {
        scmEvent: {
            message: message2
            sha: yyyyyyy
            ......
    }
]

My end goal is to extract scmEvent.sha for the first element in the sources array as efficiently as possible. Can I do that somehow with tstats?

These two queries give me the same results:
1. index=myIndex | stats count by project_sources{}.scmEvent.message, sources{}.scmEvent.sha
2. | tstats count where index=myIndex by project_sources{}.scmEvent.message, sources{}.scmEvent.sha

For these cases I will get a table with something like this:
message1 xxxxx 30
message1 yyyyy 30
message2 xxxxx 30
message2 yyyyy 30

So it seems like the stats commands are counting all combinations of fields in element 0 and element 1 of sources.

Can I extract only one of the elements using tstats?
(This may be a trivial thing to do, but I still haven't fully grasped how fields that are dictionaries/arrays work.)

kamlesh_vaghela · ‎08-17-2018

Hi @BarnesLeo,

Can you please try the following search?

index=YOUR_INDEX | rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

My Sample Search:

| makeresults 
| eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message2\",\"sha\": \"yyyyyyy\"} ] }" 
| append 
    [| makeresults 
    | eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message3\",\"sha\": \"zzzzzz\"} ] }"] 
| kv 
| rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

THanks

BarnesLeo · ‎08-17-2018

Thanks for the answer, but this misses the point of the question. I know I can get the query to do what I want with mvzip or spath before using stats. The question is how to do it with tstats since I want maximum performance.

Using tstats to extract first element in multivalue field

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Announcing Modern Navigation: A New Era of Splunk User Experience

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

Join the Conversation