Using tstats to extract first element in multivalu...

BarnesLeo · ‎08-16-2018

I have a field that looks something like this in the event viewer:

project_sources: [
    {
        scmEvent: {
            message: message1
            sha: xxxxxx
            ......
    }
    {
        scmEvent: {
            message: message2
            sha: yyyyyyy
            ......
    }
]

My end goal is to extract scmEvent.sha for the first element in the sources array as efficiently as possible. Can I do that somehow with tstats?

These two queries give me the same results:
1. index=myIndex | stats count by project_sources{}.scmEvent.message, sources{}.scmEvent.sha
2. | tstats count where index=myIndex by project_sources{}.scmEvent.message, sources{}.scmEvent.sha

For these cases I will get a table with something like this:
message1 xxxxx 30
message1 yyyyy 30
message2 xxxxx 30
message2 yyyyy 30

So it seems like the stats commands are counting all combinations of fields in element 0 and element 1 of sources.

Can I extract only one of the elements using tstats?
(This may be a trivial thing to do, but I still haven't fully grasped how fields that are dictionaries/arrays work.)

kamlesh_vaghela · ‎08-17-2018

Hi @BarnesLeo,

Can you please try the following search?

index=YOUR_INDEX | rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

My Sample Search:

| makeresults 
| eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message2\",\"sha\": \"yyyyyyy\"} ] }" 
| append 
    [| makeresults 
    | eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message3\",\"sha\": \"zzzzzz\"} ] }"] 
| kv 
| rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

THanks

BarnesLeo · ‎08-17-2018

Thanks for the answer, but this misses the point of the question. I know I can get the query to do what I want with mvzip or spath before using stats. The question is how to do it with tstats since I want maximum performance.

Using tstats to extract first element in multivalue field

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

Using tstats to extract first element in multivalue field

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard