Splunk Search

Using tstats to extract first element in multivalue field

BarnesLeo
Engager

I have a field that looks something like this in the event viewer:

project_sources: [
    {
        scmEvent: {
            message: message1
            sha: xxxxxx
            ......
    }
    {
        scmEvent: {
            message: message2
            sha: yyyyyyy
            ......
    }
]

My end goal is to extract scmEvent.sha for the first element in the sources array as efficiently as possible. Can I do that somehow with tstats?

These two queries give me the same results:
1. index=myIndex | stats count by project_sources{}.scmEvent.message, sources{}.scmEvent.sha
2. | tstats count where index=myIndex by project_sources{}.scmEvent.message, sources{}.scmEvent.sha

For these cases I will get a table with something like this:
message1 xxxxx 30
message1 yyyyy 30
message2 xxxxx 30
message2 yyyyy 30

So it seems like the stats commands are counting all combinations of fields in element 0 and element 1 of sources.

Can I extract only one of the elements using tstats?
(This may be a trivial thing to do, but I still haven't fully grasped how fields that are dictionaries/arrays work.)

Tags (2)
0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

Hi @BarnesLeo,

Can you please try the following search?

index=YOUR_INDEX | rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

My Sample Search:

| makeresults 
| eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message2\",\"sha\": \"yyyyyyy\"} ] }" 
| append 
    [| makeresults 
    | eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message3\",\"sha\": \"zzzzzz\"} ] }"] 
| kv 
| rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

THanks

0 Karma

BarnesLeo
Engager

Thanks for the answer, but this misses the point of the question. I know I can get the query to do what I want with mvzip or spath before using stats. The question is how to do it with tstats since I want maximum performance.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...