Splunk Search

Using tstats to extract first element in multivalue field

BarnesLeo
Engager

I have a field that looks something like this in the event viewer:

project_sources: [
    {
        scmEvent: {
            message: message1
            sha: xxxxxx
            ......
    }
    {
        scmEvent: {
            message: message2
            sha: yyyyyyy
            ......
    }
]

My end goal is to extract scmEvent.sha for the first element in the sources array as efficiently as possible. Can I do that somehow with tstats?

These two queries give me the same results:
1. index=myIndex | stats count by project_sources{}.scmEvent.message, sources{}.scmEvent.sha
2. | tstats count where index=myIndex by project_sources{}.scmEvent.message, sources{}.scmEvent.sha

For these cases I will get a table with something like this:
message1 xxxxx 30
message1 yyyyy 30
message2 xxxxx 30
message2 yyyyy 30

So it seems like the stats commands are counting all combinations of fields in element 0 and element 1 of sources.

Can I extract only one of the elements using tstats?
(This may be a trivial thing to do, but I still haven't fully grasped how fields that are dictionaries/arrays work.)

Tags (2)
0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

Hi @BarnesLeo,

Can you please try the following search?

index=YOUR_INDEX | rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

My Sample Search:

| makeresults 
| eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message2\",\"sha\": \"yyyyyyy\"} ] }" 
| append 
    [| makeresults 
    | eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message3\",\"sha\": \"zzzzzz\"} ] }"] 
| kv 
| rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

THanks

0 Karma

BarnesLeo
Engager

Thanks for the answer, but this misses the point of the question. I know I can get the query to do what I want with mvzip or spath before using stats. The question is how to do it with tstats since I want maximum performance.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...