Splunk Search

Joining data from multiple events with stats

mike_nau
Engager

Hoping someone can help me to join data in the same index across multiple events. Here is the event data

indexevent_typejob_name
item_namequeue_time
jenkins_statisticsqueuenullxxx/job/320
jenkins_statisticsqueuenullxxx/job/330
jenkins_statisticsqueuenullxxx/job0.03
jenkins_statisticsjobxxx/job/3null0.03
jenkins_statisticsqueuenullxxx/job/222
jenkins_statisticsqueuenullxxx/job0.01
jenkins_statisticsjobxxx/job/2null0.01
jenkins_statisticsqueuenullxxx/job/125
jenkins_statisticsqueuenullxxx/job/115
jenkins_statisticsqueuenullxxx/job0.19
jenkins_statisticsjobxxx/job/1null0.19

 

The result I am looking for is

indexjob_name
count(queue_time)avg(queue_time)
jenkins_statisticsxxx/job/3225
jenkins_statisticsxxx/job/2122
jenkins_statisticsxxx/job/1220

 

I want to grab each of the events associated with event_type=job and join the job_name field with the item_name field for the events with event_type=queue and get the associated queue_time values for these event_type=queue events and calculate the average per job_name while dropping all other event_type=queue events.

I have been trying to get stats to work for this, but have not been able to figure it out. Everything I try includes the queue_time values for the event_type=queue events. I have not been able to effectively join the events across the event_type=job and event_type=queue events 

Labels (2)
1 Solution

danspav
SplunkTrust
SplunkTrust

Hi @mike_nau,

Try the following search:

 

|makeresults |eval raw="event_type=queue,item_name=\"xxx/job/3\",queue_time=20 ###
event_type=queue,item_name=\"xxx/job/3\",queue_time=30 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.03 ###
event_type=job,job_name=\"xxx/job/3\",item_name=\"\",,queue_time=0.03 ###
event_type=queue,item_name=\"xxx/job/2\",queue_time=22 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.01 ###
event_type=job,job_name=\"xxx/job/2\",item_name=\"\",queue_time=0.01 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=25 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=15 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.19 ###
event_type=job,job_name=\"xxx/job/1\",item_name=\"\",queue_time=0.19 "
| makemv delim="###" raw
| mvexpand raw 
| rename raw  as  _raw 
| extract
| eval index="jenkins_statistics"


| eventstats values(job_name) as valid_jobs
| eval job_name=if(item_name=valid_jobs,item_name, NULL)
| stats values(index) as index, count(queue_time), avg(queue_time) by job_name
| table index, job_name, "count(queue_time)", "avg(queue_time)"
| sort - "avg(queue_time)"

 

 

The top bit is just generating the data from your table.

The real stuff happens with the eventstats command. This will run a |stats type command over the whole data set and add the results to each row. In our case we get a field called "valid_jobs" that has job1,job2,job3.

Next we set the job_name fields only if the item_name is the same as one of our "valid_jobs".

Then we run stats to get the count and average by job_name

Finally, we table out and sort by the average queue time.

The result is a table similar to yours:

danspav_0-1603437578102.png

Does that do what you were after?

View solution in original post

0 Karma

mike_nau
Engager

Both answers work, crediting the second one for the additional context and explanation which helped me to understand how it actually solved the problem 🙂 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| eval job_name=if(event_type="job",job_name,if(event_type="queue",item_name,null()))
| eventstats values(eval(if(event_type="job",event_type,null()))) as job_id by job_name
| where event_type="queue" AND job_id="job"
| stats count(queue_time) avg(queue_time) by index job_name
0 Karma

danspav
SplunkTrust
SplunkTrust

Hi @mike_nau,

Try the following search:

 

|makeresults |eval raw="event_type=queue,item_name=\"xxx/job/3\",queue_time=20 ###
event_type=queue,item_name=\"xxx/job/3\",queue_time=30 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.03 ###
event_type=job,job_name=\"xxx/job/3\",item_name=\"\",,queue_time=0.03 ###
event_type=queue,item_name=\"xxx/job/2\",queue_time=22 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.01 ###
event_type=job,job_name=\"xxx/job/2\",item_name=\"\",queue_time=0.01 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=25 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=15 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.19 ###
event_type=job,job_name=\"xxx/job/1\",item_name=\"\",queue_time=0.19 "
| makemv delim="###" raw
| mvexpand raw 
| rename raw  as  _raw 
| extract
| eval index="jenkins_statistics"


| eventstats values(job_name) as valid_jobs
| eval job_name=if(item_name=valid_jobs,item_name, NULL)
| stats values(index) as index, count(queue_time), avg(queue_time) by job_name
| table index, job_name, "count(queue_time)", "avg(queue_time)"
| sort - "avg(queue_time)"

 

 

The top bit is just generating the data from your table.

The real stuff happens with the eventstats command. This will run a |stats type command over the whole data set and add the results to each row. In our case we get a field called "valid_jobs" that has job1,job2,job3.

Next we set the job_name fields only if the item_name is the same as one of our "valid_jobs".

Then we run stats to get the count and average by job_name

Finally, we table out and sort by the average queue time.

The result is a table similar to yours:

danspav_0-1603437578102.png

Does that do what you were after?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...