Splunk Search

Joining data from multiple events with stats

mike_nau
Engager

Hoping someone can help me to join data in the same index across multiple events. Here is the event data

indexevent_typejob_name
item_namequeue_time
jenkins_statisticsqueuenullxxx/job/320
jenkins_statisticsqueuenullxxx/job/330
jenkins_statisticsqueuenullxxx/job0.03
jenkins_statisticsjobxxx/job/3null0.03
jenkins_statisticsqueuenullxxx/job/222
jenkins_statisticsqueuenullxxx/job0.01
jenkins_statisticsjobxxx/job/2null0.01
jenkins_statisticsqueuenullxxx/job/125
jenkins_statisticsqueuenullxxx/job/115
jenkins_statisticsqueuenullxxx/job0.19
jenkins_statisticsjobxxx/job/1null0.19

 

The result I am looking for is

indexjob_name
count(queue_time)avg(queue_time)
jenkins_statisticsxxx/job/3225
jenkins_statisticsxxx/job/2122
jenkins_statisticsxxx/job/1220

 

I want to grab each of the events associated with event_type=job and join the job_name field with the item_name field for the events with event_type=queue and get the associated queue_time values for these event_type=queue events and calculate the average per job_name while dropping all other event_type=queue events.

I have been trying to get stats to work for this, but have not been able to figure it out. Everything I try includes the queue_time values for the event_type=queue events. I have not been able to effectively join the events across the event_type=job and event_type=queue events 

Labels (2)
1 Solution

danspav
SplunkTrust
SplunkTrust

Hi @mike_nau,

Try the following search:

 

|makeresults |eval raw="event_type=queue,item_name=\"xxx/job/3\",queue_time=20 ###
event_type=queue,item_name=\"xxx/job/3\",queue_time=30 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.03 ###
event_type=job,job_name=\"xxx/job/3\",item_name=\"\",,queue_time=0.03 ###
event_type=queue,item_name=\"xxx/job/2\",queue_time=22 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.01 ###
event_type=job,job_name=\"xxx/job/2\",item_name=\"\",queue_time=0.01 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=25 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=15 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.19 ###
event_type=job,job_name=\"xxx/job/1\",item_name=\"\",queue_time=0.19 "
| makemv delim="###" raw
| mvexpand raw 
| rename raw  as  _raw 
| extract
| eval index="jenkins_statistics"


| eventstats values(job_name) as valid_jobs
| eval job_name=if(item_name=valid_jobs,item_name, NULL)
| stats values(index) as index, count(queue_time), avg(queue_time) by job_name
| table index, job_name, "count(queue_time)", "avg(queue_time)"
| sort - "avg(queue_time)"

 

 

The top bit is just generating the data from your table.

The real stuff happens with the eventstats command. This will run a |stats type command over the whole data set and add the results to each row. In our case we get a field called "valid_jobs" that has job1,job2,job3.

Next we set the job_name fields only if the item_name is the same as one of our "valid_jobs".

Then we run stats to get the count and average by job_name

Finally, we table out and sort by the average queue time.

The result is a table similar to yours:

danspav_0-1603437578102.png

Does that do what you were after?

View solution in original post

0 Karma

mike_nau
Engager

Both answers work, crediting the second one for the additional context and explanation which helped me to understand how it actually solved the problem 🙂 

0 Karma

ITWhisperer
Legend
| eval job_name=if(event_type="job",job_name,if(event_type="queue",item_name,null()))
| eventstats values(eval(if(event_type="job",event_type,null()))) as job_id by job_name
| where event_type="queue" AND job_id="job"
| stats count(queue_time) avg(queue_time) by index job_name
0 Karma

danspav
SplunkTrust
SplunkTrust

Hi @mike_nau,

Try the following search:

 

|makeresults |eval raw="event_type=queue,item_name=\"xxx/job/3\",queue_time=20 ###
event_type=queue,item_name=\"xxx/job/3\",queue_time=30 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.03 ###
event_type=job,job_name=\"xxx/job/3\",item_name=\"\",,queue_time=0.03 ###
event_type=queue,item_name=\"xxx/job/2\",queue_time=22 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.01 ###
event_type=job,job_name=\"xxx/job/2\",item_name=\"\",queue_time=0.01 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=25 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=15 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.19 ###
event_type=job,job_name=\"xxx/job/1\",item_name=\"\",queue_time=0.19 "
| makemv delim="###" raw
| mvexpand raw 
| rename raw  as  _raw 
| extract
| eval index="jenkins_statistics"


| eventstats values(job_name) as valid_jobs
| eval job_name=if(item_name=valid_jobs,item_name, NULL)
| stats values(index) as index, count(queue_time), avg(queue_time) by job_name
| table index, job_name, "count(queue_time)", "avg(queue_time)"
| sort - "avg(queue_time)"

 

 

The top bit is just generating the data from your table.

The real stuff happens with the eventstats command. This will run a |stats type command over the whole data set and add the results to each row. In our case we get a field called "valid_jobs" that has job1,job2,job3.

Next we set the job_name fields only if the item_name is the same as one of our "valid_jobs".

Then we run stats to get the count and average by job_name

Finally, we table out and sort by the average queue time.

The result is a table similar to yours:

danspav_0-1603437578102.png

Does that do what you were after?

View solution in original post

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.