Solved: Joining data from multiple events with stats

mike_nau · ‎10-22-2020

Hoping someone can help me to join data in the same index across multiple events. Here is the event data

index	event_type	job_name	item_name	queue_time
jenkins_statistics	queue	null	xxx/job/3	20
jenkins_statistics	queue	null	xxx/job/3	30
jenkins_statistics	queue	null	xxx/job	0.03
jenkins_statistics	job	xxx/job/3	null	0.03
jenkins_statistics	queue	null	xxx/job/2	22
jenkins_statistics	queue	null	xxx/job	0.01
jenkins_statistics	job	xxx/job/2	null	0.01
jenkins_statistics	queue	null	xxx/job/1	25
jenkins_statistics	queue	null	xxx/job/1	15
jenkins_statistics	queue	null	xxx/job	0.19
jenkins_statistics	job	xxx/job/1	null	0.19

The result I am looking for is

index	job_name	count(queue_time)	avg(queue_time)
jenkins_statistics	xxx/job/3	2	25
jenkins_statistics	xxx/job/2	1	22
jenkins_statistics	xxx/job/1	2	20

I want to grab each of the events associated with event_type=job and join the job_name field with the item_name field for the events with event_type=queue and get the associated queue_time values for these event_type=queue events and calculate the average per job_name while dropping all other event_type=queue events.

I have been trying to get stats to work for this, but have not been able to figure it out. Everything I try includes the queue_time values for the event_type=queue events. I have not been able to effectively join the events across the event_type=job and event_type=queue events

danspav · ‎10-23-2020

Hi @mike_nau,

Try the following search:

|makeresults |eval raw="event_type=queue,item_name=\"xxx/job/3\",queue_time=20 ###
event_type=queue,item_name=\"xxx/job/3\",queue_time=30 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.03 ###
event_type=job,job_name=\"xxx/job/3\",item_name=\"\",,queue_time=0.03 ###
event_type=queue,item_name=\"xxx/job/2\",queue_time=22 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.01 ###
event_type=job,job_name=\"xxx/job/2\",item_name=\"\",queue_time=0.01 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=25 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=15 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.19 ###
event_type=job,job_name=\"xxx/job/1\",item_name=\"\",queue_time=0.19 "
| makemv delim="###" raw
| mvexpand raw 
| rename raw  as  _raw 
| extract
| eval index="jenkins_statistics"


| eventstats values(job_name) as valid_jobs
| eval job_name=if(item_name=valid_jobs,item_name, NULL)
| stats values(index) as index, count(queue_time), avg(queue_time) by job_name
| table index, job_name, "count(queue_time)", "avg(queue_time)"
| sort - "avg(queue_time)"

The top bit is just generating the data from your table.

The real stuff happens with the eventstats command. This will run a |stats type command over the whole data set and add the results to each row. In our case we get a field called "valid_jobs" that has job1,job2,job3.

Next we set the job_name fields only if the item_name is the same as one of our "valid_jobs".

Then we run stats to get the count and average by job_name

Finally, we table out and sort by the average queue time.

The result is a table similar to yours:

Does that do what you were after?

View solution in original post

mike_nau · ‎10-26-2020

Both answers work, crediting the second one for the additional context and explanation which helped me to understand how it actually solved the problem 🙂

ITWhisperer · ‎10-23-2020

| eval job_name=if(event_type="job",job_name,if(event_type="queue",item_name,null()))
| eventstats values(eval(if(event_type="job",event_type,null()))) as job_id by job_name
| where event_type="queue" AND job_id="job"
| stats count(queue_time) avg(queue_time) by index job_name

danspav · ‎10-23-2020

Hi @mike_nau,

Try the following search:

|makeresults |eval raw="event_type=queue,item_name=\"xxx/job/3\",queue_time=20 ###
event_type=queue,item_name=\"xxx/job/3\",queue_time=30 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.03 ###
event_type=job,job_name=\"xxx/job/3\",item_name=\"\",,queue_time=0.03 ###
event_type=queue,item_name=\"xxx/job/2\",queue_time=22 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.01 ###
event_type=job,job_name=\"xxx/job/2\",item_name=\"\",queue_time=0.01 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=25 ###
event_type=queue,item_name=\"xxx/job/1\",queue_time=15 ###
event_type=queue,item_name=\"xxx/job\",queue_time=0.19 ###
event_type=job,job_name=\"xxx/job/1\",item_name=\"\",queue_time=0.19 "
| makemv delim="###" raw
| mvexpand raw 
| rename raw  as  _raw 
| extract
| eval index="jenkins_statistics"


| eventstats values(job_name) as valid_jobs
| eval job_name=if(item_name=valid_jobs,item_name, NULL)
| stats values(index) as index, count(queue_time), avg(queue_time) by job_name
| table index, job_name, "count(queue_time)", "avg(queue_time)"
| sort - "avg(queue_time)"

The top bit is just generating the data from your table.

The real stuff happens with the eventstats command. This will run a |stats type command over the whole data set and add the results to each row. In our case we get a field called "valid_jobs" that has job1,job2,job3.

Next we set the job_name fields only if the item_name is the same as one of our "valid_jobs".

Then we run stats to get the count and average by job_name

Finally, we table out and sort by the average queue time.

The result is a table similar to yours:

Does that do what you were after?

Joining data from multiple events with stats

join

stats

How to send events & findings from AWS to Splunk using Amazon EventBridge

Exciting News: The AppDynamics Community Joins Splunk!

The All New Performance Insights for Splunk

Are you a member of the Splunk Community?