I have an application to analyse phone call data from multiple locations.
I want to generate a report that provides a time history of concurrent calls at each location.
Using this query I can get exactly what I want for a single location:
index=calls locat=8374 | eval start_epoch=strptime(start_time,"%Y-%m-%d %H:%M:%S") | concurrency duration=call_duration start=start_epoch | timechart max(concurrency)
However, when I use all locations (as below) I get incorrect data:
index=calls locat=* | eval start_epoch=strptime(start_time,"%Y-%m-%d %H:%M:%S") | concurrency duration=call_duration start=start_epoch | timechart max(concurrency)
I've established that the concurrency command determines concurrency for all data passed into it and even when the output is split by 'location', the concurrency for each location is determined from the data for all locations rather than the data for each location seperately.
My suggestion would be to add a 'group by xxx' clause to the concurrency command that will calculate the concurrency seperately for the data associated to each occurance of field xxx.
Alternatively, does anyone know of a way to achieve the result I am looking for within the current functionality of Splunk?
I also have a similar situation, and created the following search to accomplish what I need:
| stats values(job) AS job, max(_time) AS maxTime, min(_time) AS minTime by jobid | streamstats current=f window=1 global=f last(maxTime) as last_maxTime by job | eval _time=minTime | table _time, job, jobid, minTime, maxTime, last_maxTime | eval overlapped=if(last_maxTime>=minTime,1,0) | where overlapped=1
In this case, I use stats, instead of transaction, to get the start and end time of the event (based off jobid, which is a unique id for each event). Then I do a streamstats to get the previous end time to compare to my start time, if the end time of the previous event is >= the current start time, then I set a flag called overlapped, and filter to those.
This does not get a count of the number of events in that event time range, but works for my purposes.
I'm honest, I don't fully understand your request...but let me show you a run everywhere example in which I 'group by xxx' and use
concurrency after that. First I run this:
index=_internal series=* | eventstats count by series | delta _time AS timeDelta p=1 | eval timeDelta=abs(timeDelta) | concurrency duration=timeDelta | timechart span=1h max(concurrency) by series
This will search in
index=_internal for all
series, then the
eventstats will group it for you and keeps all other fields as well. What happens next is some SplunkFu Magic to setup some dummy duration field with
concurrency and in the last step I use
timechart by series to display
max values of
concurrency for each
series over a
_time span of one hour.
hope this helps ...
This is how the problem has been expressed previously if this helps clarify.
'We have a bunch of applications which we need to monitor execution of for licensing reasons. I'm taking the Windows Event Log output, which is capture process startup and termination events and combining these into transactions for these applications.
However, I need to be able to work out maximum concurrent usage of each application. I could do this by running a new query for each application, but it would far more efficient to just run one query that is able to deduce the concurrency split by process executable.'
Thanks for you answer, but it does not address the fundamental problem with the concurrency command, in that the concurrency will be calculated on all data passed into it rather than being calculated seperately for sub-sets of the data based on a grouping field (in my example 'locat').
Also, I already have a duration field 'call_duration' and so do not need to calculate one and the timeDelta calculation is not the way phone calls work. The duration of a phone call is not related to the event time of the previous one and so a time delta does not represent the duration of a call.