Reporting

Counts and average of events over fixed periods of time using stats

cindygibbs_08
Communicator

Hello my gorgeous people from Splunk I hope everyone is keeping their sanity during this hard times!

I was wondering how to properly code on Splunk how to obtain the distinct count of events per interval of x  units of time passed  and also the average of those intervals during the working week.

Let's start first will the first scenario:
I work for a Hotel chain corp. due to "rona" our prices for rooms has dropped, and we are receiving a lot of calls and if any of these calls are left unattended for more than 2 minutes I have to mark them as "UNHAPPY_CX" I calculate this manually like so:

 

search index="calls" 
| eval arrival_time=time_in_queue_call
| eval picked_up_time=call_taken
| eval call_code=call_id
| eval ID=cx_info
| time_now=now()
| eval time_unattended=if(isnull(call_taken-time_in_queue_call),
       time_now-time_in_queue_call,call_taken-time_in_queue_call)
| eval class=if(time_unattended>120,"UNHAPPY_CX","HAPPY_CX")
| stats values(call_code) as code  values(class) as class by ID  

 

 but I would like to create intervals of x units of time such as intervales of one hour of two hours or perhaps three that will start from lets say 8:00 am EST to 9:00 am EST and distinct count the numbers of class "UNHAPPY_CX" by call_code ("code") and have this on a table -- like this: Let's say that this is the raw data

codeIDclassdate_time
DYUDJ1UNHAPPY_CX8:01
CBY2UNHAPPY_CX8:06
XCGH3UNHAPPY_CX9:01
OJ643UNHAPPY_CX10:41
5677H4UNHAPPY_CX10:45
567F5UNHAPPY_CX11:05

 

I want to be able to have something like this 

dc(code)Interval
28:00-9:00
19:00-10:00
311:00-12:00

 

I was thinking that I was going to account for a field with the time of each call and perhaps arrange them in chronological order and maybe create a new field filtering by day and hour but I dont know if splunk has a faster more efficient way of doing this... The closest answer I came across is a time chart but I want something in the form of a table...  also I want to be able to calculate in the future the mean of dc(code) for the past x days... So I am truly in debt and so thankful to anyone that can shed a bit of light on this problem, thank you so much guys!

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Hi @cindygibbs_08 ,

Here is a run-anywhere search which I hope produces some events which represent the sort of events you are getting

| gentimes start=-4 increment=1h
| rename starttime as _time 
| fields _time
| streamstats count as row 
| eval cx_info=floor((row-1)/4)
| eval eventtype=(row-1)%4
| eval _time=relative_time(now(),"-3h@h")
| eval _time=_time+(random()%100)+(eventtype*100)+(3600*(cx_info%4))
| eval time_in_queue_call=if(eventtype=0,_time,null)
| eval call_taken=if(eventtype=1 AND random()%3 > 0,_time,null)
| eval call_id=if(eventtype=2,mvindex(split("DYUDJ,CBY,XCGH,OJ64,5677H,567F",","),cx_info%6),null)
| eval call_end=if(eventtype=3,_time,null)
| fields - _time eventtype row
| stats values(*) as * by cx_info
| eval call_id=if(isnull(call_taken),null,call_id)
| eval call_end=if(isnull(call_taken) AND random()%2 > 0,null,call_end)
| fieldformat time_in_queue_call=strftime(time_in_queue_call,"%Y-%m-%d %H:%M:%S")
| fieldformat call_taken=strftime(call_taken,"%Y-%m-%d %H:%M:%S")
| fieldformat call_end=strftime(call_end,"%Y-%m-%d %H:%M:%S")
| stats values(*) as * by cx_info

I have assumed that you get an event when the call ends, and that if the call isn't taken, you don't get the call_id, and you may not have call_end events for some calls. I have also gathered all the events related to call (cx_info) so that you end up with one event per cx_info in the pipeline.

Based on this, I have added to following to get the statistics I think you want

| eval time_now=now()
| eval time_unattended=if(isnull(call_taken), if(isnull(call_end),time_now-time_in_queue_call,call_end-time_in_queue_call),call_taken-time_in_queue_call)
| eval class=if(time_unattended>120,"UNHAPPY_CX","HAPPY_CX")
| eval _time=time_in_queue_call
| bin span=1h _time
| where class="UNHAPPY_CX"
| stats dc(call_id) as unhappy_customers by _time
| eval interval=strftime(_time,"%H:%M")."-".strftime(_time+3600,"%H:%M")
| table unhappy_customers interval

I have changed time_unattended to use call_end time (assuming it is available)

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

Hi Cindy,

I noticed that you want to group the results by date_time interval.

Is date_time a field already in the events of the "calls" index? Is it the same as _time (the timestamp assigned when the event was indexed)?

Does the event always contain the call_taken field? If not, can you tell which customer (call_id) made the call?

Is cx_info unique for each call? If so, can there be more than one event per cx_info giving additional information e.g. an event when the call started, another when the call was picked up, another when it ended (whether it was picked up or not, another when the customer (call_id) was added, etc.?

Do you want the grouping based on the time_in_queue_call time, or the call_taken time (if there is one, or call_ended time if known)?

cindygibbs_08
Communicator

Omg @ITWhisperer  Thank you so much your undertanding my issue and asking me this probing questions... I can extract a date_time field from the event in which the calls enter the queue, once the call gets in queue  the event records the ID of the cx and if the cx hangs up and calls again a new call_code is created so no problems there... if the call is picked up I can extract a date_time field from that event, if the call does not get picked up then the event call_taken is not recorded (so there will be no info on that event) and I will rather use time_now to calculate how much time has passed and if that time was greater than 2 mins

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Hi @cindygibbs_08 ,

Here is a run-anywhere search which I hope produces some events which represent the sort of events you are getting

| gentimes start=-4 increment=1h
| rename starttime as _time 
| fields _time
| streamstats count as row 
| eval cx_info=floor((row-1)/4)
| eval eventtype=(row-1)%4
| eval _time=relative_time(now(),"-3h@h")
| eval _time=_time+(random()%100)+(eventtype*100)+(3600*(cx_info%4))
| eval time_in_queue_call=if(eventtype=0,_time,null)
| eval call_taken=if(eventtype=1 AND random()%3 > 0,_time,null)
| eval call_id=if(eventtype=2,mvindex(split("DYUDJ,CBY,XCGH,OJ64,5677H,567F",","),cx_info%6),null)
| eval call_end=if(eventtype=3,_time,null)
| fields - _time eventtype row
| stats values(*) as * by cx_info
| eval call_id=if(isnull(call_taken),null,call_id)
| eval call_end=if(isnull(call_taken) AND random()%2 > 0,null,call_end)
| fieldformat time_in_queue_call=strftime(time_in_queue_call,"%Y-%m-%d %H:%M:%S")
| fieldformat call_taken=strftime(call_taken,"%Y-%m-%d %H:%M:%S")
| fieldformat call_end=strftime(call_end,"%Y-%m-%d %H:%M:%S")
| stats values(*) as * by cx_info

I have assumed that you get an event when the call ends, and that if the call isn't taken, you don't get the call_id, and you may not have call_end events for some calls. I have also gathered all the events related to call (cx_info) so that you end up with one event per cx_info in the pipeline.

Based on this, I have added to following to get the statistics I think you want

| eval time_now=now()
| eval time_unattended=if(isnull(call_taken), if(isnull(call_end),time_now-time_in_queue_call,call_end-time_in_queue_call),call_taken-time_in_queue_call)
| eval class=if(time_unattended>120,"UNHAPPY_CX","HAPPY_CX")
| eval _time=time_in_queue_call
| bin span=1h _time
| where class="UNHAPPY_CX"
| stats dc(call_id) as unhappy_customers by _time
| eval interval=strftime(_time,"%H:%M")."-".strftime(_time+3600,"%H:%M")
| table unhappy_customers interval

I have changed time_unattended to use call_end time (assuming it is available)

cindygibbs_08
Communicator

Thank you so much for your thougthful insight and explaining the work process of your code this is beyond what I wanted thank you so much my dear @ITWhisperer 

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@cindygibbs_08 

Try this.

YOUR_SEARCH
| eval s=mvindex(split(date_time,":"),0),Interval=s.":00-".(tonumber(s)+1).":00"
|stats dc(Code) by Interval

 

Sample:

| makeresults 
| eval _raw="Code	ID	class	date_time
DYUDJ	1	UNHAPPY_CX	8:01
CBY	2	UNHAPPY_CX	8:06
XCGH	3	UNHAPPY_CX	9:01
OJ64	3	UNHAPPY_CX	10:41
5677H	4	UNHAPPY_CX	10:45
567F	5	UNHAPPY_CX	11:05
" 
| multikv forceheader=1 
| eval s=mvindex(split(date_time,":"),0),Interval=s.":00-".(tonumber(s)+1).":00"
|stats dc(Code) by Interval

 

Thanks
Kamlesh Vaghela

If this reply helps you, an upvote would be appreciated. 

cindygibbs_08
Communicator

@kamlesh_vaghela  thank you so much sir this helped me getting to know an aproach to do this under a certain circunstance

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...