Splunk Search

How to calculate the average and standard deviation of a field that has two values?

pavanae
Builder

I have the Splunk searches as below:

search:

 My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user|eval accessTimerange=FirstHour+"--"+LastHour|fields - FirstHour - LastHour

Result:

user     accessTimerange
A123     22--24   
B456     02--08

Now how can we calculate the average for the access time for past 7 days of each user? i.e something like below

 user     accessTimerange    Average_accessTimerange
 A123     22--24             0--23
 B456     02--08             03--07

I know we can calculate the average for the single field value by using the avg command. But this scenario is different where the field values of accessTimerange are 2 hourly fields. Is there any possibility of calculating the averages for them in Splunk?

Also if anyone know how to calculate the stdev for this scenario. Please suggest?

Where the condition for standard_deviation is if user accessTimerange is 3 times standard deviation of average.

1 Solution

somesoni2
Revered Legend

Give this a try

To get average in-out time

My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user | eventstats avg(FirstHour) as FirstHourAvg avg(LastHour) as LastHourAvg |eval accessTimerange=FirstHour+"--"+LastHour | eval Average_accessTimerange=tostring(round(FirstHourAvg))+"--"+tostring(round(LastHourAvg)) |fields - FirstHour* - LastHour*

Including Standard deviations logic

My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user | eventstats avg(*) as *Avg stdev(*) as *Stdev |eval accessTimerange=FirstHour+"--"+LastHour | eval Average_accessTimerange=tostring(round(FirstHourAvg))+"--"+tostring(round(LastHourAvg)) | eval Is_Anomalous=if(abs(FirstHourAvg-FirstHour)>FirstHourStdev OR abs(LastHourAvg-LastHour)>LastHourStdev, "Yes","No") |fields - FirstHour* - LastHour*

View solution in original post

somesoni2
Revered Legend

Give this a try

To get average in-out time

My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user | eventstats avg(FirstHour) as FirstHourAvg avg(LastHour) as LastHourAvg |eval accessTimerange=FirstHour+"--"+LastHour | eval Average_accessTimerange=tostring(round(FirstHourAvg))+"--"+tostring(round(LastHourAvg)) |fields - FirstHour* - LastHour*

Including Standard deviations logic

My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user | eventstats avg(*) as *Avg stdev(*) as *Stdev |eval accessTimerange=FirstHour+"--"+LastHour | eval Average_accessTimerange=tostring(round(FirstHourAvg))+"--"+tostring(round(LastHourAvg)) | eval Is_Anomalous=if(abs(FirstHourAvg-FirstHour)>FirstHourStdev OR abs(LastHourAvg-LastHour)>LastHourStdev, "Yes","No") |fields - FirstHour* - LastHour*

View solution in original post

gokadroid
Motivator

what do you consider the "average" of a user who logged in as follows:

user     accessTimerange
 A123     18--24   
 A123     19--23
 A123     20--22
 A123     21--21  
0 Karma

pavanae
Builder

I am trying to calculate the average for the past 7 days for each user. hmm i haven't thought about the above scenario..strange. Thank you.

0 Karma

niketnilay
Legend

Stats like average and Standard Deviation can only be applied to numeric fields. So following are your two options:

1) Use duration for stats like avg and stdev i.e 22--24 has 2 hours duration and 02--08 has 6 hours duration.
2) Use stats for start time separately and end time separately, however statistically this would make no sense.

Please confirm what kind of stats you are looking for. Also, share mathematical details for your problem, we can assist with corresponding Splunk query.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

pavanae
Builder

Thanks for the responce. option 1 dont work for me because i was trying to find the avarege timings of the user from the past 7 days.

0 Karma

niketnilay
Legend

@pavane by Option 1 I meant the duration will be for each day and then there can be average calculated over 7 days.

Month_Day User Start_End_Time Duration
11/05 A123 20-22 2
11/06 A123 01-08 7
11/07 A123 21-24 3
11/08 A123 18-22 4
11/09 A123 18-24 6
11/10 A123 02-08 6
11/11 A123 22-24 2

Then the avg will be 30/7=4.28 hrs

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

pavanae
Builder

@niketnilay . I understood. But I'm looking to find only the timings an user is working for past 7 days. For example if an user is working from 8 pm to 4 am daily and yesterday he worked in a different timings like 2 am to 7 am. I should notice that deviation in a line graph or something. Similarly. I want to monitor for all the user's deviation on yesterday's report comparing with the last 7 days working time's avg.

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.