Splunk Search

## How to calculate the average and standard deviation of a field that has two values?

Builder

I have the Splunk searches as below:

search:

`````` My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user|eval accessTimerange=FirstHour+"--"+LastHour|fields - FirstHour - LastHour
``````

Result:

``````user     accessTimerange
A123     22--24
B456     02--08
``````

Now how can we calculate the average for the access time for past 7 days of each user? i.e something like below

`````` user     accessTimerange    Average_accessTimerange
A123     22--24             0--23
B456     02--08             03--07
``````

I know we can calculate the average for the single field value by using the avg command. But this scenario is different where the field values of accessTimerange are 2 hourly fields. Is there any possibility of calculating the averages for them in Splunk?

Also if anyone know how to calculate the stdev for this scenario. Please suggest?

Where the condition for standard_deviation is if user accessTimerange is 3 times standard deviation of average.

Tags (5)
1 Solution
Revered Legend

Give this a try

To get average in-out time

``````My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user | eventstats avg(FirstHour) as FirstHourAvg avg(LastHour) as LastHourAvg |eval accessTimerange=FirstHour+"--"+LastHour | eval Average_accessTimerange=tostring(round(FirstHourAvg))+"--"+tostring(round(LastHourAvg)) |fields - FirstHour* - LastHour*
``````

Including Standard deviations logic

``````My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user | eventstats avg(*) as *Avg stdev(*) as *Stdev |eval accessTimerange=FirstHour+"--"+LastHour | eval Average_accessTimerange=tostring(round(FirstHourAvg))+"--"+tostring(round(LastHourAvg)) | eval Is_Anomalous=if(abs(FirstHourAvg-FirstHour)>FirstHourStdev OR abs(LastHourAvg-LastHour)>LastHourStdev, "Yes","No") |fields - FirstHour* - LastHour*
``````
Revered Legend

Give this a try

To get average in-out time

``````My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user | eventstats avg(FirstHour) as FirstHourAvg avg(LastHour) as LastHourAvg |eval accessTimerange=FirstHour+"--"+LastHour | eval Average_accessTimerange=tostring(round(FirstHourAvg))+"--"+tostring(round(LastHourAvg)) |fields - FirstHour* - LastHour*
``````

Including Standard deviations logic

``````My Search | stats earliest(date_hour) as FirstHour latest(date_hour) as LastHour by user | eventstats avg(*) as *Avg stdev(*) as *Stdev |eval accessTimerange=FirstHour+"--"+LastHour | eval Average_accessTimerange=tostring(round(FirstHourAvg))+"--"+tostring(round(LastHourAvg)) | eval Is_Anomalous=if(abs(FirstHourAvg-FirstHour)>FirstHourStdev OR abs(LastHourAvg-LastHour)>LastHourStdev, "Yes","No") |fields - FirstHour* - LastHour*
``````
Motivator

what do you consider the "average" of a user who logged in as follows:

``````user     accessTimerange
A123     18--24
A123     19--23
A123     20--22
A123     21--21
``````
Builder

I am trying to calculate the average for the past 7 days for each user. hmm i haven't thought about the above scenario..strange. Thank you.

Legend

Stats like average and Standard Deviation can only be applied to numeric fields. So following are your two options:

1) Use duration for stats like avg and stdev i.e 22--24 has 2 hours duration and 02--08 has 6 hours duration.
2) Use stats for start time separately and end time separately, however statistically this would make no sense.

Please confirm what kind of stats you are looking for. Also, share mathematical details for your problem, we can assist with corresponding Splunk query.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
Builder

Thanks for the responce. option 1 dont work for me because i was trying to find the avarege timings of the user from the past 7 days.

Legend

@pavane by Option 1 I meant the duration will be for each day and then there can be average calculated over 7 days.

Month_Day User Start_End_Time Duration
11/05 A123 20-22 2
11/06 A123 01-08 7
11/07 A123 21-24 3
11/08 A123 18-22 4
11/09 A123 18-24 6
11/10 A123 02-08 6
11/11 A123 22-24 2

Then the avg will be 30/7=4.28 hrs

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
Builder

@niketnilay . I understood. But I'm looking to find only the timings an user is working for past 7 days. For example if an user is working from 8 pm to 4 am daily and yesterday he worked in a different timings like 2 am to 7 am. I should notice that deviation in a line graph or something. Similarly. I want to monitor for all the user's deviation on yesterday's report comparing with the last 7 days working time's avg.

Register for .conf21 Now! Go Vegas or Go Virtual!

### How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20. Learn More or Register Now >

Get Updates on the Splunk Community!