Splunk Search

Standard Deviation Total Requests Per Day

dickersons
Explorer

I am attempting to calculate the following:

-  Total Number "Requests Per Day"

-  Average/Mean "Requests Per Day"

-  Standard Deviation "Requests Per Day"

I am using the following search:

index=myCoolIndex cluster_name="myCoolCluster" sourcetype=myCoolSourceType label_app=myCoolApp ("\"statusCode\"") | rex .*\"traceId\"\s:\s\"?(?<traceId>.*?)\".* | dedup traceId | rex "(?s)\"statusCode\"\s:\s\"?(?<statusCode>[245]\d{2})\"?" | timechart span=1d count(statusCode) as "Number_Of_Requests" | where Number_Of_Requests > 0 | eventstats mean(Number_Of_Requests) as "Average Requests Per Day" stdev(Number_Of_Requests) as "Standard Deviation"

I am getting results back, but am unsure if the results I am getting back are correct per what I am trying to look for.  For instance, I would have thought "stdev()" would need some eval statement to know what the "Total Requests Per Day" and "Average/Mean Requests Per Day" is?   Does the "where Number_Of_Requests > 0" skew the results since those are not getting added to the result set?  Was hoping someone would be able to take a look at my query and provide a little insight as to what I may still need to do so I can get an accurate Standard Deviation.  Also, below is the output I am getting from the current query:

Number_Of_Requests	 Average Requests Per Day   Standard Deviation
	25687	                 64395	                    54741.378572337766
	103103	                 64395	                    54741.378572337766

 

Any help is appreciated!

 

Labels (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Yes, you will get the mean and standard deviation of all the daily counts in your time period.

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

By using "| where Number_Of_Requests > 0", you are potentially "skewing" the results, although that does depend on what it is you are trying to show. For example, if you had 5 days, with counts of 2, 0, 0, 0, 3, your mean would be 1 with the zeroes included, and 2.5 without the zeroes. Similarly, the stddev would be similarly affected by the removal or inclusion of the zeroes.

0 Karma

dickersons
Explorer

If I hear what you are saying correctly, then it is likely going to be a more accurate representation of mean and standard deviation if I include the "0" that way every day gets included on the calculation and not only the days in which there are data points?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Correct - it is usually more meaningful to include the zeroes, but it does depend on what you are trying to show.

0 Karma

dickersons
Explorer

Makes sense.  Does the formula itself look legit?  Meaning assuming the search criteria is correct and I should get the correct standard deviation based on Requests Per Day?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Yes, you will get the mean and standard deviation of all the daily counts in your time period.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...