Splunk Search

Standard Deviation Total Requests Per Day

dickersons
Explorer

I am attempting to calculate the following:

-  Total Number "Requests Per Day"

-  Average/Mean "Requests Per Day"

-  Standard Deviation "Requests Per Day"

I am using the following search:

index=myCoolIndex cluster_name="myCoolCluster" sourcetype=myCoolSourceType label_app=myCoolApp ("\"statusCode\"") | rex .*\"traceId\"\s:\s\"?(?<traceId>.*?)\".* | dedup traceId | rex "(?s)\"statusCode\"\s:\s\"?(?<statusCode>[245]\d{2})\"?" | timechart span=1d count(statusCode) as "Number_Of_Requests" | where Number_Of_Requests > 0 | eventstats mean(Number_Of_Requests) as "Average Requests Per Day" stdev(Number_Of_Requests) as "Standard Deviation"

I am getting results back, but am unsure if the results I am getting back are correct per what I am trying to look for.  For instance, I would have thought "stdev()" would need some eval statement to know what the "Total Requests Per Day" and "Average/Mean Requests Per Day" is?   Does the "where Number_Of_Requests > 0" skew the results since those are not getting added to the result set?  Was hoping someone would be able to take a look at my query and provide a little insight as to what I may still need to do so I can get an accurate Standard Deviation.  Also, below is the output I am getting from the current query:

Number_Of_Requests	 Average Requests Per Day   Standard Deviation
	25687	                 64395	                    54741.378572337766
	103103	                 64395	                    54741.378572337766

 

Any help is appreciated!

 

Labels (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Yes, you will get the mean and standard deviation of all the daily counts in your time period.

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

By using "| where Number_Of_Requests > 0", you are potentially "skewing" the results, although that does depend on what it is you are trying to show. For example, if you had 5 days, with counts of 2, 0, 0, 0, 3, your mean would be 1 with the zeroes included, and 2.5 without the zeroes. Similarly, the stddev would be similarly affected by the removal or inclusion of the zeroes.

0 Karma

dickersons
Explorer

If I hear what you are saying correctly, then it is likely going to be a more accurate representation of mean and standard deviation if I include the "0" that way every day gets included on the calculation and not only the days in which there are data points?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Correct - it is usually more meaningful to include the zeroes, but it does depend on what you are trying to show.

0 Karma

dickersons
Explorer

Makes sense.  Does the formula itself look legit?  Meaning assuming the search criteria is correct and I should get the correct standard deviation based on Requests Per Day?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Yes, you will get the mean and standard deviation of all the daily counts in your time period.

Get Updates on the Splunk Community!

Uncovering Multi-Account Fraud with Splunk Banking Analytics

Last month, I met with a Senior Fraud Analyst at a nationally recognized bank to discuss their recent success ...

Secure Your Future: A Deep Dive into the Compliance and Security Enhancements for the ...

What has been announced?  In the blog, “Preparing your Splunk Environment for OpensSSL3,”we announced the ...

New This Month in Splunk Observability Cloud - Synthetic Monitoring updates, UI ...

This month, we’re delivering several platform, infrastructure, application and digital experience monitoring ...