Splunk Search

Standard Deviation Total Requests Per Day

dickersons
Explorer

I am attempting to calculate the following:

-  Total Number "Requests Per Day"

-  Average/Mean "Requests Per Day"

-  Standard Deviation "Requests Per Day"

I am using the following search:

index=myCoolIndex cluster_name="myCoolCluster" sourcetype=myCoolSourceType label_app=myCoolApp ("\"statusCode\"") | rex .*\"traceId\"\s:\s\"?(?<traceId>.*?)\".* | dedup traceId | rex "(?s)\"statusCode\"\s:\s\"?(?<statusCode>[245]\d{2})\"?" | timechart span=1d count(statusCode) as "Number_Of_Requests" | where Number_Of_Requests > 0 | eventstats mean(Number_Of_Requests) as "Average Requests Per Day" stdev(Number_Of_Requests) as "Standard Deviation"

I am getting results back, but am unsure if the results I am getting back are correct per what I am trying to look for.  For instance, I would have thought "stdev()" would need some eval statement to know what the "Total Requests Per Day" and "Average/Mean Requests Per Day" is?   Does the "where Number_Of_Requests > 0" skew the results since those are not getting added to the result set?  Was hoping someone would be able to take a look at my query and provide a little insight as to what I may still need to do so I can get an accurate Standard Deviation.  Also, below is the output I am getting from the current query:

Number_Of_Requests	 Average Requests Per Day   Standard Deviation
	25687	                 64395	                    54741.378572337766
	103103	                 64395	                    54741.378572337766

 

Any help is appreciated!

 

Labels (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Yes, you will get the mean and standard deviation of all the daily counts in your time period.

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

By using "| where Number_Of_Requests > 0", you are potentially "skewing" the results, although that does depend on what it is you are trying to show. For example, if you had 5 days, with counts of 2, 0, 0, 0, 3, your mean would be 1 with the zeroes included, and 2.5 without the zeroes. Similarly, the stddev would be similarly affected by the removal or inclusion of the zeroes.

0 Karma

dickersons
Explorer

If I hear what you are saying correctly, then it is likely going to be a more accurate representation of mean and standard deviation if I include the "0" that way every day gets included on the calculation and not only the days in which there are data points?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Correct - it is usually more meaningful to include the zeroes, but it does depend on what you are trying to show.

0 Karma

dickersons
Explorer

Makes sense.  Does the formula itself look legit?  Meaning assuming the search criteria is correct and I should get the correct standard deviation based on Requests Per Day?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Yes, you will get the mean and standard deviation of all the daily counts in your time period.

Get Updates on the Splunk Community!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...

Stay Connected: Your Guide to October Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...