Plotting Normal Distribution using a column

howardroark · ‎07-06-2017

How do I derive Normal Distribution for a column (time_taken which is all integers) and then plot a bell curve from it.
Can I get some examples?

Thanks

Richfez · ‎07-09-2017

Well, that depends on if it's a normal distribution or not! 🙂

But I think I know what you mean hear, so here's an example using LEN from my cheapie Firewall at home. Now, because LEN is really all over the place but I know there's LOTS of DNS requests in the 40-75 byte range, when LEN is under 100 I would expect to see a bell-curve like chart. (Or something near enough for demonstration purposes).

index=fw LEN<100
| bin LEN 
| stats count by LEN

Which gives me
alt text

Now for your needs you didn't provide an actual search to go off of, but you could try pretty much what I did only replace LEN with time_taken.

<your base search>
| bin bins=100 time_taken
| stats count by time_taken

You'll notice I also bumped up the number of bins it can make - it won't MAKE 100 bins unless it finds a reason to, but I've told it that it COULD make up to 100 bins.

Reference for bin in docs.

Happy Splunking!
-Rich

Richfez · ‎07-16-2017

howardroark,

If this answered your question, could you please mark it as accepted?

If it did not answer your question, let us know what else you need!

Thanks,
Rich

niketn · ‎07-07-2017

This is a very broad question for a use case which will be strictly intrinsic to your data behavior and scenario.

You should start with stdev() and mean() for historical data i.e. last 60 week or whichever historical data you have. Then you can calculate 2nd and 3rd Std Dev (Positive and Negative) based on the statistical calculation answered in the following post: https://www.wyzant.com/resources/answers/27347/i_need_to_find_one_two_and_three_standards_deviations...

I dont think there is currently a visualization to plot Normal Distribution so you can try with timechart command, but you can use Area Chart (https://answers.splunk.com/answers/526582/response-time-distribution-chart-with-a-long-tail.html). See if that fits the need.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

Plotting Normal Distribution using a column

Aligning Observability Costs with Business Value: Practical Strategies

Mastering Data Pipelines: Unlocking Value with Splunk

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Are you a member of the Splunk Community?

Plotting Normal Distribution using a column

Aligning Observability Costs with Business Value: Practical Strategies

Mastering Data Pipelines: Unlocking Value with Splunk

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0