- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Plotting Normal Distribution using a column
How do I derive Normal Distribution for a column (time_taken which is all integers) and then plot a bell curve from it.
Can I get some examples?
Thanks
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Well, that depends on if it's a normal distribution or not! 🙂
But I think I know what you mean hear, so here's an example using LEN from my cheapie Firewall at home. Now, because LEN is really all over the place but I know there's LOTS of DNS requests in the 40-75 byte range, when LEN is under 100 I would expect to see a bell-curve like chart. (Or something near enough for demonstration purposes).
index=fw LEN<100
| bin LEN
| stats count by LEN
Which gives me
Now for your needs you didn't provide an actual search to go off of, but you could try pretty much what I did only replace LEN
with time_taken
.
<your base search>
| bin bins=100 time_taken
| stats count by time_taken
You'll notice I also bumped up the number of bins it can make - it won't MAKE 100 bins unless it finds a reason to, but I've told it that it COULD make up to 100 bins.
Reference for bin in docs.
Happy Splunking!
-Rich
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

howardroark,
If this answered your question, could you please mark it as accepted?
If it did not answer your question, let us know what else you need!
Thanks,
Rich
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

This is a very broad question for a use case which will be strictly intrinsic to your data behavior and scenario.
You should start with stdev() and mean() for historical data i.e. last 60 week or whichever historical data you have. Then you can calculate 2nd and 3rd Std Dev (Positive and Negative) based on the statistical calculation answered in the following post: https://www.wyzant.com/resources/answers/27347/i_need_to_find_one_two_and_three_standards_deviations...
I dont think there is currently a visualization to plot Normal Distribution so you can try with timechart command, but you can use Area Chart (https://answers.splunk.com/answers/526582/response-time-distribution-chart-with-a-long-tail.html). See if that fits the need.
| makeresults | eval message= "Happy Splunking!!!"
