Splunk Search

Equivalent Search Command for Excel Histogram (analysis toolpak addin)

kbecker
Communicator

Anybody familiar with a Splunk search command that does the equivalent of the Excel Histogram function that is found in the Analysys Toolpak addin (options --> addin). This Excel function calculates data bins based on the min and max values of a data set, it then shows a count per bin. For example. If you had 1000 data points from 4820 to 71750, this would be the bin breakdown (for 6 bins).

Bins
4820 if <= 4820

18206 if > 4820 & <= 18206

31592 if >18206 & <= 31592

44978 if > 31592 & <= 44978

58364 if > 44978 & <= 58364

71750 if > 58364 & <= 71750

Now, we would need to determine the count of data points for bin, so we add frequency...

Bin Frequency

4820 2 if <= 4820

18206 800 if > 4820 & <= 18206

31592 100 if >18206 & <= 31592

44978 47 if > 31592 & <= 44978

58364 50 if > 44978 & <= 58364

71750 1 if > 58364 & <= 71750

With Splunk I am able to determine the min, max, and bin values, but I am having trouble figuring out the best way to go back through the raw data to count events per bin. Not sure if there is a Splunk search command that will just do this or do I have to use a sub-search?

Thanks in advance for any help.

Glenn
Builder

It looks like you want the buckets to be of even size. If for some reason you wanted to have more control over the bucket sizes/ranges (also useful to be able to lump outliers over a certain value into one "above X" bucket, you can do this, for example:

_your_search_ | fields some_field | eval some_field_group = case(some_field<50, "0-50ms", some_field<100, "50-100ms", some_field<150 , "100-150ms", some_field<200, "150-200ms", some_field<250, "200-250ms", some_field<500, "250-500ms", some_field<750, "500-750ms", some_field<1000, "750-1000ms", 1==1, ">1000ms" ) | chart count by some_field_group

gkanapathy
Splunk Employee
Splunk Employee

... | chart count, max(datavalue), min(datavalue) over datavalue bins=6

or

... | chart count(datavalue), max(datavalue), min(datavalue) over datavalue bins=6

The bins= can be replaced with fixed or logarithmic bucket sizes as well, see this for bucketing options: http://docs.splunk.com/Documentation/Splunk/5.0/SearchReference/Bucket

Johnvey
Contributor

Yes, the bucket command is what you're looking for. Given the following search command (that generates synthetic data):

| windbag

An auto-extracted field is called position, which is a monotonically increasing number from 0 to 99. If you want to create 5 buckets of 20 numbers each:

| windbag | bucket position span=20 | stats count by position

This produces a table of values like:

position    count
0-20        20
20-40       20
40-60       20
60-80       20
60-100      20

The bucket command also handles time natively, so you can specify a relative time for the span argument:

| bucket _time span=5m

You can read more about the bucket command and about relative time identifiers.

Johnvey
Contributor

Yes, the bins generated will always be spaced such that the min and max values of whatever field of interest is covered. Is there some specific method of binning that you're looking for?

0 Karma

kbecker
Communicator

Is there a way for Splunk to calculate the bins based on the min and max values of the position value?

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...