Deployment Architecture

Bucketing fields (and events that are floating point values)

Explorer

So say that I have a field called Version and version can range from 2.6.0 to 4.8.1 per event. What would be the way to bucket, categorize, or group or make Version add in the range of 2.0 to 2.5 and 2.6 to 2.9 then 3.0 to 3.5 then 3.6 to 3.9 and finally 4.0 to 4.8 for eventranges. In effect I need a quantize function. quantize {numerical field} min={value} max= {value} step={value} number of buckets = {value} kinda function?

Anyone got any ideas?

SplunkTrust
SplunkTrust

Quick answers

1) you should read up on the bucket command (aka bin)

http://www.splunk.com/base/Documentation/latest/SearchReference/Bucket

<your search> | bucket Version span=0.5 | stats count by Version

2) you should also read up about the timechart and chart commands, because they also have pretty powerful bucketing abilities when you use split by fields.

<your search> | chart avg(someNumericField) over Version span=0.5

<your search> | chart count over someCategoricalField by Version span=0.5

<your search> | timechart count by Version span=0.5

etc...

http://www.splunk.com/base/Documentation/latest/SearchReference/Chart

Hopefully that will give you enough to get going.

UPDATE:

of course your version strings are not decimals but version strings with more than one decimal point in them. Bucket wont know what to make of these directly.

What I would do is use rex to extract another field that only does the first two segments of the version, then bucket the results by that field.

Do this to see what I'm talking about:

<your search> | rex field=Version (?<MajorVersion>\d+\.\d+)? 
| bucket MajorVersion span=0.5 | stats count by Version, MajorVersion

So depending on what you want to do, the final version might be:

<your search> | rex field=Version (?<MajorVersion>\d+\.\d+)? 
| bucket MajorVersion span=0.5
| stats sum(bytes) as total_bytes dc(users) as distinct_users by MajorVersion

And if you want to preserve the raw values of Version in there, throw a values(Version) as Versions into the stats command.

SplunkTrust
SplunkTrust

Oh nuts. Sorry - I somehow missed that these werent floats, but rather version strings with multiple decimal points. I'll update my answer

0 Karma

Explorer

This is not working especially on my Version field. The first search example is returning all of the values and not placing them into bins or buckets.

0 Karma