Solved: Splunk Newbie: Using buckets/bins

rfernandez2010 · ‎02-05-2018

Hi Splunkers,

I can't seem to find a efficient way to bucket my results where anything greater than 174 days gets tossed to one collective bucket.
The table below is the result I want, but what I'm getting are buckets spanning to 10,000 days with each individual count.

index=[index here] source=[source] 
......
|eval days=round((now()-strptime(OPEN_DATE, "%m%d%Y"))/86400)
|bin days span=29
|stats count  by days

days , count

0-29 , 450

29-58 , 411

58-87 , 471

87-116 , 1389

116-145, 5828

145-174 , 806

174+ 30,,000

lguinn2 · ‎02-05-2018

I think you may need to use more of the options for the bin command, such as setting the number of bins.
You could also calculate this instead of trying to force bin - bin doesn't like unequal bin sizes... but I recommend the case function, which gives you complete control:

...
| eval days=round((now()-strptime(OPEN_DATE, "%m%d%Y"))/86400)
| eval day_category=case(days < 30, "0-29",
                  days < 59,"30-58",
                  days < 88,"59-87",
                  days < 117,"88-116",
                  days < 146,"117-145",
                  days < 175,"146-174",
                  true(),"174+")
| stats count by date_category

Note a couple of things about the case function: it takes the value corresponding to the first match. So if days is 118, then the day_category will be assigned to "117-145". The true() in the last part of the case function defines a default value, which will be used if no other test matches.

The formatting is just to make it easy to read; you can put the whole case function on a single line; Splunk doesn't care.

Finally, the categories that you defined in your desired output are overlapping. I corrected that in my case statement.

HTH!

View solution in original post

mayurr98 · ‎02-05-2018

hey you can try something like this

index=<your_index> source=<your_source>
    ...... 
| eval days=round((now()-strptime(OPEN_DATE, "%m%d%Y"))/86400) 
| eval t=days 
| bin days span=29 
| eval days=case(t>=0 AND t<29,"0-29",t>=29 AND t<58,"29-58",t>=58 AND t<87,"58-87",t>=87 AND t<116,"87-116",t>=116 AND t<145,"116-145",t>=145 AND t<174,"145-174",t>=174,"174+") 
| stats count by days 
| sort days

let me know if this helps!

lguinn2 · ‎02-05-2018

I think you may need to use more of the options for the bin command, such as setting the number of bins.
You could also calculate this instead of trying to force bin - bin doesn't like unequal bin sizes... but I recommend the case function, which gives you complete control:

...
| eval days=round((now()-strptime(OPEN_DATE, "%m%d%Y"))/86400)
| eval day_category=case(days < 30, "0-29",
                  days < 59,"30-58",
                  days < 88,"59-87",
                  days < 117,"88-116",
                  days < 146,"117-145",
                  days < 175,"146-174",
                  true(),"174+")
| stats count by date_category

Note a couple of things about the case function: it takes the value corresponding to the first match. So if days is 118, then the day_category will be assigned to "117-145". The true() in the last part of the case function defines a default value, which will be used if no other test matches.

The formatting is just to make it easy to read; you can put the whole case function on a single line; Splunk doesn't care.

Finally, the categories that you defined in your desired output are overlapping. I corrected that in my case statement.

HTH!

rfernandez2010 · ‎02-06-2018

Awesome, worked like a charm.

Thank you for your help.

Splunk Newbie: Using buckets/bins

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

ATTENTION: We’re Moving! (AGAIN!)

Deep Dive: Optimizing Telemetry Pipelines in Splunk Observability Cloud

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation