Splunk Search
Highlighted

Search command: bucket

Path Finder

Hi

We have a summary indexed search that puts events into buckets for a day. We then use that to get the top 5 values for a given day. This is the search we have:

... | bucket span=1d _time | sitop 5 field1 by _time  

What we notice is that there are two buckets created within a single day. One has a 12:00 AM value and the other has a 5:00 PM value. We just need all of the events to be grouped under one value. A sample result is given below.

_time                       field1  count   percent  
5/28/10 12:00:00.000 AM value1  3406    26.442046  
5/28/10 12:00:00.000 AM value2  2506    19.455011
5/28/10 12:00:00.000 AM value3  1034    8.027327
5/28/10 12:00:00.000 AM value4  617 4.790001
5/28/10 12:00:00.000 AM value5  609 4.727894
5/28/10 5:00:00.000 PM  value6  61  21.478873
5/28/10 5:00:00.000 PM  value7  39  13.732394
5/28/10 5:00:00.000 PM  value8  33  11.619718
5/28/10 5:00:00.000 PM  value9  25  8.802817
5/28/10 5:00:00.000 PM  value10 21  7.394366

Are we missing something? Thanks for your help.

Ranga

0 Karma
Highlighted

Re: Search command: bucket

SplunkTrust
SplunkTrust

interesting. The only thing i can think of is that you're using distributed search and maybe not all servers are in the same timezone? Since the bucketing logic will (i think) be applied on the remote peer, it would calculate the day boundaries differently, and then the central search-head would get back all the buckets and apply its day boundaries and you might end up with weird results like this..

View solution in original post

Highlighted

Re: Search command: bucket

Path Finder

Thanks. Is there anyway to check if distributed search is used or should I contact the admin to get this information? If a distributed search is used, can we prevent it through a configuration or command?

0 Karma
Highlighted

Re: Search command: bucket

Super Champion

You can use localop to "prevent subsequent commands from being executed on remote peers." Not sure if that's your issue or not, but I guess this could help you find out.

0 Karma
Highlighted

Re: Search command: bucket

SplunkTrust
SplunkTrust

Well if you havent set up splunk on any other machine and set up searches to distribute between the N machines, then there's no distributed search. in which case my idea will be a red herring. But to check you would go into "Manager > Distributed Search > Search Peers" and see if there are any peered servers.

0 Karma
Highlighted

Re: Search command: bucket

Path Finder

Thanks. We do have a distributed search setup and there is a timezone mismatch between the two servers. This is most likely causing the issue we are seeing.

0 Karma
Highlighted

Re: Search command: bucket

SplunkTrust
SplunkTrust

Oh nice. Yea that's probably it then. Interestingly enough this isnt supposed to be an issue - the distributed search code actually sends serialized timezone info over the wire specifically so that the bucketing should be performed consistently... Sounds like its worth a bug + support case..

0 Karma
Highlighted

Re: Search command: bucket

Super Champion

Side note: It doesn't look like your limit (5) is honored when you are using sitop instead of top.

0 Karma
Highlighted

Re: Search command: bucket

Splunk Employee
Splunk Employee

This is a bug. The best workaround right now would be to insert the localop command before the bucket, as lowell suggested.

0 Karma