We have a summary indexed search that puts events into buckets for a day. We then use that to get the top 5 values for a given day. This is the search we have:
... | bucket span=1d _time | sitop 5 field1 by _time
What we notice is that there are two buckets created within a single day. One has a
12:00 AM value and the other has a
5:00 PM value. We just need all of the events to be grouped under one value. A sample result is given below.
_time field1 count percent 5/28/10 12:00:00.000 AM value1 3406 26.442046 5/28/10 12:00:00.000 AM value2 2506 19.455011 5/28/10 12:00:00.000 AM value3 1034 8.027327 5/28/10 12:00:00.000 AM value4 617 4.790001 5/28/10 12:00:00.000 AM value5 609 4.727894 5/28/10 5:00:00.000 PM value6 61 21.478873 5/28/10 5:00:00.000 PM value7 39 13.732394 5/28/10 5:00:00.000 PM value8 33 11.619718 5/28/10 5:00:00.000 PM value9 25 8.802817 5/28/10 5:00:00.000 PM value10 21 7.394366
Are we missing something? Thanks for your help.
interesting. The only thing i can think of is that you're using distributed search and maybe not all servers are in the same timezone? Since the bucketing logic will (i think) be applied on the remote peer, it would calculate the day boundaries differently, and then the central search-head would get back all the buckets and apply its day boundaries and you might end up with weird results like this..
Thanks. Is there anyway to check if distributed search is used or should I contact the admin to get this information? If a distributed search is used, can we prevent it through a configuration or command?
You can use
localop to "prevent subsequent commands from being executed on remote peers." Not sure if that's your issue or not, but I guess this could help you find out.
Well if you havent set up splunk on any other machine and set up searches to distribute between the N machines, then there's no distributed search. in which case my idea will be a red herring. But to check you would go into "Manager > Distributed Search > Search Peers" and see if there are any peered servers.
Thanks. We do have a distributed search setup and there is a timezone mismatch between the two servers. This is most likely causing the issue we are seeing.
Oh nice. Yea that's probably it then. Interestingly enough this isnt supposed to be an issue - the distributed search code actually sends serialized timezone info over the wire specifically so that the bucketing should be performed consistently... Sounds like its worth a bug + support case..