Deployment Architecture

Why are bucket times expanding?

ericl42
Path Finder

I've read a few other forum posts with similar issues but I never found a true solution for. Overall I'm trying to mock up some correlation rules within Enterprise Security where my time frame is going to be -5h@h to -1h@h. I want to make this into two bucks so I can compare two hour time frames against one another.

I continually get 3 buckets even though there should only be two.

In my most recent test I did this search:

(index=os_windows* OR index=os_unix*) (source=WinEventLog:Security OR sourcetype=linux_secure OR tag=authentication) action=failure NOT Result_Code=0x17 NOT Account_Name="*$"  earliest=-5h@h latest=-1h@h
| bucket _time span=2h 
| stats values(user) AS affected_users, values(ComputerName) as dc, dc(user) AS num_users count BY src_ip _time 
| where num_users > 3

When I look at the results, I see multiple src_ip's that have a 9:00, 11:00, and 13:00 row. It's currently 2:30 pm so the breakdown is:

2:30 = current time
1:30 = -1h@h
12:30 = -2h@h
11:30 = -3h@h
10:30 = -4h@h
09:30 = -5h@h

So I should have a 9:00 - 11:00 bucket and an 11:00 - 1:00 bucket. I have no idea why it's also showing me a 13:00 bucket in my search results. This is throwing off my math since that number is quite a bit different as I assume it's not the full hour and it's not snapping correctly or something.

1 Solution

woodcock
Esteemed Legend

OK, I finally took the time to try this and I know what is happening. Splunk is "too smart" here for its own good. It knows that there are an even number of hours in a day so when you tell it to bin _time span=2h, the buckets that it automatically creates fall on even-hour boundaries. In your case, when you are displaying odd-numbered times, such as from 1PM-5PM, it is creating 3 even-hour-based buckets: 12-2, 2-4, 4-6, instead of 2 odd-based buckets: 1-3, 3-5. I believe that @rich7177 has experimented and pontificated at length about this and perhaps he will post a comment or a link. In any case, now that I understand the fundamental nature of the problem, I believe that it can be addressed like this:

((index="os_windows*" AND source=WinEventLog:Security) OR (index="os_unix*" ANDsourcetype=linux_secure) OR tag=authentication) AND action=failure  AND NOT (Result_Code="0x17" OR Account_Name="*$") earliest=-5h@h latest=-1h@h
| bucket _time span=2h aligntime=-1h@h
| stats values(user) AS affected_users, values(ComputerName) AS dc, dc(user) AS num_users count BY src_ip _time 
| where num_users > 3

Alternatively you might consider starting over and using a sliding window instead of a discretized window with streamstats time_window=2h.

View solution in original post

woodcock
Esteemed Legend

OK, I finally took the time to try this and I know what is happening. Splunk is "too smart" here for its own good. It knows that there are an even number of hours in a day so when you tell it to bin _time span=2h, the buckets that it automatically creates fall on even-hour boundaries. In your case, when you are displaying odd-numbered times, such as from 1PM-5PM, it is creating 3 even-hour-based buckets: 12-2, 2-4, 4-6, instead of 2 odd-based buckets: 1-3, 3-5. I believe that @rich7177 has experimented and pontificated at length about this and perhaps he will post a comment or a link. In any case, now that I understand the fundamental nature of the problem, I believe that it can be addressed like this:

((index="os_windows*" AND source=WinEventLog:Security) OR (index="os_unix*" ANDsourcetype=linux_secure) OR tag=authentication) AND action=failure  AND NOT (Result_Code="0x17" OR Account_Name="*$") earliest=-5h@h latest=-1h@h
| bucket _time span=2h aligntime=-1h@h
| stats values(user) AS affected_users, values(ComputerName) AS dc, dc(user) AS num_users count BY src_ip _time 
| where num_users > 3

Alternatively you might consider starting over and using a sliding window instead of a discretized window with streamstats time_window=2h.

ericl42
Path Finder

Thank you very much. Yesterday before this post I accidentally started testing with aligntime and it seemed to fix the issue but I wasn't 100% why. I don't think I can use sliding windows because I'm mocking all of these rules up for ES correlation searches.

0 Karma

to4kawa
Ultra Champion
 (index=os_windows* OR index=os_unix*) (source=WinEventLog:Security OR sourcetype=linux_secure OR tag=authentication) action=failure NOT Result_Code=0x17 NOT Account_Name="*$"  earliest=-5h@h latest=-1h@h
| timechart span=2h values(user) AS affected_users, values(ComputerName) as dc, dc(user) AS num_users count BY src_ip
| where num_users > 3

Hi, How about obediently timechart?

0 Karma

ericl42
Path Finder

Timechart may work for this one scenario, but I have others where I count by multiple fields and timechart only allows me to do one.

0 Karma

to4kawa
Ultra Champion

Like makeresults , bin seems to make the last time when it makes time.

If you really need two,

 (index=os_windows* OR index=os_unix*) (source=WinEventLog:Security OR sourcetype=linux_secure OR tag=authentication) action=failure NOT Result_Code=0x17 NOT Account_Name="*$"  earliest=-5h@h latest=-1h@h
 | addinfo
 | eval sessionId = if(_time < relative_time(info_min_time,"+2h"),1,2)
 | stats earliest(_time) as _time values(user) AS affected_users, values(ComputerName) as dc, dc(user) AS num_users count BY src_ip sessionId
 | where num_users > 3

With this query, the search period is divided into the first 2 hours and the rest, and the results are displayed.

0 Karma

ericl42
Path Finder

Thanks for the response. After digging around a little, I think I may have fixed the issue by adding the aligntime portion. However I'll take a look at your new query as well.

| bucket _time span=2h@h aligntime=-1h@h

0 Karma

to4kawa
Ultra Champion
2:30 = current time
1:00 = -1h@h
12:00 = -2h@h
11:00 = -3h@h
10:00 = -4h@h
09:00 = -5h@h

Hi, @h is offset at the beginning of the time, so this is correct.

0 Karma

woodcock
Esteemed Legend

Try this instead:

 ... | bucket _time bins=2

Also, BE SURE TO SET YOUR PERSONAL Time zone setting: Your Name Here -> Preferences -> Time zone.
This looks like a bug and I would open a support ticket for sure.
You can add this to the end:

... | where _time >= relative_time(now(), "-5h@h") AND _time <= relative_time(now(), "-1h@h")
0 Karma

ericl42
Path Finder

Thanks for the quick response. I like the concept of bins so I always know it's two items I'm comparing against vs. potentially three. I tried this on on my query and the time just now says 2019-11-22 and doesn't have another hour or delimiter. So basically even though I said 2 bins, I'm only seeing one row per user ID.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...