Splunk Search

How to write a search to only return data from a set time range (8:00 to 16:00) each day?

New Member

Using

| bucket span=1d _time
| stats count by _time 

and set custom time @d+8h to get TODAY'S data from 8AM onwards until @d+16h whenever I run the report.

However, I want to create a list of the results for EACH of the last 7 days that only includes data from the day 08:00 up until 16:00.

I tried to do via the custom time:
Start
-7d@d+8h
Latest
@d+16h

but this ends up including all the data from 24 hours between those two date / time ranges

I just want 08:00 to 16:00 each day

Tags (3)
0 Karma
1 Solution

Splunk Employee
Splunk Employee

Add

| eval date_hour = strftime(_time, "%H")
| search date_hour>8 date_hour<16

to your existing search and you should be good to go.

View solution in original post

Contributor

date_hour is an indexed field, so you can do this in your original query.

originial search date_hour>8 date_hour<16

The goal is to pull fewer events off the disk, and use the index wherever possible. If you can do it in the base query, then you can also use it as the query for an accelerated data model, which I'm guessing you'll want for your use case. Incidentally, I assumed that spelling each hour out with a bunch OR statements would have been more efficient, but it doesn't seem to make any difference. I think that's an improvement in recent years.

Is the reason for the timeframe to find "business hours"? The date_hour is UTC, so is not going to honor Daylight savings. Pad accordingly.

This would be a great use for an eventtype, then you can simply add to any query eventtype=businesshours. An eventtype does introduce some added load, since it is calculated any time you run a query against the events that the eventtype is configured to apply to, so for minimal load, use a macro. An eventtype is handy in other ways, of course, but it does introduce load.

0 Karma

Splunk Employee
Splunk Employee

the problem with using date_hour is that they don't reflect timezone changes - see this answer here for a detailed response.

0 Karma

Contributor

True, but you should always strive to pull less off the disk. The best answer is probably the combination -- use date_hour to get the hours that are possible, given DST, then use the search filter to throw away the one hour that isn't useful.

If using a data model, you could set a couple of fields to subsequently filter on.

dst_hour=strftime(_time, "%H")
business_hours=if(dst_hour>8 and dst_hour<16, 1, 0)

0 Karma

Contributor

Or I suppose define an eval field, and use that in the eventtype or datamodel query? I haven't used eval fields anywhere...

0 Karma

Contributor

Oh, now I read the post you linked to. I didn't realize how limited those fields actually are. Okay, so they should be used with a bucket full of caveats. Too bad.

So, the only supportable way to make this particularly efficient, if this data is going to be used a lot, is to make a data model, make an eval field that determines if the date is in scope, and then accelerate that data model. That's your only hope to make this efficient.

Splunk Employee
Splunk Employee

Add

| eval date_hour = strftime(_time, "%H")
| search date_hour>8 date_hour<16

to your existing search and you should be good to go.

View solution in original post

Influencer

@aljohnson Since you work for Splunk, I'm going to ask you if this approach is more efficient than running a bunch of appends with the desired time ranges, and if so, why?

0 Karma

Splunk Employee
Splunk Employee

vbmgarner nailed it. basically, subsearches have limitations. I'm sure you've seen people asking regularly how to avoid subsearches for that exact reason. If possible, its (generally) better to avoid them if possible. In terms of whether or not its actually more performant, I'll let someone else do a test. The search job inspector will let you know if the gain is worth it or not.

Check out the docs here.

Contributor

In all but the most extreme cases, running a single query with more arguments is going to be faster than running multiple queries and using append, just because of the time to set up each query. Append also has a time limit, as all sub-searches do.