I'm using timechart to show the number of connections we have over a collection of servers. When these servers go through a scheduled reboot, they report as if all of their connections are used, but the reboots are scheduled happen when the server connections . What is the best way to tell timechart to ignore a specific window of time? I'd rather ignore by time than high connection count values (I'm specifically not using the word "anomalous" because I know enough that it has specific meaning in Splunk, but I've not yet learned how to leverage it).
In other words, if I'm looking at the last 24 hours, I want to ignore all values from 2:00am - 3:00am, because that's when we're cycling through a reboot of servers.
I doubt this is the best way to do it, however you could use Cron and schedule the search to run every hour expect from 2:00am-3:00am and put the results in a Summary Index. Once the data is in the summary index you can search through the past 24 hours and you will not see the values from 2:00am-3:00am.
A little roundabout but I believe this gives you your desired result.
1) Easiest way is definitely to just whack away the events from that hour with a
<your search terms> NOT date_hour=2 | timechart sum(foo) avg(bar)
Which means timechart will just not show data for that hour.
NOTE: One more detail that you'll hit -- If you're using line charts or area charts, it wont show any data points in the 2am-3am period at all and that probably is what you want.
In technical terms this is the charting key
charting.chart.nullValueMode, and the default value is 'gaps'. If on the other hand you want the line(s) to drop down to zero at 2am when there's missing data, set that same key to 'zero'. And for completeness, 'connected' would draw a line from the 1am point(s) to the 3am point(s) (which you almost certainly do not want)
2) If you need something more flexible than just filtering out data on hour boundaries and the like, you can do boolean expressions with date_hour and date_minute but it can get messy. In such a case it might be easier or nicer to use the
where command combined with the
To filter out from 2am to 3am, it would look like this:
<your search terms> | where _time<relative_time(now(), "@d+2h") OR _time>relative_time(now(), "@d+3h") | timechart sum(foo) avg(bar)
but now you can just as easily filter from 2:15 to 2:45..
<your search terms> | where _time<relative_time(now(), "@d+135m") OR _time>relative_time(now(), "@d+165m") | timechart sum(foo) avg(bar)
Thanks for the help so far. The example for 2:15-2:45 seems to work if my search spans the last 24 hours, but it doesn't seem to work at that time every day if the search window is larger (i.e. two days ago from 2:15-2:45, or three days ago from 2:15-2:45). Any recommendations how to resolve that, or is there something I'm doing wrong?
Yea the searchterm approach will work better if you're doing it over multiple days. That would be:: (NOT datehour=2 OR (dateminute<15 date_minute>45 ))