@rgag : the by clause is your friend. Stay in stats-land, then xyseries/timechart only if appropriate.
Here is a similar example.
I have three hosts, ww1 , ww2 , and ww3 . I'd like to see if any day count is over the 5 day moving average, right? So, instead of using timechart, which turns my host values into fields (to make it chartable), I'm going to stay within stats. However, that requires that I bin up _time myself:
sourcetype=access_combined
| bin _time span=1d
| stats count by host _time
This gives us our basic timechart, albeit row-centric and currently not chartable. That is ok. Thats what we want. Next, use streamstats instead of trendline. You were using the sma function in your example, so we'll just do the same with streamstats .
| streamstats window=5 mean(count) as sma5 by host
Crucial bit is the by host at the end. Now we get per-host moving averages. We're nearly done.
Let's make an "outlier" field. We can set the outlier to a value to something more informative than just a boolean "1", e.g., the difference from the threshold, that is, how big of a spike:
| eval outlier = if(count > (3 * sma5), count - (3 * sma5), null())
At this point, we can do a little cleaning up of the _time field, since it won't look pretty when we chart.
| eval _time = strftime(_time, "%b %d")
Lastly, we use xyseries to make the whole thing chartable.
| xyseries _time host count outlier
Here is a screenshot of an example, where, for the sake of demonstration, I made the outlier detection very sensitive (whenever the count exceeds the moving average). I also set the counts to be overlays on their own axis whereas the outliers are the default visualization type, a column.
So to get your alert for a spike over the past 24 hours, you'd also want to search on more than 24 hours so the first data point has a stable moving average, and throwaway the first few points:
index=bg_data earliest=-30h bg="*"
| bin _time span=1h
| stats count(start_time) as num_events by bg
| streamstats window=5 mean(num_events) as moving_avg by bg
| eval spike=if(num_events > (3 * moving_avg), 500, 0)
| where _time > relative_time(now(), "-24h") AND spike > 0
And your second search for the table of counts per bg,
index=bg_data earliest=-30h bg="*"
| bin _time span=1h
| stats count(start_time) as num_events by bg
| streamstats window=5 mean(num_events) as moving_avg by bg
| eval spike=if(num_events > (3 * moving_avg), 1, 0)
| where _time > relative_time(now(), "-24h") AND spike > 0
| timechart sum(spike) as spike_count by bg
^ Those likely have some mistakes in them. but the gist is there, well, at least for what I would do 🙂
... View more