index=symantec (virus OR "security risk" OR "web attack") NOT "Tracking Cookies" earliest=-30d@d latest=now | rex "(?i) name: (?P
What I am trying to do is get an alert going that will run hourly and determine if the number of Viruses seen by Symantec in the last hour is greater than what has been predicted as the upper 95%. I have this search going back 30-days in 1-hour buckets to get the most accurate prediction going forward. I do not wish to alert on stuff 30-days old, just the last hour. What can I do to still get the more accurate prediction from 30-days worth of data but only alert on the last hour of data?
If I understand correctly, you just want to alert if the last few entries are outside the prediction.
So I'd modify your search like so:
index=symantec (virus OR "security risk" OR "web attack") NOT "Tracking Cookies" earliest=-30d@d latest=now | rex "(?i) name: (?P<virus_host>[^,]+)" | timechart span=1h count(virus_host) as count | predict count | rename upper95(prediction(count)) as upper95 | search count=* | tail 1 | where count>upper95
to just filter to the most recent event where you have a count, then filter to just where it's outside the prediction.
You may also wish to consider the holdback setting for predict.
If I understand correctly, you just want to alert if the last few entries are outside the prediction.
So I'd modify your search like so:
index=symantec (virus OR "security risk" OR "web attack") NOT "Tracking Cookies" earliest=-30d@d latest=now | rex "(?i) name: (?P<virus_host>[^,]+)" | timechart span=1h count(virus_host) as count | predict count | rename upper95(prediction(count)) as upper95 | search count=* | tail 1 | where count>upper95
to just filter to the most recent event where you have a count, then filter to just where it's outside the prediction.
You may also wish to consider the holdback setting for predict.
Too many characters to reply to your answer, but this is exactly what I needed. My modified query is the following (needed to use head instead of tail):
index=symantec (virus OR "security risk" OR "web attack") NOT "Tracking Cookies" earliest=-21d@d latest=now | rex "(?i) name: (?P<virus_host>[^,]+)" | bucket span=1h _time | timechart span=1h count(virus_host) as count | predict count | rename upper95(prediction(count)) as upper95 | fieldformat upper95=round(upper95,0) | sort -_time | eval Percent=round(upper95/count*100,0) | eval PercentAbove95thPecentile=round(100-Percent,0) | fields - Percent,lower95(prediction(count)),prediction(count) | fillnull value=0 count PercentAbove95thPecentile | head 10 | where PercentAbove95thPecentile>=1
I am sure this could be cleaned up and made more efficient, especially with the eval, but this is going to do exactly what I need it to do.