Hi, I have a query that is meant to compare longitudinal count of an event of a given day (e.g. today) with historical longitudinal percentiles.
query 1:
index=interfaces sourcesession="MICHART_SIU_HL7_INBOUND" [`define_relative_week(1)`] date_wday!=saturday date_wday!=sunday | bin _time span=20m | stats count as count by _time | eval timeofday=tonumber(strftime(_time, "%H"))*3600 + tonumber(strftime(_time,"%M"))*60 | stats perc5(count) as perc5 perc95(count) as perc95 by timeofday | eval timeofday= tostring(timeofday,"duration") | join type=outer timeofday [search index=interfaces sourcesession="MICHART_SIU_HL7_INBOUND" | bin _time span=20m | eval timeofday=tonumber(strftime(_time, "%H"))*3600 + tonumber(strftime(_time,"%M"))*60| eval timeofday= tostring(timeofday,"duration")| stats count by timeofday] |sort timeofday
where define_relative_week is a macro:
stats count | addinfo | eval earliest=(info_min_time-604800*$n$) | eval earliest=strftime(earliest,"%m/%d/%Y:%H:%M:%S") | eval latest=(info_max_time-86400) | eval latest=strftime(latest,"%m/%d/%Y:%H:%M:%S") | return earliest,latest
Basically, this search grabs historical data (previous week) and bins the time, and then compute percentile by binned time.
Some of the things that I'm grappling with are:
multiple event types.
How to generalize this to multiple event types? For example, I have many types of
events (e.g. with different sourcesession values). And recomputing the historical percentile for each type of events is going to be too much for computations.
Acceleration.
Suppose for now, there is only one event of interest. How can I further accelerate this search? I need to set-up an alarm by comparing current count with 5% and 95% percentiles for the same time period, and the saved alarm/search will run every 15 minutes and inspect the previous 15 minutes to see whether there is any anomaly.
I tried the following query which filters out irrelevant time ranges. But to my surprise it runs slower than the original query. Any insights why?
query 2:
| gentimes end=-1 increment=24h [stats count | addinfo | eval start=(info_min_time-604800*2) | eval start=strftime(start,"%m/%d/%y:%H:%M:00")| return start]
| rename starttime as earliest
| eval latest=earliest+20*60
| sort - earliest
| map maxsearches=99999
search="search earliest=$earliest$ latest=$latest$ index=interfaces sourcesession=MICHART_SIU_HL7_INBOUND " | ...
... View more