Hi to4kawa,
I am trying to handle couple of scenarios based out of above query..
I want to alert the transaction details of failures which fall into alert=1 category and have written below query to achieve it
index=dte_fios sourcetype=dte2_Fios FT=*FT Error_Code!=0000 earliest=04/20/2020:11:00:00 latest=04/20/2020:13:00:00
| bin _time span=15m
| stats count as Total, count(eval(Error_Code!="0000")) AS Failure by FT,_time
| eval Failurepercent=round(Failure/Total*100)
| multikv forceheader=1
| table _time,FT,Total,Failure,Failurepercent
| autoregress Failurepercent p=3 as F_45MinsAgo
| autoregress Failurepercent p=2 as F_30MinsAgo
| autoregress Failurepercent p=1 as F_15MinsAgo
| lookup ftthresholdlkp FT
| eval alert=case(Minhits<=Total AND Total<=Maxhits AND Failurepercent > Failure_Threshold1, 1
, (Minhits<=Total AND Total<=Maxhits) AND (Failure_Threshold2 <= Failurepercent AND Failurepercent <= Failure_Threshold3) AND (Failure_Threshold2 <= F_15MinsAgo AND F_15MinsAgo <= Failure_Threshold3) AND (Failure_Threshold2 <= F_30MinsAgo AND F_30MinsAgo <= Failure_Threshold3) AND (Failure_Threshold2 <= F_45MinsAgo AND F_45MinsAgo <= Failure_Threshold3), 1
, (Minhits<=Total AND Total<=Maxhits) AND (Failure_Threshold3 <= Failurepercent AND Failurepercent <= Failure_Threshold4) AND (Failure_Threshold3 <= F_15MinsAgo AND F_15MinsAgo <= Failure_Threshold4) AND (Failure_Threshold3 <= F_30MinsAgo AND F_30MinsAgo <= Failure_Threshold4) ,1,(Minhits<=Total AND Total<=Maxhits) AND (Failure_Threshold4 <= Failurepercent AND Failurepercent <= Failure_Threshold5) AND (Failure_Threshold4 <= F_15MinsAgo AND F_15MinsAgo <= Failure_Threshold5),1,true(), 0)
| where alert=1
| map search="search index=dte_fios sourcetype=dte2_Fios FT=$FT$ earliest=04/20/2020:12:45:00 latest=04/20/2020:13:00:00 |table _time,WPID,MGRID,Host,System,DIP_Command,CID,DTE_Command,FT,OSS,Error_Code,Error_Msg"
If you observe line 3 of my query where I did stats by FT,_time.. this is required for me because I want to compare the failures for every 15mins for each domain in my application and whichever domain has highest failures I want to send those transaction details of that domain over the last 15mins in my alert email which you can see in my map query.
I have 2 doubts here:
1. If I get multiple domain satisfying the alert condition how can I get only the top domain from the where output so that I can use only that FT in my second query.
2. If suppose my alert is running at 1 PM and I have records during 12:45 - 1 PM but do not have records during 12:30 - 12:45 PM but have records during 12:15 - 12:30 PM then this F_15Minago is showing the Failurepercent value of duration 12:15 - 12:30 PM because I do not have row for 12:30-12:45 PM interval.. How can I handle this as this is incorrect
... View more