Hello,
I am trying to generate an alert based of response times. In a given timeframe, if the percentage of response times (resp_time > 8000ms) >2%, then alert.
SO, far i am using this and getting a null value
index=AlwaysUseIndex service_name=MyService request_host=something.com "parameter_check" | rex field=route "^(?<route>.*)\?" | eval pTime = response_time | stats count as Volume by route | appendcols [ search pTime > 8000 | stats count as Err by route ] | fields route,Volume,Err
This is to test and once i get some output, i want to do below percentage calculation.
| eval Percentage=round((Volume/Total)*100,2) | table route, Total,Err,Percentage
Final Output
Route | Total | Err | percentage |
I tried a few other options as well, not sure if its a brain fade moment after working for 12 hours but i cannot seem to get the err count to dive by total and get percentages.
I am able to get percentages but not sure how to alert based on it.
This the query i used to get percentages:
index=ALwaysUseIndex service_name=MyService request_host=something.com "parameter_check" | rex field=route "^(?<route>.*)\?" | eval pTime = total_time | eval TimeFrames = case(pTime<=1000, "0-1", pTime>1000 AND pTime<=3000, "1-3", pTime>3000 AND pTime<=5000, "3-5", pTime>5000 AND pTime<=8000, "5-8", pTime>8000, ">8", pTime>20000, "ReallyBad") | stats count as CallVolume by route, TimeFrames | eventstats sum(CallVolume) as Total by route | eval Percentage=round((CallVolume/Total)*100,2) | sort by route, -CallVolume | chart values(Percentage) over route by TimeFrames | sort -TimeFrames
But i am not sure how to alert based off only the pTime >8000 and pTime >20000
Thanks
@rakeshreddy1230, can you try below query? You can set alert if result is more than zero.
index=AlwaysUseIndex service_name=MyService request_host=something.com "parameter_check"
| rex field=route "^(?<route>.*)\?"
| eval pTime = total_time
| eval TimeFrames = case(pTime<=1000, "0-1", pTime>1000 AND pTime<=3000, "1-3", pTime>3000 AND pTime<=5000, "3-5", pTime>5000 AND pTime<=8000, "5-8", pTime>8000, ">8", pTime>20000, "ReallyBad")
| stats count as CallVolume by route, TimeFrames
| eventstats sum(CallVolume) as Total by route
| eval Percentage=round((CallVolume/Total)*100,2)
| stats max(Total) as Total max(Percentage) as Percentage by route TimeFrames
| where Percentage > 0.2 AND (TimeFrames=">8" OR TimeFrames="ReallyBad")
@rakeshreddy1230, can you try below query? You can set alert if result is more than zero.
index=AlwaysUseIndex service_name=MyService request_host=something.com "parameter_check"
| rex field=route "^(?<route>.*)\?"
| eval pTime = total_time
| eval TimeFrames = case(pTime<=1000, "0-1", pTime>1000 AND pTime<=3000, "1-3", pTime>3000 AND pTime<=5000, "3-5", pTime>5000 AND pTime<=8000, "5-8", pTime>8000, ">8", pTime>20000, "ReallyBad")
| stats count as CallVolume by route, TimeFrames
| eventstats sum(CallVolume) as Total by route
| eval Percentage=round((CallVolume/Total)*100,2)
| stats max(Total) as Total max(Percentage) as Percentage by route TimeFrames
| where Percentage > 0.2 AND (TimeFrames=">8" OR TimeFrames="ReallyBad")