Hi
How can I find continuously occured events?
e.g
1- I have field that call "response time"
if some times show "response time" is high not issue but if it continuously occured it's not normal. (couple of milisecond, seconds or minutes )
2-I have "error code" field
error code 404 is normal but if continuously occured, means something is wrong. (couple of milisecond, seconds or minutes )
FYI: this is huge log, consider that performance is a important factor.
Any idea?
Thanks,
It seems that you're trying to do some service monitoring.
There is an app for it and it's called ITSI 😉
But to solve your problem we should define what you call "conitnuously".
I'd approach it by doing a timechart of 404 or long request ratio against all requests and check if the ratio exceeds some threshold. This might not be exactly what you specified, but I think it's a relatively good way to monitor errors.
Of course you could try to approach your specs literarily and check for a continuous stream of long or erroneous requests (for example with transaction) but it won't be either practical nor effective.
so what is the best way to do this?
as I mention this is huge log file(it's not realtime) , and i'm looking for logical way to do this.
Any other idea?
Thanks
As I said, I'd do something like
<your search> | timechart span=1m count(eval(code="404")) as errors count as allreqs | eval ratio=errors/allreqsAnd see if in any of your periods the ratio exceeds, for example, 20%.
If the data set is huge, you might want to accelerate the report.
Or, even better, if you have the data aligned with the Web CIM model, you can accelerate the datamodel (if you don't have it accelerated already) and search from the datamodel, not from the original raw data.
as I mention (second, milisecond) are important. your span set to the minute. and definately if I change it to second it has huge impact. take long time to cmplete and could not loud on visualize chart.
As I wrote before - accelerate your report, use accelerated datamodel.
But still, if you want a near-realtime response... well, you chose a wrong tool. If you have huge amounts of data to process, whatever search you make, you still have to scan all those events this way or another.
BTW, aggregating on second or millisecond level to seems pointless, but who am I to judge.