Alerting

Consolidated Alerting using TOP command

Communicator

enter code hereHi Team,

We have an application platform which is gateway for multiple applications[1000+ apps], we have ingested platform logs in splunk.

Platform team wants to setup alerts for each of the applications if the number of HTTP 500 requests breaches the threshold say 5% of the total requests for that app. Alert schedule is every 5 mins. Alert should be sent to corresponding app team as there are more than 1000+ apps, platform team cannot handle.

I am using the top command with limit=0 , this is giving the total events across all the applications in last 5 mins but i want to be able treat the count of specific application as 100 and then calculate the 500 requests % and check if the threshold is violated.

my current query

index=abc | top HttpStatus app limit=0

This is how my current query data looks like , App1+App2 adds up to 100%

HttpStatus  app        count    percent
400         App1        30      0.091609
200         App1        15      0.045804
500         App1        6       0.018322
200         App2        3813    11.643459
400         App2        2       0.006107
500         App2        28882   88.194699

This is how i want the output to be able to trigger at app level. app1 breakup sums up to 100%,likewise App2 breakup sums up to 100%. Then i can link it to lookup and check if the 500 requests have exceeded the threshold % and then trigger the alert.

HttpStatus  app        count    percent
400         App1        30      58.82352941
200         App1        15      29.41176471
500         App1        6       11.76470588
200         App2        3813    11.66162033
400         App2        2       0.006116769
500         App2        28882   88.3322629

I want to use single consolidated alert which can trigger for each applications threshold violation[i plan to use lookup for thresholds] as we do not want to set up 1000 odd individual alerts specific to application

Can you share your thoughts if there is a way to achieve this or something similar.

Thanks!

0 Karma
1 Solution

Champion

Try:

index=abc | top HttpStatus app limit=0 | eventstats sum(count) AS app_total BY app | eval app_percent=(count/app_total)*100

View solution in original post

0 Karma

Champion

Try:

index=abc | top HttpStatus app limit=0 | eventstats sum(count) AS app_total BY app | eval app_percent=(count/app_total)*100

View solution in original post

0 Karma

Communicator

thank you micahkemp, it worked like charm.

0 Karma