Alerting

How to search Apache access logs and alert if status code 500 appears for more than 10 minutes?

krishnacasso
Path Finder

I am new to Splunk. We will be using it to monitor our Apache logs. I need to configure an alert for the Apache access log. If the 500 status in the access log appears more than 10 minutes I need get a alert saying the Apache services were down.

xx.xx.xx.xx - - [2/Feb/2016:12:44:05 -0600] 7765 "GET /forms/login.fcc HTTP/1.1" 200 3094 "-" "-"
xx.xx.xx.xx - - [2/Feb/2016:12:44:06 -0600] 8704 "GET /forms/login.fcc HTTP/1.1" 200 3094 "-" "-"
xx.xx.xx.xx - - [2/Feb/2016:12:44:06 -0600] 12730 "GET /forms/login.fcc HTTP/1.1" 200 3094 "-" "-"
xx.xx.xx.xx - - [2/Feb/2016:12:44:07 -0600] 10192 "GET /forms/login.fcc HTTP/1.1" 200 3094 "-" "-"
xx.xx.xx.xx - - [2/Feb/2016:12:44:07 -0600] 8749 "GET /forms/login.fcc HTTP/1.1" 500 3094 "-" "-"
xx.xx.xx.xx - - [2/Feb/2016:12:44:07 -0600] 8222 "GET /forms/login.fcc HTTP/1.1" 500 3094 "-" "-"
xx.xxx.xx.xx - - [2/Feb/2016:12:44:08 -0600] 9021 "GET /forms/login.fcc HTTP/1.1" 500 3094 "-" "-"
xx.xxx.xx.xx - - [2/Feb/2016:12:44:08 -0600] 9306 "GET /forms/login.fcc HTTP/1.1" 500 3094 "-" "-"
xx.xxx.xx.xx - - [2/Feb/2016:12:44:08 -0600] 8598 "GET /forms/login.fcc HTTP/1.1" 500 3094 "-" "-"
xx.xxx.xx.xx - - [2/Feb/2016:12:44:09 -0600] 8482 "GET /forms/login.fcc HTTP/1.1" 200 3094 "-" "-"
xx.xxx.xx.xx - - [2/Feb/2016:12:44:09 -0600] 8647 "GET /forms/login.fcc HTTP/1.1" 200 3094 "-" "-"
xx.xxx.xx.xx - - [2/Feb/2016:12:44:09 -0600] 8750 "GET /forms/login.fcc HTTP/1.1" 200 3094 "-" "-"

I need to get an alert if 500 appears for more than 10 minutes.
Can you help with writing a search for this?

Thanks.

0 Karma

MuS
Legend

Hi krishnacasso,

Welcome to Splunk 🙂

if you already have a field called status which represents the status code, you can run a search like this:

your base search to get the apache logs earliest=-11min@min | bin _time span=1min | stats count(eval(status=="500")) by time | where count>=10

This will count status="500" per minute and if the count is bigger or equal 10 show the result. You can safe this as an alert http://docs.splunk.com/Documentation/Splunk/6.3.3/Alert/Aboutalerts

If the status field is not yet available run the above search with some regex to get the field :

your base search to get the apache logs earliest=-11min@min | rex ""\s(?<status>\d+)\s\d" | bin _time span=1min | stats count(eval(status=="500")) by time | where count>=10

Hope this helps to get you started ...

cheers, MuS

PS: This is un-tested, because I don't have a Splunk handy right now 😉

Get Updates on the Splunk Community!

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...