We have a scheduled job that outputs log file in following format:
19.06.2014 04:00:00 STARTED 19.06.2014 04:00:00 Test Log 19.06.2014 04:00:05 blablabal 19.06.2014 04:00:05 ** 19.06.2014 04:00:05 Template-*** 19.06.2014 04:00:05 *** 19.06.2014 04:00:10 ENDED
How can i monitor this log file, where i want to trigger an alert if the event has a STARTED but not an ENDED line in a timespan of 5 minutes ??
I think i will define "Started" as Starting of an event and "Ended" as ending of the event. Meaning that the whole "event" if the job hangs will not finish.
You could achive this by using the transaction command, with some parameters to match your needs, try this
yourbasesearch | transaction startswith="STARTED" endswith="ENDED" maxspan=5m keepevicted=true | search closed_txn=0 | ...
The meaning, is to groups events by started and ended messages, with a max duration of the transaction of 5 minutes, keepevicted allows to track the "unclosed" transactions, and with the last search command we are filtering to keep only the "unclosed" ones. Anything that get outs of this search would be an alert, following your requirements.
You could try this:
index=foo sourcetype=bar STARTED OR ENDED | transaction keepevicted=t startswith="STARTED" endswith="ENDED" maxspan=300s | search eventcount=1 STARTED
That should give you STARTED-events that don't have an ENDED within 300 seconds. Note, merging transactions like this is miles better if you have a unique transaction ID. If possible you should consider adding one to your data at the source.