Hello,
I have a log file that I am indexing that has events that log the word "offline" and the word "online". I want to be alerted when the event offline occurs so I am currently searching for "offline" in my sourcetype and sending out an email. Then I want to be alerted when the event online occurs. Simple right? Similar query similar alert.
The problem is that when the log file has the word offline, it has it in there MANY times until the word online occurs. And I don't want to bombard email every 5 minutes with another alert. So I have been playing around with throttling however this is not perfect and I can miss a real offline alert.
Can anyone help with coming up with a search and alert methodology that will be full proof in making sure I always get an alert when offline and an alert when online?
Thanks!
Do you have a host or server associated with the offline and online states?
What you could do is write the state of any offline hosts out to a file ( |outputcsv
or |outputlookup
). If a host already exists in the file, don't raise an alert about it. Remove the host from the file once you see a corresponding online message
I have the events already indexed in Splunk. So I can already query if the service is offline or online. The problem comes with a methodology for alerting via email. I could simply just run an alert every 5 minutes and send an email however would you really want to receive an email every 5 minutes telling you the service is down?
That's what I am trying to avoid. To figure out a way to make sure we get an email just once when offline and then an email once back online.
Thanks!