I am very new to Splunk, so forgive me if this answer is obvious.
I have some freezers which contain some special stuff in them. This week, we had 2 freezers go down, unexpectedly. All their stuff was ruined.
I'd like to set up Splunk to receive the log files at a regular interval. BUT, if the freezer gets unplugged or goes down, and the log does not generate, I'd like to set up an alert to notify us. ("Hey, we haven't heard from Freezer1, please check to see if he's ok")
Is this even possible with Splunk>?
It is possible... You would need to either build or provide a lookup of the freezers you expect to show up, then on a regular interval (slightly larger than your reporting interval) check to see what freezers reported in over the previous interval.
Assuming the freezer identifier is the host (or another index time extracted field), and you have a freezerinfo lookup that lists all the expected values of that field, your search could potentially look something like:
| tstats count where index=<yourindex> sourcetype=<freezer_sourcetype> by host | inputlookup append=true freezerinfo | stats count by host | where count=1
but that's a lot of assumptions of course :). Check out the docs for tstats, inputlookup, stats, and where if you're curious about this theory.
I don't think you mentioned what version of Splunk you're on but if it's pre-6.3 then the Deployment Monitor app has this build in:
https://splunkbase.splunk.com/app/1294/#/overview
We use a modified version of the alerting that comes with the app, but it's a nice and quick starting point.
If you're 6.3+ then the new built in Deployment Manager has this:
http://docs.splunk.com/Documentation/Splunk/latest/DMC/Platformalerts
I've never used it, so I can't comment on how comparable it is I'm afraid, I'm sure someone else has experience and might be able to comment though.
It is possible... You would need to either build or provide a lookup of the freezers you expect to show up, then on a regular interval (slightly larger than your reporting interval) check to see what freezers reported in over the previous interval.
Assuming the freezer identifier is the host (or another index time extracted field), and you have a freezerinfo lookup that lists all the expected values of that field, your search could potentially look something like:
| tstats count where index=<yourindex> sourcetype=<freezer_sourcetype> by host | inputlookup append=true freezerinfo | stats count by host | where count=1
but that's a lot of assumptions of course :). Check out the docs for tstats, inputlookup, stats, and where if you're curious about this theory.
Thanks. I checked out your hyperlinks, too.
So I would:
1. Create look up table.
2. Create report that runs at slightly longer intervals than how often the log is received (using code similar to what is above)
3. Add an alert to the report that will email people when a freezer doesn't transmit the log.
Our network adapter for the freezer is on back order, so I won't be able to start this for a week or so.
Thank you so much. Any additional guidance or suggestions you may have is certainly greatly appreciated!