We've installed Splunk For Nagios v1.1.1 on Splunk v4.2.2 and when running the saved searches I receive no results. Looking closer I see that some of the field extractions are not working. Specifically, the field nagiosevent (we assume others) does not appear in the search results.
We have nagios critical alerts being logged to nagios.log and indexed by splunk but
Splunk For Nagios reports/searches do not pick these up. Can you point us in the right direction to get this working??
What version of Nagios are you running? I have tested Nagios v3.2.1 with Splunk for Nagios.
Your log output looks non-standard, hence the field extractions in Splunk for Nagios aren't working for you.
Here is a snippet of a similar event from our environment :-
[1308619187] SERVICE ALERT: server33;Load;WARNING;HARD;3;WARNING - load average: 7.78, 6.14, 4.79
Note that the date format above is in unix epoch time and SERVICE ALERT follows immediately after.
As your nagios event data is being directed to syslog, it contains a couple of extra fields between the date and SERVICE ALERT notably the hostname of the nagios server and the process name, ie. nagios
It is quite simple to change the relevant field extraction to work with your event data, just update the following configuration file:
$SPLUNK_HOME/etc/apps/SplunkForNagios/default/props.conf
Replace the following field extraction:
EXTRACT-nagiosevent = \[\d+] (?P<nagiosevent>[^:]*)(?=)
with this field extraction:
EXTRACT-nagiosevent = \snagios:\s(?P<nagiosevent>[^:]*)(?=)
All the best,
Luke 🙂
I have updated this answer with the relevant field extraction 🙂
I know where our issue is now.
We send our nagios log to rsyslog (USER facility) which is sent to our central log host then is indexed into Splunk. The extra pre-appended fields ((conversion of epoch time) host, process) are added by rsyslog.
Can you give us some pointers on how to update the field extractions to account for this?
Many Thanks
The Nagios app expects there to be a "nagios" index in Splunk. When we originally set it up, our nagios data was going to "main", and we had similar issues. It was easier in the long run to make the new index and reindex our nagios data into it.
Still no luck.
We have all our data being indexed to a nagios index with sourcetype nagios, as per the install instructions.
When I do the search:
index="nagios"
Here is a snippet of the results:
Jun 21 10:53:17 lda nagios: SERVICE ALERT: ird8.st;TRAP;WARNING;HARD;1;262077 131038 1 0 34 output IP:TS Drop Pkts TT1222 major
Jun 21 10:53:07 leda nagios: SERVICE ALERT: sw4.syd.i;FastEthernet0/17 - sv21.sd - eth0 - LB1;OK;SOFT;3;FastEthernet0/17:UP (in=2028.2Kbps/out=736.1Kbps/errors-in=0.0/errors-out=0.0/discard-in=0.0/discard-out=0.0):(1 UP): OK
dwaddle: agreed, I wrote the app and all saved searches and dashboards expect the nagios data to be in an index called nagios 🙂
The original app may have been written for logs of a slightly different format than what you are sending into Splunk from your implementation. Can you post a sample snippet of the logs?