We are looking to take an enterprise level approach on the monitoring of critical device logging. We have a list of several hundred critical devices that we need to monitor for the presence of logs. Our critical device list appears as follows in a lookup, critical_device_sample:
SplunkHost,PairGroup,HoursLag,Index
10.1.2.x,,1,networkdevice
10.1.1.x,,1,networkdevice
switcha-tx,B,1,networkdevice
switchb-tx,B,1,networkdevice
switcha-ma,,1,networkdevice
neversentlogs,,2,proxy
proxy-ma,,2,proxy
proxy-tx,,2,proxy
proxy1-ga,A,1,proxy
proxy2-ga,A,1,proxy
The search:
| inputlookup critical_device_sample | eval SplunkHost=lower(SplunkHost) | join SplunkHost type=outer [metadata index=proxy index=networkdevice type=hosts
| rename totalCount as Count, host as SplunkHost, lastTime as "Last Event"
| eval actualhourslag=(now()-'Last Event')/60/60
| eval SplunkHost=lower(SplunkHost)] | fieldformat "Last Event"=strftime('Last Event', "%c") | where actualhourslag>HoursLag OR NOT actualhourslag="*"
The above query works, showing critical devices that aren't reporting. However, this needs some improvement. Specifically, we want to pair the primary and backup devices using a PairGroup field. If Primary is down, backup took over and we don't care provided we are still receiving logs from one of the pairs. Understanding that something failed over may be another query. If the PairGroup is empty, it's assumed it has no backup, the lookup table can also be modified if the explicit word "nobackup" is easier.
There is a minor thing to mention that someone must make an assumption on - the pair groups should have the same HoursLag and index. Though I know this is a leap, should they not, feel free to take the min value for HoursGroup and max? value for index. Finally, no need to assume there are tertiary devices. However, if you can handle three letter "C"'s in PairGroup, fantastic.
We would like to only search in the indexes named in the lookup sample as opposed to trying to keep them in sync. Any idea how to approach this problem? - Happy to see someone write out these queries, but pointing me in the right direction would also help as much. I'm fairly new to Splunk so trying to decide between, map, subsearches, joins vs appendpipes give me enough of rabbit hole...
Thank you in advance.
... View more