Hi guys,
I've been been given 2 tasks with regards to our Splunk forwarders.
1) Find out which forwarders are not checking in/do not have a heartbeat but have in the past.
2) Find out which forwarders do have a live heartbeat but have not sent any logs in over a specific period of time (probably going to make it 4 hours)
Could anyone give me advise on how I would go about finding this information? I have looked into creating a daily alert or report to inform me but I don't have a clue where I should be searching.
Any help at all would be appreciated as I don't even know where to start with this.
Thanks!
Hi Robbie1194
You can also run a search like this.
This would display a list of forwarders that haven’t reported in for a 2 hour window at any point. You can adjust accordingly to suit your needs.
index=_internal source=*metrics.log group=tcpin_connections
| eval sourceHost=if(isnull(hostname), sourceHost,hostname)
| table sourceHost _time
| sort sourceHost, -_time
| reverse
| streamstats window=1 current=f global=false last(_time) as previous_time by sourceHost
| eval d=_time-previous_time
| fields - previous_time
| search d>7200
If you can identify all the hosts which you need the forwarders on, then you are in a good shape. It should be the list from the serverclass.conf
file.
You can then create a lookup with these hosts and do a left join on them using the metadata
command.
This approach has been working for us very well.
We do something like -
| inputlookup <lookup>
| fields host, <rest of fields>
| join type=left host
[ | metadata type=hosts index=<the corresponding indexes>
| eval host=lower(host) ]
| eval RECENT=strftime(recentTime,"%a %m/%d/%Y-%T %Z(%z)")
| eval LAST=strftime(lastTime,"%a %m/%d/%Y-%T %Z(%z)") | sort host
| where <anything...>
| table host, LAST