Is there anyway we can trigger the alert from Splunk, if any one of it's Universal forwarder is not in running status.
Thanks for your help!
You can certainly monitor index=internal and determine if you`re receiving logs. Other way is to use the metadata command to show if something is getting indexed and adjust to your reality. You can find some information here: http://docs.splunk.com/Documentation/Splunk/6.5.2/Troubleshooting/Cantfinddata#Areyouusingforwarders.3F
in DMC there is an alert for you.
if you don't like it, you can build your own lookup with all forwarders to monitor (e.g. Perimeter.csv) and build a search like this:
| metasearch index=_internal | eval host=upper(host) | stats count by host | append [ | inputlookup Perimeter.csv | eval count=0 | eval host=upper(host) | fields host count] | stats sum(count) AS Total by host | where Total=0
and schedule it e.g. every five minutes.
When this search has zero results means that in the last five minutes all forwarders sent logs.
Instead Forwarders with Total=0 did't send logs in the last five minutes.
Thanks for your response. We are in Splunk 6.2, I don't see any alert in DMC by default to monitor the forwarders.
I will try the other solution and get back to you.
This query is working fine, but I would like monitor only for 20 hosts out of 30. Is there anyway I can do that. Also if you just give me some idea about the lookup that will be helpful for me.
Thanks for your help.
I usually use a lookup in this situations because in this way I can easily and dynamically manage this list: it's the only way if I have many hundreds or thousands items, but it's very useful also in situations like yours.
If you don't want to manage it you could also update it automatically using a nightly scheduled search with outputlookup command.
In addition, your lookup you can insert additional information about your forwarders to use in a dashboard showing the status of your deployment (Total=0 means Device Down, Total>0 means Device up and running).
Thank You! one final question if I use Total = 0 then it's not working as expected. Because if there is no event for any forwarder it is not showing it's name in the output.
Can you help me here?
If in your search there are forwarders that didn't send logs in the monitored period, the above search take the value of the lookup that is 0 (see
| eval count=0 in line 4).
If you have a forwarder that doesn't send logs and doesn't have Total=0, this means that this forwarder isn't listed in your lookup.
Be sure that all forwarders are listed in your lookup, because in the lookup you have the perimeter to monitor.
Be attention to host case, for this reason you have to insert
| eval host=upper(host) both in search and subsearch (append).
To test this scenario, I have just brought down one of the forwarder. I didn't see that forwarder in my search result. do you want me to add the hostnames anywhere?
Thanks for your help!
what is the column name in your lookup?
if it's "host " (all in lowercase), search is already correct, if it has another name you have to insert in the subsearch, before fields command
| rename your_column_name AS host.
I have many installation that uses this search