Hi Team,
Is there anyway we can trigger the alert from Splunk, if any one of it's Universal forwarder is not in running status.
Thanks for your help!
Regards,
Abilan
Hi Abilan,
in DMC there is an alert for you.
if you don't like it, you can build your own lookup with all forwarders to monitor (e.g. Perimeter.csv) and build a search like this:
| metasearch index=_internal
| eval host=upper(host)
| stats count by host
| append [ | inputlookup Perimeter.csv | eval count=0 | eval host=upper(host) | fields host count]
| stats sum(count) AS Total by host
| where Total=0
and schedule it e.g. every five minutes.
When this search has zero results means that in the last five minutes all forwarders sent logs.
Instead Forwarders with Total=0 did't send logs in the last five minutes.
Bye.
Giuseppe
You can use Splunk's Monitoring Console to monitor deployment status.
It internally uses following REST command which you can run in Splunk search
| rest splunk_server=local /services/deployment/server/clients
This will give you lastPhonedHome time for each deployment client pinging Splunk Deployment Server.
Hi Abilan,
in DMC there is an alert for you.
if you don't like it, you can build your own lookup with all forwarders to monitor (e.g. Perimeter.csv) and build a search like this:
| metasearch index=_internal
| eval host=upper(host)
| stats count by host
| append [ | inputlookup Perimeter.csv | eval count=0 | eval host=upper(host) | fields host count]
| stats sum(count) AS Total by host
| where Total=0
and schedule it e.g. every five minutes.
When this search has zero results means that in the last five minutes all forwarders sent logs.
Instead Forwarders with Total=0 did't send logs in the last five minutes.
Bye.
Giuseppe
Hi ,
Thanks for your response. We are in Splunk 6.2, I don't see any alert in DMC by default to monitor the forwarders.
I will try the other solution and get back to you.
Hi ,
This query is working fine, but I would like monitor only for 20 hosts out of 30. Is there anyway I can do that. Also if you just give me some idea about the lookup that will be helpful for me.
Thanks for your help.
Hi Abilan,
I usually use a lookup in this situations because in this way I can easily and dynamically manage this list: it's the only way if I have many hundreds or thousands items, but it's very useful also in situations like yours.
If you don't want to manage it you could also update it automatically using a nightly scheduled search with outputlookup command.
In addition, your lookup you can insert additional information about your forwarders to use in a dashboard showing the status of your deployment (Total=0 means Device Down, Total>0 means Device up and running).
Bye.
Giuseppe
Thank You! one final question if I use Total = 0 then it's not working as expected. Because if there is no event for any forwarder it is not showing it's name in the output.
Can you help me here?
Hi Abilan,
If in your search there are forwarders that didn't send logs in the monitored period, the above search take the value of the lookup that is 0 (see | eval count=0
in line 4).
If you have a forwarder that doesn't send logs and doesn't have Total=0, this means that this forwarder isn't listed in your lookup.
Be sure that all forwarders are listed in your lookup, because in the lookup you have the perimeter to monitor.
Be attention to host case, for this reason you have to insert | eval host=upper(host)
both in search and subsearch (append).
Bye.
Giuseppe
Hi .
To test this scenario, I have just brought down one of the forwarder. I didn't see that forwarder in my search result. do you want me to add the hostnames anywhere?
Thanks for your help!
Regards,
Abilan
Hi Abilan,,
what is the column name in your lookup?
if it's "host " (all in lowercase), search is already correct, if it has another name you have to insert in the subsearch, before fields command | rename your_column_name AS host
.
I have many installation that uses this search
Bye.
Giuseppe
Hi,
I didn't change anything on the lookup which you have given. I am just simply using that. Still one of forwarder is down, I am not getting that host name in result.
If I remove where Total=0 from search, I can see all other hosts but not the one which is down.
Thanks again!
Hi Abilan,
removing Total=0 you can see all Forwarders and if you see other forwarders this means that search is running.
But the problem is probably on lookup:
check the column name anche check if the missed forwarder is in lookup.
Bye.
Giuseppe
Hi ,
Column names are looking good. If I search with Total >0 then I can see all the hosts but not that one forwarder which is down.
Just ran only initial part of the query. Even in this am not having that host in this list.
| metasearch index=_internal | eval host=upper(host) | stats count by host
I think we cannot achieve the requirement using this lookup or else I have to put the lookup for every host.
Regards,
Abilan
Hi Abilan,
the problem isn't in the main search, the problem is surely in the subsearch: if you run only the subsearch (| inputlookup Perimeter.csv | eval count=0 | eval host=upper(host) | fields host count
), is there the missed host?
Probably there isn't
the result of the previous search should be:
host count
host1 0
host2 0
...
missed_host 0
For this reason I said to you to verify the lookup column name (if it's different from "host", must be renamed) and the missed host name.
The sense of the above search is to take the forwarders logs and add to them the lookups host with count=0, so if there aren't results in the search, everyway there is a record with hostname and count=0.
Bye.
Giuseppe
Hi ,
I understand that but when we search for forwarder logs to fetch the host list but in this case the host which is down already won't be in the host list.
Please correct me If I am wrong. Thanks for your help!
Hi Abilan,
In the Perimeter.csv lookup you must put the monitored host list (your monitoring perimeter) to say to Splunk which are the hosts to check.
You can manage this list manually (using Lookup Editor App) or using a scheduled search.
In this second case, you have to schedule to run every night a search like this:
| metasearch index=_internal earliest=-30d latest=now | fields host | outputlookup Perimeter.csv
I prefer to manually manage this lookup to avoid false positive cases.
Bye.
Giuseppe
Hi ,
I have installed lookup editor app and created a sheet with only 4 hosts.
Now I am getting the host which are having Total =0. But however If I check for >0 then am getting all of my hosts not only the ones which are only in input lookup file.
csv file has 2 below columns. am keeping the Total column blank and host column contains host names.
host Total
Thanks a lot for your help!
Hi Abilan,
in your lookup you have to insert all hosts you want to monitor, both the ones you're receiving logs and the missed ones, as you said above you should have around 30 hosts.
If you add one forwarder to you network, you have to add it to the lookup list.
In this way you're sure to monitor all hosts of your lookup and you're sure to have in your search result also the missed hosts.
Bye.
Giuseppe
Thanks a lot for your help.. I will do that.
Regards,
Abilan
You can certainly monitor index=_internal and determine if you`re receiving logs. Other way is to use the metadata command to show if something is getting indexed and adjust to your reality. You can find some information here: http://docs.splunk.com/Documentation/Splunk/6.5.2/Troubleshooting/Cantfinddata#Are_you_using_forward...