I'm trying to use a lookup table to find servers that are not reporting or have NEVER reported to Splunk.
Since I don't have admin access to the Splunk instance, I write the info from our DB of servers to the event log of a specific server and then import it nightly via an alert with a an action of "output results to lookup".
When I run the search below, it always returns "Error in 'inputlookup' command: Invalid argument: 'NOT'" - regardless of whether I use the word NOT.
| inputlookup siteservers NOT [search host=Server Channel=System ProviderName=EventLog EventID=6013]
"Servers" contains the value from the lookup that I want to match to the host field. Event 6013 is just the daily windows uptime event - if it's not found then I have a server that's not reporting.
I suspect that the fact that the parameter "local=t" is not available with the inputlookup command may be part of the problem. The lookup must be used with the "local=t" parameter. Our admin tells me that making the lookup table available across the cluster is very difficult.
Any ideas on how to make this work would be much appreciated.
so,
if you are counting on an event code, you can count on any event from windows for letting you know that the windows systems are up
try this:
| tstats max(_time) as last_event where index = <yourDataIndex> by host
| eval now_time = now()
| eval last_seen_ago_in_seconds = now_time - last_event
| sort - last_seen_ago_in_seconds
or compare against your lookup
hope it helps
My problem is finding servers that have NEVER reported to Splunk. The lookup table contains the list of all active servers, and I'm looking for servers from that list that have not sent an uptime event over the last 24 hours. This would indicate that Splunk has not been installed or has other issues that need to be addressed.
maybe this, considering you have hosts in your all_hosts.csv
that never touched splunk
| inputlookup all_hosts.csv | table host
| append [
| tstats max(_time) as last_event where index = * by host
| eval now_time = now()
| eval last_seen_ago_in_seconds = now_time - last_event
| sort -last_seen_ago_in_seconds ]
| stats values(*) as * by host
| eval MISSING = if(isnull(last_seen_ago_in_seconds) OR last_seen_ago_in_seconds>86400,"MISSING","GOOD")
Good idea, but returns servers not in my lookup table – i.e. from other sites. Changing append to “join left” comes a lot closer to what I’m looking for. In fact, if the join was not case sensitive, the job would be done. A few of the host names returned are lower case, and they don’t get matched in my lookup.
Thanks for pointing me in a good direction. Not sure I’ll get closer to solving the problem than this.
| inputlookup site9servers where OSname=*Windows*
| eval host=Server
| table host
| join type=left host [
| tstats max(_time) as last_event by host
| eval now_time = now()
| eval last_seen_ago_in_seconds = now_time - last_event
| sort -last_seen_ago_in_seconds ]
| stats values(*) as * by host
| eval MISSING = if(isnull(last_seen_ago_in_seconds) OR last_seen_ago_in_seconds>86400,"MISSING","GOOD")
glad it helped
you can get over the case sensitivity with eval lower
or eval upper
| inputlookup site9servers where OSname=*Windows*
| eval host=upper(Server)
| table host
| join type=left host [
| tstats max(_time) as last_event by host
| eval now_time = now()
| eval last_seen_ago_in_seconds = now_time - last_event
| sort -last_seen_ago_in_seconds
| eval host = upper(host) ]
| stats values(*) as * by host
| eval MISSING = if(isnull(last_seen_ago_in_seconds) OR last_seen_ago_in_seconds>86400,"MISSING","GOOD")
You're going to run into so many false positives trying to monitor at the host level. Perhaps you should try at the sourcetype level
Regardless, you should use MetaWoot for this
Metawoot looks good, but not sure how it can help find a server that has never reported to Splunk? Not to mention I'm not an admin on this Splunk instance.