We have a search that runs overnight, updating a summary index for reporting the following day, as follows.
tag::eventtype="failure" tag:: eventtype="authentication" tag::eventtype="user" | stats count by host
The daytime search reads:
index=xx-summary earliest=-2w@w1 latest=-1d search name="overnight-search-name" | eval date=time | convert timeformat="%d-%b-%Y" ctime(date) | stats sum(count) by _time date | fields - _time
This search provides input to a report which graphs the results over the given period.
Now, as the overnight search searches for authentication events, obviously it does not produce results for hosts which have no such events; and the result is that if there are no authentication failures overnight, there are no stats for any host, and consequently no entry (rather than a zero entry) for that date in the final report. The result is that we have a blank patch in the report which may require investigation to confirm that there were in fact no authentication failures, rather than it being due to e.g. a Splunk collector failure.
What I would like to do is to arrange that a day with no authentication failures should be reported by a zero entry for that day, rather than no entry. I'd be grateful for any ideas on how to achieve this. Could I somehow cycle through all hosts using "metadata=hosts"?
While you can certainly use a pre-processed list of hosts to fill those gaps with zeroes, how does that solve your actual problem - determining if there really were no authentication failures or if there was some kind of logging or machine failure?
You could run this search at the same time to check that all hosts are online, and if not, then how long they've been offline.
It is set to search the past 7 days, that way if you run it at least once a week and correct offline hosts, then it should always list all of the hosts. In your case, you want to make sure that it will search a suitable timeframe to suit your needs.
If all hosts are listed as current, then you can be fairly confident that 'no data' means 'no failures'.
index=_internal source=*metrics.log group=tcpin_connections earliest=-7d@d | eval sourceHost=lower(sourceHost) | eval hostname=lower(hostname) | eval sourceHost=coalesce(hostname, sourceHost) | eval age = (now() - _time ) |stats first(age) as age, first(_time) as LastTime by sourceHost | convert ctime(LastTime) as "Last Active On" | eval Status= case(age < XXX,"Running",age > XXX,"DOWN")
Thanks, I can see your point, and indeed we have a search that does more or less the same thing. Ideally, though, I'm looking to incorporate the "zero" entries into the final report - though the more I think about it, the more I fear that in order to do that, I would have to cycle though every host on the system in order to generate a summary record for each host - which would mean running the same subsearch (as it might be) a couple of hundred times! Not sure how far search acceleration would help me there :-(.
I think perhaps I should withdraw this query - though I could hardly object if someone came up with a brilliant idea! Thanks to those who have replied, and to those who may have given it a passing though
For incorporating the zeroes into your other search, build a lookup from luke's search containing all your hosts scheduled regularly to be up to date.
Then start your other search with an
inputlookup loading the list of hosts and left outer join the count of events to this list. That way hosts without a count will still appear.