One approach is to use a lookup to hold the state of servers you have seen before. I'm going to build up to the solution, so read along carefully. Let's imagine you can do this:
index=myindex | stats min(_time) as oldest max(_time) as newest by host
If you run this over a (say) 7 day period, then you can get an idea of hosts that are both "new" and "missing" in that time period. Hosts will be in one of three states:
The value of oldest will be at the beginning of the 7 day period, the value of newest will be at the end. This means the host was (in all likelihood) sending events the whole 7 days and is neither new nor old.
The value of oldest is not the beginning of the 7 day period, and the value of newest is at the end. This means the host started sending data sometime in the 7 day period, and has been sending since and is therefore "new"
The value of oldest is at the beginning of the 7 day period, and the value of newest is sometime before the end of the 7 day period. This means the host was sending data at the beginning and then stopped and is therefore "old" and/or "missing"
This is good but not optimal. So let's make a slightly different variant:
index=myindex | stats min(_time) as oldest, max(_time) as newest by host
| outputlookup myindex_host_status.csv
This doesn't provide any new logic, but merely persists the data made by the search out to a lookup file. Now, let's use that lookup file to provide context in a slightly more complex search:
index=myindex | stats min(_time) as oldest, max(_time) as newest by host
| inputlookup myindex_host_status.csv
| stats min(oldest) as oldest, max(newest) as newest by host
| outputlookup myindex_host_status.csv
We can now take this search and run it every day over the past 24 hours. Or every hour over the past hour, or whatever. It winds up keeping for us - over an infinitely long period of time - the first timestamp and last timestamp for a given host. It keeps this even if the original data ages off. The scheduled maintenance search runs and maintains the lookup holding state for us. Now, we can use that state:
| inputlookup myindex_host_status.csv | where oldest > now() - (86400 * 3)
Giving us a list of host who first sent data within the past three days. Or
| inputlookup myindex_host_status.csv | where newest < now() - (86400 * 7)
Giving us a list of every host that has not sent in any new data in the past 7 days.
The trick here is we're using lookups to hold the long term state, and taking advantage of how splunk stores _time as an integer value that increases as time goes on. Every day is exactly 86,400 seconds, and bigger numbers are higher times. Simple mathematical functions like min() and max() work to compute earliest and latest times.
... View more