We keep some of our security data in Splunk for a longer period of time (6 months). Lately, we have noticed that "some" data appears to be being dropped (Still working on this with Splunk support). One issue that has popped up is, how do we track and validate that certain data is coming in and being retained in Splunk? For example, I expect a certain set of hosts to come in every day, and I expect that data to stay in the system for a certain period of time. We initially used metadata, but according to Splunk support, that isn't a reliable method. Any suggestions?
You can use tstats
to replace metadata
, for example:
| tstats count earliest(_time) latest(_time) where index=important by host
Tstats is already accelerated, you can group by time though:
| tstats count... where... by host _time span=1d
If you suspect losing data at a later time then you should Summary Index this frequently and look for changes.
Thanks, this is good for beginning/end, any idea how would I do this for each day? We are seeing data in the middle of a time period drop. Save counts to a summary index, or an accelerated report?