This might be an easy thing to do, but I haven't figured out a good way to do it.
I have a database input that runs a query for all open tickets in my helpdesk and indexes the full dump every hour. The timestamp is the time that the query was run. This allows me to do things like a timechart showing the number of open tickets each hour over time.
But I also want to run searches on just the most recent dump to answer questions like, "Which technician has the most open tickets right now?" or "List all open tickets that were created over 4 hours ago". I have to be sure that the fields for the ticket represent the most recent data and doesn't include tickets from previous dumps that might have been closed and thus missing from the most recent dump.
I know how to get this info if I can limit the search to just the latest dump of the data. but because of the way it's timestamped, I'm not sure how to limit the search to leave out tickets that exist in previous dumps. The dirty way I've found is to try to set the earliest / latest times trying to catch a single dump in the range, but if the frequency of the updates is increased, I'd have to change my searches.
As long as I can get the data for the latest data dump, I can create all the searches necessary to extract Created_time, Responded_time, etc. I just need to make sure that I'm only looking at the newest data.
I am trying to do the same thing. I'm not sure this is the best solution, but I used a subsearch to find the latest index time of the data and then used that in the main search.
index=myIndex sourcetype=mySourceType report=myReport [|search index=myIndex sourcetype=mySourceType report=myReport | stats latest(_time) as earliest]
Hopefully the dedup works, but we couldn't get it working. We always found some new scenario where dedup "found" the wrong ticket (not the latest one, but something else). We gave up and now we dump all the tickets every 5 minutes (yes it's crazy), then we add current time to the event (we name it as INDEX_TIME) and finally we have this in our search: | eventstats max(INDEX_TIME) as last | where INDEX_TIME=last
Now we are trying to figure out how to get rid of the events where INDEX_TIME older than 1 hour in order to keep the dashboard as quick as possible. We set the index freeze time to 1 hour, but doesn't work as we expected...
All in all our setup works, but we hope to find more efficient way. Basically we need the solution where the search finds always the latest update of the ticket. And the ticket source is relational DB...
No need to add a field
INDEX_TIME, just use the existing field
_indextime provided by Splunk out of the box.
You're dumping a DB into Splunk every five minutes and want to delete it after an hour? Consider using the DB live as it is through
If you absolutely have to do it like this, consider reducing
maxHotSpanSecs for that index to something under an hour, that enables a frequent freezing of old buckets.