Archive

Splunk DB Connect: How to search the latest database dump ONLY?

Explorer

This might be an easy thing to do, but I haven't figured out a good way to do it.

I have a database input that runs a query for all open tickets in my helpdesk and indexes the full dump every hour. The timestamp is the time that the query was run. This allows me to do things like a timechart showing the number of open tickets each hour over time.

But I also want to run searches on just the most recent dump to answer questions like, "Which technician has the most open tickets right now?" or "List all open tickets that were created over 4 hours ago". I have to be sure that the fields for the ticket represent the most recent data and doesn't include tickets from previous dumps that might have been closed and thus missing from the most recent dump.

I know how to get this info if I can limit the search to just the latest dump of the data. but because of the way it's timestamped, I'm not sure how to limit the search to leave out tickets that exist in previous dumps. The dirty way I've found is to try to set the earliest / latest times trying to catch a single dump in the range, but if the frequency of the updates is increased, I'd have to change my searches.

As long as I can get the data for the latest data dump, I can create all the searches necessary to extract Created_time, Responded_time, etc. I just need to make sure that I'm only looking at the newest data.

Any suggestions?

Engager

I am trying to do the same thing. I'm not sure this is the best solution, but I used a subsearch to find the latest index time of the data and then used that in the main search.

index=myIndex sourcetype=mySourceType report=myReport [|search index=myIndex sourcetype=mySourceType report=myReport | stats latest(_time) as earliest]

Engager

Hopefully the dedup works, but we couldn't get it working. We always found some new scenario where dedup "found" the wrong ticket (not the latest one, but something else). We gave up and now we dump all the tickets every 5 minutes (yes it's crazy), then we add current time to the event (we name it as INDEX_TIME) and finally we have this in our search: | eventstats max(INDEX_TIME) as last | where INDEX_TIME=last

Now we are trying to figure out how to get rid of the events where INDEX_TIME older than 1 hour in order to keep the dashboard as quick as possible. We set the index freeze time to 1 hour, but doesn't work as we expected...

All in all our setup works, but we hope to find more efficient way. Basically we need the solution where the search finds always the latest update of the ticket. And the ticket source is relational DB...

0 Karma

SplunkTrust
SplunkTrust

No need to add a field INDEX_TIME, just use the existing field _indextime provided by Splunk out of the box.

You're dumping a DB into Splunk every five minutes and want to delete it after an hour? Consider using the DB live as it is through dbquery instead.

If you absolutely have to do it like this, consider reducing maxHotSpanSecs for that index to something under an hour, that enables a frequent freezing of old buckets.

SplunkTrust
SplunkTrust

Assuming your tickets have a unique ID, you can append | dedup ticketId to your search. Set the time range to e.g. past four hours to be sure to have at least one dump in range.

0 Karma

Splunk Employee
Splunk Employee

What does your search look like right now?

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!