This is going to be long, but I hope it presents an interesting problem and hopefully, it has an elegant solution.
One of the things that really sold me on Splunk was the ability to throw a huge CSV file at it, Index it as a one-off and quickly create meaningful graphs and reports. The problem is that I have to manually copy and index each CSV snapshot as a one-off, then I have to modify my searches to point to the new source file. I want to automate this in a way that requires no manual intervention. My saved searches and dashboard should always have the latest information.
Here's an idea of what the CSV looke like.
Created_Time,Request_ID,Request_Type,Request_Status,Completed_Time
DD-MM-YYYY HH:MM:SS,#100,incident,open,,
DD-MM-YYYY HH:MM:SS,#101,request,closed,DD-MM-YYYY HH:MM:SS,
So I'm basically treating tickets as events. _time = Created_Time
#100 is open. The next time a take a snapshot of my tickets, #100 might be closed. I would want my dashboards to pick that up and report on that ticket in it's closed state the next time it is run. But Splunk can't change field values after indexing them. If I made Splunk monitor the CSV for changes, and overwrote it with the entire snapshot, it would re-index the file, duplicating all the tickets from the previous snapshot.
If I were to overwrite the CSV with only those tickets that have been newly created, over time, my indexed data would contain only open tickets because the closure of the ticket would be missing from the updates.
If I updated with newly created tickets plus tickets that have changed state, tickets would be duplicated in the index every time they changed status. Maybe this isn't such a bad thing, but I'm not sure how to search on it without counting the duplicates.
Then I thought about using a lookup table. The CSV could be loaded as a lookup file and I update it with all the tickets by ticket number and create_time. That way, with each search, it would do a lookup for the latest version of the field values. I'd then index a copy of ticket numbers and create_Time and do a lookup on the full list of tickets. Does this sound like it can work?
I'm just learning how to use Splunk and this has been racking my brain.
... View more