Hi All,
I have a lookup file which changes frequently. Currently, we are have a base csv file physically on the server. Any changes are made to the server copy of csv file and then the update CSV file is uploaded to Splunk as lookup table, so that lookup file always have current data.
The drawback of this is that I can't keep track of changes within splunk ( I can log them in a file but that's an overhead.
Lookup file format
field1,field2,field3
The solution that I was trying was this
Add history related fields like changed_date, changed_by in the base CSV file
changed_date,field1,field2,field3,changed_by
Add this base CSV file as lookup table file in another app (say MgmtApp).
Have Splunk monitor this base CSV lookup file in the MgmtApp in an index, say lookupHistory.
Use "Sideview Util -The Lookup Updater" to add/update data (data is not deleted) in base CSV lookup file. All changes (add/updated) should go to lookupHistory index with updated timestamp.
Use a scheduled search to take latest value of field1,field2,field3 combination and use outputlookup command to generate actualy lookup file in another app (say MainApp).
In theory this approach looked fine as I was having history data and a way to have latest lookup file will less manualy intervention. But the problem that I am facing is that, everytime I add/update a row using the "Lookup Updater" view, Splunk is re-indexing whole base CSV file (say initial event count in index=lookupHistory is 50, I add a new row, total event count in becomes 101, if I update a value now, it becomes 152).
Is there a way to instruct Splunk to just index new/updated data and not to re-index whole file?
Or If you have any suggestion to what I am trying to achieve, that will be appreciated.
Alright. Yes in hindsight it's a huge feature that was missing. I'll add something to my queue to log changes. Probably via its own standalone log in /var/log/splunk.
Alright. Yes in hindsight it's a huge feature that was missing. I'll add something to my queue to log changes. Probably via its own standalone log in /var/log/splunk.
Thanks. awaiting your new release.
You got it right on the money 🙂
The lookup updater itself was a very good tool, having this fearue (storing history) will make it pretty useful.
Am I right in thinking that the core requirement here is auditability/visibility of admin changes to the lookup? That you need to know who changes the lookup when and how?
It sounds like a good general feature that the Lookup Updater is currently lacking.
The values to be updated manually in the base CSV file (that's the reason I was tried Lookup Updater so that updates can be done from UI instead of manual file change and upload to SPlunk Server). Another option (I guess you're suggesting the same) is to index the base CSV file and next time just keep indexing new changes (instead of monitoring of file it will be one time indexing for every change) and than write a search to get latest values into lookup (using outputlookup command). But this still left me with manual change of file, copying file to Splunk Server and indexing it manually.
Can't you use outputlookup and keep all the records rather than updating it? index and Dedup all the records to get the latest record. But you need to add new columns you specified to keep track of them..
Its a file which holds today's rate and updated daily by users with admin role. I have 6 such files
Who updates the lookup file?