I've been looking in the docs and in Answers for a solution to this problem. Say, for example, I want to look up a customer table with a rising ID field. Fine and dandy. But what happens if the customer's details change, for instance their address or phone changes? Splunk already has an event with this identifier, and won't re-index it as far as I can tell. But even if it did, you'd then have two events for the customer.
What needs to happen is to make the first event go away e.g. using | delete.
But I don't see anywhere that DB connect has this functionality. If the DB records change rarely, is there any other solution than giving this source it own index, and dropping the whole thing and re-indexing if there is one change?
So...what gives?
You have a couple of options.
1) DB Connect will re-index a changed row if the rising_column value for that row is higher than the last rising_column value read. Using a 'modificationTime' column as the rising_column would do here. Your query would then need to use the dedup
command to filter out duplicate rows.
2) Consider periodically reading the database in batch mode into a lookup file (or KV store). Each read would overwrite the existing lookup file so you'd only have the most recent data in Splunk.
Hello,
Can you please extend on how option 2) Consider periodically reading the database in batch mode into a lookup file (or KV store). Each read would overwrite the existing lookup file so you'd only have the most recent data in Splunk. could be implemented?
You have a couple of options.
1) DB Connect will re-index a changed row if the rising_column value for that row is higher than the last rising_column value read. Using a 'modificationTime' column as the rising_column would do here. Your query would then need to use the dedup
command to filter out duplicate rows.
2) Consider periodically reading the database in batch mode into a lookup file (or KV store). Each read would overwrite the existing lookup file so you'd only have the most recent data in Splunk.