We correlate multiple data sources into a master lookup that also periodically gets exported and would like to implement a last_modified_timestamp field so only new rows/rows whose values have changed since the last export are exported. What is the best way to compare what is currently in the KV store to what is being inserted into it? We have a lot of data and a lot of fields, so efficiency is key.
Is the KV updated by just a few well-maintained searches? i.e., will it only be updated in controllable, predictable ways?
If so, you can do this by adding a last_modified field to the KV (
| inputlookup example_kv | eval last_modified=now() | outputlookup example_kv), and then keep it updated by making sure that all of your updating queries modify that last_modified field as well (using now() as the value).
If you do this, your updating query will go from something like this:
| inputlookup example_kv | search _key=544948df3ec32d7a4c1d9755 | eval CustName="Marge Simpson" | eval CustCity="Springfield" | outputlookup example_kv append=True
to something like this:
| inputlookup example_kv | search _key=544948df3ec32d7a4c1d9755 | eval CustName="Marge Simpson" | eval CustCity="Springfield" | eval last_modified=now() | outputlookup example_kv append=True
Then when you're exporting you can just search for
... | where last_modified>relative_time(now(), -1d@d) (adjust as needed).
Key is key. Meaning make sure you form your _key value for your rows such that when you update it updates the precise row then when you update a time based column with the latest time you should get the result you want.
Hi starcher. Thanks for replying. We have already tried that, but because we are pulling data every day that may or may not be different, our "updates" are not true updates, so that solution was not sufficient for our purposes. We want to figure out whether the data for a row has truly changed.