Hello,
It's not the first time that I notice this issue, but I cannot find a workaround this time.
I'm trying to overwrite a KV store with a subset of a csv file.
When I try to overwrite this KV store with a subset that contains a bigger number of elements, it's fine.
But when I try to do this with a subset that is smaller than the KV store, it's failing (I mean there is no error, but the KV store is not modified).
| inputcsv accounts_temp
| eval key = username
| search account_type="TA"
| outputlookup append=false key_field=key accounts
When I look at the search.log:
01-22-2018 13:28:01.703 INFO outputcsv - 4616 events written to accounts_collection
But when I try to read the KV store, I still have more than 31 000 elements.
| inputlookup accounts
Any idea about what's going on ?
Thanks
Here's what I found (with the help of Perplexity engine) - saved me... :
It seems like as soon as you add the key_field argument the append=false option is ignored (despite what the documentation says).
In my case I was trying to overwrite the collection by using this
| outputlookup append=false key_field=host_id <kv_lookup_ref >
I overcame the problem by using the following approach
| rename host_id as _key
| outputlookup <kv_lookup_ref>
Which overwrote the collection successfully whilst still using my desired _key field (host_id) rather than system generated _key values.
This worked beautifully, thank you!
This is not a bug, it's working as intended. You're doing
| outputlookup append=false key_field=key accounts
The important setting is key_field=key
. This will update rows in your kv store, identified by the keys present in the results, and leave others as they are. So if you have this in your accounts
kv store:
_key val
1 foo
2 bar
3 baz
4 baf
and do
| makeresults count=2
| streamstats count as key
| eval val = "update"
| outputlookup append=f key_field=key accounts
your kv store will still have four rows but they will look like this:
_key val
1 update
2 update
3 baz
4 baf
If you want to change your entire lookup to what your search results are, no problem - either drop the key_field=key
from your outputlookup and live with the system-generated keys (if you're doing your lookup based on another field such as key (no underscore), you might want to accelerate it) or do the following:
| inputcsv accounts_temp
| eval key = username
| search account_type="TA"
| append [| outputlookup accounts]
| outputlookup append=false key_field=key accounts
The subsearch for append
in this search will run before the main search (as all subsearches do) and empty the entire kv store. Personally, I'd go with the former, the latter is more of a hack.
I have this problem on a somewhat larger scale... mine's an ~2,000,000-row kvstore refusing to be squished down to ~400,000 rows.
Use case is an active-session state table where one scheduled search adds news rows and updates existing rows, and another scheduled search prunes old expired sessions.
The latter search just plain doesn't work - the size of the kvstore just keeps creeping up. The only solution is to periodically purge the entire kvstore with a
| outputlookup huge_kvstore
I'd be a lot less perplexed if this was a problem which didn't previously exist.