Hi All,
Hoping someone out there can help me unravel the mystery I'm currently facing.
We have a KV Store that we use to hold MISP values which is checked against when running various security alerts.
We have 3 searches that are querying MISP data source and based on the results should add any new entries into the KVStore.
Basics of the search we run are below:
| misp command to get new records in last 24hrs
| bunnch of evals to format data
| append
[| inputlookup MispKVstore]
| dedup
| outputlookup append=false MispKVstore
We have this running 3 times to get details for different types of values - but all are stored in the same KVstore.
Issue we are having is, once we reach 50 rows in the KV Store, updates are not being made as expected.
Each time the search runs, it will add new entries for that category, but seems to delete / discard the values added by the other searches.
All column names are consistent between the searches, I have updated the Max_rows_per_query as we thought we might be being affected by the 50k limit, but this has not resolved the issue.
Seeking any tips, tricks, troubleshooting advice anyone is able to give to help get this sorted.
Thanks in advance 🙂
Do you mean when you reach 50 rows or 50k rows?
Anyway, I wonder why you are rewriting the whole kv store each time rather than only adding new rows, which you could do with
| misp command to get new records in last 24hrs
| bunch of evals to format data
| lookup MispKVstore fields_to_determine_existence OUTPUT _key
| outputlookup append=t MispKVstore
then you would only add new records (and update existing depending on how you do the lookup)
As for the overwrites you are seeing - if you are up at 50k then the subsearch will have a 50k limit, but changing to use the lookup form above would solve that.
Do you mean when you reach 50 rows or 50k rows?
Anyway, I wonder why you are rewriting the whole kv store each time rather than only adding new rows, which you could do with
| misp command to get new records in last 24hrs
| bunch of evals to format data
| lookup MispKVstore fields_to_determine_existence OUTPUT _key
| outputlookup append=t MispKVstore
then you would only add new records (and update existing depending on how you do the lookup)
As for the overwrites you are seeing - if you are up at 50k then the subsearch will have a 50k limit, but changing to use the lookup form above would solve that.
You comment: "if you are up at 50k then the subsearch will have a 50k limit" Does this apply even to an append search where we are doing | inputlookup - ie because this is not main search, we are only appending the first 50K rows from the lookup?
Your append with the inputlookup is a subsearch, so that will only return the first 50k rows of the lookup file.
It's easy to test this. Just run this
| makeresults
| eval c=0
| append [
| inputlookup yourlookup.csv
]
| streamstats c
| table c
You will see 50001 results with values of c from 0 to 50,000
Hi @bowesmana,
The search was one that pre-existing my involvement and I'm just trouble shooting.
Just a note, if I run the search without the output, the returned results are accurate for what should be written to the KV Store, and I think for that specific search result, the results are added. It's just that the next search that adds a different category of data seems to overwrite the updated results from the last search.
I did think about just looking at adding new values and had tried doing a NOT in search, but this seemed to take a long time.
I'll try your method tomorrow to see how that performs and let you know if resolves my issues.
Do you have a search head cluster? I am not totally sure about KV store sync between nodes in a search head cluster. Do these searches run as saved searches and if so, when and where. Do they run at the same time as each other? I wonder if there is some issue around syncing the changes in a clustered environment.
Not my area of expertise though...
We will run as a cluster at some point, but right now - no, this is just a single SH, and the instance I'm testing on is an all in one server so won't even be any lag with sync between SH and IDX.
search is scheduled - runs every hr, initially all searches were set to run at the same time, but I have adjusted this and now they run 20mins apart.