Knowledge Management
Highlighted

Is there any KV Store Limits setting Documentation??

Contributor

It would appear from the docs that there are a number of kvstore limits.conf settings that can be tweaked. However, there appears to be very little documented about what these do and/or when these should be modified etc. For example, for large scale kvstores of a million+ rows that are regularly updated/appended - i am going to guess that at least one of these might need to be modified.

For example, I currently have a large KV lookup that I am trying to append to but it is failing after a certain number of rows - almost as if it hits some limit then stops.

Here are the settings that are somewhat cryptic to me - I am hopeful that someone could maybe just reply to let us know when each one of these should be modified and why.

max_queries_per_batch = <unsigned int>
* The maximum number of queries that can be run in a single batch
* Defaults to 1000

max_size_per_result_mb = <unsigned int>
* The maximum size of the result that will be returned for a single query to a
  collection in MB.
* Defaults to 50 MB

max_size_per_batch_save_mb = <unsigned int>
* The maximum size of a batch save query in MB
* Defaults to 50 MB

max_documents_per_batch_save = <unsigned int>
* The maximum number of documents that can be saved in a single batch
* Defaults to 1000

max_size_per_batch_result_mb = <unsigned int>
* The maximum size of the result set from a set of batched queries
* Defaults to 100 MB

max_rows_in_memory_per_dump = <unsigned int>
* The maximum number of rows in memory before flushing it to the CSV projection
  of KVStore collection.
* Defaults to 200
Tags (2)
Highlighted

Re: Is there any KV Store Limits setting Documentation??

Contributor

I've been pushing 10s of millions of rows into KV Store on Splunk 6.3.3 (on Windows). When I do large data inputs we occasionally run into issues with the data not completing. I worked with Splunk support to modify one setting to resolve the issue.

[kvstore]
maxdocumentsperbatchsave = 500

This has the impact of slowing down the data input, but it completed.

Splunk 6.4.x has improvements to the KV Store and Splunk support reported (but not tested by me). The output I received from support was:

Used search: index=kvcheck1 | outputlookup kvstorecoll
6.3.3 result (failure)

Job inspector
This search did not successfully execute. Any results returned from this job are not consistent and should not be used.

And our favorite entry:
search.log
06-13-2016 10:04:59.730 ERROR KVStorageProvider - An error occurred during the last operation ('saveBatchData', domain: '2', code: '4'): Failed to read 4 bytes from socket within 300000 milliseconds.
06-13-2016 10:04:59.748 ERROR KVStoreLookup - KV Store output failed with code -1 and message '[ "{ \"ErrorMessage\" : \"Failed to read 4 bytes from socket within 300000 milliseconds.\" }" ]'


6.4.1 result (success)

Job inspector
This search has completed and has returned 1,999,744 results by scanning 1,999,744 events in 534.54 seconds.

Duration (seconds)  Component   Invocations Input count Output count
325.37  command.outputlookup    1   8,000,000   8,009,664

As stated the socket timeout problem (this is Windows only issue!) should be fixed in 6.4.x releases.

So for most purposes, up to 70,000,000+ rows, the defaults work well from my experience.

Highlighted

Re: Is there any KV Store Limits setting Documentation??

Builder

Thanks,
I'm going to develop an app for a Customer that needs to track the status of 4-5 SIMs connecting the their systems and I was worried about any limit. At the moment I could simulate the app with just 200K unique SIMs and the "extract new connections -> inputlookup in append -> update -> outpulookup" pipeline is working quite well, but I've no idea on how it could perform with 5millions entries in the KV Store.

Marco

0 Karma
Highlighted

Re: Is there any KV Store Limits setting Documentation??

New Member

As Splunk has 4 GB RAM limitation so KV store with such number of rows (7 crore) will not raise issue ?

0 Karma