Splunk Search

CSV vs KV store lookup. How large is large?

MonkeyK
Builder

Documentation comparing CSV and KV store notes that for large lookups, KV Store is preferred over CSV.
http://dev.splunk.com/view/SP-CAAAEY7#kvsvscsv

What is the definition of large? Is it measured in total bytes? Number of records? And in either case how much?
I also have read that up to a point, CSV would be preferred because is gets loaded in memory. What is that point?

0 Karma

hunters_splunk
Splunk Employee
Splunk Employee

Hi MonkeyK,

csv lookups are preferred for small tables that change infrequently. Most csv lookups contain no more than 100 rows of data.
KV Store is designed for large key-value data collections that frequently change, for example:

– Tracking workflow state changes (an incident-review system)
– Keeping a list of environment assets assigned to users and their metadata
– Controlling a job queue or application state as the user interacts with the app
KV store can:
– Enable per-record CRUD operations using the lookup commands and the REST API
– Access key-value data seamlessly across search head cluster
– Back up and restore KV Store data
Optionally, KV store can also:
- Allow data type enforcement on write operations
- Perform field accelerations and automatic lookups
- Work with distributed searches on the search peers (indexers)

Hope this helps. Thanks!
Hunter

0 Karma

MonkeyK
Builder

Doesn't really help. I was looking for a more quantitative definition of large. But now you have added another undefined term "frequently". How do I evaluate "large" and "frequently"?

Most of what I am seeing are qualitative criteria that have little to do with anything that I know about.

I am interested in understanding the quantitative measures in general, but right now I am evaluating them against a specific use case of creating and maintaining Indicator of Compromise lists. These would be on the order of 100-1000 IP addresses or URLs (separate lists) that would be used once in a local search and then accumulated to a central list that would be used as part of a nightly search/alert. Each day, any indicators older than a set amount of time (say two weeks) would be removed.
I know that I can perform my use case using csv based lookups, but do not know how important it is/will be to consider KV stores.

0 Karma

MonkeyK
Builder

as a follow-up, I have built out some csv based lookup tables with several thousand records. No problem in the searches.

0 Karma

myu_splunk
Splunk Employee
Splunk Employee

Hi MonkeyK, if you have a lookup table file that is 100MB or larger, I'd consider that a large lookup table file. As far as the CSV vs KV store lookup question goes, KV Store collections live on the search head and are not passed down to indexers. CSV lookups are replicated to the indexers so if your lookup table changes frequently, this could lead to performance problems.

Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...