Splunk Enterprise

KV store scalability

yoho
Contributor

I'm building an app where Splunk is receiving a large number of IDs and a property that I will need to sum over time.  Let's take for instance a simple example:

_timeidamount
00:00A1
00:01C8
01:01B2
01:02A4

 

At 01:03, a user asks "What is the sum of amount per id, for each id you've seen in the last 15 minutes" ? Take into account this is a very limited example, but you will have millions of unique ids and a big event rate (hundreds of events per second). The expected answer is:

A5
B2

 

The first reflex is something like this: "stats sum(amount) earliest=0 by id [ |search id ]" over the last hour. But it's not really scable since, over time, the first part of the query will need to sum a lot of events.

Then the second thought was to add an intermediary summary index which is doing the sum(amount) over a small period of time (15 min) and keep the result. Yes, it's accelarating but after a year or so, I will still have performance / scalability issues.

In the end, then only thing I need is to keep the last value "sum(amount)" per id and continue counting based on this value every time you receive a new event. That's why I'm wondering if we could use the kvstore to keep counting, everytime we see a record, we simply update the records with :

  • key = id
  • value = value(id) + amount

Anyone having a similar experience with KV Store or having a similar issue ?

Labels (1)
0 Karma

yoho
Contributor

I've made a mistake but can't change it in my message: you should read "over the last 15 minutes" instead of  "over the last hour" in the sentence below the second table

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...