Splunk Enterprise

KV store scalability

yoho
Contributor

I'm building an app where Splunk is receiving a large number of IDs and a property that I will need to sum over time.  Let's take for instance a simple example:

_timeidamount
00:00A1
00:01C8
01:01B2
01:02A4

 

At 01:03, a user asks "What is the sum of amount per id, for each id you've seen in the last 15 minutes" ? Take into account this is a very limited example, but you will have millions of unique ids and a big event rate (hundreds of events per second). The expected answer is:

A5
B2

 

The first reflex is something like this: "stats sum(amount) earliest=0 by id [ |search id ]" over the last hour. But it's not really scable since, over time, the first part of the query will need to sum a lot of events.

Then the second thought was to add an intermediary summary index which is doing the sum(amount) over a small period of time (15 min) and keep the result. Yes, it's accelarating but after a year or so, I will still have performance / scalability issues.

In the end, then only thing I need is to keep the last value "sum(amount)" per id and continue counting based on this value every time you receive a new event. That's why I'm wondering if we could use the kvstore to keep counting, everytime we see a record, we simply update the records with :

  • key = id
  • value = value(id) + amount

Anyone having a similar experience with KV Store or having a similar issue ?

Labels (1)
0 Karma

yoho
Contributor

I've made a mistake but can't change it in my message: you should read "over the last 15 minutes" instead of  "over the last hour" in the sentence below the second table

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...