Incremental Update of Events

New Member

We would like to use Splunk to dashboard business level metrics. For these metrics, we would like to populate the "current day" information, and then update this metric (event) as the day continues. So the resulting metric would be something like:


2013-08-20,Apples Sold,200

As the day progresses, we would like to update that count (300 apples, 400 apples, etc). Such that at the end of the day there is one event that contains the total number of apples for the day.

While we can do incremental queries (count the number of apples sold since the last query) and then aggregate those metrics into a single number. In this case, it's far more efficient to simply have the database produce the number.

With ElasticSearch we'd define the "key" to be the date and the metric. That way each time the same row is seen it would be updated. Is there a way to achieve a similar functionality in Splunk?

0 Karma



Let me start off by saying that my background in Accounting empathizes with you. My first degree was in "Data Processing, Business Option", which tells you it has been a few decades. It would be nice to know there was a ledger somewhere that has the up-to-the-moment number that is the exact net number of apple boxes we have in the warehouse right now. Of course, in reality from that number, a couple of boxes have been moved in with the rutabagas, an extra box wasn't counted in the last inventory, a few have spoiled, and one box was juiced for an employee's birthday party.

So, this beautiful idea of having a single (wishfully accurate) number stored in a FIELD somewhere...that's different thinking from Splunk's inherent paradigm . Calling Splunk a database is one of the giveaways. Splunk's more like a crowd of really fast trained squirrels who can sort through your junk closets with lightning speed, and will do exactly what you tell them... even if that has NO discernable relationship to what you want.


Okay, two things here - first, updating the "database" every time a transaction is entered is not necessarily more efficient than calculating the value when you need it. That fact is even true in relational databases - storing the value of a field you can calculate at will is discouraged in many (most?) modern systems design methodologies.

Regarding splunk, I just don't want to be adding /updating/ deleting records from the indexes willy-nilly. The design, code and maintenance overhead (as well as insuring the ACID factors) could completely eat my lunch, and human time is much more expensive than machine time. At any given moment, I would have to figure out which events (transactions) had been already processed at the prior summary record, then MARK them somehow when I produce the new summary, and identify which records were relevant and which weren't... ouch. just not my idea of a good time.

Much simpler, and more efficient, to add up whatever has come in up to this point in time, and present the number.

On the other hand ---


Second, you CAN calculate your daily results and put them into an end-of-day summary event, so that your in-day searches would only contain the prior day's summary event and the current day's new events. That's going to be pretty efficient, and it is typical splunk usage as designed.

Review the "collect" command here

0 Karma

Super Champion

Have you tried using the Other > Today time range?

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!