Splunk Search

How do you chart a cumulative sum?

Communicator

I'm calculating the sum of spending over a month period.

* | timechart sum(value) span=1mon

This will produce the cumalative amount, but it won't show you how you arrived at the amount in day incements. Changing the span to 1 day, doesn't produce the desired result nor does bucketing ahead of the timechart.

 * | timechart sum(value) span=1mon

How do you achieve this without some major delta hack?

1 Solution

Communicator

The answer is not pretty but it works, thanks Ayn.

enter code here| reverse | accum value as totalvalue | timechart last(totalvalue) span=1d

View solution in original post

SplunkTrust
SplunkTrust

you want to use the streamstats command.

1) simple example, running the timechart first and using streamstats to create the cumulative total on the timechart output rows.

* | timechart count| streamstats sum(count) as cumulative

2) similar, but with a field value instead of the count:

index=_internal source=*metrics.log group=per_sourcetype_thruput | timechart sum(kb) as totalKB | streamstats sum(totalKB) as totalCumulativeKB

3) If you want to go the other way, and use streamstats on the raw events, you can do that, but then you have to use the reverse command.

index=_internal source=*metrics.log group=per_sourcetype_thruput | reverse | streamstats sum(kb) as cumulativeKB | timechart max(cumulativeKB)

4) And streamstats also allows a 'by' term, so for example it can keep track of all of these cumulative numbers separately by some field value like 'series':

With the streamstats before the reporting command:

index=_internal source=*metrics.log group=per_sourcetype_thruput | reverse | streamstats sum(kb) as cumulativeKB by series | timechart max(cumulativeKB) by series

and last but not leasat, if you want to use the other way and use streamstats after the reporting command, you have to get a little more hands-on with stats and bin.

index=_internal source=*metrics.log group=per_sourcetype_thruput | bin _time span=1h | streamstats sum(kb) as totalKB by _time series | timechart sum(totalKB) by series

Motivator

I found this to work well: statsing by _time and your split-by field. It is more efficient as you are not potentially reversing hundreds of thousands of events: index=_internal source=*license_usage.log type=Usage | eval MB=b/1024/1024 | bucket _time span=1h | stats sum(MB) as MB by st | streamstats sum(MB) as MB by st | timechart span=1h last(MB) as MB by st is a nice cumulative graph of indexing today by sourcetype. And over a day it creates 25 rows, which is a lot easier to reverse than 800,000 original log lines...

0 Karma

Communicator

The answer is not pretty but it works, thanks Ayn.

enter code here| reverse | accum value as totalvalue | timechart last(totalvalue) span=1d

View solution in original post

Motivator

This will be a lot more efficient if you do a stats table first, statsing by _time and the value you're interested in. Plus, you won't need a reverse anymore, because the stats table will be ordered by time: search ... | bucket _time span=1d | stats sum(value) as value | accum value as totalvalue | timechart last(totalvalue) span=1d - use the same bucketing of _time as the span in timechart and splunk has even less work to do in the timechart.

0 Karma

Builder

I used the same function but but it is accumulating all the sum of 3 status where i want to accumulate the sum by status. Any idea how can i do that ?
| accum value as totalvalue | timechart last(totalvalue) by STATUS

0 Karma

Legend

You could use accum to create the cumulative sum and then do a timechart last() on this sum to get the last value at the breakpoint of each interval and finally arriving at the total sum:

... | accum value as totalvalue | timechart last(value) span=1d

Communicator

Ironically it produces the opposite result. It's stepping down from the cumulative total.
| accum value as totalvalue | timechart last(totalvalue) span=1d