Splunk Search

tracking number of indexed events: how to chart, how to summarize

Communicator

The following tells me how many events I'm indexing every 5 minutes.

index="_internal" group="thruput" | bucket _time span=5m | timechart span=5m sum(instantaneous_eps)

The graph isn't accurate, it shows the number of results in the bucket. It'd be nice if that was pretty.

More important, it means I have a large number of rows to look through. What I'd really like is to have a per-day min/max/avg of these bucketed events. In other words, summarize the 5-minute span of events into three per-day numbers.

Can someone show off their advanced Splunk skills on this for me?

0 Karma
1 Solution

SplunkTrust
SplunkTrust

some notes -

the bucket command in your search is redundant since timechart automatically will call bucket.

and maybe you know what you're doing but I'm suspicious of sum(instantaneous_eps). I dont know what it means and I'm curious as to what you might think it means. It definitely cannot be safely treated as though it was a number of events corresponding uniquely to that event. (i believe in 4.2 they are finally adding a eventCount=12321 per thruput line)

But assuming that you're interested in the lowest, highest and median values for instantaneous_eps, you can pipe to timechart twice, as in the below:

index="_internal" group="thruput" sourcetype="splunkd" | timechart span=5m max(instantaneous_eps) as max min(instantaneous_eps) as min median(instantaneous_eps) as median | timechart span=1d max(max) as max min(min) as min median(median) as median

The first pipe gives you min, max and median for the 5 minute buckets and the second takes the max(max), min(min) and median(median) over 1-day buckets.

View solution in original post

SplunkTrust
SplunkTrust

some notes -

the bucket command in your search is redundant since timechart automatically will call bucket.

and maybe you know what you're doing but I'm suspicious of sum(instantaneous_eps). I dont know what it means and I'm curious as to what you might think it means. It definitely cannot be safely treated as though it was a number of events corresponding uniquely to that event. (i believe in 4.2 they are finally adding a eventCount=12321 per thruput line)

But assuming that you're interested in the lowest, highest and median values for instantaneous_eps, you can pipe to timechart twice, as in the below:

index="_internal" group="thruput" sourcetype="splunkd" | timechart span=5m max(instantaneous_eps) as max min(instantaneous_eps) as min median(instantaneous_eps) as median | timechart span=1d max(max) as max min(min) as min median(median) as median

The first pipe gives you min, max and median for the 5 minute buckets and the second takes the max(max), min(min) and median(median) over 1-day buckets.

View solution in original post

SplunkTrust
SplunkTrust

You're saying that (max-1d(max-5m(x)) is the same as max-1d(x) because the metrics log happens to only write an event every 5 minutes? cause that's definitely not true generally.

0 Karma

Communicator

I didn't think about the sum(instantaneous_eps) being skewed as being independent events. And you've both made me realize that it isn't necessary- since min/med/max of the bucketed value is sufficient. Thanks.

0 Karma

Splunk Employee
Splunk Employee

hmm, I would actually just do ... | timechart span=5m avg(instantaneous_eps) as eps | timechart span=1d max(eps) min(eps) median(eps), since the way you have it above, the first timechart isn't needed. (max-1d(max-5m(x)) is just max-1d(x), same for min(), and same for median() assuming equal numbers of observations in every equal-size interval, which do we get in the metrics log) But I also question the meaning of instantaneous_eps, and even more question the meaning of summing up or averaging that value over 5 minutes in the first place.

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!