Hi Splunkers,
I want to create an Instance overview dashboard, and one KPI should be today's estimated indexing volume. The daily traffic varies greatly by time (significantly more over the working hours, less during nighttime), which makes it a bit hard to just sum up the already indexed data and just extrapolate the value.
Currently I am trying to get a value by using a rolling average with streamstats like this:
index=_internal source=*metrics.log* sourcetype=splunkd group=per_host_thruput host=indexhost* earliest=@d | timechart per_day(kb) as daily | streamstats window=0 avg(daily) as davg
The davg
value is being displayed as a single value. My problem is: the value will be much too low in the morning hours, and too high in the early afternoon. I have already tried using data from the last 24h to get a better average, but with limited success.
Any chance to properly consider the changing traffic per day here?
If you are looking to estimate the usage of your license quota, the only source of truth is the events of the license_usage.log file as they are recorded on your license master. The panels of the License Usage view in the Distributed Management Console provide authoritative searches on this matter.
Now, if you are looking to estimate your daily indexing throughput (whether the data counted against your license quota or not), I would recommend to leverage the events of group=thruput name=index_thruput
in metrics.log, like so:
index=_internal group=thruput name=index_thruput | timechart span=1d sum(kb) AS daily_KB
Do not attempt to use events of group=per_*_thruput
to accurately determine license usage or indexing thruput as those represent a sampled measurement.
If you are looking to estimate the usage of your license quota, the only source of truth is the events of the license_usage.log file as they are recorded on your license master. The panels of the License Usage view in the Distributed Management Console provide authoritative searches on this matter.
Now, if you are looking to estimate your daily indexing throughput (whether the data counted against your license quota or not), I would recommend to leverage the events of group=thruput name=index_thruput
in metrics.log, like so:
index=_internal group=thruput name=index_thruput | timechart span=1d sum(kb) AS daily_KB
Do not attempt to use events of group=per_*_thruput
to accurately determine license usage or indexing thruput as those represent a sampled measurement.
I have marked this as accepted, as it comes most closely to what I needed to achieve.
We have built a dashboard on the group=thruput name=index_thruput
metrics, and did some averaging to get reasonable results.
Thanks!
Hi there!
You know, if this can help you, with licensing, you can consult a report done by spllunk. Settings -> Licensing -> usage report.
If you open the seraches for each panel they have, maybe you can finde there some useful stuff
hi DMohn,
The LicenseManager search will not count things like index=_internal and index=_audit data, because that volume doesnt count against your license. And the per_host search does.
However you can use the per_index_thruput numbers and then filter out the indexes that have leading underscores.
index=internal source=metrics.log splunk_server="" group="per_index_thruput" | eval MB=kb/1024 | stats sum(MB) by series | rename series as index | search index!=* | sort sum(MB) | addcoltotals | fillnull value="[ Total Indexed Volume ] last 24 hours" index
If I run this search against the data from yesterday and compare it to the licenseManager's search from today (necessary because the licenseManager runs just after midnight and its talking about yesterday), then the numbers seem very close to eachother but oddly they are not equal. Im not sure why.
for other solutions follow this link:
you can use new searches for the detail per : sourcetype/host/source per pool
see http://wiki.splunk.com/Community:TroubleshootingIndexedDataVolume
another query, another one that when you plug it in.. you get nothing 😞
Hi,
Thanks for your reply. However - the stated search only sums up the last 24 hours. What I would need is a prediction of the current day.
Any idea how to accomplish that?