Hi ,
How to calculate indexing volume/disk space usage for _internal index /internal DB per day In GB? Any specific query to find out top host, source type which is sending most of the logs to _internal index. Although _internal index log volume do not calculated against license its consuming high disk space and needs to identify the hosts sending high volume of internal logs. Any specific search query/tstat/db inspect command for same . Please advise.
Hi thezero,
Try this :
index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by host useother=false|untable _time host sum(GB)|top limit=1 host|join -time[search index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by sourcetype useother=false |untable _time sourcteype sum(GB)]
Also see the following:
http://www.splunk.com/wiki/Community:TroubleshootingIndexedDataVolume
Hi thezero,
Try this :
index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by host useother=false|untable _time host sum(GB)|top limit=1 host|join -time[search index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by sourcetype useother=false |untable _time sourcteype sum(GB)]
Also see the following:
http://www.splunk.com/wiki/Community:TroubleshootingIndexedDataVolume
if you are looking to estimate your daily indexing throughput (whether the data counted against your license quota or not), I would recommend to leverage the events of group=thruput name=index_thruput in metrics.log, like so:
index=_internal group=thruput name=index_thruput | timechart span=1d sum(kb) AS daily_GB|eval daily_GB =daily_GB/(1024*1024)."GB"
Do not attempt to use events of group=per_*_thruput to accurately determine license usage or indexing thruput as those represent a sampled measurement.