Getting Data In

How to calculate and troubleshoot indexing volume by _internal index.

Path Finder

Hi ,

How to calculate indexing volume/disk space usage for _internal index /internal DB per day In GB? Any specific query to find out top host, source type which is sending most of the logs to _internal index. Although _internal index log volume do not calculated against license its consuming high disk space and needs to identify the hosts sending high volume of internal logs. Any specific search query/tstat/db inspect command for same . Please advise.

0 Karma
1 Solution

Hi thezero,

Try this :

index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by host useother=false|untable _time host sum(GB)|top limit=1 host|join -time[search  index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by sourcetype useother=false |untable _time sourcteype sum(GB)]

Also see the following:

http://www.splunk.com/wiki/Community:TroubleshootingIndexedDataVolume

View solution in original post

0 Karma

Hi thezero,

Try this :

index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by host useother=false|untable _time host sum(GB)|top limit=1 host|join -time[search  index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by sourcetype useother=false |untable _time sourcteype sum(GB)]

Also see the following:

http://www.splunk.com/wiki/Community:TroubleshootingIndexedDataVolume

View solution in original post

0 Karma

Motivator

if you are looking to estimate your daily indexing throughput (whether the data counted against your license quota or not), I would recommend to leverage the events of group=thruput name=index_thruput in metrics.log, like so:

index=_internal group=thruput name=index_thruput | timechart span=1d sum(kb) AS daily_GB|eval daily_GB =daily_GB/(1024*1024)."GB"

Do not attempt to use events of group=per_*_thruput to accurately determine license usage or indexing thruput as those represent a sampled measurement.

0 Karma