Getting Data In

How to calculate and troubleshoot indexing volume by _internal index.

thezero
Path Finder

Hi ,

How to calculate indexing volume/disk space usage for _internal index /internal DB per day In GB? Any specific query to find out top host, source type which is sending most of the logs to _internal index. Although _internal index log volume do not calculated against license its consuming high disk space and needs to identify the hosts sending high volume of internal logs. Any specific search query/tstat/db inspect command for same . Please advise.

0 Karma
1 Solution

ngatchasandra
Builder

Hi thezero,

Try this :

index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by host useother=false|untable _time host sum(GB)|top limit=1 host|join -time[search  index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by sourcetype useother=false |untable _time sourcteype sum(GB)]

Also see the following:

http://www.splunk.com/wiki/Community:TroubleshootingIndexedDataVolume

View solution in original post

0 Karma

ngatchasandra
Builder

Hi thezero,

Try this :

index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by host useother=false|untable _time host sum(GB)|top limit=1 host|join -time[search  index=_internal source=*license_usage.log type=Usage | eval GB=b/1024/1024/1024 | timechart span=1d sum(GB) by sourcetype useother=false |untable _time sourcteype sum(GB)]

Also see the following:

http://www.splunk.com/wiki/Community:TroubleshootingIndexedDataVolume

0 Karma

fdi01
Motivator

if you are looking to estimate your daily indexing throughput (whether the data counted against your license quota or not), I would recommend to leverage the events of group=thruput name=index_thruput in metrics.log, like so:

index=_internal group=thruput name=index_thruput | timechart span=1d sum(kb) AS daily_GB|eval daily_GB =daily_GB/(1024*1024)."GB"

Do not attempt to use events of group=per_*_thruput to accurately determine license usage or indexing thruput as those represent a sampled measurement.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...