Solved: Does the license master have disk usage info from ...

bkumarm · ‎03-04-2016

I have a distributed Splunk setup. I have License Master that has around 10 slaves. All the ten are using the same license pool.
We want to have a report that depicts Disk usage by all indexes in the below format

Index sourcetype(s) Indexer Diskallocated DiskUsed PCtDiskUsage

I have tried dbinspect, but it gives data only about itself, not slaves.
I have tried looking at data under index=_introspection.
Tried few other options such as license_usage.log, metrics.log.
Have also tried installing FireBrigade, Splunk On Splunk, Utilization Monitor, Splunk Health monitor apps.
I am doubting if the relevant data is available at License Master level or not.
I observe that the data may be available on the Search head instead.

any help on this please.

MuS · ‎03-28-2016

Hi bkumarm,

Since you're not using any license specific information here, you can run the following query (which is totally un-tuned and uses three sub searches (Bad bad bad and not really performing well!) but this is one way to get this combination of information to get what you need) from your search head:

index=_internal group=per_sourcetype_thruput source="/opt/splunk/var/log/splunk/metrics.log" group=per_sourcetype_thruput 
| eval sizeMB = round(kb/1024,2) 
| stats sum(sizeMB) by series, host, index 
| rename sum(sizeMB) AS UsedInMB series AS sourcetype
| join sourcetype [ | eventcount summarize=false index=* index=_* | dedup index | fields index | map maxsearches=100 search="|metadata type=sourcetypes index=\"$index$\" | eval index=\"$index$\"" | fields index sourcetype ]
| join index [ | REST /services/data/indexes | table title maxTotalDataSizeMB | rename title as index ] 
| stats values(sourcetype) AS sourcetypes sum(UsedInMB) AS UsedInMB values(maxTotalDataSizeMB) AS IDXmaxTotalSizeMB by index, host | eval PCtDiskUsage= UsedInMB*100/IDXmaxTotalSizeMB

Everything before the first join get you the information about the disk space usage per sourcetype. The first join gets all sourcetypes from all indexes and the second join will get the max size per index. The final stats is used to group everything and the eval will get the percentage of disk space used for this sourcetype in this index.

The result will look like this:

And if you want to know why I wrote bad bad bad because of my three subsearches; you can read some details over here https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and get some general guidance on this topic.

Hope this helps ...

cheers, MuS

View solution in original post

bkumarm · ‎03-29-2016

We had also found another version of it:

| rest /services/data/indexes/ count=0 | rename title AS index splunk_server AS Indexer currentDBSizeMB AS usage maxTotalDataSizeMB AS size |join index [| tstats values(sourcetype) AS SourceTypes by index] | stats sum(usage) AS usage sum(size) AS size by index,SourceTypes,Indexer| eval DiskPer=((usage*100)/size) | rename usage as DiskUsage(MB), size AS DiskQuota(MB), DiskPer AS Used(%)

MuS · ‎03-29-2016

Nice one as well; but the tstats sub search seems incorrect it does not return anything. Try this one instead:

| rest /services/data/indexes/ count=0 
| rename title AS index splunk_server AS Indexer currentDBSizeMB AS usage maxTotalDataSizeMB AS size 
| join index [ | tstats count where index=* by sourcetype, index ] 
| stats sum(usage) AS usage sum(size) AS size by index, sourcetype, Indexer
| eval DiskPer=((usage*100)/size) | rename usage as DiskUsage(MB), size AS DiskQuota(MB), DiskPer AS Used(%)

cheers, MuS

snaikwade_splun · ‎11-08-2017

How to avoid the JOIN operation and make use of just stats command?

MuS · ‎11-08-2017

The join in this example is actually not bad because these are two generating searches and the tstats sub search also is not likely to hit any of the nasty sub search limits. But if you insist to remove the join just use the this as the tstats search:

  | tstats append=true prestats=true count where index=* by sourcetype, index

cheers, MuS

MuS · ‎03-28-2016

Hi bkumarm,

Since you're not using any license specific information here, you can run the following query (which is totally un-tuned and uses three sub searches (Bad bad bad and not really performing well!) but this is one way to get this combination of information to get what you need) from your search head:

index=_internal group=per_sourcetype_thruput source="/opt/splunk/var/log/splunk/metrics.log" group=per_sourcetype_thruput 
| eval sizeMB = round(kb/1024,2) 
| stats sum(sizeMB) by series, host, index 
| rename sum(sizeMB) AS UsedInMB series AS sourcetype
| join sourcetype [ | eventcount summarize=false index=* index=_* | dedup index | fields index | map maxsearches=100 search="|metadata type=sourcetypes index=\"$index$\" | eval index=\"$index$\"" | fields index sourcetype ]
| join index [ | REST /services/data/indexes | table title maxTotalDataSizeMB | rename title as index ] 
| stats values(sourcetype) AS sourcetypes sum(UsedInMB) AS UsedInMB values(maxTotalDataSizeMB) AS IDXmaxTotalSizeMB by index, host | eval PCtDiskUsage= UsedInMB*100/IDXmaxTotalSizeMB

Everything before the first join get you the information about the disk space usage per sourcetype. The first join gets all sourcetypes from all indexes and the second join will get the max size per index. The final stats is used to group everything and the eval will get the percentage of disk space used for this sourcetype in this index.

The result will look like this:

And if you want to know why I wrote bad bad bad because of my three subsearches; you can read some details over here https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and get some general guidance on this topic.

Hope this helps ...

cheers, MuS

bkumarm · ‎03-30-2016

while trying to save it as a dashboard, it fails o get the value $index$.
any update for fixing this? I want to display it as dashboard panel showing data for last 30 days !!!
Thanks

MuS · ‎03-30-2016

Easy as this:

index=_internal group=per_sourcetype_thruput source="/opt/splunk/var/log/splunk/metrics.log" group=per_sourcetype_thruput 
 | eval sizeMB = round(kb/1024,2) 
 | stats sum(sizeMB) by series, host, index 
 | rename sum(sizeMB) AS UsedInMB series AS sourcetype
 | join sourcetype [ | eventcount summarize=false index=* index=_* | dedup index | fields index | map maxsearches=100 search="|metadata type=sourcetypes index=\"$$index$$\" | eval index=\"$$index$$\"" | fields index sourcetype ]
 | join index [ | REST /services/data/indexes | table title maxTotalDataSizeMB | rename title as index ] 
 | stats values(sourcetype) AS sourcetypes sum(UsedInMB) AS UsedInMB values(maxTotalDataSizeMB) AS IDXmaxTotalSizeMB by index, host | eval PCtDiskUsage= UsedInMB*100/IDXmaxTotalSizeMB

Hint: http://docs.splunk.com/Documentation/Splunk/6.3.1/Viz/tokens

bkumarm · ‎03-07-2016

License Master does not seem to store Sourcetype level of storage information..
Search Heads can access these data from Indexers. However we cannot have sourcetype level of individual breakup. A method to project this is to list all the sorucetype values in a single data cell.
Splunk team needs to be informed that customers may ask for such info and stating logging the,.

somesoni2 · ‎03-04-2016

Splunk stores data in disk in data buckets which are specific to indexes. So if you're looking for an split of disk usage by sourcetype, it's not available.

The dbinspect can give you disk usage by index and Indexer (field splunk_server in the output is Indexer) and it can be run from Search Head OR License Master as long as it has Indexers added as search peers.

Does the license master have disk usage info from all slaves (all indexes, sourcetypes, sources) to report on?

Welcome to the Splunk Community!

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Adoption of RUM and APM at Splunk