Monitoring Splunk

Variance between licence_usage.log and metrics.log

mark
Path Finder

Hi,

Using v4.3.3 - I’m attempting to track license usage per index. I have quite a decent discrepancy in figures the license_usage.log and metrics.log

For example, using the following I get a figure of license used per day… (there is only one license pool)
index=_internal source=*license_usage.log type=RolloverSummary | eval GB = b/1024/1024/1024 | eval _time = _time - 43200 | timechart span=1d sum(GB) AS "Total GB used"
Sample result is: 59.5

I’d like to break this down per index. I’m using something quite simple such as:
index=_internal source=*metrics.log group="per_index_thruput" | timechart span=1d sum(eval(kb/1024/1024)) AS "GB indexed" by series
Sample total result for yesterday is: 55.3

I realise here that I should exclude summary indexes, etc… However the later search returns less volume than the top and this would only make the difference even greater…

Furthermore, if I split by throughput group:
index=internal source=*metrics.log | where like(group, "per%")|timechart span=1d sum(eval(kb/1024/1024)) by group

I get the following results over the same interval:
license_usage.log: 59.5
per_host_thruput: 42.5
per_index_thruput: 55.3
per_source_thruput: 55.6
per_sourcetype_thruput: 53.8

Seems to be a significant variance – surely the source, sourcetype,index thruput stats should have the same value!? Can anyone make any sense of this or can anyone provide a sensible way for me to get a breakdown of license usage per index without the variances I see here…

Thanks,
Mark

1 Solution

MuS
Legend

Hi mark

I'll try to answer this:

In the docs about what splunk logs about itself link>, you can find the following information:

  • metrics.log

    Contains periodic snapshots of Splunk performance and system data, including information about CPU usage by internal processors and queue usage in Splunk's data processing. The metrics.log file is a sampling of the **top ten items* in each category in 30 second intervals, based on the size of _raw. It can be used for limited analysis of volume trends for data inputs.*

  • license_usage.log

Indexed volume in bytes per pool, source, sourcetype, and host.

This means, if you want to get reliable information about your license usage you have to use license_usage.log. metrics.log can provide different results, because it only contains the top 10 hot data sources. You can change this setting in the [metrics] stanza in limits.conf.

Find further information on working with metrics.log here>

hope this helps and if my answer was wrong, I'm sure hexx and/or Ayn can provide a superb and detailed one 🙂

cheers,

MuS

View solution in original post

MuS
Legend

Hi mark

I'll try to answer this:

In the docs about what splunk logs about itself link>, you can find the following information:

  • metrics.log

    Contains periodic snapshots of Splunk performance and system data, including information about CPU usage by internal processors and queue usage in Splunk's data processing. The metrics.log file is a sampling of the **top ten items* in each category in 30 second intervals, based on the size of _raw. It can be used for limited analysis of volume trends for data inputs.*

  • license_usage.log

Indexed volume in bytes per pool, source, sourcetype, and host.

This means, if you want to get reliable information about your license usage you have to use license_usage.log. metrics.log can provide different results, because it only contains the top 10 hot data sources. You can change this setting in the [metrics] stanza in limits.conf.

Find further information on working with metrics.log here>

hope this helps and if my answer was wrong, I'm sure hexx and/or Ayn can provide a superb and detailed one 🙂

cheers,

MuS

MuS
Legend

Hi Mark,

no I see no reason why this should be a bad idea. If you see that it will get you any troubles, than simply revert to the original values.

cheers,
MuS

0 Karma

mark
Path Finder

Hi MuS

That completely answers my questions with regard to the difference between the various thruput volume and also between metrics.log and license_usage.log.

A final question though.For more accurate volume per index reporting, would you believe is any performance hit in increasing the [metrics] maxseries significantly? 10x more series and also 10x less frequent.

[metrics]
maxseries=100
interval=300

This I assume would make the per index volumes in metrics.log much closer to actual volume count in license_usage.log.

Any reason you can see why this would be a bad idea?

Thanks,
Mark

Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...