Solved: Variance between licence_usage.log and metrics.log

mark · ‎03-07-2013

Hi,

Using v4.3.3 - I’m attempting to track license usage per index. I have quite a decent discrepancy in figures the license_usage.log and metrics.log

For example, using the following I get a figure of license used per day… (there is only one license pool)
index=_internal source=*license_usage.log type=RolloverSummary | eval GB = b/1024/1024/1024 | eval _time = _time - 43200 | timechart span=1d sum(GB) AS "Total GB used"
Sample result is: 59.5

I’d like to break this down per index. I’m using something quite simple such as:
index=_internal source=*metrics.log group="per_index_thruput" | timechart span=1d sum(eval(kb/1024/1024)) AS "GB indexed" by series
Sample total result for yesterday is: 55.3

I realise here that I should exclude summary indexes, etc… However the later search returns less volume than the top and this would only make the difference even greater…

Furthermore, if I split by throughput group:
index=internal source=*metrics.log | where like(group, "per%")|timechart span=1d sum(eval(kb/1024/1024)) by group

I get the following results over the same interval:
license_usage.log: 59.5
per_host_thruput: 42.5
per_index_thruput: 55.3
per_source_thruput: 55.6
per_sourcetype_thruput: 53.8

Seems to be a significant variance – surely the source, sourcetype,index thruput stats should have the same value!? Can anyone make any sense of this or can anyone provide a sensible way for me to get a breakdown of license usage per index without the variances I see here…

Thanks,
Mark

MuS · ‎03-07-2013

Hi mark

I'll try to answer this:

In the docs about what splunk logs about itself link>, you can find the following information:

metrics.log

Contains periodic snapshots of Splunk performance and system data, including information about CPU usage by internal processors and queue usage in Splunk's data processing. The metrics.log file is a sampling of the **top ten items* in each category in 30 second intervals, based on the size of _raw. It can be used for limited analysis of volume trends for data inputs.*
license_usage.log

Indexed volume in bytes per pool, source, sourcetype, and host.

This means, if you want to get reliable information about your license usage you have to use license_usage.log. metrics.log can provide different results, because it only contains the top 10 hot data sources. You can change this setting in the [metrics] stanza in limits.conf.

Find further information on working with metrics.log here>

hope this helps and if my answer was wrong, I'm sure hexx and/or Ayn can provide a superb and detailed one 🙂

cheers,

MuS

View solution in original post

MuS · ‎03-07-2013

Hi mark

I'll try to answer this:

In the docs about what splunk logs about itself link>, you can find the following information:

metrics.log

Contains periodic snapshots of Splunk performance and system data, including information about CPU usage by internal processors and queue usage in Splunk's data processing. The metrics.log file is a sampling of the **top ten items* in each category in 30 second intervals, based on the size of _raw. It can be used for limited analysis of volume trends for data inputs.*
license_usage.log

Indexed volume in bytes per pool, source, sourcetype, and host.

This means, if you want to get reliable information about your license usage you have to use license_usage.log. metrics.log can provide different results, because it only contains the top 10 hot data sources. You can change this setting in the [metrics] stanza in limits.conf.

Find further information on working with metrics.log here>

hope this helps and if my answer was wrong, I'm sure hexx and/or Ayn can provide a superb and detailed one 🙂

cheers,

MuS

MuS · ‎03-11-2013

Hi Mark,

no I see no reason why this should be a bad idea. If you see that it will get you any troubles, than simply revert to the original values.

cheers,
MuS

mark · ‎03-10-2013

Hi MuS

That completely answers my questions with regard to the difference between the various thruput volume and also between metrics.log and license_usage.log.

A final question though.For more accurate volume per index reporting, would you believe is any performance hit in increasing the [metrics] maxseries significantly? 10x more series and also 10x less frequent.

[metrics]
maxseries=100
interval=300

This I assume would make the per index volumes in metrics.log much closer to actual volume count in license_usage.log.

Any reason you can see why this would be a bad idea?

Thanks,
Mark

Variance between licence_usage.log and metrics.log

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)