Solved: How to validate actual GB count of license usage?

a212830 · ‎11-19-2016

Hi,

I have a customer who is challenging the license numbers being reported by Splunk for his hosts. Is there a way to actually count the number of bytes for all of his events over a time period?

martin_mueller · ‎11-20-2016

Depending on the amount of data and what parts of Splunk's internal counting you trust or mistrust, there are several approaches.

In any case, you're probably comparing against the license usage view, so Settings -> License or something like that on your license master. That's nice visually, but underneath there is anactual log you want to look at: index=_internal source=*license_usage* component=LicenseUsage

type=RolloverSummary has daily summaries, that's what is displayed in the 30-day view by default.
type=Usage has detailed usage on 30-second intervals iirc, that's displayed in the 30-day view if you split by some of the more specific fields.

Assuming the view itself isn't broken and is reporting that log correctly, you'll want to compare other sources of information against that log. You can use short timespans and compare with Usage over that span, or whole days and compare with RolloverSummary, or both.

As for other sources of info, here are a few ideas.

brute force
Pick a suspicious, high-volume set of data like a certain sourcetype or index, and pipe it through | eval length = length(_raw) | stats sum(length), then compare that number to the data you get from license usage logging. The search may be unfeasible for larger sets of data, but should be most precise.
metrics logging
In the internal index there's also source=*metrics* that provides a second source of data. It won't always line up with licensing, but for your larger indexes, sourcetypes, etc. the group=per_index_thruput or group=per_sourcetype_thruput should be pretty good data.
For a twist, forwarders also log metrics... but getting the right metric from the right set of forwarders can be tricky. I'd recommend starting with indexer metrics logs.
You might ask "but that's splunk counting, I don't trust splunk's counting!" ... well yeah, but assuming the license counter had a bug in your case, the metrics counter might not have that same bug.
dbinspect
This should be the fastest, least-splunk-counter-dependent way... but also the least accurate. When you run | dbinspect index=foo you get a rawSize field for each bucket. If you have an index that has never had buckets rolled out and you have license usage log data for the entire life of the index, comparing the total rawSize should line up with the total license usage for that index.

Finally, make sure you're comparing apples with apples.
Have you looked at all indexers connected to your license master?
Have you looked at all data for the to-be-compared time range? There could be data for yesterday coming in today, so it'll be timerange-sorted into yesterday but license-reported into today.

View solution in original post

sloshburch · ‎11-21-2016

Cross referencing similar post: https://answers.splunk.com/answers/476227/help-with-license-validation.html

martin_mueller · ‎11-20-2016

Depending on the amount of data and what parts of Splunk's internal counting you trust or mistrust, there are several approaches.

In any case, you're probably comparing against the license usage view, so Settings -> License or something like that on your license master. That's nice visually, but underneath there is anactual log you want to look at: index=_internal source=*license_usage* component=LicenseUsage

type=RolloverSummary has daily summaries, that's what is displayed in the 30-day view by default.
type=Usage has detailed usage on 30-second intervals iirc, that's displayed in the 30-day view if you split by some of the more specific fields.

Assuming the view itself isn't broken and is reporting that log correctly, you'll want to compare other sources of information against that log. You can use short timespans and compare with Usage over that span, or whole days and compare with RolloverSummary, or both.

As for other sources of info, here are a few ideas.

brute force
Pick a suspicious, high-volume set of data like a certain sourcetype or index, and pipe it through | eval length = length(_raw) | stats sum(length), then compare that number to the data you get from license usage logging. The search may be unfeasible for larger sets of data, but should be most precise.
metrics logging
In the internal index there's also source=*metrics* that provides a second source of data. It won't always line up with licensing, but for your larger indexes, sourcetypes, etc. the group=per_index_thruput or group=per_sourcetype_thruput should be pretty good data.
For a twist, forwarders also log metrics... but getting the right metric from the right set of forwarders can be tricky. I'd recommend starting with indexer metrics logs.
You might ask "but that's splunk counting, I don't trust splunk's counting!" ... well yeah, but assuming the license counter had a bug in your case, the metrics counter might not have that same bug.
dbinspect
This should be the fastest, least-splunk-counter-dependent way... but also the least accurate. When you run | dbinspect index=foo you get a rawSize field for each bucket. If you have an index that has never had buckets rolled out and you have license usage log data for the entire life of the index, comparing the total rawSize should line up with the total license usage for that index.

Finally, make sure you're comparing apples with apples.
Have you looked at all indexers connected to your license master?
Have you looked at all data for the to-be-compared time range? There could be data for yesterday coming in today, so it'll be timerange-sorted into yesterday but license-reported into today.

sloshburch · ‎11-21-2016

This is spot on. Summarizing: You have bytes (b) broken down by source (s), sourcetype (st), host (h), etc... in the license_usage.log when type=Usage. It's possible that those get collapsed if usage is too intense. So as @martin_mueller points out, there are other data points that can be used OR go to the data itself with an eval size = len(_raw).

If these numbers add up and the customer is still pushing back, feel free to engage the account team from Splunk for assistance.

a212830 · ‎11-22-2016

Gooooood info. Thanks!

skoelpin · ‎11-19-2016

Well you could use Splunk to count it for you, but if you suspect Splunk is reporting it incorrectly then this may not solve your problem.. You will have to go and tally up the file sizes on the servers between a specific time period and sum them up

a212830 · ‎11-19-2016

I'd rather go the route of having Splunk count it, at least initially. How can I do that?

pradeepkumarg · ‎11-19-2016

You can get an approximate count of character length in events by doing average of len(_raw) over a small set of events and then multiply the average length with the total eventcount using |tstats and convert your # of characters into bytes, MB and so on.. again this is clearly an approximation but gives you fair idea

How to validate actual GB count of license usage?

license

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?

Splunk Education Goes to Washington | Splunk GovSummit 2024