Solved: Host License Usage

DataWrangler

When running license usage reports by host we are hitting the squash_threshold in server.conf.

I've researched this and the only solution I can see it to increase the squash_threshold beyond the number of combinations of index, host, source and sourcetype, which I calculate by running this search:

| tstats count AS tuples where index IN (uk*, us*) by index host source sourcetype

The docs say there will be an impact on memory, though there's no indication on what that might look like.

Do you have any real world experience of the impact of doing this?

Is the license usage log the only method of calculating host based usage? It is from what I've found so far in my reading.

Thanks!

PickleRick

You can either create an indexed field holding the raw event length so you can quickly do tstats.

Or - even better - create a simple datamodel holding length of your events as a calculated field. And accelerate it.

It will have _some_ impact on your environment but still better than plowing through raw data every time you need a report on your license usage.

View solution in original post

tej57

You can also go for configuring the Chargeback App for Splunk. It was created for the same intended purpose. The configuration is quite complex at the initial stage, but once you setup the lookup files it needs, you'll get a clear view of how much of the fund is consumed by which part of the project/team/business etc.

DataWrangler

If you or anyone else has experience of using the chargeback app please do share it. I'm sure others will find it helpful too.

tej57

Hey @DataWrangler,

I have assisted customers set up the Splunk App for Chargeback and they've mentioned that they were able to distribute the costs as per their requirement within the different business units/orgs. You need to make sure that you enter the right proportion in the lookup while configuring the business units. I do agree that a one time setup is a complicated process, but it does eventually help with the task.

Thanks,
Tejas.

PickleRick

You can either create an indexed field holding the raw event length so you can quickly do tstats.

Or - even better - create a simple datamodel holding length of your events as a calculated field. And accelerate it.

It will have _some_ impact on your environment but still better than plowing through raw data every time you need a report on your license usage.

DataWrangler

Two good options there. So I can see we have as is often the case with Splunk, different ways to tackle this issue.

Increase the squash_threshold in server.conf to something larger than our tuple count - has an unquantified memory impact.
Eval a field from _raw event length with len() - due to characters not always being 1 byte won't be 100% accurate but close enough to track usage.
Create a datamodel to store the field from option 2 using a calculated field.
Invest time to figure out the Splunk App for Chargeback.

I will go with the quick option 2 and constrain this to the indexes / hosts I need to report on. I can work through the others when I get some quiet time 😁

Thanks everyone for your responses which helped me think this through and find a practical solution.

PickleRick

The third option is actually kinda like "option 2 on steroids". For a one-off thing, a simple search over raw data will probably suffice. If you're planning on doing this often and especially if your data set is big, you will want to accelerate that somehow.

DataWrangler

You are right. For now this will be an infrequent ask perhaps monthly or quarterly, so I will run it adhoc or schedule it to run as a report overnight.

Definitely worth looking at making this more efficient.

isoutamo

What is the real work issue which you are trying to solve? And is this one shot or continuously needed answer to it?

DataWrangler

We run Splunk in a project and the costs are billed to each part of the project as they are funded separately through separate changes.

So when a new set of servers are built, I'd like to be able to report with reasonable accuracy that these 10 new web servers are using say 5GB of license on average per day for billing.

This will also help us predict future usage when we add another 5 servers of the same type.

It will be an ongoing requirement to accurately report costs and bill correctly.

I've started investigating the Splunk App for Chargeback which seems useful but overly complex for this requirement.

https://splunkbase.splunk.com/app/5688

tej57

Hey @DataWrangler,

I do agree to the document to not increase the squash_threshold. However, if you want host based utilization, you can check metrics.log with group=per_host_thruput and then sum up the values of kb and group it by series. The series field will contain the host values for group=per_host_thruput.

index=_internal source=*var/log/splunk/metrics.log* group=per_host_thruput
| eval gb = round(kb/1024/1024,2)
| timechart sum(gb) as total_ingestion by series

I haven't tried experimenting the value of squash_threshold.

Thanks,
Tejas.

---
If the above solution helps, an upvote is appreciated..!!

isoutamo

As metrics reports only metrics on top X, this didn't get real amount as there probably are many periods when another nodes have more traffic.

Host License Usage

audit

license usage

monitoring console

resource usage

Can’t make it to .conf25? Join us online!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

Calling All Security Pros: Ready to Race Through Boston?

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Are you a member of the Splunk Community?