Installation

How come the sum(len(_raw)) of my data does not correlate with my license usage reports?

dwfarris
Explorer

Ok, I am working to trim back some of our indexed data. I initially tried to drill down using a basic sum(len(_raw) for all index broken down by various other fields.

The problem is that the sum counts dont match the counts when compared to Splunk license usage for the index.

In this specific test case, I am comparing the Splunk license usage for ONE index for ONE day. I compare it to the byte sum of all of the _raw records for that SAME index for the SAME ONE day. . .

I expected the counts to at least be similar. . .

My query from a Splunk source to get license info. . .

index=_internal sourcetype=splunkd source=*license_usage.log [| rest splunk_server_group=dmc_group_indexer /services/server/info | rename guid AS i | fields i ] | eval gb=b/1024/1024/1024 | join i [|rest splunk_server_group=dmc_group_indexer /services/server/info | rename guid AS i | fields serverName i]   | search serverName=*rtp* idx=xyzzy_logs | stats sum(gb) by  serverName idx

...yields between 50gb to 53gb per indexer for that ONE index for that ONE day.

Verses...

index=xyzzy_logs splunk_server=*rtp* | eval leng=len(_raw)/1024/1024/1024 | stats sum(leng) as totalgb by splunk_server | table splunk_server, totalgb 

...which yields only 14.7gb to 15.66gb per indexer for the SAME index for the SAME day.

Again, i expected them not to be exactly the same but thought they should be closer than 300%+.

What is splunk licensing counting that does not seem to show up in my indexes?

I tried looking for answers for this. . . i found other posts using similar accepted answers with sum(len(_raw) as a "brute force" way to drill down on sizes. .

See Splunk Answer: How to get license usage data for a particular index with a breakdown of usage by a field?

Labels (1)

sloshburch
Splunk Employee
Splunk Employee

Eval's len function counts characters. Your license usage is measured in bytes. I've gotten burned by this one in the past as well.

"This function returns the character length of a string X."

splunkreal
Motivator

Hello,

thanks for this information, looks like it depends on the data, also license usage can increase if UF resends several times.

Did you find a valid solution to find the volume of a specific data/search (other than index/host/source breakdown)?

Best regards.

* If this helps, please upvote or accept solution 🙂 *
0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...