Installation

How come the sum(len(_raw)) of my data does not correlate with my license usage reports?

dwfarris
Explorer

Ok, I am working to trim back some of our indexed data. I initially tried to drill down using a basic sum(len(_raw) for all index broken down by various other fields.

The problem is that the sum counts dont match the counts when compared to Splunk license usage for the index.

In this specific test case, I am comparing the Splunk license usage for ONE index for ONE day. I compare it to the byte sum of all of the _raw records for that SAME index for the SAME ONE day. . .

I expected the counts to at least be similar. . .

My query from a Splunk source to get license info. . .

index=_internal sourcetype=splunkd source=*license_usage.log [| rest splunk_server_group=dmc_group_indexer /services/server/info | rename guid AS i | fields i ] | eval gb=b/1024/1024/1024 | join i [|rest splunk_server_group=dmc_group_indexer /services/server/info | rename guid AS i | fields serverName i]   | search serverName=*rtp* idx=xyzzy_logs | stats sum(gb) by  serverName idx

...yields between 50gb to 53gb per indexer for that ONE index for that ONE day.

Verses...

index=xyzzy_logs splunk_server=*rtp* | eval leng=len(_raw)/1024/1024/1024 | stats sum(leng) as totalgb by splunk_server | table splunk_server, totalgb 

...which yields only 14.7gb to 15.66gb per indexer for the SAME index for the SAME day.

Again, i expected them not to be exactly the same but thought they should be closer than 300%+.

What is splunk licensing counting that does not seem to show up in my indexes?

I tried looking for answers for this. . . i found other posts using similar accepted answers with sum(len(_raw) as a "brute force" way to drill down on sizes. .

See Splunk Answer: How to get license usage data for a particular index with a breakdown of usage by a field?

Labels (1)

sloshburch
Splunk Employee
Splunk Employee

Eval's len function counts characters. Your license usage is measured in bytes. I've gotten burned by this one in the past as well.

"This function returns the character length of a string X."

splunkreal
Motivator

Hello,

thanks for this information, looks like it depends on the data, also license usage can increase if UF resends several times.

Did you find a valid solution to find the volume of a specific data/search (other than index/host/source breakdown)?

Best regards.

* If this helps, please upvote or accept solution if it solved *
0 Karma
Get Updates on the Splunk Community!

Industry Solutions for Supply Chain and OT, Amazon Use Cases, Plus More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Index This | Divide 100 by half. What do you get?

November 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...