Installation

License says twice as much data indexed as what index says

redc
Builder

I just indexed a data set. According to the Indexes page in the Manager, the index is 126MB. However, the Licensing page showed 252MB of consumption. So far as I can tell, the data isn't being indexed twice.

Anyone know what's going on? I have several 100-300MB data sets to load and I don't want to overflow our 10GB license.

Tags (1)
0 Karma
1 Solution

lukejadamec
Super Champion

You are seeing the compressed data in the index, and the uncompressed data in the license manager.

View solution in original post

aelliott
Motivator

if you are loading a one time huge load.. it is ok to do a few times a month, as long as you don't get too many that would cause your search to stop working in splunk. After 30 days, those warnings will be gone and you can do it again. This helps you be able to load Historical data when needed without scaling up your splunk instance.

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi redc,

license consumption is always based on the amount of the raw data not the amount of the data it has after the indexing process. This means for your 10Gb license, you can index 10Gb of raw data per day. For more details and information on the license process check the docs please.

hope this helps ...

cheers, MuS

lukejadamec
Super Champion

You are seeing the compressed data in the index, and the uncompressed data in the license manager.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

The compression ratio for the raw data is actually better than 50%... That amount also contains index data such as .tsidx files.

0 Karma

lukejadamec
Super Champion

Check out this wiki page. It contains a number of index volume searches that you can run and might find useful.

http://wiki.splunk.com/Community:TroubleshootingIndexedDataVolume

0 Karma

redc
Builder

It's SQL data being exported as XML through a sqlcmd.exe script, so I have no idea what the size is.

0 Karma

the_wolverine
Champion

If you have relatively consistent data, like syslog, you could have relatively consistent compression. It sounds like you compression rate is 50%.

0 Karma

lukejadamec
Super Champion

How large was the original data set? It should match the license manager.

0 Karma

redc
Builder

It seems a little odd that it would always be exactly twice as much, though (this isn't the first time I've noticed this behavior). It is all text in a similar format, but it seems like there should be SOME variation.

0 Karma

MuS
SplunkTrust
SplunkTrust

aaah, I was typing too long 😉

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!