I just indexed a data set. According to the Indexes page in the Manager, the index is 126MB. However, the Licensing page showed 252MB of consumption. So far as I can tell, the data isn't being indexed twice.
Anyone know what's going on? I have several 100-300MB data sets to load and I don't want to overflow our 10GB license.
You are seeing the compressed data in the index, and the uncompressed data in the license manager.
if you are loading a one time huge load.. it is ok to do a few times a month, as long as you don't get too many that would cause your search to stop working in splunk. After 30 days, those warnings will be gone and you can do it again. This helps you be able to load Historical data when needed without scaling up your splunk instance.
Hi redc,
license consumption is always based on the amount of the raw data not the amount of the data it has after the indexing process. This means for your 10Gb license, you can index 10Gb of raw data per day. For more details and information on the license process check the docs please.
hope this helps ...
cheers, MuS
You are seeing the compressed data in the index, and the uncompressed data in the license manager.
The compression ratio for the raw data is actually better than 50%... That amount also contains index data such as .tsidx files.
Check out this wiki page. It contains a number of index volume searches that you can run and might find useful.
http://wiki.splunk.com/Community:TroubleshootingIndexedDataVolume
It's SQL data being exported as XML through a sqlcmd.exe script, so I have no idea what the size is.
If you have relatively consistent data, like syslog, you could have relatively consistent compression. It sounds like you compression rate is 50%.
How large was the original data set? It should match the license manager.
It seems a little odd that it would always be exactly twice as much, though (this isn't the first time I've noticed this behavior). It is all text in a similar format, but it seems like there should be SOME variation.
aaah, I was typing too long 😉