Deployment Architecture

How does Splunk estimate the total raw data size?

srajarat2
Path Finder

I had just setup Splunk with indexer clustering (RF-3, SF-2) with no data and initially loaded 1TB of syslog file using oneshot. The "Index Detail: Deployment" page showed that the total index size is 1121GB whereas the total raw data size (uncompressed) as 1783GB and hence the Raw to Index Size Ratio at 1.59:1.

My question is how is it possible for 1024GB (1TB) file to be treated as 1783GB?

0 Karma

somesoni2
Revered Legend

The index size doesn't only depends upon the uncompressed raw data size. The Splunk create a compressed raw data files, as well as, a set of index files to make it searchable. The index consists of both these type of files. The compression ratio of raw data files and size of index files depends upon various factor. For more information, see following documentation. (see 2nd link for example of how Splunk calculates space).

http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/HowSplunkstoresindexes
http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Systemrequirements#Storage_requirement_exa...

0 Karma

srajarat2
Path Finder

Sorry, if I was not clear. I am not asking about the index size. I do understand the sizing calculations on rawdata (RF) and tsidx (SF) in a clustered indexer mode. My question is specifically on the page "Index Detail: Deployment" page which shows the following information under "Index Structure Overview" (in 6.5.1).

  8 (Indexers)        1121GB (Total Index Size)     1783GB (Total Raw Data size (uncompressed))                                          1.59:1 (Raw to Index Size Ratio)

My question is specifically on how Splunk measures the "Total Raw Data size (uncompressed)" as I just ingested a 1024GB syslog file and I was hoping to see that as the total raw data size and not 1783GB.

0 Karma

srajarat2
Path Finder

You can see the screenshot here.

alt text

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...