Getting Data In

REST endpoint: data/indexes-extended - Why is total_raw_size is bigger than total_size

thenhaque
Explorer

I tried to interpret the output the REST endpoint from Splunk doc:
http://docs.splunk.com/Documentation/Splunk/7.0.2/RESTREF/RESTintrospect#data.2Findexes-extended.2F....
and have problem understanding the 2 output parameters total_raw_size and total_size

API:
data/indexes-extended/{name}

Usage details
total_raw_size (If total_size > 0) Cumulative size (fractional MB) on disk of the /rawdata/ directories of all buckets in this index, excluding frozen.
total_size Size (fractional MB) on disk of this index.

Example:
28.000/s:key
22.000/s:key

Question:
Why is total_raw_size bigger the total_size? Note that I got the same result when applying this API on my cluster.

0 Karma

bandit
Motivator

total_raw_size: essentially uncompressed bytes indexed on this indexer for this index
total_size: essentially size on disk for after compression and indexing metadata on this indexer for this index

On average it will be normal for total_size to be 50% of total_raw_size.

0 Karma

strive
Influencer

Hi,

rawSize: The volume in bytes of the raw data files in each bucket. This value represents the volume before compression and the addition of index files.

sizeOnDisk: The size in MB of disk space that the bucket takes up expressed as a floating point number. This value represents the volume of the compressed raw data files and the index files.

http://docs.splunk.com/Documentation/Splunk/7.0.2/SearchReference/Dbinspect

Thanks
Strive

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...