Getting Data In

Disk space issue on Indexer

siva_cg
Path Finder

Hi All,

We have Replication factor as 2 and search factor as 2 in 2 different sites in clustered environment.
For an index with 11 GB of license consumption per day, it consumed 40 Gb of disk space. I just want to know, what could be the relation for this? What are the attributes that affects the disk space on the indexer? Thanks in advance.

0 Karma

hunderliggur
Path Finder

It all depends on what you are doing with the 11GB you are indexing. If it is "classic" log files (syslog) you are looking at about 15% for raw data and 35% for index files (=50%). With a rf:2 and sf:2 you have two copies of each so you will use 11GB.

If you data is very verbose (e.g., json data) and you are using indexed extrations you can look at upwards of 150% of ingest for storage x 2 = 300% (33GB).

Any other processing you do (summary indexing, reporting, etc.) will take additional space.

0 Karma

hunderliggur
Path Finder

...and if you are a multisite cluster we would need to know what your server.conf for the master is for replication.

0 Karma

siva_cg
Path Finder

Thank you @hunderliggur. Your explanation fits a little to my environment as we use many indexed extractions and tstats. As we have many reports based on tstats, what could be better solution to optimize my disk space? Thanks in advance.

0 Karma

hunderliggur
Path Finder

The only quick solution would be to change your searchable copies to 1. However, if you have a node failure you will have a processing delay as explained here https://docs.splunk.com/Documentation/Splunk/7.3.1/Indexer/Thesearchfactor

Otherwise, you need to quantify where your space is being used. Is it index buckets and metadata, dispatch files, job history, etc?

0 Karma

codebuilder
Influencer

The 11GB reported by your license consumption includes only the amount of data that was actually indexed. Replication does not count against your license.

The 40GB would reflect indexed data, replicated data, meta files, logs, etc.

Replication factor would be the biggest attribute affecting disk consumption, but you may also want to look at compression, if you do not already have it enabled.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...

SplunkTrust Application Period is Officially OPEN!

It's that time, folks! The application/nomination period for the 2026-2027 SplunkTrust is officially open. If ...