Solved: recommended index sizes

awurster · ‎10-29-2012

hi guys -

i have a stand-alone splunk server that i'm trying to size appropriately. we have a fixed 3TB volume to work with.

i am wondering how large or small to make the various indexes, especially the built-in ones: summary, _internal, etc.

it seems like the default sizes would theoretically allow for overrun on the volume (500,000 MB). so i guess my questions are:

1 - can / should we resize the internal indexes (i.e. _internal, history, _audit) to be more aware of the given storage volumes?
2 - what percentage should we reserve for summary indexing? 25% of desired index (and/or main)?

cheers,

andrew

bmacias84 · ‎10-29-2012

That all depends on your requirements for the data stored in your indices. Also how do you have your storage broken for your hot, warm, cold, forzen, archive buckets. Will you be summary indexing all your data, how will that be broken out, hourly,daily, weekly, monthly? Do you have retention/security policy for certain data sources/types?

This varies dramaticly depending on your requirements.

The 500,000 MB is how large your index is across all buckets ( HOT, WARM, COLD).

Additional Reading:

HowSplunkstoresindexes

Setaretirementandarchivingpolicy

Setupmultipleindexes

Howindexingworks

EstimateIndexSize <--Splunk Wiki on how to perform estimations.

View solution in original post

bmacias84 · ‎10-29-2012

That all depends on your requirements for the data stored in your indices. Also how do you have your storage broken for your hot, warm, cold, forzen, archive buckets. Will you be summary indexing all your data, how will that be broken out, hourly,daily, weekly, monthly? Do you have retention/security policy for certain data sources/types?

This varies dramaticly depending on your requirements.

The 500,000 MB is how large your index is across all buckets ( HOT, WARM, COLD).

Additional Reading:

HowSplunkstoresindexes

Setaretirementandarchivingpolicy

Setupmultipleindexes

Howindexingworks

EstimateIndexSize <--Splunk Wiki on how to perform estimations.

bmacias84 · ‎11-02-2012

In most cases the data will have rolled to Frozen and deleted before the Max DB size is approached. Make sure if you modify your indexes.conf buckets may rollover and cause data to be deleted.

awurster · ‎11-01-2012

thank you. so i think it's fair to say that the sum of all your indexes should ideally not exceed the size of your available disk space / volume(s). it seems very unlikely for the internal indexes and so on to really use up much space, however your main / primary indexes should never exceed 100% of available space - perhaps even 90 or 95% is better.
i'm somewhat comparing this to when you partition new disk(s) during an initial OS install (i.e. swap, home, os, etc). the installation process in most cases won't let you allocate more than 100%.

bmacias84 · ‎10-30-2012

In my env I have different types of storage for HOT(LOCAL SSD), WARM (TIER 2 SAN), COLD (TIER 3 SAN). In the end it comes down to knowing your data and configureing indexes based on retention/security/importance. Configuring Settings like maxHotSpanSecs(upper bound of timespan for Hotbuckets), maxHotIdleSecs(Maxlife of hotbucket). Hope this helps.

bmacias84 · ‎10-30-2012

@awurster, "what happens when an indexer runs out of space on disk?" Your indexers will pause (stop indexing) which has a potentional for data loss. You can minimize possible data loss by using indexer acknowldgement, increasing input and output queueSize for streamed data sources. _internal or summary_indexes are just indexes and will have the same rules and will be paused. Once disk space issue has been resolve you indexer will continue indexing. An indexer pausing occurrs at 2000MB free diskspace by default. http://docs.splunk.com/Documentation/Splunk/5.0/Indexer/Setlimitsondiskusage

awurster · ‎10-30-2012

thanks.

i guess in that case my question is more towards "what happens when an indexer runs out of space on disk?" and then "if something like main or another regular index fills up - what happens to retention of data in other key places like _internal or summary?"

just want to avoid any disasters once the disk fills up.

recommended index sizes

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Fuel Your Journey: What’s Waiting for You at the .conf26 Acceleration Station

Join the Final Session of the Data Management & Federation Bootcamp Series

From Data to Insight: Announcing the Winners of the Splunk Dashboard Contest

Join the Conversation