- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi guys -
i have a stand-alone splunk server that i'm trying to size appropriately. we have a fixed 3TB volume to work with.
i am wondering how large or small to make the various indexes, especially the built-in ones: summary, _internal, etc.
it seems like the default sizes would theoretically allow for overrun on the volume (500,000 MB). so i guess my questions are:
1 - can / should we resize the internal indexes (i.e. _internal, history, _audit) to be more aware of the given storage volumes?
2 - what percentage should we reserve for summary indexing? 25% of desired index (and/or main)?
cheers,
andrew
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That all depends on your requirements for the data stored in your indices. Also how do you have your storage broken for your hot, warm, cold, forzen, archive buckets. Will you be summary indexing all your data, how will that be broken out, hourly,daily, weekly, monthly? Do you have retention/security policy for certain data sources/types?
This varies dramaticly depending on your requirements.
The 500,000 MB is how large your index is across all buckets ( HOT, WARM, COLD).
Additional Reading:
Setaretirementandarchivingpolicy
EstimateIndexSize <--Splunk Wiki on how to perform estimations.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That all depends on your requirements for the data stored in your indices. Also how do you have your storage broken for your hot, warm, cold, forzen, archive buckets. Will you be summary indexing all your data, how will that be broken out, hourly,daily, weekly, monthly? Do you have retention/security policy for certain data sources/types?
This varies dramaticly depending on your requirements.
The 500,000 MB is how large your index is across all buckets ( HOT, WARM, COLD).
Additional Reading:
Setaretirementandarchivingpolicy
EstimateIndexSize <--Splunk Wiki on how to perform estimations.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In most cases the data will have rolled to Frozen and deleted before the Max DB size is approached. Make sure if you modify your indexes.conf buckets may rollover and cause data to be deleted.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thank you. so i think it's fair to say that the sum of all your indexes should ideally not exceed the size of your available disk space / volume(s). it seems very unlikely for the internal indexes and so on to really use up much space, however your main / primary indexes should never exceed 100% of available space - perhaps even 90 or 95% is better.
i'm somewhat comparing this to when you partition new disk(s) during an initial OS install (i.e. swap, home, os, etc). the installation process in most cases won't let you allocate more than 100%.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In my env I have different types of storage for HOT(LOCAL SSD), WARM (TIER 2 SAN), COLD (TIER 3 SAN). In the end it comes down to knowing your data and configureing indexes based on retention/security/importance. Configuring Settings like maxHotSpanSecs(upper bound of timespan for Hotbuckets), maxHotIdleSecs(Maxlife of hotbucket). Hope this helps.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@awurster, "what happens when an indexer runs out of space on disk?" Your indexers will pause (stop indexing) which has a potentional for data loss. You can minimize possible data loss by using indexer acknowldgement, increasing input and output queueSize for streamed data sources. _internal or summary_indexes are just indexes and will have the same rules and will be paused. Once disk space issue has been resolve you indexer will continue indexing. An indexer pausing occurrs at 2000MB free diskspace by default. http://docs.splunk.com/Documentation/Splunk/5.0/Indexer/Setlimitsondiskusage
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks.
i guess in that case my question is more towards "what happens when an indexer runs out of space on disk?" and then "if something like main or another regular index fills up - what happens to retention of data in other key places like _internal or summary?"
just want to avoid any disasters once the disk fills up.
