Deployment Architecture

What is the typical size of the (compressed) buckets; given 10gb a day in indexed data, what kind of growth will I see on disk?

Dimitri_McKay
Splunk Employee
Splunk Employee

I know this is dependent on the variance of the data being indexed, but since the indexing mechanism is proprietary, I’d like some real-world numbers.

0 Karma
1 Solution

Dimitri_McKay
Splunk Employee
Splunk Employee

50% (2:1 ratio) is a safe bet. That includes not only the indexes but also the compressed raw log data. So allocating 5GB per day is a good bet, though I'd probably add at least 20% on top for growth.

Also when talking storage, you'll want to consider average search time. So, the majority of searches which take place are usually "last 24 hours" or "last 7 days" but rarely do most searches go beyond that 7 day period. So having 40GB of local storage (as fast as possible as that disk is going to handle collection, compression, indexing and search for that short time period). Then it can be pushed out to slower storage afterward.

View solution in original post

Dimitri_McKay
Splunk Employee
Splunk Employee

50% (2:1 ratio) is a safe bet. That includes not only the indexes but also the compressed raw log data. So allocating 5GB per day is a good bet, though I'd probably add at least 20% on top for growth.

Also when talking storage, you'll want to consider average search time. So, the majority of searches which take place are usually "last 24 hours" or "last 7 days" but rarely do most searches go beyond that 7 day period. So having 40GB of local storage (as fast as possible as that disk is going to handle collection, compression, indexing and search for that short time period). Then it can be pushed out to slower storage afterward.

View solution in original post

Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!