Splunk Dev

considerations on using SSD for hot\cold indexes

Skins
Path Finder

for a small scale distributed (30GB p/d) splunk instance with indexes currently on one disk.

Planning to introduce SSD for hot\warm index.

I have read various posts and

If we were to configure the indexes for say 30-60 days of hot warm data before being rolled to the slower disks would there be anything to consider such as :

When a premium app such as ES also comes into play and the data model summary ranges are larger than the hot\warm retention.
Eg: hot\warm index on SSD keep for 30 days then move to slower disk - however the authentication data model is configured for 1 year ? Would that be a factor to consider or not ?

Anything else to consider ?

gratzi.

0 Karma
1 Solution

s2_splunk
Splunk Employee
Splunk Employee

You can configure the storage location for DMA summaries separately; find tstatsHomePath here.
Switching to SSD will greatly improve search performance for sparse and rare term searches, where random access speeds are important.
For dense searches, things will get CPU bound, because removal of I/O constraints will mean your server will be mostly busy unzipping buckets.
Hope that helps.

View solution in original post

s2_splunk
Splunk Employee
Splunk Employee

You can configure the storage location for DMA summaries separately; find tstatsHomePath here.
Switching to SSD will greatly improve search performance for sparse and rare term searches, where random access speeds are important.
For dense searches, things will get CPU bound, because removal of I/O constraints will mean your server will be mostly busy unzipping buckets.
Hope that helps.

Skins
Path Finder

gratzi,

Would it be best practice to host the tstatsHomePath on the SSD also ?

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

If you have sufficient space, yes, absolutely.

0 Karma

Skins
Path Finder

thx squire

So using the following calculations from this search ..

| dbinspect index=*
| search tsidxState="full"
| stats min(startEpoch) as MinStartTime max(startEpoch) as MaxStartTime min(endEpoch) as MinEndTime max(endEpoch) as MaxEndTime max(hostCount) as MaxHosts max(sourceTypeCount) as MaxSourceTypes sum(eventCount) as TotalEvents sum(rawSize) as TotalRawDataSizeMB sum(sizeOnDiskMB) as TotalDiskDataSizeMB by state
| eval TotalRawDataSizeMB =round((TotalRawDataSizeMB/1024/1024),6)
| eval MinStartTime=strftime(MinStartTime,"%Y/%m/%d %H:%M:%s")
| eval MaxStartTime=strftime(MaxStartTime,"%Y/%m/%d %H:%M:%s")
| eval MinEndTime=strftime(MinEndTime,"%Y/%m/%d %H:%M:%s")
| eval MaxEndTime=strftime(MaxEndTime,"%Y/%m/%d %H:%M:%s")
| eval PercentSizeReduction=round(((TotalRawDataSizeMB-TotalDiskDataSizeMB)/TotalRawDataSizeMB)*100,2)

Run over a 90 day period
(if that was how long i wanted to keep my hot\warm data before rolling to cold)

state TotalRawDataSizeMB TotalDiskDataSizeMB PercentSizeReduction
cold 27315.003618 8304.898440 69.60
hot 49257.884926 15460.234388 68.61
warm 1569389.609292 599056.425956 61.83

Total hot & warm usage on disk = roughly 600GB

So a 1TB SSD would suffice in this instance ?

If a disk of that size was unavailable could we split those indexes and put the ones we use most on the SSD and the others leave where they are ?

How would you make the same calculation for the DMA Summaries ?

0 Karma
Get Updates on the Splunk Community!

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Pack your bags (and maybe your dancing shoes)—Cisco Live is heading to San Diego, June 8–12, 2025, and Splunk ...

Splunk App Dev Community Updates – What’s New and What’s Next

Welcome to your go-to roundup of everything happening in the Splunk App Dev Community! Whether you're building ...

The Latest Cisco Integrations With Splunk Platform!

Join us for an exciting tech talk where we’ll explore the latest integrations in Cisco + Splunk! We’ve ...