Monitoring Splunk

daily on-disk rate for a given index?

tedder
Communicator

I know how to get my indexing volume per index. Here's what I use.

index="_internal" source="*metrics.log" per_index_thruput | eval GB=kb/(1024*1024) | stats sum(GB) as total by series date_mday | sort total | fields + date_mday,series,total | reverse

I know how to profile an index by what days it contains, what its size is, etc.

| dbinspect timeformat="%s" index=INDEXNAME
| rename state as category 
| stats min(earliestTime) as earliestTime max(latestTime) as latestTime sum(sizeOnDiskMB) as MB by category 
| convert timeformat="%m/%d/%Y" ctime(earliestTime) as earliestTime ctime(latestTime) as latestTime

What I really need to know is the "conversion rate". In other words, I have 5gb hitting a given index per day, but how much space does that take on disk? It's hard to get that with "df" because the indexes are always changing in size, databases are moving along the hot->warm->cold->frozen train, and so on.

Tags (2)

sowings
Splunk Employee
Splunk Employee

I've done a bit of digging on this myself, comparing the uncompressed size of the raw data (gzip -dc warm_bucket/rawdata/journal.gz | wc) to the total size of the bucket (du -s), and been able to estimate some per-bucket compression ratios. It's not the easiest answer to get to, since it involves direct shell access, traversing many buckets, and may not be accurate for hot buckets, but it's the best I've been able to come up with. HTH.

0 Karma
Get Updates on the Splunk Community!

Index This | Why did the turkey cross the road?

November 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Feel the Splunk Love: Real Stories from Real Customers

Hello Splunk Community,    What’s the best part of hearing how our customers use Splunk? Easy: the positive ...