Monitoring Splunk

daily on-disk rate for a given index?

tedder
Communicator

I know how to get my indexing volume per index. Here's what I use.

index="_internal" source="*metrics.log" per_index_thruput | eval GB=kb/(1024*1024) | stats sum(GB) as total by series date_mday | sort total | fields + date_mday,series,total | reverse

I know how to profile an index by what days it contains, what its size is, etc.

| dbinspect timeformat="%s" index=INDEXNAME
| rename state as category 
| stats min(earliestTime) as earliestTime max(latestTime) as latestTime sum(sizeOnDiskMB) as MB by category 
| convert timeformat="%m/%d/%Y" ctime(earliestTime) as earliestTime ctime(latestTime) as latestTime

What I really need to know is the "conversion rate". In other words, I have 5gb hitting a given index per day, but how much space does that take on disk? It's hard to get that with "df" because the indexes are always changing in size, databases are moving along the hot->warm->cold->frozen train, and so on.

Tags (2)

sowings
Splunk Employee
Splunk Employee

I've done a bit of digging on this myself, comparing the uncompressed size of the raw data (gzip -dc warm_bucket/rawdata/journal.gz | wc) to the total size of the bucket (du -s), and been able to estimate some per-bucket compression ratios. It's not the easiest answer to get to, since it involves direct shell access, traversing many buckets, and may not be accurate for hot buckets, but it's the best I've been able to come up with. HTH.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...