daily on-disk rate for a given index?

tedder · ‎01-07-2011

I know how to get my indexing volume per index. Here's what I use.

index="_internal" source="*metrics.log" per_index_thruput | eval GB=kb/(1024*1024) | stats sum(GB) as total by series date_mday | sort total | fields + date_mday,series,total | reverse

I know how to profile an index by what days it contains, what its size is, etc.

| dbinspect timeformat="%s" index=INDEXNAME
| rename state as category 
| stats min(earliestTime) as earliestTime max(latestTime) as latestTime sum(sizeOnDiskMB) as MB by category 
| convert timeformat="%m/%d/%Y" ctime(earliestTime) as earliestTime ctime(latestTime) as latestTime

What I really need to know is the "conversion rate". In other words, I have 5gb hitting a given index per day, but how much space does that take on disk? It's hard to get that with "df" because the indexes are always changing in size, databases are moving along the hot->warm->cold->frozen train, and so on.

sowings · ‎05-27-2012

I've done a bit of digging on this myself, comparing the uncompressed size of the raw data (gzip -dc warm_bucket/rawdata/journal.gz | wc) to the total size of the bucket (du -s), and been able to estimate some per-bucket compression ratios. It's not the easiest answer to get to, since it involves direct shell access, traversing many buckets, and may not be accurate for hot buckets, but it's the best I've been able to come up with. HTH.

daily on-disk rate for a given index?

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers

Are you a member of the Splunk Community?

daily on-disk rate for a given index?

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers