Getting Data In

How to create a search that shows how much old data is used past 30 days?

bsrikanthreddy5
Path Finder

Hi, 

I have smartstore cluster in AWS  with frozenTimePeriodInSecs =(7 years) and In DMC I see there are lots of downloading buckets from S3. I would like to know how much old data is retrieved so that I can efficiently allocate space to the cache, does anyone have any spl query to get details on how much old data is retrieved per index. 

Tags (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @bsrikanthreddy5m,

id DMC, you can have all the informations you need.

how many events you have in each index, home many storage you're using for each index.

If you want to customize searches, you could start from the DMC searches; e.g. to have the Total event count and the storage usage, you could see at [Settings -- Monitoring Console -- Indexing -- Indexes and Volums -- Indexes and Volums:Instance], or run the relative search

| rest splunk_server=DESKTOP-KBVMP9Q /services/data/indexes 
  | join title type=outer [
  | rest splunk_server=DESKTOP-KBVMP9Q /services/data/indexes-extended 
  | eval cold_bucket_size = if(isnotnull('bucket_dirs.cold.bucket_size'), 'bucket_dirs.cold.bucket_size', 'bucket_dirs.cold.size')
  | fields title, cold_bucket_size, total_size, total_bucket_count]
| `dmc_exclude_indexes`
| fields title datatype maxTotalDataSizeMB currentDBSizeMB frozenTimePeriodInSecs minTime coldPath.maxDataSizeMB homePath.maxDataSizeMB, homePath, coldPath, cold_bucket_size, total_size, total_bucket_count, totalEventCount
| eval currentDBSizeGB = if(isnotnull(currentDBSizeMB), round(currentDBSizeMB / 1024, 2), 0)
| eval maxTotalDataSizeGB = if((maxTotalDataSizeMB == 0) OR isnull(maxTotalDataSizeMB), "unlimited", round(maxTotalDataSizeMB / 1024, 2))
| eval disk_usage_gb = currentDBSizeGB." / ".maxTotalDataSizeGB
| eval currentTimePeriodDay = round((now() - strptime(minTime,"%Y-%m-%dT%H:%M:%S%z")) / 86400, 0)
| eval currentTimePeriodDay = if(isnull(currentTimePeriodDay), 0, currentTimePeriodDay)
| eval frozenTimePeriodDay = round(frozenTimePeriodInSecs / 86400, 0)
| eval frozenTimePeriodDay = if(isnull(frozenTimePeriodDay) OR frozenTimePeriodDay == 0, "unlimited", frozenTimePeriodDay)
| eval freeze_period_viz = currentTimePeriodDay." / ".frozenTimePeriodDay
| eval total_bucket_count = if(isnotnull(total_bucket_count), total_bucket_count, 0)
| eval totalEventCount = if(isnotnull(totalEventCount), totalEventCount, 0)
| eval home_bucket_size_gb = round((total_size - if(isnull(cold_bucket_size), 0, cold_bucket_size)) / 1024, 2)
| eval home_bucket_size_gb = if(isnull(home_bucket_size_gb), 0, home_bucket_size_gb)
| eval home_bucket_capacity_gb = if(isnull('homePath.maxDataSizeMB') OR 'homePath.maxDataSizeMB' = 0, "unlimited", round('homePath.maxDataSizeMB' / 1024, 2))
| eval home_bucket_usage_gb = home_bucket_size_gb." / ".home_bucket_capacity_gb
| eval cold_bucket_size_gb = if(isnull(cold_bucket_size), 0, round(cold_bucket_size / 1024, 2))
| eval cold_bucket_capacity_gb = if(isnull('coldPath.maxDataSizeMB') OR 'coldPath.maxDataSizeMB' = 0, "unlimited", round('coldPath.maxDataSizeMB' / 1024, 2))
| eval cold_bucket_usage_gb = cold_bucket_size_gb." / ".cold_bucket_capacity_gb
| fields title, datatype, freeze_period_viz, disk_usage_gb, home_bucket_usage_gb, cold_bucket_usage_gb, total_bucket_count, totalEventCount, currentDBSizeGB,
      cold_bucket_size_gb, home_bucket_size_gb, homePath, coldPath | fields title, datatype, freeze_period_viz, disk_usage_gb, home_bucket_usage_gb, cold_bucket_usage_gb, totalEventCount, total_bucket_count
            | eval total_bucket_count=tostring(total_bucket_count, "commas")
            | eval totalEventCount=tostring(totalEventCount, "commas")
            | rename title as Index, datatype as "Data Type", disk_usage_gb as "Index Usage (GB)", freeze_period_viz as "Data Age vs Frozen Age (days)", home_bucket_usage_gb as "Home Path Usage (GB)", cold_bucket_usage_gb as "Cold Path Usage (GB)", total_bucket_count as "Total Bucket Count", totalEventCount as "Total Event Count"

Obviously you can use the time period you need.

Ciao.

Giuseppe

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Observability Simplified: Combining User Experience, Application Performance & ...

Tech Talk Observability Simplified: Combining User Experience, Application Performance & Network ...

Event Series May & June: From Network Visibility to Service Intelligence

Unifying the Network: Moving from Alert Noise to Service Intelligence with Splunk ITSI In today’s hybrid ...

Global Splunk User Group Events: May + June 2026

Your Splunk Community Awaits: Discover Upcoming User Group Events Worldwide    Staying ahead in the fast-paced ...