Getting Data In

How to create a search that shows how much old data is used past 30 days?

bsrikanthreddy5
Path Finder

Hi, 

I have smartstore cluster in AWS  with frozenTimePeriodInSecs =(7 years) and In DMC I see there are lots of downloading buckets from S3. I would like to know how much old data is retrieved so that I can efficiently allocate space to the cache, does anyone have any spl query to get details on how much old data is retrieved per index. 

Labels (1)
Tags (1)
0 Karma

gcusello
Esteemed Legend

Hi @bsrikanthreddy5m,

id DMC, you can have all the informations you need.

how many events you have in each index, home many storage you're using for each index.

If you want to customize searches, you could start from the DMC searches; e.g. to have the Total event count and the storage usage, you could see at [Settings -- Monitoring Console -- Indexing -- Indexes and Volums -- Indexes and Volums:Instance], or run the relative search

| rest splunk_server=DESKTOP-KBVMP9Q /services/data/indexes 
  | join title type=outer [
  | rest splunk_server=DESKTOP-KBVMP9Q /services/data/indexes-extended 
  | eval cold_bucket_size = if(isnotnull('bucket_dirs.cold.bucket_size'), 'bucket_dirs.cold.bucket_size', 'bucket_dirs.cold.size')
  | fields title, cold_bucket_size, total_size, total_bucket_count]
| `dmc_exclude_indexes`
| fields title datatype maxTotalDataSizeMB currentDBSizeMB frozenTimePeriodInSecs minTime coldPath.maxDataSizeMB homePath.maxDataSizeMB, homePath, coldPath, cold_bucket_size, total_size, total_bucket_count, totalEventCount
| eval currentDBSizeGB = if(isnotnull(currentDBSizeMB), round(currentDBSizeMB / 1024, 2), 0)
| eval maxTotalDataSizeGB = if((maxTotalDataSizeMB == 0) OR isnull(maxTotalDataSizeMB), "unlimited", round(maxTotalDataSizeMB / 1024, 2))
| eval disk_usage_gb = currentDBSizeGB." / ".maxTotalDataSizeGB
| eval currentTimePeriodDay = round((now() - strptime(minTime,"%Y-%m-%dT%H:%M:%S%z")) / 86400, 0)
| eval currentTimePeriodDay = if(isnull(currentTimePeriodDay), 0, currentTimePeriodDay)
| eval frozenTimePeriodDay = round(frozenTimePeriodInSecs / 86400, 0)
| eval frozenTimePeriodDay = if(isnull(frozenTimePeriodDay) OR frozenTimePeriodDay == 0, "unlimited", frozenTimePeriodDay)
| eval freeze_period_viz = currentTimePeriodDay." / ".frozenTimePeriodDay
| eval total_bucket_count = if(isnotnull(total_bucket_count), total_bucket_count, 0)
| eval totalEventCount = if(isnotnull(totalEventCount), totalEventCount, 0)
| eval home_bucket_size_gb = round((total_size - if(isnull(cold_bucket_size), 0, cold_bucket_size)) / 1024, 2)
| eval home_bucket_size_gb = if(isnull(home_bucket_size_gb), 0, home_bucket_size_gb)
| eval home_bucket_capacity_gb = if(isnull('homePath.maxDataSizeMB') OR 'homePath.maxDataSizeMB' = 0, "unlimited", round('homePath.maxDataSizeMB' / 1024, 2))
| eval home_bucket_usage_gb = home_bucket_size_gb." / ".home_bucket_capacity_gb
| eval cold_bucket_size_gb = if(isnull(cold_bucket_size), 0, round(cold_bucket_size / 1024, 2))
| eval cold_bucket_capacity_gb = if(isnull('coldPath.maxDataSizeMB') OR 'coldPath.maxDataSizeMB' = 0, "unlimited", round('coldPath.maxDataSizeMB' / 1024, 2))
| eval cold_bucket_usage_gb = cold_bucket_size_gb." / ".cold_bucket_capacity_gb
| fields title, datatype, freeze_period_viz, disk_usage_gb, home_bucket_usage_gb, cold_bucket_usage_gb, total_bucket_count, totalEventCount, currentDBSizeGB,
      cold_bucket_size_gb, home_bucket_size_gb, homePath, coldPath | fields title, datatype, freeze_period_viz, disk_usage_gb, home_bucket_usage_gb, cold_bucket_usage_gb, totalEventCount, total_bucket_count
            | eval total_bucket_count=tostring(total_bucket_count, "commas")
            | eval totalEventCount=tostring(totalEventCount, "commas")
            | rename title as Index, datatype as "Data Type", disk_usage_gb as "Index Usage (GB)", freeze_period_viz as "Data Age vs Frozen Age (days)", home_bucket_usage_gb as "Home Path Usage (GB)", cold_bucket_usage_gb as "Cold Path Usage (GB)", total_bucket_count as "Total Bucket Count", totalEventCount as "Total Event Count"

Obviously you can use the time period you need.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

How I Instrumented a Rust Application Without Knowing Rust

As a technical writer, I often have to edit or create code snippets for Splunk's distributions of ...

Splunk Community Platform Survey

Hey Splunk Community, Starting today, the community platform may prompt you to participate in a survey. The ...

Observability Highlights | November 2022 Newsletter

 November 2022Observability CloudEnd Of Support Extension for SignalFx Smart AgentSplunk is extending the End ...