Getting Data In

How to create a search that shows how much old data is used past 30 days?

bsrikanthreddy5
Path Finder

Hi, 

I have smartstore cluster in AWS  with frozenTimePeriodInSecs =(7 years) and In DMC I see there are lots of downloading buckets from S3. I would like to know how much old data is retrieved so that I can efficiently allocate space to the cache, does anyone have any spl query to get details on how much old data is retrieved per index. 

Labels (1)
Tags (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @bsrikanthreddy5m,

id DMC, you can have all the informations you need.

how many events you have in each index, home many storage you're using for each index.

If you want to customize searches, you could start from the DMC searches; e.g. to have the Total event count and the storage usage, you could see at [Settings -- Monitoring Console -- Indexing -- Indexes and Volums -- Indexes and Volums:Instance], or run the relative search

| rest splunk_server=DESKTOP-KBVMP9Q /services/data/indexes 
  | join title type=outer [
  | rest splunk_server=DESKTOP-KBVMP9Q /services/data/indexes-extended 
  | eval cold_bucket_size = if(isnotnull('bucket_dirs.cold.bucket_size'), 'bucket_dirs.cold.bucket_size', 'bucket_dirs.cold.size')
  | fields title, cold_bucket_size, total_size, total_bucket_count]
| `dmc_exclude_indexes`
| fields title datatype maxTotalDataSizeMB currentDBSizeMB frozenTimePeriodInSecs minTime coldPath.maxDataSizeMB homePath.maxDataSizeMB, homePath, coldPath, cold_bucket_size, total_size, total_bucket_count, totalEventCount
| eval currentDBSizeGB = if(isnotnull(currentDBSizeMB), round(currentDBSizeMB / 1024, 2), 0)
| eval maxTotalDataSizeGB = if((maxTotalDataSizeMB == 0) OR isnull(maxTotalDataSizeMB), "unlimited", round(maxTotalDataSizeMB / 1024, 2))
| eval disk_usage_gb = currentDBSizeGB." / ".maxTotalDataSizeGB
| eval currentTimePeriodDay = round((now() - strptime(minTime,"%Y-%m-%dT%H:%M:%S%z")) / 86400, 0)
| eval currentTimePeriodDay = if(isnull(currentTimePeriodDay), 0, currentTimePeriodDay)
| eval frozenTimePeriodDay = round(frozenTimePeriodInSecs / 86400, 0)
| eval frozenTimePeriodDay = if(isnull(frozenTimePeriodDay) OR frozenTimePeriodDay == 0, "unlimited", frozenTimePeriodDay)
| eval freeze_period_viz = currentTimePeriodDay." / ".frozenTimePeriodDay
| eval total_bucket_count = if(isnotnull(total_bucket_count), total_bucket_count, 0)
| eval totalEventCount = if(isnotnull(totalEventCount), totalEventCount, 0)
| eval home_bucket_size_gb = round((total_size - if(isnull(cold_bucket_size), 0, cold_bucket_size)) / 1024, 2)
| eval home_bucket_size_gb = if(isnull(home_bucket_size_gb), 0, home_bucket_size_gb)
| eval home_bucket_capacity_gb = if(isnull('homePath.maxDataSizeMB') OR 'homePath.maxDataSizeMB' = 0, "unlimited", round('homePath.maxDataSizeMB' / 1024, 2))
| eval home_bucket_usage_gb = home_bucket_size_gb." / ".home_bucket_capacity_gb
| eval cold_bucket_size_gb = if(isnull(cold_bucket_size), 0, round(cold_bucket_size / 1024, 2))
| eval cold_bucket_capacity_gb = if(isnull('coldPath.maxDataSizeMB') OR 'coldPath.maxDataSizeMB' = 0, "unlimited", round('coldPath.maxDataSizeMB' / 1024, 2))
| eval cold_bucket_usage_gb = cold_bucket_size_gb." / ".cold_bucket_capacity_gb
| fields title, datatype, freeze_period_viz, disk_usage_gb, home_bucket_usage_gb, cold_bucket_usage_gb, total_bucket_count, totalEventCount, currentDBSizeGB,
      cold_bucket_size_gb, home_bucket_size_gb, homePath, coldPath | fields title, datatype, freeze_period_viz, disk_usage_gb, home_bucket_usage_gb, cold_bucket_usage_gb, totalEventCount, total_bucket_count
            | eval total_bucket_count=tostring(total_bucket_count, "commas")
            | eval totalEventCount=tostring(totalEventCount, "commas")
            | rename title as Index, datatype as "Data Type", disk_usage_gb as "Index Usage (GB)", freeze_period_viz as "Data Age vs Frozen Age (days)", home_bucket_usage_gb as "Home Path Usage (GB)", cold_bucket_usage_gb as "Cold Path Usage (GB)", total_bucket_count as "Total Bucket Count", totalEventCount as "Total Event Count"

Obviously you can use the time period you need.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...