Getting Data In

How to create a search that shows how much old data is used past 30 days?

bsrikanthreddy5
Path Finder

Hi, 

I have smartstore cluster in AWS  with frozenTimePeriodInSecs =(7 years) and In DMC I see there are lots of downloading buckets from S3. I would like to know how much old data is retrieved so that I can efficiently allocate space to the cache, does anyone have any spl query to get details on how much old data is retrieved per index. 

Labels (1)
Tags (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @bsrikanthreddy5m,

id DMC, you can have all the informations you need.

how many events you have in each index, home many storage you're using for each index.

If you want to customize searches, you could start from the DMC searches; e.g. to have the Total event count and the storage usage, you could see at [Settings -- Monitoring Console -- Indexing -- Indexes and Volums -- Indexes and Volums:Instance], or run the relative search

| rest splunk_server=DESKTOP-KBVMP9Q /services/data/indexes 
  | join title type=outer [
  | rest splunk_server=DESKTOP-KBVMP9Q /services/data/indexes-extended 
  | eval cold_bucket_size = if(isnotnull('bucket_dirs.cold.bucket_size'), 'bucket_dirs.cold.bucket_size', 'bucket_dirs.cold.size')
  | fields title, cold_bucket_size, total_size, total_bucket_count]
| `dmc_exclude_indexes`
| fields title datatype maxTotalDataSizeMB currentDBSizeMB frozenTimePeriodInSecs minTime coldPath.maxDataSizeMB homePath.maxDataSizeMB, homePath, coldPath, cold_bucket_size, total_size, total_bucket_count, totalEventCount
| eval currentDBSizeGB = if(isnotnull(currentDBSizeMB), round(currentDBSizeMB / 1024, 2), 0)
| eval maxTotalDataSizeGB = if((maxTotalDataSizeMB == 0) OR isnull(maxTotalDataSizeMB), "unlimited", round(maxTotalDataSizeMB / 1024, 2))
| eval disk_usage_gb = currentDBSizeGB." / ".maxTotalDataSizeGB
| eval currentTimePeriodDay = round((now() - strptime(minTime,"%Y-%m-%dT%H:%M:%S%z")) / 86400, 0)
| eval currentTimePeriodDay = if(isnull(currentTimePeriodDay), 0, currentTimePeriodDay)
| eval frozenTimePeriodDay = round(frozenTimePeriodInSecs / 86400, 0)
| eval frozenTimePeriodDay = if(isnull(frozenTimePeriodDay) OR frozenTimePeriodDay == 0, "unlimited", frozenTimePeriodDay)
| eval freeze_period_viz = currentTimePeriodDay." / ".frozenTimePeriodDay
| eval total_bucket_count = if(isnotnull(total_bucket_count), total_bucket_count, 0)
| eval totalEventCount = if(isnotnull(totalEventCount), totalEventCount, 0)
| eval home_bucket_size_gb = round((total_size - if(isnull(cold_bucket_size), 0, cold_bucket_size)) / 1024, 2)
| eval home_bucket_size_gb = if(isnull(home_bucket_size_gb), 0, home_bucket_size_gb)
| eval home_bucket_capacity_gb = if(isnull('homePath.maxDataSizeMB') OR 'homePath.maxDataSizeMB' = 0, "unlimited", round('homePath.maxDataSizeMB' / 1024, 2))
| eval home_bucket_usage_gb = home_bucket_size_gb." / ".home_bucket_capacity_gb
| eval cold_bucket_size_gb = if(isnull(cold_bucket_size), 0, round(cold_bucket_size / 1024, 2))
| eval cold_bucket_capacity_gb = if(isnull('coldPath.maxDataSizeMB') OR 'coldPath.maxDataSizeMB' = 0, "unlimited", round('coldPath.maxDataSizeMB' / 1024, 2))
| eval cold_bucket_usage_gb = cold_bucket_size_gb." / ".cold_bucket_capacity_gb
| fields title, datatype, freeze_period_viz, disk_usage_gb, home_bucket_usage_gb, cold_bucket_usage_gb, total_bucket_count, totalEventCount, currentDBSizeGB,
      cold_bucket_size_gb, home_bucket_size_gb, homePath, coldPath | fields title, datatype, freeze_period_viz, disk_usage_gb, home_bucket_usage_gb, cold_bucket_usage_gb, totalEventCount, total_bucket_count
            | eval total_bucket_count=tostring(total_bucket_count, "commas")
            | eval totalEventCount=tostring(totalEventCount, "commas")
            | rename title as Index, datatype as "Data Type", disk_usage_gb as "Index Usage (GB)", freeze_period_viz as "Data Age vs Frozen Age (days)", home_bucket_usage_gb as "Home Path Usage (GB)", cold_bucket_usage_gb as "Cold Path Usage (GB)", total_bucket_count as "Total Bucket Count", totalEventCount as "Total Event Count"

Obviously you can use the time period you need.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...