Getting Data In

How to find the index footprint by hot, cold, and frozen?

Admiral_Marith
Explorer

Good morning those more knowledgeable than myself 🙂

The index usage default panel which shows such useful information as earliest event, is not quite giving me what I need.

Trying to manage Hot/warm, cold and frozen in such a way as 60% of the data is on hot/warm, 40% of the data is on cold and anything older than 115 days (we promise 90 searchable) goes to frozen.

The frozen data is included in the earliest event calculation, and I'd like to either see my footprint in size for only hot/cold or for all three so I can calculate the hot/cold ratio without the blur introduced by including frozen.

Ideally I'd like to have the ability to control hot/cold retention entirely by the date range of the data, not the size of it but that seems to be impossible to do directly hence calculating it. Now having turned frozen on, the data we were using to size the indexes is being made fuzzy by including the frozen data in the earliest event count.

So what particular Splunk incantation is needed to parse the index footprint data out like that?

-J

niketn
Legend

Option 1
Use Splunk on Splunk (SoS App) or Distributed Management Console (called Management Console in Splunk 6.5) to validate Indexing bucket performance and stats. DMC should have optimized searches which you can run to check out yourself.

Option 2
Use Splunk generating command dbinspect.

 | dbinspect index=_internal | search tsidxState="full" | stats min(startEpoch) as MinStartTime max(startEpoch) as MaxStartTime min(endEpoch) as MinEndTime max(endEpoch) as MaxEndTime  max(hostCount) as MaxHosts max(sourceTypeCount) as MaxSourceTypes sum(eventCount) as TotalEvents sum(rawSize) as TotalRawDataSizeMB sum(sizeOnDiskMB) as TotalDiskDataSizeMB by state | eval TotalRawDataSizeMB =round((TotalRawDataSizeMB/1024/1024),6) | eval MinStartTime=strftime(MinStartTime,"%Y/%m/%d %H:%M:%s") | eval MaxStartTime=strftime(MaxStartTime,"%Y/%m/%d %H:%M:%s")  | eval MinEndTime=strftime(MinEndTime,"%Y/%m/%d %H:%M:%s") | eval MaxEndTime=strftime(MaxEndTime,"%Y/%m/%d %H:%M:%s") | eval PercentSizeReduction=round(((TotalRawDataSizeMB-TotalDiskDataSizeMB)/TotalRawDataSizeMB)*100,2)

Refer to Splunk documentation on DBinspect to come up with query that you need.
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Dbinspect

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

esix_splunk
Splunk Employee
Splunk Employee

I'd also recommend FireBrigade for this. Has a lot more information regarding in-flight buckets and status then the current DMC does.

0 Karma

ddrillic
Ultra Champion

As part of indexes.conf

you have a nice number of configuration parameters to control the buckets, such as -

maxHotBuckets
homePath.maxDataSizeMB 
frozenTimePeriodInSecs
maxWarmDBCount

Hopefully, it can assist you.

Admiral_Marith
Explorer

Yes. I'm aware of all those parameters. What I was shooting for was an alternative from the -->settings-->data-->indexes dialog where you see things like current size, earliest event etc. This is one of the few things where there is no 'open in search' button.

When you sort on earliest event, it appears you get events that are included in frozen. I'm trying to know just the event date ranges for hot/warm/cold.

I have a retention policy of 90 days searchable (delivering 120) and remainder of year archive (275 days but delivering 295).

Since Splunk index sizes are determined by data volume and not event age, I need know the earliest event in my indexes that does not include frozen as a sort of retention audit.

Now, however, I've seen a 42% increase in my index usage all in firewall traffic, so monitoring the earliest event in hot/warm/cold is even more of a concern while we nail down the cause of the spike.

0 Karma

Admiral_Marith
Explorer

I should be a bit more clear. If seeing footprint for hot/cold/frozen I'd like to see the size of the data on hot/cold/frozen and not an aggregate number.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...