Getting Data In

Best Practices for Handling High-Cardinality Dimensions in Metric Indices?

grunt
New Member

We are using a metrics index to store metric events. These metric events are linked to a different parent dataset through a unique ID dimension. This ID dimension can have tens of thousands of unique values, and the parent dataset primarily consists of string values.

Given the cardinality issues associated with metric indices (where it's best to avoid dimensions with a large range of unique values), what would be the best practice in this scenario?
https://docs.splunk.com/Documentation/Splunk/latest/Metrics/BestPractices#Cardinality_issues 

Would it be a good idea to use a key-value store (kvstore) for the parent data and perform lookups from the metric data? How would this approach impact performance?

Labels (1)
0 Karma

Brett
SplunkTrust
SplunkTrust

Every bucket has to store every dimension value once, so if you are using a million unique IDs to reference combinations of less than a million unique dimension strings, you are making the situation worse.

Using KV Store is a great idea for repetitive asset information, like adding context to a hostname, but in this situation you should still store the meaningful unique identifier (hostname) as a dimension.

I believe your best solution will be some combination of dimensions and KV Store to enrich them, but don't go 100% in either direction, and if you start creating new unique keys to make it work I think it's going too far.

The only other suggestion I have is if you have large logic groups of systems without overlapping dimensions, you could put them into separate indexes and use wildcards in your index filter to access them all. Will keep the TSIDX smaller and performance higher.

isoutamo
SplunkTrust
SplunkTrust

@Brett have you any answers to this?

0 Karma
Get Updates on the Splunk Community!

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...

September Community Champions: A Shoutout to Our Contributors!

As we close the books on another fantastic month, we want to take a moment to celebrate the people who are the ...

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...