Splunk Cloud Platform

Is this a Splunk Bug? tstats + PREFIX() over a summary index doesn't return all DISTINCT COUNTS?

isaiz
Loves-to-Learn Lots

Hello again.

 

I am testing a "light" version of an index completely compatible with the tstats + PREFIX() method (selecting only the fields I work with and removing all major breakers of field values from the _raw) as an alternative to datamodels, since it's waaaay faster.

newraw_test gen.PNG

 

My first test has been computing the distinct count value of a field (sessionid) with extremely high cardinality but without major breakers (so prefix compatible) both in the original index and my summary index for a given hour.

ORIGINAL INDEX:  48.692.463 distinct session ids

fortinet_data dc.PNG

SUMMARY INDEX: 6.016.022 distinct session ids

newraw_test dc.PNG

 

However, if I do the alternative way of doing DC (count by sessionid so for each different sessionid it generates a row and then I count all the rows) it gives me the correct result.

SUMMARY INDEX with count of counts method: 48.692.463 distinct session ids
newraw_test count count.PNG

 

So the problem is in the DC function. It seems the issue occurs when splunk gathers the DC chunks to generate the final result, but tuning chunk_size parameter has no effect whatsoever. When I do the same test with smaller time ranges so distinct sessions >1.000.000 both original index and summary index DCs give me the same result.

 

How can I solve the problem? Is this a Splunk bug?

Labels (2)
0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...