Splunk Search

tstats distinct_count returns an incorrect value on high cardinality source field

yotamros
Explorer

Hey

I've been working on a distributed Splunk environment, where in one of our indexes we have a very high cardinality "source" field (basically different for each event).

I've noticed that using tstats 'distinct_count' to count the number of sources, I am getting an incorrect result (far from one per event).

The query looks something like:

|tstats dc(source) where index=my_index

 

I've noticed that when I search on a smaller number of events (~100,000 instead of ~5,000,000), the result is correct.

In addition, when using estdc I get a better result than dc (which is wildly wrong).

Finally, when using stats instead of tstats, I get the correct value:

index=my_index | stats dc(source)

 

Any ideas? My guess is that I'm hitting some memory barrier, but there is no indication of this.

Labels (3)
0 Karma
1 Solution

yotamros
Explorer

Ended up looking at the search.log and finding the following ERROR:

"SRSSerializer - max str len exceeded - probably corrupt"

After looking at the known issues page, I found SPL-166001 that stated this happens with event that are larger than 16MB. Even though this isn't the case, I tried the workaround offered there:

[search]

results_serial_format=csv

 

This did fix the issue, however sadly this is supposed to affect all search performance.

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

There is a flag you can give to tstats - chunk_size - see the docs here

https://docs.splunk.com/Documentation/Splunk/9.1.1/SearchReference/tstats

It talks about high cardinality distinct counts - you could experiment to see if that makes a difference

 

0 Karma

yotamros
Explorer

Sadly setting chunk_size doesn't make a difference.

I've since tried playing around with limits.conf on both search heads and indexers to no avail.

Also, the queries does seem to work on the indexers (when querying there directly, rather than using the search head).

Another note that might be helpful - the query works on Splunk 7.3 but not on 8.2.2.

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Interesting, it sounds like you have the energy to dig a little deeper. Take a look at these links

https://www.splunk.com/en_us/blog/tips-and-tricks/splunk-clara-fication-job-inspector.html

https://conf.splunk.com/files/2020/slides/TRU1143C.pdf

which show how you can dive into debug logging and the search log - maybe that will throw up something useful.

 

yotamros
Explorer

Ended up looking at the search.log and finding the following ERROR:

"SRSSerializer - max str len exceeded - probably corrupt"

After looking at the known issues page, I found SPL-166001 that stated this happens with event that are larger than 16MB. Even though this isn't the case, I tried the workaround offered there:

[search]

results_serial_format=csv

 

This did fix the issue, however sadly this is supposed to affect all search performance.

bowesmana
SplunkTrust
SplunkTrust

Kudos for digging - glad you found a solution - could you quantify the performance hit?

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...