Splunk Search

tstats distinct_count returns an incorrect value on high cardinality source field

yotamros
Explorer

Hey

I've been working on a distributed Splunk environment, where in one of our indexes we have a very high cardinality "source" field (basically different for each event).

I've noticed that using tstats 'distinct_count' to count the number of sources, I am getting an incorrect result (far from one per event).

The query looks something like:

|tstats dc(source) where index=my_index

 

I've noticed that when I search on a smaller number of events (~100,000 instead of ~5,000,000), the result is correct.

In addition, when using estdc I get a better result than dc (which is wildly wrong).

Finally, when using stats instead of tstats, I get the correct value:

index=my_index | stats dc(source)

 

Any ideas? My guess is that I'm hitting some memory barrier, but there is no indication of this.

Labels (2)
0 Karma
1 Solution

yotamros
Explorer

Ended up looking at the search.log and finding the following ERROR:

"SRSSerializer - max str len exceeded - probably corrupt"

After looking at the known issues page, I found SPL-166001 that stated this happens with event that are larger than 16MB. Even though this isn't the case, I tried the workaround offered there:

[search]

results_serial_format=csv

 

This did fix the issue, however sadly this is supposed to affect all search performance.

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

There is a flag you can give to tstats - chunk_size - see the docs here

https://docs.splunk.com/Documentation/Splunk/9.1.1/SearchReference/tstats

It talks about high cardinality distinct counts - you could experiment to see if that makes a difference

 

0 Karma

yotamros
Explorer

Sadly setting chunk_size doesn't make a difference.

I've since tried playing around with limits.conf on both search heads and indexers to no avail.

Also, the queries does seem to work on the indexers (when querying there directly, rather than using the search head).

Another note that might be helpful - the query works on Splunk 7.3 but not on 8.2.2.

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Interesting, it sounds like you have the energy to dig a little deeper. Take a look at these links

https://www.splunk.com/en_us/blog/tips-and-tricks/splunk-clara-fication-job-inspector.html

https://conf.splunk.com/files/2020/slides/TRU1143C.pdf

which show how you can dive into debug logging and the search log - maybe that will throw up something useful.

 

yotamros
Explorer

Ended up looking at the search.log and finding the following ERROR:

"SRSSerializer - max str len exceeded - probably corrupt"

After looking at the known issues page, I found SPL-166001 that stated this happens with event that are larger than 16MB. Even though this isn't the case, I tried the workaround offered there:

[search]

results_serial_format=csv

 

This did fix the issue, however sadly this is supposed to affect all search performance.

bowesmana
SplunkTrust
SplunkTrust

Kudos for digging - glad you found a solution - could you quantify the performance hit?

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...