Splunk Search

Whats the difference between dc (distinct count) and estdc (estimated distinct count)

khourihan_splun
Splunk Employee
Splunk Employee

I have a search that returns unique visitors query over 30 days' worth of logs :

Using dc() it was a lot slower. Here is the comparison:

estdc: 3300 seconds, 15351270
dc: 17700 seconds, 15134261

ESTDC looks good enough, especially given that it's fairly accurate (1.5% difference) and MUCH faster. Any information will be appreciated.

Tags (2)
1 Solution

khourihan_splun
Splunk Employee
Splunk Employee

Basically, the technique is based on hashing and hash collisions. You can estimate how many distinct items you have tried to hash based on the number of hash collisions and the size of the hash bucket.

More or less it will use constant time and resources regardless of the number of unique values. The technique is accurate to about 1-2%, although it may be over or undercounting.

View solution in original post

khourihan_splun
Splunk Employee
Splunk Employee

Basically, the technique is based on hashing and hash collisions. You can estimate how many distinct items you have tried to hash based on the number of hash collisions and the size of the hash bucket.

More or less it will use constant time and resources regardless of the number of unique values. The technique is accurate to about 1-2%, although it may be over or undercounting.

VatsalJagani
Champion

@khourihan_splunk - Could you please elaborate on how does it use constant time and resource regardless of the number of values? As per my understanding if I search for estdc(bytes) it needs to calculate the hash for each value of bytes and then it must go through all the hashes and count number of the collision.

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...