Hey I've been working on a distributed Splunk environment, where in one of our indexes we have a very high cardinality "source" field (basically different for each event). I've noticed that using tstats 'distinct_count' to count the number of sources, I am getting an incorrect result (far from one per event). The query looks something like: |tstats dc(source) where index=my_index I've noticed that when I search on a smaller number of events (~100,000 instead of ~5,000,000), the result is correct. In addition, when using estdc I get a better result than dc (which is wildly wrong). Finally, when using stats instead of tstats, I get the correct value: index=my_index | stats dc(source) Any ideas? My guess is that I'm hitting some memory barrier, but there is no indication of this.
... View more