Splunk Search

Why does dedup count and dc return a different number of values?

tmaltizo
Path Finder

Doing separate searches with dc doesn't match numbers returned by a dedup count, except for the total. This is for the "All time" time frame. But, the issue prevails regardless of the time frame.

=====================================================

Using dc

index="forescout" sourcetype="fs_av_compliance" description="Server*" status="compliant" | stats dc(src_ip)

2804

index="forescout" sourcetype="fs_av_compliance" description="Server*" status="non-compliant" | stats dc(src_ip)

614

index="forescout" sourcetype="fs_av_compliance" description="Server*"| stats dc(src_ip)

2922

=====================================================

Using count

index="forescout" sourcetype="fs_av_compliance" description="Server*" | dedup src_ip | stats count by status | addcoltotals

compliant = 2767
non-compliant = 155
addcoltotals = 2922

Any insight is much appreciated!
Trista

0 Karma
1 Solution

sundareshr
Legend

Here's an example

_time=1 index=forescout ip=x.x.x.x status=complaint
_time=2 index=forescout ip=x.x.x.x status=complaint
_time=3 index=forescout ip=x.x.x.x status=non-complaint

With the above sample data dc(ip) will return 1 for compliant and 1 for non-compliant, Whereas dedup ip | stats count by ip will return only one for compliant.

For a more appropriate comparison try 'dedup ip status | stats count by status | addtotals`

View solution in original post

somesoni2
Revered Legend

Suppose your data set is this

src_ip  status
--------------------
src1    Compliance
src1    Compliance
src2    Non-compliance
src1    Non-compliance
src2    Compliance
src3    Compliance
src4    Non-compliance

Output of query 1 (distinct count of src_ip where status =Compliance) is 3 (src1, src2 and src3)
Output of query 2 (distinct count of src_ip where status =Non-compliance) is 3 (src2, src1 and src4)
Output of query 3 (distinct count of src_ip regardless of status) is 4 (src1,src2,src3 and src4)

This will be the output of query 4 after you run till dedup src_ip (take the first events for each src_ip)

src_ip  status
-----------
src1    Compliance
src2    Non-compliance
src3    Compliance
src4    Non-compliance

So, the count of src_ip with status=Compliance is now 2,
So, the count of src_ip with status=Non-compliance is now 2,
And total count is still 4 as there are still 4 distinct src_ip.

Hope this helps.

tmaltizo
Path Finder

This definitely helps @somesoni2! Thank you!

0 Karma

sundareshr
Legend

Here's an example

_time=1 index=forescout ip=x.x.x.x status=complaint
_time=2 index=forescout ip=x.x.x.x status=complaint
_time=3 index=forescout ip=x.x.x.x status=non-complaint

With the above sample data dc(ip) will return 1 for compliant and 1 for non-compliant, Whereas dedup ip | stats count by ip will return only one for compliant.

For a more appropriate comparison try 'dedup ip status | stats count by status | addtotals`

tmaltizo
Path Finder

Thanks for the clarification @sundareshr!

0 Karma

tmaltizo
Path Finder

@sundareshr,

If dc counts each unique ip/status and dedup counts only the first instance, then why are the totals the same?

... | dedup src_ip | stats count(src_ip) = 2928
... | stats dc(src_ip) = 2928

When I run the following....
... | dedup src_ip status | stats count by status | addtotals

compliant = 2809, total=2809
non-compliant = 616, total=616

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...