Splunk Search

Why does dedup count and dc return a different number of values?

tmaltizo
Path Finder

Doing separate searches with dc doesn't match numbers returned by a dedup count, except for the total. This is for the "All time" time frame. But, the issue prevails regardless of the time frame.

=====================================================

Using dc

index="forescout" sourcetype="fs_av_compliance" description="Server*" status="compliant" | stats dc(src_ip)

2804

index="forescout" sourcetype="fs_av_compliance" description="Server*" status="non-compliant" | stats dc(src_ip)

614

index="forescout" sourcetype="fs_av_compliance" description="Server*"| stats dc(src_ip)

2922

=====================================================

Using count

index="forescout" sourcetype="fs_av_compliance" description="Server*" | dedup src_ip | stats count by status | addcoltotals

compliant = 2767
non-compliant = 155
addcoltotals = 2922

Any insight is much appreciated!
Trista

0 Karma
1 Solution

sundareshr
Legend

Here's an example

_time=1 index=forescout ip=x.x.x.x status=complaint
_time=2 index=forescout ip=x.x.x.x status=complaint
_time=3 index=forescout ip=x.x.x.x status=non-complaint

With the above sample data dc(ip) will return 1 for compliant and 1 for non-compliant, Whereas dedup ip | stats count by ip will return only one for compliant.

For a more appropriate comparison try 'dedup ip status | stats count by status | addtotals`

View solution in original post

somesoni2
Revered Legend

Suppose your data set is this

src_ip  status
--------------------
src1    Compliance
src1    Compliance
src2    Non-compliance
src1    Non-compliance
src2    Compliance
src3    Compliance
src4    Non-compliance

Output of query 1 (distinct count of src_ip where status =Compliance) is 3 (src1, src2 and src3)
Output of query 2 (distinct count of src_ip where status =Non-compliance) is 3 (src2, src1 and src4)
Output of query 3 (distinct count of src_ip regardless of status) is 4 (src1,src2,src3 and src4)

This will be the output of query 4 after you run till dedup src_ip (take the first events for each src_ip)

src_ip  status
-----------
src1    Compliance
src2    Non-compliance
src3    Compliance
src4    Non-compliance

So, the count of src_ip with status=Compliance is now 2,
So, the count of src_ip with status=Non-compliance is now 2,
And total count is still 4 as there are still 4 distinct src_ip.

Hope this helps.

tmaltizo
Path Finder

This definitely helps @somesoni2! Thank you!

0 Karma

sundareshr
Legend

Here's an example

_time=1 index=forescout ip=x.x.x.x status=complaint
_time=2 index=forescout ip=x.x.x.x status=complaint
_time=3 index=forescout ip=x.x.x.x status=non-complaint

With the above sample data dc(ip) will return 1 for compliant and 1 for non-compliant, Whereas dedup ip | stats count by ip will return only one for compliant.

For a more appropriate comparison try 'dedup ip status | stats count by status | addtotals`

tmaltizo
Path Finder

Thanks for the clarification @sundareshr!

0 Karma

tmaltizo
Path Finder

@sundareshr,

If dc counts each unique ip/status and dedup counts only the first instance, then why are the totals the same?

... | dedup src_ip | stats count(src_ip) = 2928
... | stats dc(src_ip) = 2928

When I run the following....
... | dedup src_ip status | stats count by status | addtotals

compliant = 2809, total=2809
non-compliant = 616, total=616

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...