Splunk Search

Why does dedup count and dc return a different number of values?

Explorer

Doing separate searches with dc doesn't match numbers returned by a dedup count, except for the total. This is for the "All time" time frame. But, the issue prevails regardless of the time frame.

=====================================================

Using dc

index="forescout" sourcetype="fsavcompliance" description="Server" *status="compliant"** | stats dc(src_ip)

2804

index="forescout" sourcetype="fsavcompliance" description="Server" *status="non-compliant"** | stats dc(src_ip)

614

index="forescout" sourcetype="fsavcompliance" description="Server*"| stats dc(src_ip)

2922

=====================================================

Using count

index="forescout" sourcetype="fsavcompliance" description="Server*" | dedup src_ip | stats count by status | addcoltotals

compliant = 2767
non-compliant = 155
addcoltotals = 2922

Any insight is much appreciated!
Trista

0 Karma
1 Solution

Legend

Here's an example

_time=1 index=forescout ip=x.x.x.x status=complaint
_time=2 index=forescout ip=x.x.x.x status=complaint
_time=3 index=forescout ip=x.x.x.x status=non-complaint

With the above sample data dc(ip) will return 1 for compliant and 1 for non-compliant, Whereas dedup ip | stats count by ip will return only one for compliant.

For a more appropriate comparison try 'dedup ip status | stats count by status | addtotals`

View solution in original post

SplunkTrust
SplunkTrust

Suppose your data set is this

src_ip  status
--------------------
src1    Compliance
src1    Compliance
src2    Non-compliance
src1    Non-compliance
src2    Compliance
src3    Compliance
src4    Non-compliance

Output of query 1 (distinct count of srcip where status =Compliance) is 3 (src1, src2 and src3)
Output of query 2 (distinct count of src
ip where status =Non-compliance) is 3 (src2, src1 and src4)
Output of query 3 (distinct count of src_ip regardless of status) is 4 (src1,src2,src3 and src4)

This will be the output of query 4 after you run till dedup srcip (take the first events for each srcip)

src_ip  status
-----------
src1    Compliance
src2    Non-compliance
src3    Compliance
src4    Non-compliance

So, the count of srcip with status=Compliance is now 2,
So, the count of src
ip with status=Non-compliance is now 2,
And total count is still 4 as there are still 4 distinct src_ip.

Hope this helps.

Explorer

This definitely helps @somesoni2! Thank you!

0 Karma

Legend

Here's an example

_time=1 index=forescout ip=x.x.x.x status=complaint
_time=2 index=forescout ip=x.x.x.x status=complaint
_time=3 index=forescout ip=x.x.x.x status=non-complaint

With the above sample data dc(ip) will return 1 for compliant and 1 for non-compliant, Whereas dedup ip | stats count by ip will return only one for compliant.

For a more appropriate comparison try 'dedup ip status | stats count by status | addtotals`

View solution in original post

Explorer

Thanks for the clarification @sundareshr!

0 Karma

Explorer

@sundareshr,

If dc counts each unique ip/status and dedup counts only the first instance, then why are the totals the same?

... | dedup srcip | stats count(srcip) = 2928
... | stats dc(src_ip) = 2928

When I run the following....
... | dedup src_ip status | stats count by status | addtotals

compliant = 2809, total=2809
non-compliant = 616, total=616

0 Karma