Splunk Search

How to dedup after | stats list(blah)?

packet_hunter
Contributor

Scenario: I am extracting sender domains with the following code:

index=mail sourcetype=xemail

    [search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 

        |stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 

            |search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats list(domain) 

Problem: I am not able to find the correct dedup syntax so I only list one unique domain once.
Currently I see
domain_A.tld
domain_A.tld
domain_A.tld
domain_B.tld
domain_C.tld
domain_A.tld

When I want just
domain_A.tld
domain_B.tld
domain_C.tld

Please provide an example.

Thank you

Tags (3)
0 Karma
1 Solution

MuS
Legend

Hi packet_hunter,

just use values() instead of list() in the last stats:

index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats values(domain) 

This will return uniq values for domain http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/CommonStatsFunctions .

values(X)   Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.

Also, you may want to look at this answer https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and try to use stats in your search, because subsearch have limits and are expensive and slow to run over a large data set.

Hop this helps ...

cheers, MuS

View solution in original post

MuS
Legend

Hi packet_hunter,

just use values() instead of list() in the last stats:

index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats values(domain) 

This will return uniq values for domain http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/CommonStatsFunctions .

values(X)   Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.

Also, you may want to look at this answer https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and try to use stats in your search, because subsearch have limits and are expensive and slow to run over a large data set.

Hop this helps ...

cheers, MuS

Get Updates on the Splunk Community!

Congratulations to the 2025-2026 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Community Feedback

We Want to Hear from You! Share Your Feedback on the Splunk Community   The Splunk Community is built for you ...

Manual Instrumentation with Splunk Observability Cloud: Implementing the ...

In our observability journey so far, we've built comprehensive instrumentation for our Worms in Space ...