Splunk Search

How to dedup after | stats list(blah)?

packet_hunter
Contributor

Scenario: I am extracting sender domains with the following code:

index=mail sourcetype=xemail

    [search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 

        |stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 

            |search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats list(domain) 

Problem: I am not able to find the correct dedup syntax so I only list one unique domain once.
Currently I see
domain_A.tld
domain_A.tld
domain_A.tld
domain_B.tld
domain_C.tld
domain_A.tld

When I want just
domain_A.tld
domain_B.tld
domain_C.tld

Please provide an example.

Thank you

Tags (3)
0 Karma
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi packet_hunter,

just use values() instead of list() in the last stats:

index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats values(domain) 

This will return uniq values for domain http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/CommonStatsFunctions .

values(X)   Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.

Also, you may want to look at this answer https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and try to use stats in your search, because subsearch have limits and are expensive and slow to run over a large data set.

Hop this helps ...

cheers, MuS

View solution in original post

MuS
SplunkTrust
SplunkTrust

Hi packet_hunter,

just use values() instead of list() in the last stats:

index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats values(domain) 

This will return uniq values for domain http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/CommonStatsFunctions .

values(X)   Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.

Also, you may want to look at this answer https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and try to use stats in your search, because subsearch have limits and are expensive and slow to run over a large data set.

Hop this helps ...

cheers, MuS

Get Updates on the Splunk Community!

Accelerating Observability as Code with the Splunk AI Assistant

We’ve seen in previous posts what Observability as Code (OaC) is and how it’s now essential for managing ...

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

 Splunk is More Than Just the Web Console For Digital Forensics and Incident Response (DFIR) practitioners, ...

Congratulations to the 2025-2026 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...