Splunk Search

How to dedup after | stats list(blah)?

packet_hunter
Contributor

Scenario: I am extracting sender domains with the following code:

index=mail sourcetype=xemail

    [search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 

        |stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 

            |search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats list(domain) 

Problem: I am not able to find the correct dedup syntax so I only list one unique domain once.
Currently I see
domain_A.tld
domain_A.tld
domain_A.tld
domain_B.tld
domain_C.tld
domain_A.tld

When I want just
domain_A.tld
domain_B.tld
domain_C.tld

Please provide an example.

Thank you

Tags (3)
0 Karma
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi packet_hunter,

just use values() instead of list() in the last stats:

index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats values(domain) 

This will return uniq values for domain http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/CommonStatsFunctions .

values(X)   Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.

Also, you may want to look at this answer https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and try to use stats in your search, because subsearch have limits and are expensive and slow to run over a large data set.

Hop this helps ...

cheers, MuS

View solution in original post

MuS
SplunkTrust
SplunkTrust

Hi packet_hunter,

just use values() instead of list() in the last stats:

index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID] 
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID 
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats values(domain) 

This will return uniq values for domain http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/CommonStatsFunctions .

values(X)   Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.

Also, you may want to look at this answer https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and try to use stats in your search, because subsearch have limits and are expensive and slow to run over a large data set.

Hop this helps ...

cheers, MuS

Get Updates on the Splunk Community!

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Discover how the Splunk Model Context Protocol (MCP) Server can revolutionize the way your organization uses ...