Scenario: I am extracting sender domains with the following code:
index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID]
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats list(domain)
Problem: I am not able to find the correct dedup syntax so I only list one unique domain once.
Currently I see
domain_A.tld
domain_A.tld
domain_A.tld
domain_B.tld
domain_C.tld
domain_A.tld
When I want just
domain_A.tld
domain_B.tld
domain_C.tld
Please provide an example.
Thank you
Hi packet_hunter,
just use values()
instead of list()
in the last stats:
index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID]
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats values(domain)
This will return uniq values for domain http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/CommonStatsFunctions .
values(X) Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.
Also, you may want to look at this answer https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and try to use stats
in your search, because subsearch have limits and are expensive and slow to run over a large data set.
Hop this helps ...
cheers, MuS
Hi packet_hunter,
just use values()
instead of list()
in the last stats:
index=mail sourcetype=xemail
[search index=mail sourcetype=xemail subject = "Blah" |stats count by UID| fields UID]
|stats list(subject) as subj list(sender) as sender list(recipient) as recp list(vendor_action) as status by UID
|search status= "Message done" |rex field=sender "[@\.](?<domain>\w+\.\w+)$" | stats values(domain)
This will return uniq values for domain http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/CommonStatsFunctions .
values(X) Returns the list of all distinct values of the field X as a multivalue entry. The order of the values is lexicographical.
Also, you may want to look at this answer https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... and try to use stats
in your search, because subsearch have limits and are expensive and slow to run over a large data set.
Hop this helps ...
cheers, MuS