Splunk Search

Limiting Results of matching values in an array field

yepyepyayyooo
New Member

Anyone know of a way to only return the matching values of a sub search to the string array field in the parent search?

index="email" sourcetype="email_links" 
    [ search index="sinkholed" sourcetype="bad_http" 
    | rename raw_host as "extracted_host{}"
    | fields "extracted_host{}" ] 
| stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
| sort recipient_dc

The query works fine except I'm getting back more than I want. The results I get back in the "extracted_host{}" field are everything in that particular field value array instead of just the matching criteria. For example, in the sub-search let's say there is a sinkhole domain called baddomain.com. The results I see in "extracted_host{}" are:

baddomain.com
www.w3.org
abc123advertisement.com
etcetcetc.com

Would like to only return what matched in the sub-search. Any assistance is greatly appreciated.

0 Karma
1 Solution

manjunathmeti
Champion

Field "extracted_host{}" in main search is a json array. So when you filter extracted_host{} = baddomain.com in main search all other values in arrays containing baddomain.com will also appear in search results.

You need to expand field extracted_host{} in main search before filtering it with a sub-search.

index="email" sourcetype="email_links" 
| mvexpand extracted_host{}
| search
     [ search index="sinkholed" sourcetype="bad_http" 
     | rename raw_host as "extracted_host{}"
     | fields "extracted_host{}" ] 
 | stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
 | sort recipient_dc

View solution in original post

0 Karma

manjunathmeti
Champion

Field "extracted_host{}" in main search is a json array. So when you filter extracted_host{} = baddomain.com in main search all other values in arrays containing baddomain.com will also appear in search results.

You need to expand field extracted_host{} in main search before filtering it with a sub-search.

index="email" sourcetype="email_links" 
| mvexpand extracted_host{}
| search
     [ search index="sinkholed" sourcetype="bad_http" 
     | rename raw_host as "extracted_host{}"
     | fields "extracted_host{}" ] 
 | stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
 | sort recipient_dc
0 Karma

yepyepyayyooo
New Member

Thanks! That worked. How would you go about performing this on multiple multi-value fields?

| mvexpand extracted_host{}, url

0 Karma

manjunathmeti
Champion

Welcome! Yes, you can filter on multiple multi-value fields.

<base search>
| mvexpand extracted_host{}
| mvexpand url
| search
[<search> | fields extracted_host{}, url]
0 Karma

wmyersas
Builder

I'll presume raw_host is a multivalue field

Presuming that is the case, do the following:

index=email sourcetype=email_links
| search
    [ search index=sinkholed sourcetype=bad_http
    | mvexpand raw_host
    | stats count by raw_host
    | fields - count
    | rename raw_host as <field-in-outer-search> ]
| <rest of search>

That should only show you email_links to domains that were sinkholed

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...