Splunk Search

Limiting Results of matching values in an array field

yepyepyayyooo
New Member

Anyone know of a way to only return the matching values of a sub search to the string array field in the parent search?

index="email" sourcetype="email_links" 
    [ search index="sinkholed" sourcetype="bad_http" 
    | rename raw_host as "extracted_host{}"
    | fields "extracted_host{}" ] 
| stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
| sort recipient_dc

The query works fine except I'm getting back more than I want. The results I get back in the "extracted_host{}" field are everything in that particular field value array instead of just the matching criteria. For example, in the sub-search let's say there is a sinkhole domain called baddomain.com. The results I see in "extracted_host{}" are:

baddomain.com
www.w3.org
abc123advertisement.com
etcetcetc.com

Would like to only return what matched in the sub-search. Any assistance is greatly appreciated.

0 Karma
1 Solution

manjunathmeti
Champion

Field "extracted_host{}" in main search is a json array. So when you filter extracted_host{} = baddomain.com in main search all other values in arrays containing baddomain.com will also appear in search results.

You need to expand field extracted_host{} in main search before filtering it with a sub-search.

index="email" sourcetype="email_links" 
| mvexpand extracted_host{}
| search
     [ search index="sinkholed" sourcetype="bad_http" 
     | rename raw_host as "extracted_host{}"
     | fields "extracted_host{}" ] 
 | stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
 | sort recipient_dc

View solution in original post

0 Karma

manjunathmeti
Champion

Field "extracted_host{}" in main search is a json array. So when you filter extracted_host{} = baddomain.com in main search all other values in arrays containing baddomain.com will also appear in search results.

You need to expand field extracted_host{} in main search before filtering it with a sub-search.

index="email" sourcetype="email_links" 
| mvexpand extracted_host{}
| search
     [ search index="sinkholed" sourcetype="bad_http" 
     | rename raw_host as "extracted_host{}"
     | fields "extracted_host{}" ] 
 | stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
 | sort recipient_dc
0 Karma

yepyepyayyooo
New Member

Thanks! That worked. How would you go about performing this on multiple multi-value fields?

| mvexpand extracted_host{}, url

0 Karma

manjunathmeti
Champion

Welcome! Yes, you can filter on multiple multi-value fields.

<base search>
| mvexpand extracted_host{}
| mvexpand url
| search
[<search> | fields extracted_host{}, url]
0 Karma

wmyersas
Builder

I'll presume raw_host is a multivalue field

Presuming that is the case, do the following:

index=email sourcetype=email_links
| search
    [ search index=sinkholed sourcetype=bad_http
    | mvexpand raw_host
    | stats count by raw_host
    | fields - count
    | rename raw_host as <field-in-outer-search> ]
| <rest of search>

That should only show you email_links to domains that were sinkholed

0 Karma
Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Watch On Demand the Tech Talk, and empower your SOC to reach new heights! Duration: 1 hour  Prepare to ...

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...