Splunk Search

Limiting Results of matching values in an array field

yepyepyayyooo
New Member

Anyone know of a way to only return the matching values of a sub search to the string array field in the parent search?

index="email" sourcetype="email_links" 
    [ search index="sinkholed" sourcetype="bad_http" 
    | rename raw_host as "extracted_host{}"
    | fields "extracted_host{}" ] 
| stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
| sort recipient_dc

The query works fine except I'm getting back more than I want. The results I get back in the "extracted_host{}" field are everything in that particular field value array instead of just the matching criteria. For example, in the sub-search let's say there is a sinkhole domain called baddomain.com. The results I see in "extracted_host{}" are:

baddomain.com
www.w3.org
abc123advertisement.com
etcetcetc.com

Would like to only return what matched in the sub-search. Any assistance is greatly appreciated.

0 Karma
1 Solution

manjunathmeti
Champion

Field "extracted_host{}" in main search is a json array. So when you filter extracted_host{} = baddomain.com in main search all other values in arrays containing baddomain.com will also appear in search results.

You need to expand field extracted_host{} in main search before filtering it with a sub-search.

index="email" sourcetype="email_links" 
| mvexpand extracted_host{}
| search
     [ search index="sinkholed" sourcetype="bad_http" 
     | rename raw_host as "extracted_host{}"
     | fields "extracted_host{}" ] 
 | stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
 | sort recipient_dc

View solution in original post

0 Karma

manjunathmeti
Champion

Field "extracted_host{}" in main search is a json array. So when you filter extracted_host{} = baddomain.com in main search all other values in arrays containing baddomain.com will also appear in search results.

You need to expand field extracted_host{} in main search before filtering it with a sub-search.

index="email" sourcetype="email_links" 
| mvexpand extracted_host{}
| search
     [ search index="sinkholed" sourcetype="bad_http" 
     | rename raw_host as "extracted_host{}"
     | fields "extracted_host{}" ] 
 | stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
 | sort recipient_dc
0 Karma

yepyepyayyooo
New Member

Thanks! That worked. How would you go about performing this on multiple multi-value fields?

| mvexpand extracted_host{}, url

0 Karma

manjunathmeti
Champion

Welcome! Yes, you can filter on multiple multi-value fields.

<base search>
| mvexpand extracted_host{}
| mvexpand url
| search
[<search> | fields extracted_host{}, url]
0 Karma

wmyersas
Builder

I'll presume raw_host is a multivalue field

Presuming that is the case, do the following:

index=email sourcetype=email_links
| search
    [ search index=sinkholed sourcetype=bad_http
    | mvexpand raw_host
    | stats count by raw_host
    | fields - count
    | rename raw_host as <field-in-outer-search> ]
| <rest of search>

That should only show you email_links to domains that were sinkholed

0 Karma
Get Updates on the Splunk Community!

Uncovering Multi-Account Fraud with Splunk Banking Analytics

Last month, I met with a Senior Fraud Analyst at a nationally recognized bank to discuss their recent success ...

Secure Your Future: A Deep Dive into the Compliance and Security Enhancements for the ...

What has been announced?  In the blog, “Preparing your Splunk Environment for OpensSSL3,”we announced the ...

New This Month in Splunk Observability Cloud - Synthetic Monitoring updates, UI ...

This month, we’re delivering several platform, infrastructure, application and digital experience monitoring ...