Splunk Search

inputlookup with fuzzy matching

warren
Explorer

Hello,

I'm building a query which matches entries in an inputlookup table against a set of log data. The original working query (thanks to @ITWhisperer ) is:

dataFeedTypeId=AS
[ | inputlookup approvedsenders | fields Value
| rename Value as sender]
| stats count as cnt_sender by sender
| append
[inputlookup approvedsenders | fields Value | rename Value as sender]
| fillnull cnt_sender
| stats sum(cnt_sender) as count BY sender

This is correctly providing a list of all of the emails address entries in the lookup file with the number of times they occur in the email address field (sender) of the dataset.

However, the "Value" field from the lookup file may contain an email address, an IP address, or a domain name:

"Value,description,spoof,spam,heuristic,newsletter
training.cloud@bob,Email Tests,Y,Y,Y,Y
mk.mimecast.com,mimecast emails,Y,Y,Y,Y
blah@yahoo.com,more belgrat,Y,Y,Y,Y
bbc.co.uk,BBC sends me lots of useful information,N,Y,Y,Y
81.96.24.195,test IP,Y,Y,Y,Y
yahoo.com, Yahoo domain,Y,Y,Y,Y"

What I need to do next is to widen the search, at the moment it is doing an exact match on the sender field and I need the entries in the lookup file to be used against a number of fields in the log data.

Examples: This query seems to provide the correct results but it does not show the null values as above(n.b. the return 1000 isn't optimal code, I'd prefer it to simply return all of the results):

dataFeedTypeId=AS
[| inputlookup approvedsenders
| fields Value
| return 1000 $Value]
| stats count as cnt_sender by sender

This query also provides the correct results but again does not show the null values (there is no domain field in the log data, I am using an app to extract it from the sender field)

dataFeedTypeId=AS
| rex field=sender "\@(?<domain_detected>.*)"
| eval list="mozilla"
| `ut_parse_extended(domain_detected, list)`
| lookup approvedsenders Value AS domain_detected OUTPUTNEW description Value
| lookup approvedsenders Value AS sender OUTPUTNEW description Value
| lookup approvedsenders Value AS senderIp OUTPUTNEW description Value
| search description="*"
| table subject sender domain_detected senderIP description Value
| lookup approvedsenders description OUTPUTNEW Value as Matched
| stats count by Matched

 

My apologies for the long question, thank you for taking the time to read this far.

Labels (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

You need to clarify the question.

Examples: This query seems to provide the correct results but it does not show the null values as above(n.b. the return 1000 isn't optimal code, I'd prefer it to simply return all of the results):


What do you mean by "null values" here?  Do you mean the 0s that fillnull command supplied?  You can apply the same trick as before, and it will have the same effect. (BTW, do not do fillnull outside of append subsearch.  That adds to compute.)

dataFeedTypeId=AS
[| inputlookup approvedsenders
| fields Value
| return 9999 $Value]
| stats count as cnt_sender by sender
| append
  [inputlookup approvedsenders
  | fields Value
  | rename Value AS sender]
| fillnull cnt_sender
| stats sum(cnt_sender) as cnt_sender by sender

(I changed 1000 to 9999 just as a suggestion simply give it a large enough number.)  Is this what you are looking for?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

.conf25 Global Broadcast: Don’t Miss a Moment

Hello Splunkers, .conf25 is only a click away.  Not able to make it to .conf25 in person? No worries, you can ...

Observe and Secure All Apps with Splunk

 Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What's New in Splunk Observability - August 2025

What's New We are excited to announce the latest enhancements to Splunk Observability Cloud as well as what is ...