Splunk Search

inputlookup with fuzzy matching

warren
Explorer

Hello,

I'm building a query which matches entries in an inputlookup table against a set of log data. The original working query (thanks to @ITWhisperer ) is:

dataFeedTypeId=AS
[ | inputlookup approvedsenders | fields Value
| rename Value as sender]
| stats count as cnt_sender by sender
| append
[inputlookup approvedsenders | fields Value | rename Value as sender]
| fillnull cnt_sender
| stats sum(cnt_sender) as count BY sender

This is correctly providing a list of all of the emails address entries in the lookup file with the number of times they occur in the email address field (sender) of the dataset.

However, the "Value" field from the lookup file may contain an email address, an IP address, or a domain name:

"Value,description,spoof,spam,heuristic,newsletter
training.cloud@bob,Email Tests,Y,Y,Y,Y
mk.mimecast.com,mimecast emails,Y,Y,Y,Y
blah@yahoo.com,more belgrat,Y,Y,Y,Y
bbc.co.uk,BBC sends me lots of useful information,N,Y,Y,Y
81.96.24.195,test IP,Y,Y,Y,Y
yahoo.com, Yahoo domain,Y,Y,Y,Y"

What I need to do next is to widen the search, at the moment it is doing an exact match on the sender field and I need the entries in the lookup file to be used against a number of fields in the log data.

Examples: This query seems to provide the correct results but it does not show the null values as above(n.b. the return 1000 isn't optimal code, I'd prefer it to simply return all of the results):

dataFeedTypeId=AS
[| inputlookup approvedsenders
| fields Value
| return 1000 $Value]
| stats count as cnt_sender by sender

This query also provides the correct results but again does not show the null values (there is no domain field in the log data, I am using an app to extract it from the sender field)

dataFeedTypeId=AS
| rex field=sender "\@(?<domain_detected>.*)"
| eval list="mozilla"
| `ut_parse_extended(domain_detected, list)`
| lookup approvedsenders Value AS domain_detected OUTPUTNEW description Value
| lookup approvedsenders Value AS sender OUTPUTNEW description Value
| lookup approvedsenders Value AS senderIp OUTPUTNEW description Value
| search description="*"
| table subject sender domain_detected senderIP description Value
| lookup approvedsenders description OUTPUTNEW Value as Matched
| stats count by Matched

 

My apologies for the long question, thank you for taking the time to read this far.

Labels (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

You need to clarify the question.

Examples: This query seems to provide the correct results but it does not show the null values as above(n.b. the return 1000 isn't optimal code, I'd prefer it to simply return all of the results):


What do you mean by "null values" here?  Do you mean the 0s that fillnull command supplied?  You can apply the same trick as before, and it will have the same effect. (BTW, do not do fillnull outside of append subsearch.  That adds to compute.)

dataFeedTypeId=AS
[| inputlookup approvedsenders
| fields Value
| return 9999 $Value]
| stats count as cnt_sender by sender
| append
  [inputlookup approvedsenders
  | fields Value
  | rename Value AS sender]
| fillnull cnt_sender
| stats sum(cnt_sender) as cnt_sender by sender

(I changed 1000 to 9999 just as a suggestion simply give it a large enough number.)  Is this what you are looking for?

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...