Splunk Search

How to match email from index with a lookup file.csv ? same "email prefix" but different email domain.

harold_ziedler
Engager

Hi, I'm trying to use index and lookup function. However values in those fields are not an exact match but those email address belongs to one person. How can i get the non exact match to work?

 

eg.

from index= user:

email_address, team

john.doe@xyz.com, blue

 

from file.csv:

email_address, department

john.doe@xyz.com.au, HR

example search:

"index=user
| lookup "file.csv"
| table email_address department 

Labels (3)
Tags (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

First, that's a lot of faith in E-mail domains.  I guess if this is regarding one enterprise that has full control, it could work.

Then, the point of using lookup is so an efficient search can be used.  The lookup data should be immediately searchable by the real match term, the common denominator, so to speak.  In other words, the lookup file should contain

email_addressdepartment
john.doe@xyz.comHR

Such a file can be easily produced from the current format, or the developer could make a simple change to produce this format.  When such a new lookup is produced, the search would be simply

 

index=user
| eval stripped_email_address = replace(email_address, "@xyz.com.*", "@xyz.com")
| lookup "reallookupfile.csv" email_address AS stripped_email_address
| table email_address department

 

Otherwise, you will be forced to treat the lookup file as another input stream, and the calculation becomes much more expensive.  Effectively, you are no longer using a lookup files.

index=user
| append
    [| inputlookup lookup.csv
    | eval fromlookup = "true"]
| eval stripped_email_address = replace(email_address, "@xyz.com.*", "@xyz.com")
| stats list(email_address) as email_address list(department) as department values(fromlookup) as fromlookup by stripped_email_address
| where mvcount(email_address) > 1 OR isnull(fromlookup)

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

First, that's a lot of faith in E-mail domains.  I guess if this is regarding one enterprise that has full control, it could work.

Then, the point of using lookup is so an efficient search can be used.  The lookup data should be immediately searchable by the real match term, the common denominator, so to speak.  In other words, the lookup file should contain

email_addressdepartment
john.doe@xyz.comHR

Such a file can be easily produced from the current format, or the developer could make a simple change to produce this format.  When such a new lookup is produced, the search would be simply

 

index=user
| eval stripped_email_address = replace(email_address, "@xyz.com.*", "@xyz.com")
| lookup "reallookupfile.csv" email_address AS stripped_email_address
| table email_address department

 

Otherwise, you will be forced to treat the lookup file as another input stream, and the calculation becomes much more expensive.  Effectively, you are no longer using a lookup files.

index=user
| append
    [| inputlookup lookup.csv
    | eval fromlookup = "true"]
| eval stripped_email_address = replace(email_address, "@xyz.com.*", "@xyz.com")
| stats list(email_address) as email_address list(department) as department values(fromlookup) as fromlookup by stripped_email_address
| where mvcount(email_address) > 1 OR isnull(fromlookup)
0 Karma

fredclown
Contributor

The inputlookup command has an append option for it so that you do not have to use a subsearch and the limitations that come with those.

index=user
| inputlookup lookup.csv append=true
| eval fromlookup=if(isnull(source), "true", null())
| eval stripped_email_address = replace(email_address, "@xyz.com.*", "@xyz.com")
| stats list(email_address) as email_address list(department) as department values(fromlookup) as fromlookup by stripped_email_address
| where mvcount(email_address) > 1 OR isnull(fromlookup)

 

0 Karma
Get Updates on the Splunk Community!

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...

Splunk APM: New Product Features + Community Office Hours Recap!

Howdy Splunk Community! Over the past few months, we’ve had a lot going on in the world of Splunk Application ...

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...