Splunk Search

Comparing Two Files with Eval and Set

pradeepk_splunk
Splunk Employee
Splunk Employee

Hello! I started using Splunk about 3 hours ago and am getting stuck on something that may be very simple:

I have two csv files:

fullname.txt. Contents of file:

full_name, date_of_birth
Homer Simpson, 01-01-1980
Bart Simpson, 02-02-2009
Apu Nahasapeemapetilon, 03-03-1984
Montgomery Burns, 06-06-1881

db_export.csv. Contents of file:

id, first_name, last_name, create_date
001, HoM er, siMpson, 01-01-2018
002, bART, Simp son, 07-01-2018
003, ApU , Nahasapeemapetilon , 09-03-2018
004, Montgomery, bUrns, 11-14-2018

What I'm trying to do: Compare full_name in fullname.txt to first_name+last_name in db_export.csv. Note that there are capitalization and spacing issues in the db_export csv, so I'm attempting to use eval to 'normalize' the fields by creating new values that contain lowercased letters, then comparing those.

Here's what I have so far:

source="fullname.txt" OR source="db-export.csv" | eval spring_full_name = full_name, simps_full_name = first_name." ".last_name, spring_low_name = lower(full_name), simps_low_name = lower(simps_full_name) | set intersect *... this is about the point where I'm getting lost*

I'd also like to concatenate the eval'd full_name field and compare that to the concatenated first_name+last_name eval'd field, due to the spacing issues in db_export.csv. Finally, I would like to have one search show all of the matches and another search show the non-matches(or combined into one search, if possible). Ultimately, I would also like to display a report that contains the full_name and date_of_birth of fullname.txt combined with the id and create_date of db_export.csv

Thanks in advance!

0 Karma
1 Solution

sdchakraborty
Contributor

Hi,

Can you try the below search, Please note that I indexed those files into my main index. I also assumed that in your db export you have only space issues. If any other special character issue is there into db export we may need to handle it through regular expression. Also I assumed your db_export is the master data so the below query will also return names which are present in db export but not in fullname.txt.

index=main source="db_export.csv"
| table id, first_name, last_name, create_date
| eval fullname_formatted = lower(replace(first_name," ","")).lower(replace(last_name," ",""))
| join type="left" fullname_formatted 
    [search index=main source="fullname.csv"
    | table full_name, date_of_birth
    | eval fullname_formatted = lower(replace(full_name," ",""))]

Sid

View solution in original post

sdchakraborty
Contributor

Hi,

Can you try the below search, Please note that I indexed those files into my main index. I also assumed that in your db export you have only space issues. If any other special character issue is there into db export we may need to handle it through regular expression. Also I assumed your db_export is the master data so the below query will also return names which are present in db export but not in fullname.txt.

index=main source="db_export.csv"
| table id, first_name, last_name, create_date
| eval fullname_formatted = lower(replace(first_name," ","")).lower(replace(last_name," ",""))
| join type="left" fullname_formatted 
    [search index=main source="fullname.csv"
    | table full_name, date_of_birth
    | eval fullname_formatted = lower(replace(full_name," ",""))]

Sid

pradeepk_splunk
Splunk Employee
Splunk Employee

Wow...fantastic! That is exactly what I was looking for! Thank you Sid.

0 Karma

sdchakraborty
Contributor

no worries. Have a great day 🙂

0 Karma
Get Updates on the Splunk Community!

SOC4Kafka - New Kafka Connector Powered by OpenTelemetry

The new SOC4Kafka connector, built on OpenTelemetry, enables the collection of Kafka messages and forwards ...

Your Voice Matters! Help Us Shape the New Splunk Lantern Experience

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Building Momentum: Splunk Developer Program at .conf25

At Splunk, developers are at the heart of innovation. That’s why this year at .conf25, we officially launched ...