Hello! I started using Splunk about 3 hours ago and am getting stuck on something that may be very simple:
I have two csv files:
fullname.txt. Contents of file:
full_name, date_of_birth
Homer Simpson, 01-01-1980
Bart Simpson, 02-02-2009
Apu Nahasapeemapetilon, 03-03-1984
Montgomery Burns, 06-06-1881
db_export.csv. Contents of file:
id, first_name, last_name, create_date
001, HoM er, siMpson, 01-01-2018
002, bART, Simp son, 07-01-2018
003, ApU , Nahasapeemapetilon , 09-03-2018
004, Montgomery, bUrns, 11-14-2018
What I'm trying to do: Compare full_name in fullname.txt to first_name+last_name in db_export.csv. Note that there are capitalization and spacing issues in the db_export csv, so I'm attempting to use eval to 'normalize' the fields by creating new values that contain lowercased letters, then comparing those.
Here's what I have so far:
source="fullname.txt" OR source="db-export.csv" | eval spring_full_name = full_name, simps_full_name = first_name." ".last_name, spring_low_name = lower(full_name), simps_low_name = lower(simps_full_name) | set intersect *... this is about the point where I'm getting lost*
I'd also like to concatenate the eval'd full_name field and compare that to the concatenated first_name+last_name eval'd field, due to the spacing issues in db_export.csv. Finally, I would like to have one search show all of the matches and another search show the non-matches(or combined into one search, if possible). Ultimately, I would also like to display a report that contains the full_name and date_of_birth of fullname.txt combined with the id and create_date of db_export.csv
Thanks in advance!
Hi,
Can you try the below search, Please note that I indexed those files into my main index. I also assumed that in your db export you have only space issues. If any other special character issue is there into db export we may need to handle it through regular expression. Also I assumed your db_export is the master data so the below query will also return names which are present in db export but not in fullname.txt.
index=main source="db_export.csv"
| table id, first_name, last_name, create_date
| eval fullname_formatted = lower(replace(first_name," ","")).lower(replace(last_name," ",""))
| join type="left" fullname_formatted
[search index=main source="fullname.csv"
| table full_name, date_of_birth
| eval fullname_formatted = lower(replace(full_name," ",""))]
Sid
Hi,
Can you try the below search, Please note that I indexed those files into my main index. I also assumed that in your db export you have only space issues. If any other special character issue is there into db export we may need to handle it through regular expression. Also I assumed your db_export is the master data so the below query will also return names which are present in db export but not in fullname.txt.
index=main source="db_export.csv"
| table id, first_name, last_name, create_date
| eval fullname_formatted = lower(replace(first_name," ","")).lower(replace(last_name," ",""))
| join type="left" fullname_formatted
[search index=main source="fullname.csv"
| table full_name, date_of_birth
| eval fullname_formatted = lower(replace(full_name," ",""))]
Sid
Wow...fantastic! That is exactly what I was looking for! Thank you Sid.
no worries. Have a great day 🙂