Getting Data In

help with regular expression to extract first and last names in csv

Builder

Hi.
I have a csv file that looks like:

123,"firstname secondname","firstlast secondlast",
124,firstname secondname, firstlast secondlast,
125,"firstname","firstlast",
126,firstname,firstlast,

I need to extract the fields (firstnames and lastnames) (doesn't matter if has one/two names/lasts or if has quotation marks), how i can?
My regular expressions only catch with quotation marks, or without or only two words in the first names, but not all.

Thanks for the help

0 Karma

Communicator

this should do the trick

(?<number>[^,]+)(?:\,\"|\,)(?<first>[^",]+)(?:\"\,\"|\,)(?<last>[^,"]+)

Communicator

yeah it's look pretty good

0 Karma

Esteemed Legend

If you just add a header line to the CSV file, using inputcsv, etc. will automatically create the field and you will not need to do anything. How are you bringing in this into Splunk?

0 Karma

Builder

what's the regex you used?

0 Karma

Contributor

Do you want to extract the firstname and secondname into separate fields or would you be good with just grabbing the name between the quotation marks?

maybe this would work for you?

\d+,\"(?<firstname>[A-Za-z\sA-Za-z]+)\",\"(?<lastname>[A-Za-z\sA-Za-z]+)\"\,

otherwise you can write one that has optional matches to extract the secondname as it own field

a good site to play at is - https://regex101.com/

0 Karma

Contributor

This should get the separate names:

\d+,\"(?<firstname>[A-Za-z]+)(\s)?(?<secondname>[A-Za-z]+)?\",\"(?<lastname>[A-Za-z]+)(\s)?(?<secondlast>[A-Za-z]+)?\",

You can tweak as needed. The ? works as an optional match after the optional criteria. Also you can use curly braces and a range. check out this - http://www.regular-expressions.info/optional.html

0 Karma