Hi.
I have a csv file that looks like:
123,"firstname secondname","firstlast secondlast",
124,firstname secondname, firstlast secondlast,
125,"firstname","firstlast",
126,firstname,firstlast,
I need to extract the fields (firstnames and lastnames) (doesn't matter if has one/two names/lasts or if has quotation marks), how i can?
My regular expressions only catch with quotation marks, or without or only two words in the first names, but not all.
Thanks for the help
this should do the trick
(?<number>[^,]+)(?:\,\"|\,)(?<first>[^",]+)(?:\"\,\"|\,)(?<last>[^,"]+)
yeah it's look pretty good
If you just add a header line to the CSV file, using inputcsv
, etc. will automatically create the field and you will not need to do anything. How are you bringing in this into Splunk?
what's the regex you used?
Do you want to extract the firstname and secondname into separate fields or would you be good with just grabbing the name between the quotation marks?
maybe this would work for you?
\d+,\"(?<firstname>[A-Za-z\sA-Za-z]+)\",\"(?<lastname>[A-Za-z\sA-Za-z]+)\"\,
otherwise you can write one that has optional matches to extract the secondname as it own field
a good site to play at is - https://regex101.com/
This should get the separate names:
\d+,\"(?<firstname>[A-Za-z]+)(\s)?(?<secondname>[A-Za-z]+)?\",\"(?<lastname>[A-Za-z]+)(\s)?(?<secondlast>[A-Za-z]+)?\",
You can tweak as needed. The ? works as an optional match after the optional criteria. Also you can use curly braces and a range. check out this - http://www.regular-expressions.info/optional.html