Hi, I wonder whether someone could help me please.
I'm trying to put together a rex to extract the First Name from the data below.
{"matchingDataset":{"surnames":[{"value":"Smith","verified":true}],"gender":{"value":"MALE","verified":true},"dateOfBirth":{"value":"1973-12-26","verified":true},"firstName":{"value":"John","verified":true},"addresses":[{"verified":true,"postCode":"AB1 1BC","lines":["1 A Street","A Town","GB"]}],"middleNames":{"value":"john","verified":true}},"hashedPid":"123","matchId":"_123","levelOfAssurance":"LEVEL_2"}
From guidance I received from a post I made here I've been trying to come up with a suitable solution as shown below.
"\"firstName\":\"{\"\"value\":\"(?<idaFName>[^\"]+)"
Unfortunately, although I'm not receiving any error message, I'm not able to extract the information I need.
I've tried using Regex101 yesterday afternoon and this morning without success, but I admit I am a complete novice when it comes to regex.
I just wondered whether someone could look at this please and let me know where I've gone wrong.
Many thanks and kind regards
Chris
Hi,
you simply put some extra " signs in your regex around the { that you do not need.
Try this:
| rex "\"firstName\":{\"value\":\"(?<idaFName>[^\"]+)"
Hi IRHM73,
based on the provided examples this will match:
\"firstName\":\{\"value\":\"(?<idaFName>[^\"]*)
This is the pure regex so to use it in Splunk use it this way:
base search here | rex "\"firstName\":\{\"value\":\"(?<idaFName>[^\"]*)" | do more foo here
But...this looks like JSON? Have you tried spath
http://docs.splunk.com/Documentation/Splunk/6.3.0/SearchReference/Spath or KV_MODE=JSON
in props.conf
Hope this helps ...
cheers, MuS
Hi @Mus, thank you for taking the time to reply to my post. JSON was a possible solution that was suggested in earlier posts I've made, but I must admit I'm finding Regex a little scary at the moment, so I'm not too confident with JSON, but I now know that this possibly the next step I need to take to being more adapt with Splunk.
Many thanks and kind regards
Chris
Hi,
you simply put some extra " signs in your regex around the { that you do not need.
Try this:
| rex "\"firstName\":{\"value\":\"(?<idaFName>[^\"]+)"
dammit - 48 secs faster 🙂
Haha,
sorry MuS 😄
Hi @Tom, thank you very much for taking the time to reply to my post, again! The solution works great, but could you tell me please, I've been trying to use Regex101 to understand how to put these queries together, and when I add the raw data from my original post and your solution to Regex101 it doesn't work.
Am I doing something wrong?
Many thanks and kind regards
Chris
You need to dismiss the outer " signs when you want to use it in regex101. It is the rex command that requires those.
So in regex101 just use:
\"firstName\":{\"value\":\"(?<idaFName>[^\"]+)
HeHe, so my answer would fit better because it shows both ways 😉 Never mind ... blop - ein Flens in Ehren 🙂 Ja, die gibt es hier auch 😉
I'm not sure this fixes all your problems, but you have a backslash in your capturing group's character class. I doubt you want that, because you don't need to escape a double quote in there (see here for more info). The way this works at the moment is the regex arrives at the end of the name you want to capture (the double quote in the data) and checks whether this is not a backslash. As this works, it continues to match your character class - bad. (Edit: Oops, see note below)
Also, I figure the double quotes enclosing the term belong to your rex command in splunk. Because your expression contains double quotes itself, you should change those outer ones to single quotes.
PS: Sorry, I was wrong there. Although not needed, you can escape a double quote inside a character class - in fact, the backslash is one of the few characters you have to escape, so the backslash in your character class does no harm in your case. It seems Tom and MuS have it right though.
HI @jeffland, thank you very much for this and for taking the time to reply to my post.
Kind Regards
Chris