HI,
I wonder whether someone could help me please.
I'm trying to extract the first name from the data as shown below:
[{"name":{"current":{"firstName":"M","lastName":"SMITH"}},"ids":{"nino":"AA111111A"},"dateOfBirth":"26121973"}]
So I've put together the following rex:
rex field="detail.output-cid-response" "\"firstName\":\"(?<cidFName>[^\"]+)"
The problem I have is that although there is data there, it is not extracting the "cidFName" for all the records and to be honest I'm at a loss why.
Could someone perhaps shed some light on where I'm going wrong please.
Many thanks and kind regards
Chris
Can you try this one?
rex field="detail.output-cid-response" "firstName\\\":\\\"(?<cidFName>[^\\]+)\\"
Can you try this one?
rex field="detail.output-cid-response" "firstName\\\":\\\"(?<cidFName>[^\\]+)\\"
I indexed your sample data and was able to use the following regex to extract "JOHN" as the "firstName" field. One rex extracted from the _raw
field as a source, and the other extracted from the detail.output-cid-response
field as a source. Please see if either fits your needs:
| rex field=_raw "firstName[\\\]\":[\\\]\"(?<firstNameRaw>[^\\\]+)[\\\]"
| rex field="detail.output-cid-response" "firstName\":\"(?<firstName>[^\"]+)\""
Eureka!!!!
@Wpreston, thank you for coming back to me with this it is greatly appreciated. I've tried the queries you kindly provided and this one worked: | rex field=_raw "firstName[\\\]\":[\\\]\"(?[^\\\]+)[\\\]"
So that I can learn from this, could I ask please what the '[' and ']' do?
Many thanks and kind regards
Chris
The "[" and "]" characters are used to make a regular expression character class. They are typically used when you want to match one of several characters. Consider a commonly misspelled word like "separate". If you were looking for all instances of this word, you might want to make allowances for people who spelled it "seperate" as well. Using a character class, your regex would be sep[ae]rate
.
Regular-expressions.info has a good write up on character classes and could explain them much better than I can. Glad that this worked for you!
Hi @wpreston, thank you very much for the explanation and for the link, which is a really great article.
Kind Regards
Chris
Hi @wpreston, thank you for this.
Unfortunately when I run this I recieve this error:
Error in 'rex' command: Encountered
the following error while compiling
the regex
'firstName\":\"(?[^]+)\':
Regex: \ at end of pattern
Many thanks and kind regards
Chris
Hmm, ok how about this one?
rex field=detail.output-cid-response "firstName.\":.\"(?<NewField>.+)[\\\]\","
Hi IRHM73, That either means that the regex isn't valid for all values of the "detail.output-cid-response" field, or that the "detail.output-cid-response" field doesn't exist for all events.
I would run the regex over _raw, which is the default value for the rex command.
So, in that way, try running
rex "\"firstName\":\"(?<cidFName>[^\"]+)"
If that doesn't pull all the cidFName fields as you would expect, post the _raw for the events where the field isn't extracting properly.
Please let me know how this works! 😄
Hi @muebel, thank you for taking the time to reply to my post.
I tried the query you kindly sent but found I had to put 'rex field....' in front.
But unfortunately the details on some of the records are missing inc the one shown as the raw data log below:
{"auditSource":"matching","auditType":"TxSucceeded","eventId":"cc642788","tags":{"X-Request-ID":"uke83d","transactionName":"Search"},"detail":{"output-cid-response":"[{\"name\":{\"current\":{\"firstName\":\"JOHN\",\"lastName\":\"SMITH\"}},\"ids\":{\"nino\":\"AA111111A\"},\"dateOfBirth\":\"26121973\"}]","output-cycle":"CYCLE3","output-matching-time-in-millis":"120","input-searchRequest":"IncomingSearchRequest(Some(AA111111A),Some(John),Some(Smith),Some(1973-12-26))","output-errors":"[]","output-result":"match found","input-nino":"AA111111A"},"generatedAt":"2015-10-20T20:04:14.728Z"}
I can confirm that the detail.output-cid-response is present in all records and as far as I can see they are exactly the same with differeing usernames, nino's etc.
Many thanks and kind regards
Chris
the double quotes are escaped within the _raw of all the events? In that case try escaping the slashes as well:
rex field=_raw "\\\"firstName\\\":\\\"(?<cidFName>[^\"]+)"
Hi I really appreciate you coming back to me with this.
In answer to your question, all the raw events the double quotes are escaped.
I tried the query you provided, but unfortunately I receive the following error:
Error in 'SearchParser': Missing a
search command before '^'. Error at
position '470' of search query 'search
index=main auditSource="matching"
auditType...{snipped} {errorcontext =
Name":"(?[^"]+)" | e}'.
Many thanks and kind regards
Chris
If any event has two names on this field, better you use this:
firstName\":\"(?P<cidFname>.*?)\"
or
firstName\":\"(?P<cidFname>[\w\s]+)\"
Hi @renatobamorim, thank you for taking the time to come back to me with this.
I must admit I wasn't quite sure what to do with the query you kindly sent but using the snippet as the following:
rex field="detail.output-cid-response" ""firstName":"(?.*?)""
I receive
a Error in 'rex' command: Encountered
the following error while compiling
the regex 'firstName:(?.*?)': Regex:
unrecognized character after (? or (?-
I've had to include the double " otherwise I receive an unbalanced quotes error message.
Many thanks and kind regards
Chris
You'll need to escape the double quote, like this:
rex field="detail.output-cid-response" "\"firstName\":\"(?P<field_name>.*?)\""
or
rex field="detail.output-cid-response" "\"firstName\":\"(?P<field_name>[^\"]+)"
Hi thank you for clarifying on how to use the querys.
Unfortunately there was no change using firstName\":\"(?P.*?)\"
and firstName\":\"(?P[\w\s]+)\""
didn't extract any information.
I then tried:
firstName\":\"(?P<cidFname>.*?)\"
firstName\":\"(?P<cidFname>[\w\s]+)\"
\"firstName\":\"(?<cidFName>[^\"]+)"
All with the rex=field raw, and unfortunately these did not extract any of the information.
Many thanks and kind regards
Chris
try this....
\"firstName\":\"(?<cidFName>[\w]+)"
Hi @krish3, thank you for taking the time to reply to my post,
I've tried the query you kindly provided, but unfortunately this hasn't made any difference.
Many thanks and kind regards
Chris
Can you please share what is the value of field detail.output-cid-response
Hi @krish3, my apologies for not making this clear but detail,.output-cid-response is the raw data shown in my initial post i.e. [{"name":{"current":{"firstName":"CHRIS","lastName":"SMITH"}},"ids":{"nino":"AA111111A"},"dateOfBirth":"26121973"}]
Many thanks and kind regards
Chris
Can you post few more lines of your logs I do not see any issues with the regex pattern....