Splunk Search

Rex search for a specific pattern

nisheethbaxi
Loves-to-Learn

I have a splunk query that has following text in message field - 

"message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\}}"

I need to extract value ABC123XYZ which is between account_id\":\" and \",\"activity. I tried the following query but it's not returning any data.

index=prod_logs app_name="abc" 
| rex field=_raw "account_id\\\"\:\\\"(?<accid>[^\"]+)\\\"\,\\\"activity"
| where isnotnull (accid)
| table accid

 

Labels (2)
Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Your data illustration strongly suggest that it is part of a JSON event like,

 

 

{"message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}", "some_field":"somevalue", "some_other_field": "morevalue"}

 

 

In this case, Splunk should have given you a field named "message"  that has this value: 

 

 

"message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}"

 

 

What the developer is trying to do is to embed more data in this field, partially also in JSON.  For long-term maintainability, it is best not to treat that as text, either.  This means that regex is not the right tool for the job.  Instead,  try to get the embedded JSON first.

There is just one problem (in addition to missing a closing double quote for the time value): the string \xxxxy is illegal in JSON.  If this is the real data, Splunk would have bailed and NOT give you a field named "message".  In that case, you will have to deal with that first.  Let's explore how later.

For now, suppose your data is actually

 

{"message":"sypher:[tokenized] build successful -\\\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}", "some_field":"somevalue", "some_other_field": "morevalue"}

 

As such, Splunk would have given you a value for message like this:

 

sypher:[tokenized] build successful -\xxxxy {"data":{"account_id":"ABC123XYZ","activity":{"time":"2024-05-31T12:37:25Z"}}

 

Consequently, all you need to do is

 

| eval jmessage = replace(message, "^[^{]+", "")
| spath input=jmessage

 

You will get the following fields

data.account_iddata.activity.timesome_fieldsome_other_field
ABC123XYZ2024-05-31T12:37:25Zsomevaluemorevalue

Here is an emulation of the "correct" data you can play with and compare with real data

 

| makeresults
| eval _raw = "{\"message\":\"sypher:[tokenized] build successful -\\\xxxxy {\\\"data\\\":{\\\"account_id\\\":\\\"ABC123XYZ\\\",\\\"activity\\\":{\\\"time\\\":\\\"2024-05-31T12:37:25Z\\\"}}\", \"some_field\":\"somevalue\", \"some_other_field\": \"morevalue\"}"
| spath
``` data emulation above ```

 

Now, if your raw data indeed contains \xxxxy inside a JSON block, you can still rectify that with text manipulation so you get a legal JSON.  But you have to tell your developer that they are logging bad JSON. (Recently there was a case where an IBM mainframe plugin sent Splunk bad data like this.  It is best for the developer to fix this kind of problem.)

0 Karma

nisheethbaxi
Loves-to-Learn

Tried both the expressions, Getting same error in both

regex 'account_id\\":\\"(?<account_id>[^\]+"activity)': Regex: missing terminating ] for character class.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Try like this

index=prod_logs app_name="abc" 
| rex field=_raw "account_id\\\"\:\\\"(?<accid>[^\\]+)\\\"\,\\\"activity"
| where isnotnull (accid)
| table accid
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @nisheethbaxi ,

if you're sure to have the backslashes in your logs, you could try this regex:

| rex "account_id\\\":\\\"(?<account_id>[^\\]+)"

that you can test at https://regex101.com/r/maaQBE/1

or the following (there's an issue using a regex in Spunk when there's backslash)

| rex "account_id\\\\\":\\\\\"(?<account_id>[^\\]+)"

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...