Splunk Search

Rex search for a specific pattern

nisheethbaxi
Loves-to-Learn

I have a splunk query that has following text in message field - 

"message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\}}"

I need to extract value ABC123XYZ which is between account_id\":\" and \",\"activity. I tried the following query but it's not returning any data.

index=prod_logs app_name="abc" 
| rex field=_raw "account_id\\\"\:\\\"(?<accid>[^\"]+)\\\"\,\\\"activity"
| where isnotnull (accid)
| table accid

 

Labels (2)
Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Your data illustration strongly suggest that it is part of a JSON event like,

 

 

{"message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}", "some_field":"somevalue", "some_other_field": "morevalue"}

 

 

In this case, Splunk should have given you a field named "message"  that has this value: 

 

 

"message":"sypher:[tokenized] build successful -\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}"

 

 

What the developer is trying to do is to embed more data in this field, partially also in JSON.  For long-term maintainability, it is best not to treat that as text, either.  This means that regex is not the right tool for the job.  Instead,  try to get the embedded JSON first.

There is just one problem (in addition to missing a closing double quote for the time value): the string \xxxxy is illegal in JSON.  If this is the real data, Splunk would have bailed and NOT give you a field named "message".  In that case, you will have to deal with that first.  Let's explore how later.

For now, suppose your data is actually

 

{"message":"sypher:[tokenized] build successful -\\\xxxxy {\"data\":{\"account_id\":\"ABC123XYZ\",\"activity\":{\"time\":\"2024-05-31T12:37:25Z\"}}", "some_field":"somevalue", "some_other_field": "morevalue"}

 

As such, Splunk would have given you a value for message like this:

 

sypher:[tokenized] build successful -\xxxxy {"data":{"account_id":"ABC123XYZ","activity":{"time":"2024-05-31T12:37:25Z"}}

 

Consequently, all you need to do is

 

| eval jmessage = replace(message, "^[^{]+", "")
| spath input=jmessage

 

You will get the following fields

data.account_iddata.activity.timesome_fieldsome_other_field
ABC123XYZ2024-05-31T12:37:25Zsomevaluemorevalue

Here is an emulation of the "correct" data you can play with and compare with real data

 

| makeresults
| eval _raw = "{\"message\":\"sypher:[tokenized] build successful -\\\xxxxy {\\\"data\\\":{\\\"account_id\\\":\\\"ABC123XYZ\\\",\\\"activity\\\":{\\\"time\\\":\\\"2024-05-31T12:37:25Z\\\"}}\", \"some_field\":\"somevalue\", \"some_other_field\": \"morevalue\"}"
| spath
``` data emulation above ```

 

Now, if your raw data indeed contains \xxxxy inside a JSON block, you can still rectify that with text manipulation so you get a legal JSON.  But you have to tell your developer that they are logging bad JSON. (Recently there was a case where an IBM mainframe plugin sent Splunk bad data like this.  It is best for the developer to fix this kind of problem.)

0 Karma

nisheethbaxi
Loves-to-Learn

Tried both the expressions, Getting same error in both

regex 'account_id\\":\\"(?<account_id>[^\]+"activity)': Regex: missing terminating ] for character class.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Try like this

index=prod_logs app_name="abc" 
| rex field=_raw "account_id\\\"\:\\\"(?<accid>[^\\]+)\\\"\,\\\"activity"
| where isnotnull (accid)
| table accid
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @nisheethbaxi ,

if you're sure to have the backslashes in your logs, you could try this regex:

| rex "account_id\\\":\\\"(?<account_id>[^\\]+)"

that you can test at https://regex101.com/r/maaQBE/1

or the following (there's an issue using a regex in Spunk when there's backslash)

| rex "account_id\\\\\":\\\\\"(?<account_id>[^\\]+)"

Ciao.

Giuseppe

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...