Splunk Search

Help with Regex extraction

siksaw33
Path Finder
2023-01-09T16:46:00.780076351Z app_name=default-java environment=e3 ns=one pod_container=default-java pod_name=default stream=stdout message={"name":"com","timestamp":"2023-01-09T16:46:00.779Z","level":"info","schemaVersion":"0.1","application":{"name":"com ","version":"1.2.5"},"request":{"address":{"uri":"Read/1.2.5"},"metadata":{"one-data-correlation-id":"d5d3 ","one-data-trace-id":"0be"}},"message":"Parent Function Address: Read, Request identifier: d5d35c6e-3661-4445-bbe4-f5a3f382d035, REQUEST-RECEIVED: {\"requestIdentifier\""d5 \",\"clientIdentifier\""CUST \",\"locale\""en-US\",\"userId\""lkapla\",\"accountNumber\""1234\",\"treatmentsFilter\":[\"targeted\",\"messages\"],\"callerType\""ADDTL\",\"cancelType\""\",\"handle\""gsp00a79e6b_b610_3407_90fa_11d5417c0b7f\",\"callTimeStamp\""1/9/2023 9:46:00 AM\",\"callIdentifier\""01091\",\"geoTelIdentifier\""04ba\"}, "}

 

I want to extract the time, userid and  clientIdentifier in a table?

 

Labels (3)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Similar to your other question, please post JSON objects in code blocks because some combinations turn into smileys.  As I said there, try not to treat JSON objects like text strings.  Use SPL's built-in capabilities to deal with structured data.

With your raw logs, Splunk should have extracted the field "message".  Inside message, there's a JSON node named "message".  Somehow spath cannot work well with duplicate names.  So, we'll rename the Splunk field "message" first.

 

 

| rename message AS data
| spath input=data
| eval REQUEST_RECEIVED = replace(message, ".*, REQUEST-RECEIVED: ", "")
| spath input=REQUEST_RECEIVED
| fields - REQUEST_RECEIVED data message

 

 

Your sample data - after correction for smileys, would give this output that contains multiple time fields as well as other data about the request.

accountNumberapp_nameapplication.nameapplication.versioncallIdentifiercallTimetampcallerTypecancelTypeclientIdentifierenvironmentgeoTelIdentifierhandlelevellocalenamenspod_containerpod_namerequest.address.urirequest.metadata.one-data-correlation-idrequest.metadata.one-data-trace-idrequestIdentifierschemaVersionstreamtimestamp
treatmentsFilter{}
userId
1234default-javacom1.2.5010911/9/2023 9:46:00 AMADDTL CUSTe304bagsp00a79e6b_b610_3407_90fa_11d5417c0b7finfoen-UScomonedefault-javadefaultRead/1.2.5d5d30bed50.1stdout2023-01-09T16:46:00.779Z
targeted
messages
lkapla

 

View solution in original post

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

Similar to your other question, please post JSON objects in code blocks because some combinations turn into smileys.  As I said there, try not to treat JSON objects like text strings.  Use SPL's built-in capabilities to deal with structured data.

With your raw logs, Splunk should have extracted the field "message".  Inside message, there's a JSON node named "message".  Somehow spath cannot work well with duplicate names.  So, we'll rename the Splunk field "message" first.

 

 

| rename message AS data
| spath input=data
| eval REQUEST_RECEIVED = replace(message, ".*, REQUEST-RECEIVED: ", "")
| spath input=REQUEST_RECEIVED
| fields - REQUEST_RECEIVED data message

 

 

Your sample data - after correction for smileys, would give this output that contains multiple time fields as well as other data about the request.

accountNumberapp_nameapplication.nameapplication.versioncallIdentifiercallTimetampcallerTypecancelTypeclientIdentifierenvironmentgeoTelIdentifierhandlelevellocalenamenspod_containerpod_namerequest.address.urirequest.metadata.one-data-correlation-idrequest.metadata.one-data-trace-idrequestIdentifierschemaVersionstreamtimestamp
treatmentsFilter{}
userId
1234default-javacom1.2.5010911/9/2023 9:46:00 AMADDTL CUSTe304bagsp00a79e6b_b610_3407_90fa_11d5417c0b7finfoen-UScomonedefault-javadefaultRead/1.2.5d5d30bed50.1stdout2023-01-09T16:46:00.779Z
targeted
messages
lkapla

 

Tags (1)

siksaw33
Path Finder

Thank you so much @yuanliu 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @siksaw33,

this seems to be a json file, so at first try to use the spath command (https://docs.splunk.com/Documentation/Splunk/9.0.3/SearchReference/Spath) that automatically extracts all the fields.

Otherwise, you can use this regex:

| rex "^(?<time>[^ ]+).*clientIdentifier\\\":(?<clientIdentifier>[^,]+).*userId\\\":(?<userId>[^,]+)"

that you can test at https://regex101.com/r/Mb2Z3z/1

Ciao.

Giuseppe

siksaw33
Path Finder

FYI I used rex field=_raw "userId\\\\\":\\\\\"(?<userId>[a-z]+)"  for this

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...