Splunk Search

Rex Field Extraction

parthiban
Path Finder

Hi @All ,

 I want to extract the correlation_id for the below payload, can anyone help me to write rex command.

{"message_type": "INFO", "processing_stage": "Deleted message from queue", "message": "Deleted message from queue", "correlation_id": "['321e2253-443a-41f1-8af3-81dbdb8bcc77']", "error": "", "invoker_agent": "arn:aws:sqs:eu-central-1:981503094308:prd-ccm-incontact-ingestor-queue-v1", "invoked_component": "prd-ccm-incontact-ingestor-v1", "request_payload": "", "response_details": "{'ResponseMetadata': {'RequestId': 'a04c3e82-fe3a-5986-b61c-6323fd295e18', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'a04c3e82-fe3a-5986-b61c-6323fd295e18', 'x-amzn-trace-id': 'Root=1-652700cc-f7ed3cf574ce28da63f6625d;Parent=865f4dad6eddf3c1;Sampled=1', 'date': 'Wed, 11 Oct 2023 20:08:51 GMT', 'content-type': 'text/xml', 'content-length': '215', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}", "invocation_timestamp": "2023-10-11T20:08:51Z", "response_timestamp": "2023-10-11T20:08:51Z", "original_source_app": "YMKT", "target_idp_application": "", "retry_attempt": "1", "custom_attributes": {"entity-internal-id": "", "root-entity-id": "", "campaign-id": "", "campaign-name": "", "marketing-area": "", "lead-id": "", "record_count": "1", "country": ["India"]}}

Labels (1)
0 Karma

parthiban
Path Finder

"Hi @bowesmana, thank you for your response. I need a regular expression to extract the correlation_id because I want to calculate the average time taken for two source events. The samples I provided are as follows:

  • correlation_id: "['321e2253-443a-41f1-8af3-81dbdb8bcc77']"
  • correlation_id: "11315ad3-02a3-419d-a656-85972e07a1a5"

These are two format logs one is in array format and another normal value. Thanks in advance

0 Karma

yuanliu
SplunkTrust
SplunkTrust
The samples I provided are as follows:
  • correlation_id: "['321e2253-443a-41f1-8af3-81dbdb8bcc77']"
  • correlation_id: "11315ad3-02a3-419d-a656-85972e07a1a5"

These are two format logs one is in array format and another normal value. Thanks in advance


Did you forget to provide one of samples you alluded to?  The only sample (if it is raw event) you provided would have these fields available to you by Splunk:

fieldnamefieldvalue
correlation_id['321e2253-443a-41f1-8af3-81dbdb8bcc77']
custom_attributes.campaign-id 
custom_attributes.campaign-name 
custom_attributes.country{}India
custom_attributes.entity-internal-id 
custom_attributes.lead-id 
custom_attributes.marketing-area 
custom_attributes.record_count1
custom_attributes.root-entity-id 
error 
invocation_timestamp2023-10-11T20:08:51Z
invoked_componentprd-ccm-incontact-ingestor-v1
invoker_agentarn:aws:sqs:eu-central-1:981503094308:prd-ccm-incontact-ingestor-queue-v1
messageDeleted message from queue
message_typeINFO
original_source_appYMKT
processing_stageDeleted message from queue
request_payload 
response_details{'ResponseMetadata': {'RequestId': 'a04c3e82-fe3a-5986-b61c-6323fd295e18', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'a04c3e82-fe3a-5986-b61c-6323fd295e18', 'x-amzn-trace-id': 'Root=1-652700cc-f7ed3cf574ce28da63f6625d;Parent=865f4dad6eddf3c1;Sampled=1', 'date': 'Wed, 11 Oct 2023 20:08:51 GMT', 'content-type': 'text/xml', 'content-length': '215', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}
response_timestamp2023-10-11T20:08:51Z
retry_attempt1
target_idp_application 

As you can see, there is only one correlation_id; the value 11315ad3-02a3-419d-a656-85972e07a1a5 is nowhere in this sample.  The field response_details contains a pseudo JSON that can be transformed into conformant JSON, but it also does not contain any embedded key named correlation_id nor any embedded value of 11315ad3-02a3-419d-a656-85972e07a1a5.

I also fail to see the significance of 11315ad3-02a3-419d-a656-85972e07a1a5 vs ['321e2253-443a-41f1-8af3-81dbdb8bcc77'].  In JSON, they are just strings.  None of them is special.  As I mentioned earlier, it is best not to use regex on structured data like this.  As your sample event is conformant JSON, using Splunk's built in function is a lot more robust and saves a lot headaches in future maintenance.

yuanliu
SplunkTrust
SplunkTrust

As I always caution people in this forum, do not treat structured data such as JSON as text.  Regex is usually not the right tool.

Is the illustrated JSON the raw event?  If so, Splunk should have given you a field named correlation_id of value ['321e2253-443a-41f1-8af3-81dbdb8bcc77'].  If it is part of a raw event that is compliant JSON, you need to show the full raw event - and Splunk should have given you a field named some_path.correlation_id.  If it is part of a raw event that is not JSON, you need to show the raw event so we can help you extract the JSON part, then you can use spath on the JSON part.  This is much more robust and maintainable than using regex on structured data.

bowesmana
SplunkTrust
SplunkTrust

If that is your _raw event, just do

| spath correlation_id

and it will give you the correlation_id field

Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...