Splunk Search

How to do Field extraction from regex issues?

iamsplunker
Communicator

Hello Splunk Community, 

I'm trying to extract fields from the cloudwatch events like 1)region 2)arn 3) startTime 4) endTime 5)eventTypeCode 6)latestDescription from an event. The regex works fine in regex101 however it's not extracting all field values in Splunk

For ex: | rex field=_raw "region":\s(?P<_region>"\w+-\w+-\d)"

the above rex is only extracting us-east-1 region only where I have multiple regions in the data. Please help to extract the field I mentioned/highlighted. 

sample event:

2020-02-10T17:42:41.088Z 775ab4c6-ccc3-600b-9c84-124320628f00 {"records": [{"value": {"successfulSetoflog": [{"awsAccountId": "123456789123", "event": {"arn": "arn:aws:health:us-east-........................................................

Labels (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

If you scroll to the right, you will notice that "arn" is a subnode event.arn, "region" is subnode event.region, and so on; "eventTypeCode" is just node eventTypeCode, and "latestDescription" is subnode eventDescription.latestDescription.

If you only want to see these, you can use fields or table command to list them, e.g.,

 

| rex "^[^{]+ (?<data>{.+})"
| spath input=data path=records{}
| mvexpand records{}
| spath input=records{}
| spath input=records{} path=value.successfulSetoflog{}
| mvexpand value.successfulSetoflog{}
| spath input=value.successfulSetoflog{}
| fields - data records{} value.successfulSetoflog{}.* value.successfulSetoflog{} _time
| fields event.arn event.region event.startTime event.endTime eventTypeCode eventDescription.latestDescription

 

Your sample data will give you listing (again, scroll to the right to see all fields)

event.arnevent.regionevent.startTimeevent.endTimeeventTypeCodeeventDescription.latestDescription
arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULEDus-east-22020-01-20 04:33:00+00:002020-01-22 04:33:00+00:00AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLEWe are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. http://aws.amazon.com/support

This is an emulation for you to play with and compare with real data

 

| makeresults
| eval _raw = "2020-02-10T17:42:41.088Z 775ab4c6-ccc3-600b-9c84-124320628f00 {\"records\": [{\"value\": {\"successfulSetoflog\": [{\"awsAccountId\": \"123456789123\", \"event\": {\"arn\": \"arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED\", \"eventTypeCategory\": \"scheduledChange\", \"region\": \"us-east-2\", \"startTime\": \"2020-01-20 04:33:00+00:00\", \"endTime\": \"2020-01-22 04:33:00+00:00\", \"lastUpdatedTime\": \"2020-02-22 02:05:17.689000+00:00\", \"statusCode\": \"current\", \"eventStatusCode\": \"NUMBER_SPECIFIC\"}, \"eventTypeCode\": \"AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE\", \"eventDescription\": {\"latestDescription\": \"We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. \\n\\nhttp://aws.amazon.com/support\"}}], \"failedSet\": [], \"ResponseMetatype\": {\"RequestId\": \"yz0c12d7-s44d-8b65-k883-f233rb4cb70c\", \"HTTPStatusCode\": 500, \"HTTPHeaders\": {\"x-amzn-requestid\": \"105ab4c6-ccc3-999b-9c84-999320628f00 \", \"context-type\": \"application/x-dvz-json-2.1\", \"content-length\": \"4000\", \"date\": \"Tue, 10 Jan 2020 11:11:11 GMT\"}, \"RetryAttempts\": 0}, \"detail-type\": \"AWS API Health Event\"}}]}"
``` data emulation above ```

 

View solution in original post

0 Karma

jotne
Builder

For this type of data, you can use the extract command.  To make it work, we need to remove the part before the first {.  (It can be saved to a field if needed)

| makeresults
| eval _raw="2020-02-10T17:42:41.088Z 775ab4c6-ccc3-600b-9c84-124320628f00 {\"records\": [{\"value\": {\"successfulSetoflog\": [{\"awsAccountId\": \"123456789123\", \"event\": {\"arn\": \"arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED\", \"eventTypeCategory\": \"scheduledChange\", \"region\": \"us-east-2\", \"startTime\": \"2020-01-20 04:33:00+00:00\", \"endTime\": \"2020-01-22 04:33:00+00:00\", \"lastUpdatedTime\": \"2020-02-22 02:05:17.689000+00:00\", \"statusCode\": \"current\", \"eventStatusCode\": \"NUMBER_SPECIFIC\"}, \"eventTypeCode\": \"AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE\", \"eventDescription\": {\"latestDescription\": \"We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. \n\nhttp://aws.amazon.com/support\"}}], \"failedSet\": [], \"ResponseMetatype\": {\"RequestId\": \"yz0c12d7-s44d-8b65-k883-f233rb4cb70c\", \"HTTPStatusCode\": 500, \"HTTPHeaders\": {\"x-amzn-requestid\": \"105ab4c6-ccc3-999b-9c84-999320628f00 \", \"context-type\": \"application/x-dvz-json-2.1\", \"content-length\": \"4000\", \"date\": \"Tue, 10 Jan 2020 11:11:11 GMT\"}, \"RetryAttempts\": 0}, \"detail-type\": \"AWS API Health Event\"}}]}"
| rex mode=sed "s/^[^{]+//"
| extract
Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

I always tell people do not treat structured data as text.  You'll regret later.  Use spath to unpack JSON; use mvexpand to flatten JSON array.

 

| rex "^[^{]+ (?<data>{.+})"
| spath input=data path=records{}
| mvexpand records{}
| spath input=records{}
| spath input=records{} path=value.successfulSetoflog{}
| mvexpand value.successfulSetoflog{}
| spath input=value.successfulSetoflog{}
| fields - data records{} value.successfulSetoflog{}.* value.successfulSetoflog{}

 

The sample data will give you

awsAccountIdevent.arnevent.endTimeevent.eventStatusCodeevent.eventTypeCategoryevent.lastUpdatedTimeevent.regionevent.startTimeevent.statusCodeeventDescription.latestDescriptioneventTypeCodevalue.ResponseMetatype.HTTPHeaders.content-lengthvalue.ResponseMetatype.HTTPHeaders.context-typevalue.ResponseMetatype.HTTPHeaders.datevalue.ResponseMetatype.HTTPHeaders.x-amzn-requestidvalue.ResponseMetatype.HTTPStatusCodevalue.ResponseMetatype.RequestIdvalue.ResponseMetatype.RetryAttemptsvalue.detail-type
123456789123arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED2020-01-22 04:33:00+00:00NUMBER_SPECIFICscheduledChange2020-02-22 02:05:17.689000+00:00us-east-22020-01-20 04:33:00+00:00currentWe are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. http://aws.amazon.com/supportAWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE4000application/x-dvz-json-2.1Tue, 10 Jan 2020 11:11:11 GMT105ab4c6-ccc3-999b-9c84-999320628f00500yz0c12d7-s44d-8b65-k883-f233rb4cb70c0AWS API Health Event
Tags (2)

iamsplunker
Communicator

@yuanliu Thanks for your response, the query you've provided is the example?
Would you mind to share the example query to unpack the fields I've highlighted in my question.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

If you scroll to the right, you will notice that "arn" is a subnode event.arn, "region" is subnode event.region, and so on; "eventTypeCode" is just node eventTypeCode, and "latestDescription" is subnode eventDescription.latestDescription.

If you only want to see these, you can use fields or table command to list them, e.g.,

 

| rex "^[^{]+ (?<data>{.+})"
| spath input=data path=records{}
| mvexpand records{}
| spath input=records{}
| spath input=records{} path=value.successfulSetoflog{}
| mvexpand value.successfulSetoflog{}
| spath input=value.successfulSetoflog{}
| fields - data records{} value.successfulSetoflog{}.* value.successfulSetoflog{} _time
| fields event.arn event.region event.startTime event.endTime eventTypeCode eventDescription.latestDescription

 

Your sample data will give you listing (again, scroll to the right to see all fields)

event.arnevent.regionevent.startTimeevent.endTimeeventTypeCodeeventDescription.latestDescription
arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULEDus-east-22020-01-20 04:33:00+00:002020-01-22 04:33:00+00:00AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLEWe are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. http://aws.amazon.com/support

This is an emulation for you to play with and compare with real data

 

| makeresults
| eval _raw = "2020-02-10T17:42:41.088Z 775ab4c6-ccc3-600b-9c84-124320628f00 {\"records\": [{\"value\": {\"successfulSetoflog\": [{\"awsAccountId\": \"123456789123\", \"event\": {\"arn\": \"arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED\", \"eventTypeCategory\": \"scheduledChange\", \"region\": \"us-east-2\", \"startTime\": \"2020-01-20 04:33:00+00:00\", \"endTime\": \"2020-01-22 04:33:00+00:00\", \"lastUpdatedTime\": \"2020-02-22 02:05:17.689000+00:00\", \"statusCode\": \"current\", \"eventStatusCode\": \"NUMBER_SPECIFIC\"}, \"eventTypeCode\": \"AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE\", \"eventDescription\": {\"latestDescription\": \"We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. \\n\\nhttp://aws.amazon.com/support\"}}], \"failedSet\": [], \"ResponseMetatype\": {\"RequestId\": \"yz0c12d7-s44d-8b65-k883-f233rb4cb70c\", \"HTTPStatusCode\": 500, \"HTTPHeaders\": {\"x-amzn-requestid\": \"105ab4c6-ccc3-999b-9c84-999320628f00 \", \"context-type\": \"application/x-dvz-json-2.1\", \"content-length\": \"4000\", \"date\": \"Tue, 10 Jan 2020 11:11:11 GMT\"}, \"RetryAttempts\": 0}, \"detail-type\": \"AWS API Health Event\"}}]}"
``` data emulation above ```

 

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...