Hello Splunk Community,
I'm trying to extract fields from the cloudwatch events like 1)region 2)arn 3) startTime 4) endTime 5)eventTypeCode 6)latestDescription from an event. The regex works fine in regex101 however it's not extracting all field values in Splunk
For ex: | rex field=_raw "region":\s(?P<_region>"\w+-\w+-\d)"
the above rex is only extracting us-east-1 region only where I have multiple regions in the data. Please help to extract the field I mentioned/highlighted.
sample event:
2020-02-10T17:42:41.088Z 775ab4c6-ccc3-600b-9c84-124320628f00 {"records": [{"value": {"successfulSetoflog": [{"awsAccountId": "123456789123", "event": {"arn": "arn:aws:health:us-east-........................................................
If you scroll to the right, you will notice that "arn" is a subnode event.arn, "region" is subnode event.region, and so on; "eventTypeCode" is just node eventTypeCode, and "latestDescription" is subnode eventDescription.latestDescription.
If you only want to see these, you can use fields or table command to list them, e.g.,
| rex "^[^{]+ (?<data>{.+})"
| spath input=data path=records{}
| mvexpand records{}
| spath input=records{}
| spath input=records{} path=value.successfulSetoflog{}
| mvexpand value.successfulSetoflog{}
| spath input=value.successfulSetoflog{}
| fields - data records{} value.successfulSetoflog{}.* value.successfulSetoflog{} _time
| fields event.arn event.region event.startTime event.endTime eventTypeCode eventDescription.latestDescription
Your sample data will give you listing (again, scroll to the right to see all fields)
event.arn | event.region | event.startTime | event.endTime | eventTypeCode | eventDescription.latestDescription |
arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED | us-east-2 | 2020-01-20 04:33:00+00:00 | 2020-01-22 04:33:00+00:00 | AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE | We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. http://aws.amazon.com/support |
This is an emulation for you to play with and compare with real data
| makeresults
| eval _raw = "2020-02-10T17:42:41.088Z 775ab4c6-ccc3-600b-9c84-124320628f00 {\"records\": [{\"value\": {\"successfulSetoflog\": [{\"awsAccountId\": \"123456789123\", \"event\": {\"arn\": \"arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED\", \"eventTypeCategory\": \"scheduledChange\", \"region\": \"us-east-2\", \"startTime\": \"2020-01-20 04:33:00+00:00\", \"endTime\": \"2020-01-22 04:33:00+00:00\", \"lastUpdatedTime\": \"2020-02-22 02:05:17.689000+00:00\", \"statusCode\": \"current\", \"eventStatusCode\": \"NUMBER_SPECIFIC\"}, \"eventTypeCode\": \"AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE\", \"eventDescription\": {\"latestDescription\": \"We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. \\n\\nhttp://aws.amazon.com/support\"}}], \"failedSet\": [], \"ResponseMetatype\": {\"RequestId\": \"yz0c12d7-s44d-8b65-k883-f233rb4cb70c\", \"HTTPStatusCode\": 500, \"HTTPHeaders\": {\"x-amzn-requestid\": \"105ab4c6-ccc3-999b-9c84-999320628f00 \", \"context-type\": \"application/x-dvz-json-2.1\", \"content-length\": \"4000\", \"date\": \"Tue, 10 Jan 2020 11:11:11 GMT\"}, \"RetryAttempts\": 0}, \"detail-type\": \"AWS API Health Event\"}}]}"
``` data emulation above ```
For this type of data, you can use the extract command. To make it work, we need to remove the part before the first {. (It can be saved to a field if needed)
| makeresults
| eval _raw="2020-02-10T17:42:41.088Z 775ab4c6-ccc3-600b-9c84-124320628f00 {\"records\": [{\"value\": {\"successfulSetoflog\": [{\"awsAccountId\": \"123456789123\", \"event\": {\"arn\": \"arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED\", \"eventTypeCategory\": \"scheduledChange\", \"region\": \"us-east-2\", \"startTime\": \"2020-01-20 04:33:00+00:00\", \"endTime\": \"2020-01-22 04:33:00+00:00\", \"lastUpdatedTime\": \"2020-02-22 02:05:17.689000+00:00\", \"statusCode\": \"current\", \"eventStatusCode\": \"NUMBER_SPECIFIC\"}, \"eventTypeCode\": \"AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE\", \"eventDescription\": {\"latestDescription\": \"We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. \n\nhttp://aws.amazon.com/support\"}}], \"failedSet\": [], \"ResponseMetatype\": {\"RequestId\": \"yz0c12d7-s44d-8b65-k883-f233rb4cb70c\", \"HTTPStatusCode\": 500, \"HTTPHeaders\": {\"x-amzn-requestid\": \"105ab4c6-ccc3-999b-9c84-999320628f00 \", \"context-type\": \"application/x-dvz-json-2.1\", \"content-length\": \"4000\", \"date\": \"Tue, 10 Jan 2020 11:11:11 GMT\"}, \"RetryAttempts\": 0}, \"detail-type\": \"AWS API Health Event\"}}]}"
| rex mode=sed "s/^[^{]+//"
| extract
I always tell people do not treat structured data as text. You'll regret later. Use spath to unpack JSON; use mvexpand to flatten JSON array.
| rex "^[^{]+ (?<data>{.+})"
| spath input=data path=records{}
| mvexpand records{}
| spath input=records{}
| spath input=records{} path=value.successfulSetoflog{}
| mvexpand value.successfulSetoflog{}
| spath input=value.successfulSetoflog{}
| fields - data records{} value.successfulSetoflog{}.* value.successfulSetoflog{}
The sample data will give you
awsAccountId | event.arn | event.endTime | event.eventStatusCode | event.eventTypeCategory | event.lastUpdatedTime | event.region | event.startTime | event.statusCode | eventDescription.latestDescription | eventTypeCode | value.ResponseMetatype.HTTPHeaders.content-length | value.ResponseMetatype.HTTPHeaders.context-type | value.ResponseMetatype.HTTPHeaders.date | value.ResponseMetatype.HTTPHeaders.x-amzn-requestid | value.ResponseMetatype.HTTPStatusCode | value.ResponseMetatype.RequestId | value.ResponseMetatype.RetryAttempts | value.detail-type |
123456789123 | arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED | 2020-01-22 04:33:00+00:00 | NUMBER_SPECIFIC | scheduledChange | 2020-02-22 02:05:17.689000+00:00 | us-east-2 | 2020-01-20 04:33:00+00:00 | current | We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. http://aws.amazon.com/support | AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE | 4000 | application/x-dvz-json-2.1 | Tue, 10 Jan 2020 11:11:11 GMT | 105ab4c6-ccc3-999b-9c84-999320628f00 | 500 | yz0c12d7-s44d-8b65-k883-f233rb4cb70c | 0 | AWS API Health Event |
@yuanliu Thanks for your response, the query you've provided is the example?
Would you mind to share the example query to unpack the fields I've highlighted in my question.
If you scroll to the right, you will notice that "arn" is a subnode event.arn, "region" is subnode event.region, and so on; "eventTypeCode" is just node eventTypeCode, and "latestDescription" is subnode eventDescription.latestDescription.
If you only want to see these, you can use fields or table command to list them, e.g.,
| rex "^[^{]+ (?<data>{.+})"
| spath input=data path=records{}
| mvexpand records{}
| spath input=records{}
| spath input=records{} path=value.successfulSetoflog{}
| mvexpand value.successfulSetoflog{}
| spath input=value.successfulSetoflog{}
| fields - data records{} value.successfulSetoflog{}.* value.successfulSetoflog{} _time
| fields event.arn event.region event.startTime event.endTime eventTypeCode eventDescription.latestDescription
Your sample data will give you listing (again, scroll to the right to see all fields)
event.arn | event.region | event.startTime | event.endTime | eventTypeCode | eventDescription.latestDescription |
arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED | us-east-2 | 2020-01-20 04:33:00+00:00 | 2020-01-22 04:33:00+00:00 | AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE | We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. http://aws.amazon.com/support |
This is an emulation for you to play with and compare with real data
| makeresults
| eval _raw = "2020-02-10T17:42:41.088Z 775ab4c6-ccc3-600b-9c84-124320628f00 {\"records\": [{\"value\": {\"successfulSetoflog\": [{\"awsAccountId\": \"123456789123\", \"event\": {\"arn\": \"arn:aws:health:us-east-1::event/RDS/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED/AWS_RDS_AURORA_SOFTWARE_BACKUP_SCHEDULED_SOFTWARE_BACKUP_SCHEDULED\", \"eventTypeCategory\": \"scheduledChange\", \"region\": \"us-east-2\", \"startTime\": \"2020-01-20 04:33:00+00:00\", \"endTime\": \"2020-01-22 04:33:00+00:00\", \"lastUpdatedTime\": \"2020-02-22 02:05:17.689000+00:00\", \"statusCode\": \"current\", \"eventStatusCode\": \"NUMBER_SPECIFIC\"}, \"eventTypeCode\": \"AWS_DATABASE_SOFTWARE_UPDATE_AVAILABLE\", \"eventDescription\": {\"latestDescription\": \"We are contacting you to inform you that one or more of your Amazon authena instances listed in the 'Affected resources' tab are scheduled to receive maintenance on the mentioned hardware between 2020-03-10 04:33 UTC (thursday) and2020-03-10 07:33UTC (thursday). The exact time of the maintenance will be determined by the DB instance if you have any questions or concerns, contact AWS Premium Support. \\n\\nhttp://aws.amazon.com/support\"}}], \"failedSet\": [], \"ResponseMetatype\": {\"RequestId\": \"yz0c12d7-s44d-8b65-k883-f233rb4cb70c\", \"HTTPStatusCode\": 500, \"HTTPHeaders\": {\"x-amzn-requestid\": \"105ab4c6-ccc3-999b-9c84-999320628f00 \", \"context-type\": \"application/x-dvz-json-2.1\", \"content-length\": \"4000\", \"date\": \"Tue, 10 Jan 2020 11:11:11 GMT\"}, \"RetryAttempts\": 0}, \"detail-type\": \"AWS API Health Event\"}}]}"
``` data emulation above ```