When using the Splunk Add-On for AWS, we're observing that events for sourcetype aws:cloudwatch:guardduty are not all parsed the same. There are events that have _raw begin with {"version":"0",... and others that begin with {"schemaVersion":"2.0",... . For the ones that begin with version, they seem to have the other same event data as the schemaVersion events, however the data is nested in a JSON field "detail".
How to fix this?
The problem is with the props/transforms. There's an extraction in the Splunk Add-On for AWS' default transforms.conf:
[extract_detail_from_cloudwatch_events]
DEST_KEY = _raw
REGEX = ^{[^{]+"detail":(.*)}$
FORMAT = $1
PROBLEM:
What's happening is that all the events are formatted the same in the raw source from AWS, however the transform which is rewriting _raw is not being applied 100% to extract the nested "detail" JSON field and replace _raw (i.e. remove the outer wrapper). This is because some of the GuardDuty events can be large in size (up to 20k characters), and the parsing is using the default LOOKAHEAD of 4k (4096). You can see this by using the btool command: ./splunk btool transforms list --debug extract_detail_from_cloudwatch_events
SOLUTION:
If you create a transforms.conf in a local folder for the add-on, you can add the following stanza:
[extract_detail_from_cloudwatch_events]
LOOKAHEAD = 20480
This will tell Splunk to read an event up to the LOOKAHEAD length (20k in this case) to apply the transformation, which it needs to do or the end of the "detail" JSON field won't be read in full for large events and the extraction won't happen.
You can verify the longest guardduty event in your events by searching for your guardduty data and appending the search:
| eval raw_length = len(_raw) | stats max(raw_length)
Or if you're curious to see the event distribution, use a visualization to see the distribution in size of events:
index=<YOUR_aws_index> sourcetype=<YOUR_guardduty sourcetype> version=* OR schemaVersion=*
| eval raw_length_kb=ceil(len(_raw)/1024) | stats count by raw_length_kb | sort raw_length_kb
I'd classify this as a bug, where the Splunk Add-On should have the LOOKAHEAD value defined with an appropriate value. The largest event I've seen in a recent 90 day search is around 20k (used a value of 20480).
Thanks to @ashajambagi for figuring out this issue together.
The problem is with the props/transforms. There's an extraction in the Splunk Add-On for AWS' default transforms.conf:
[extract_detail_from_cloudwatch_events]
DEST_KEY = _raw
REGEX = ^{[^{]+"detail":(.*)}$
FORMAT = $1
PROBLEM:
What's happening is that all the events are formatted the same in the raw source from AWS, however the transform which is rewriting _raw is not being applied 100% to extract the nested "detail" JSON field and replace _raw (i.e. remove the outer wrapper). This is because some of the GuardDuty events can be large in size (up to 20k characters), and the parsing is using the default LOOKAHEAD of 4k (4096). You can see this by using the btool command: ./splunk btool transforms list --debug extract_detail_from_cloudwatch_events
SOLUTION:
If you create a transforms.conf in a local folder for the add-on, you can add the following stanza:
[extract_detail_from_cloudwatch_events]
LOOKAHEAD = 20480
This will tell Splunk to read an event up to the LOOKAHEAD length (20k in this case) to apply the transformation, which it needs to do or the end of the "detail" JSON field won't be read in full for large events and the extraction won't happen.
You can verify the longest guardduty event in your events by searching for your guardduty data and appending the search:
| eval raw_length = len(_raw) | stats max(raw_length)
Or if you're curious to see the event distribution, use a visualization to see the distribution in size of events:
index=<YOUR_aws_index> sourcetype=<YOUR_guardduty sourcetype> version=* OR schemaVersion=*
| eval raw_length_kb=ceil(len(_raw)/1024) | stats count by raw_length_kb | sort raw_length_kb
I'd classify this as a bug, where the Splunk Add-On should have the LOOKAHEAD value defined with an appropriate value. The largest event I've seen in a recent 90 day search is around 20k (used a value of 20480).
Thanks to @ashajambagi for figuring out this issue together.