@jcoates i think there are actually multiple things going on here. we found the following issue migrating from a homegrown aggregator with a single queue, to a queue which serves up notifications from tons of s3buckets.
i can confirm this was an issue in both 1.1.1 and 2.0 versions of the TA.
records would download fine in most cases, but then we'd see a stacktrace before any events were ingested. we think it was possibly treating the directory / folder itself as a cloudtrail log, and just giving up.
2015-10-28 01:59:04,521 INFO pid=10768 tid=MainThread file=aws_cloudtrail.py:process_S3_notifications:453 | fetched 31 records, wrote 31, discarded 0, redirected 0 from s3:foo-cloudtrails/nnnnnnnn/folder/yyyyyyyyyy/CloudTrail/region/2015/10/28/aaaaaaaaaaaa-foo.json.gz
2015-10-28 01:59:35,704 CRITICAL pid=10798 tid=MainThread file=aws_cloudtrail.py:stream_events:331 | Outer catchall - Traceback:
Traceback (most recent call last):
File "/opt/splunk/etc/apps/Splunk_TA_aws/bin/aws_cloudtrail.py", line 269, in stream_events
s3_completed, s3_keys_to_delete, s3_failed=self.process_S3_notifications(s3_conn, s3_notifications)
File "/opt/splunk/etc/apps/Splunk_TA_aws/bin/aws_cloudtrail.py", line 405, in process_S3_notifications
message['s3Bucket'], key, type(e).__name__, e))
KeyError: 's3Bucket'
digging further with a cleaned up debug message... (which is super tricky in splunk TBH) seems like it was a formatting error actually printing the error itself and killing the script 😕 :
2015-10-28 05:16:14,005 ERROR pid=26577 tid=MainThread file=aws_cloudtrail.py:process_S3_notifications:407 | problems reading json from s3:foo-cloudtrails/nnnnnnnn/folder/yyyyyyyyyy/: ValueError No JSON object could be decoded
so effectively, we DoS'd ourselves until we patched the script. the main line for us was below, but other exceptions around there would also fail similarly:
except ValueError as e:
message_failed=True
logger.log(logging.ERROR, "problems reading json from s3:{}/{}: {} {}".format(
message['s3']['bucket']['name'], key, type(e).__name__, e))
#logger.log(logging.ERROR, "problems reading json from s3:{}/{}: {} {}".format(
# message['s3Bucket'], key, type(e).__name__, e))
is there a github or bitbucket repo which we can use to suggest changes? for now, i will toss them in bitbucket:
https://bitbucket.org/awurster/splunk-ta-aws/commits/311f3828e422bf1583148039771a7c051e3cc6c0
... View more