We are using SplunkTAAWS 4.6.0. On an EC2 instance with a proper IAM instance profile which has access to SQS and S3, enable an SQS-based-S3 input on 1GB+ S3 keys.
The modular input has a watch path on splunktaaws/settings/account/YOURINSTANCEPROFILE and exits on change, even though the credentials are still valid for +6 hours. The number of messages in "in flight" grow as modular inputs keep resetting due to the credential changes. The auto-discovered instance profile role credentials cause modular input to exit prematurely. This causes SQS messages to remain in flight. The mod input gets re-ingested and causing duplicate data.
74 def hasexpired(self):
75 now = time.time()
76 if now - self.lastcheck > 30:
### This is 30seconds timer you may want to try like, 20000 considering the previous credentials is valid for the next 6 hours ( 6 X 3600) ###
77 eelf.lastcheck = now
78 self.hasexpired = self.check()
---- file: etc/apps/SplunkTAaws/bin/splunksdc/config.py
This has reduced the chances of duplicate data by 95+ %.