Hi Splunk Community,
I’m looking for confirmation or guidance on a gzip handling issue with the Splunk Add-on for AWS when ingesting data from Kinesis Firehose → S3 → SQS-based S3 input.
Splunk Add-on for AWS version: 7.1.0 (testing upgrade to latest as well)
Deployment: Heavy Forwarder
Ingestion method: SQS-based S3 input (SDC framework)
Source: API Gateway → Kinesis Firehose → S3
Firehose settings:
Compression: GZIP
Buffer size: 64 MB
Data format: Concatenated / “smushed” JSON (e.g. {"a":1}{"b":2})
Files delivered by Firehose are valid concatenated GZIP files (multiple gzip members in a single .gz object).
The AWS TA fails to correctly decompress these files:
Events appear as binary garbage (\x1f\x8b)
Or ingestion stops after the first gzip member
If the same file is downloaded from S3 and re-uploaded manually via the AWS Console, Splunk ingests it correctly (single gzip member).
This ingestion path uses splunksdc (splunksdc/aws/s3/archive.py), not Splunkd’s ArchiveProcessor.
props.conf settings such as:
NO_BINARY_CHECK = true
unarchive_cmd = gzip -cd -
do not help, as decompression happens upstream in the add-on code.
The gzip handling in archive.py appears to use Python’s gzip.GzipFile().read(), which does not fully support concatenated gzip members.
_internal logs do not show ArchiveProcessor, confirming SDC path.
Is concatenated / multi-member GZIP from Kinesis Firehose officially supported by the Splunk Add-on for AWS SQS-based S3 input?
Has this behaviour been fixed in newer versions (8.x)?
The release notes don’t explicitly mention concatenated gzip support.
Is there a recommended configuration or supported workaround, or is a custom patch / upstream decompression (e.g. Lambda) the only option?
Is Splunk planning to align SDC gzip handling with standard multi-member gzip behaviour?
Hi @alphablue
Can you confirm what you specified as the s3_file_decoder for the SQS-based-S3 input? Is this set to CustomLogs?
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing