With the AWS Add-On for Splunk (version 5.0.3) we can pull logs from a CloudFront S3 bucket via the "Generic S3" type, or from an Application Load Balance with the ELB, Generic S3 type Input.
The problem is that the time format for the CloudFront logs are without timezone specified in the S3 objects and our splunk instance is incorrectly defaulting to localtime. The ELB logs are correctly converted from UTC to localtime in searches.
How might we force the timezone to UTC for these events at ingest?
I tried creating a /opt/splunk/etc/apps/Splunk_TA_aws/local/props.conf file (yes, $SPLUNK_HOME is /opt/splunk) on our heavy forwarder and restarted with this content:
[aws:cloudfront:accesslogs]
TZ = UTC
Alas, no dice yet. Suggestions?
Ahh, I think I found the issue. The TA was pulling in data when it thought it occurred at the present time, but it was interpreting UTC time as -0400 (EDT), thus it was always 4 hours behind the latest. When it was unwedged, it kept processing data for another four hours and stopped, but I also introduced another problem which caused it to abruptly stop. I fixed that problem, and retroactively loaded in the old data, so it was able to catch up the gap. Sometimes you need to spend many hours on a timezone problem before the evidence catches up with you.
Problem now solved.
Hmm,
[splunk@ess1 Splunk_TA_aws]$ splunk btool props list aws:cloudfront:accesslogs | grep TZ
TZ = UTC
Ahh, I think I found the issue. The TA was pulling in data when it thought it occurred at the present time, but it was interpreting UTC time as -0400 (EDT), thus it was always 4 hours behind the latest. When it was unwedged, it kept processing data for another four hours and stopped, but I also introduced another problem which caused it to abruptly stop. I fixed that problem, and retroactively loaded in the old data, so it was able to catch up the gap. Sometimes you need to spend many hours on a timezone problem before the evidence catches up with you.
Problem now solved.