Answering my own question here, the issue was Splunk was not recognising the timestamp format of the log entry. I found this from a log entry:
06-29-2016 03:47:33.686 +0000 WARN AggregatorMiningProcessor - Breaking event because limit of 1024 has been exceeded - data_source="s3://logs/access_logs/AWSLogs/...", data_host="...", data_sourcetype="aws:elb:accesslog"
06-29-2016 03:47:33.686 +0000 WARN AggregatorMiningProcessor - Changing breaking behavior for event stream because MAX_EVENTS (1024) was exceeded without a single event break. Will set BREAK_ONLY_BEFORE_DATE to False, and unset any MUST_NOT_BREAK_BEFORE or MUST_NOT_BREAK_AFTER rules. Typically this will amount to treating this data as single-line only. - data_source="...", data_host="...", data_sourcetype="aws:elb:accesslog"
I fixed this by setting the following in props.conf
BREAK_ONLY_BEFORE_DATE = false
... View more