I've been trying, unsuccessfully, to configure a Splunk HEC endpoint to consume AWS VPC Flow Logs via Firehose.
Having slowly worked through various errors, including HEC acknowledgement being disabled, SSL certificates issues, I thought I had beaten the last of them. However, I am now getting a rather unhelpful error in my Firehose failed events log as follows:
"attemptsMade":34,"arrivalTimestamp":1567429559545,"errorCode":"Splunk.ConnectionTimeout","errorMessage":"The connection to Splunk timed out. This might be a transient error and the request will be retried. Kinesis Firehose backs up the data to Amazon S3 if all retries fail."
Having had previous errors stating that the HEC indexer acknowledgement was disabled, and that ELB stickiness was not enabled, I'm fairly certain I am getting traffic to and from my Splunk instances. So I am not sure now why this is timing out. Is there any way to understand what is causing this? HEC Acknowledgement timeout is set to 600 seconds, so I don't believe it is this (plus that has its own error and corresponding code).
Any help gratefully received as I've been through all the documentation I can find, and am now stumped!
One of the alternatives to ingest AWS VPC Flow Logs into Splunk is with NetFlow Optimizer (NFO). There are additional benefits that can be achieved by using NFO: data consolidation and enrichment (EC2 instances names, regions, etc.)
I built an app to help troubleshoot firehose issues : https://github.com/amiracle/kinesis_data_firehose_helper (It will be on splunkbase soon once it gets vetted.) You can use this to help troubleshoot some of the issues and make sure your setup is correct and able to send data into HEC.