We have a variety of different AWS logs (i.e. CloudWatch, Cloudtrail, Config, VPC Flow, Aurora) and non-AWS logs (i.e. Palo Alto, Trend Micro) routed to S3 buckets today. There are a total of 15 S3 buckets (5 per AWS account).
Upon recently purchasing and configuring an on-premise Splunk ES (distributed deployment w/ index clustering, no SH clustering yet), our goal is to begin forwarding these logs to our Splunk deployment.
What are some considerations that we should keep in mind? Since we're going with a push approach, we're planning to do the following - Could someone confirm if this looks right? I'm open to suggestions.
Send the logs from the S3 bucket to Amazon Kinesis Firehouse
Firehose writes a batch of events to Splunk via HEC.
Since indexers are not in an AWS VPC (they reside in a separate Oracle Cloud instance), I'm assuming that an SSL certificate needs to be installed on each indexer? We have 1 index cluster in a Production environment and a separate one in our Disaster Recovery environment.
Assign DNS name that resolves to the set of indexers which shall collect data from Kinesis Firehose.
Install Splunk Add-on for Amazon Kinesis on Enterprise and ES Search Head, as well as Cluster Master
Ensure new index is created for AWS logs (1 sourcetype for each AWS log source) and existing indexes are used for the Palo Alto and Trend Micro logs. If new indexes are needed for Palo Alto and Trend Micro logs, I'm assuming that they would still adhere to the appropriate Splunk ES data models.
Configure HEC and create new HEC Token. There will be a unique HEC token per sourcetype.
Configure Amazon Kinesis Firehouse to send data to Splunk.
Ensure all events are backed up to S3 bucket until it is confirmed that all events are processed by Splunk
Search for data by source type to confirm that it is being indexed and visible.