All Apps and Add-ons

Mixed cloud and internal architecture for Splunk

grahamkenville
Engager

We have an internally hosted Splunk environment which indexes logs from most of our internal hosts.

We also have a number of hosts in the Amazon cloud which we would like to Splunk. We are looking into the best way to achieve this. I hoped there would be a white paper on the subject, or at least forum posts but I didn't find much.

Obviously all environments are unique, but how are people handling this case?

The first thing that springs to mind is to add a Splunk forwarder in the DMZ and allow the cloud instances to talk to it with authentication via certificates. The DMZ forwarder would then forward logs into the existing Splunk environment.

Is the Universal Forwarder secure enough to expose it to the public Internet?

I also ran across this PDF:
https://s3.amazonaws.com/aws001/guided_trek/Splunk_and_Amazon_Web_Services_Tech_Brief_Final_6.06.11....

On page two they show an environment with indexers and search heads both internally and in the cloud which allows for centralized searching. Does anyone here have any experience with this design?

Any thoughts or comments would be appreciated.

Thanks,
-Graham

Tags (3)

rtadams89
Contributor

You probably don't want to send the log data from the cloud to your private network as you will eat up Amazon bandwidth and run up quite a bill. Instead, you should have an indexer (or indexers) in the Amazon cloud where all the cloud data gets indexed. Similarly, have an indexer/indexers locally where all local data gets indexed. Setup a search head (I would recommend locally, but you could put it in the cloud), and point it to both indexers. This will allow you to search both zones from one search head and will also limit the amount of data through the internet.

0 Karma

ndsmalle0
Explorer

Your recommendation is in line with the document referenced above. However, in hindsight one of the determining factor of where indexer/indexers should go is based on the expected amount of data being ingested. Our experience is with close to 1TB a day and 30-40 concurrent searches which when looked at from a cost/benefit perspective the request of that number of reserved XL instances and the amount of storage(Amazon S3), it was much less to send data across the network.

0 Karma

ndsmalle0
Explorer

Graham,

We are using our internal splunk instance to ingest Amazon instance data from both Public and VPC housed instances. On our public instances, Chef installs the Universal Forwarder on the instances which have an outputs.conf sending them to heavy forwarders in the cloud which sends data via ssl connection to heavy forwarders on our environments DMZ which then forwards the data to our internal indexers. This seems to be the easiest way to receive basic log data from Linux Hosts (we monitor the /var/log* directory and can source type on the indexer side).

Hope that helps!

Nate

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...