Re: Mixed cloud and internal architecture for Splu...

grahamkenville · ‎04-25-2012

We have an internally hosted Splunk environment which indexes logs from most of our internal hosts.

We also have a number of hosts in the Amazon cloud which we would like to Splunk. We are looking into the best way to achieve this. I hoped there would be a white paper on the subject, or at least forum posts but I didn't find much.

Obviously all environments are unique, but how are people handling this case?

The first thing that springs to mind is to add a Splunk forwarder in the DMZ and allow the cloud instances to talk to it with authentication via certificates. The DMZ forwarder would then forward logs into the existing Splunk environment.

Is the Universal Forwarder secure enough to expose it to the public Internet?

I also ran across this PDF:
https://s3.amazonaws.com/aws001/guided_trek/Splunk_and_Amazon_Web_Services_Tech_Brief_Final_6.06.11....

On page two they show an environment with indexers and search heads both internally and in the cloud which allows for centralized searching. Does anyone here have any experience with this design?

Any thoughts or comments would be appreciated.

Thanks,
-Graham

rtadams89 · ‎12-20-2012

You probably don't want to send the log data from the cloud to your private network as you will eat up Amazon bandwidth and run up quite a bill. Instead, you should have an indexer (or indexers) in the Amazon cloud where all the cloud data gets indexed. Similarly, have an indexer/indexers locally where all local data gets indexed. Setup a search head (I would recommend locally, but you could put it in the cloud), and point it to both indexers. This will allow you to search both zones from one search head and will also limit the amount of data through the internet.

ndsmalle0 · ‎12-21-2012

Your recommendation is in line with the document referenced above. However, in hindsight one of the determining factor of where indexer/indexers should go is based on the expected amount of data being ingested. Our experience is with close to 1TB a day and 30-40 concurrent searches which when looked at from a cost/benefit perspective the request of that number of reserved XL instances and the amount of storage(Amazon S3), it was much less to send data across the network.

ndsmalle0 · ‎12-20-2012

Graham,

We are using our internal splunk instance to ingest Amazon instance data from both Public and VPC housed instances. On our public instances, Chef installs the Universal Forwarder on the instances which have an outputs.conf sending them to heavy forwarders in the cloud which sends data via ssl connection to heavy forwarders on our environments DMZ which then forwards the data to our internal indexers. This seems to be the easiest way to receive basic log data from Linux Hosts (we monitor the /var/log* directory and can source type on the indexer side).

Hope that helps!

Nate

Mixed cloud and internal architecture for Splunk

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Index This | What travels the world but is also stuck in place?

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

Join the Conversation