Deployment Architecture

What is the difference between a Distributed and Clustered environment?

aoliullah
Path Finder

Hi. Could someone explain to me the difference between Distributed and Clustered environment in relation to Splunk? I keep thinking it's the same.

Thanks in advance!

0 Karma
1 Solution

skalliger
SplunkTrust
SplunkTrust

Distributed does not necessarily mean clustered. A distributed environment describes the separation of indexing and searching logic in Splunk. In a non-distributed environment, you would have installed all the logic on a single machine, which does the indexing of data and also searches the data.

In a distributed environment however, you would have an indexer which gets data from several inputs and you would also have a search head, which searches across your indexer.

In a clustered environment, you could then combine multiple indexers to an indexer cluster for high-availabily/data loss prevention (keeping multiple copies of your data). Talking of desaster recover, you would then talk about a multi-site cluster (two clusters at different locations).
Also you would combine multiple search heads together, which distribute their searches to each other. Besides those two clusters, you will also need a deployer and a master (which can be the same machine) to manage your indexer and search head clusters.

Skalli

View solution in original post

ChrisG
Splunk Employee
Splunk Employee

There is a whole manual specifically about this subject. Start your reading at Scale your deployment with Splunk Enterprise components. The manual includes information about all the dimensions of a distributed deployment, including clustering, and explains a number of typical deployment scenarios.

skalliger
SplunkTrust
SplunkTrust

Distributed does not necessarily mean clustered. A distributed environment describes the separation of indexing and searching logic in Splunk. In a non-distributed environment, you would have installed all the logic on a single machine, which does the indexing of data and also searches the data.

In a distributed environment however, you would have an indexer which gets data from several inputs and you would also have a search head, which searches across your indexer.

In a clustered environment, you could then combine multiple indexers to an indexer cluster for high-availabily/data loss prevention (keeping multiple copies of your data). Talking of desaster recover, you would then talk about a multi-site cluster (two clusters at different locations).
Also you would combine multiple search heads together, which distribute their searches to each other. Besides those two clusters, you will also need a deployer and a master (which can be the same machine) to manage your indexer and search head clusters.

Skalli

gokadroid
Motivator

If it's about really differentiating the terms from each other then one way can be of thinking it as , clustering to be within a layer like cluster of indexes, cluster of searchheads. Distributed can be one search head, one indexer, each on different machines.

Technical details of each setup might have some overlaps, but that's the simplest I could think of 🙂

Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...