Deployment Architecture

How many indexers do I need in my Splunk deployment?

adukes_splunk
Splunk Employee
Splunk Employee

Does anyone have any good resources about indexes and index management?
Before I set up a bunch of indexes, I'd like to know more about the how indexes impact my deployment.

0 Karma
1 Solution

adukes_splunk
Splunk Employee
Splunk Employee

The Splunk Product Best Practices team provided this response. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

Indexer, indexes, indexing?! What's what?

Well, to start, an indexer stores indexed data in indexes after such data is indexed. However, the real work happens when you run a search and the indexer fetches the indexed data from the indexes. If you're still with us, great. If not, don't panic, because we'll show you how Splunk creates and manages data repositories (indexes), and review the courses designed to help Splunk Administrators keep Splunk installations happy, healthy, and growing.

How multiple indexers helps you scale

Believe it or not, the more indexers you have, the better! Slow indexing? Add indexers! Slow searching? Add indexers! That's because Splunk forwarders distribute data to each of your indexers. That data-distribution results in opportunities for parallelized processing when you need to search that data. In other words, you win when you have a bunch of machines working on portions of your search rather than one machine trying to handle it all on its own.

You may ask, "But what happens to my data if one of those indexers goes down?" Great question! The Splunk indexer clustering feature manages multiple copies of the data to increase resiliency for your Splunk-ed data.

Now that you know enough to be smart and safe, take a moment to understand the relationship between indexers, buckets, and indexer clusters. These concepts will help you effectively plan and scale your deployments with Splunk Enterprise components.

Things to know

Indexers play a key role in how data moves through Splunk deployments.

An indexer is a Splunk Enterprise instance that stores incoming raw event data and transforms it into searchable events that it places on an index. Each index can contain a variety of data, and is made up of buckets, that is, smaller collections of data and their associated index files.

An indexer cluster, or the Splunk implementation of index replication, is a group of indexers configured to replicate the data of other indexers in the cluster group to ensure the system has redundant copies of all data. By maintaining multiple, identical copies of data, indexer clusters and index replication prevent data loss and ensure that data is available for searching. Key benefits include: data availability, data fidelity, data recovery, disaster recovery and search affinity.

Replication factor is the number of copies of data that the cluster maintains. For example, to ensure that your system can tolerate a failure of two peers, you would configure a replication factor of 3, which means that the cluster stores three identical copies of each bucket on separate nodes. As the replication factor increases, you need to run more indexers and provision more storage for the indexed data. The good news is data replication itself requires little processing power, so you can take advantage of the multiple indexers in a cluster to ingest and index more data.

Things to do

View solution in original post

adukes_splunk
Splunk Employee
Splunk Employee

The Splunk Product Best Practices team provided this response. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

Indexer, indexes, indexing?! What's what?

Well, to start, an indexer stores indexed data in indexes after such data is indexed. However, the real work happens when you run a search and the indexer fetches the indexed data from the indexes. If you're still with us, great. If not, don't panic, because we'll show you how Splunk creates and manages data repositories (indexes), and review the courses designed to help Splunk Administrators keep Splunk installations happy, healthy, and growing.

How multiple indexers helps you scale

Believe it or not, the more indexers you have, the better! Slow indexing? Add indexers! Slow searching? Add indexers! That's because Splunk forwarders distribute data to each of your indexers. That data-distribution results in opportunities for parallelized processing when you need to search that data. In other words, you win when you have a bunch of machines working on portions of your search rather than one machine trying to handle it all on its own.

You may ask, "But what happens to my data if one of those indexers goes down?" Great question! The Splunk indexer clustering feature manages multiple copies of the data to increase resiliency for your Splunk-ed data.

Now that you know enough to be smart and safe, take a moment to understand the relationship between indexers, buckets, and indexer clusters. These concepts will help you effectively plan and scale your deployments with Splunk Enterprise components.

Things to know

Indexers play a key role in how data moves through Splunk deployments.

An indexer is a Splunk Enterprise instance that stores incoming raw event data and transforms it into searchable events that it places on an index. Each index can contain a variety of data, and is made up of buckets, that is, smaller collections of data and their associated index files.

An indexer cluster, or the Splunk implementation of index replication, is a group of indexers configured to replicate the data of other indexers in the cluster group to ensure the system has redundant copies of all data. By maintaining multiple, identical copies of data, indexer clusters and index replication prevent data loss and ensure that data is available for searching. Key benefits include: data availability, data fidelity, data recovery, disaster recovery and search affinity.

Replication factor is the number of copies of data that the cluster maintains. For example, to ensure that your system can tolerate a failure of two peers, you would configure a replication factor of 3, which means that the cluster stores three identical copies of each bucket on separate nodes. As the replication factor increases, you need to run more indexers and provision more storage for the indexed data. The good news is data replication itself requires little processing power, so you can take advantage of the multiple indexers in a cluster to ingest and index more data.

Things to do

gjanders
SplunkTrust
SplunkTrust

Both the title and the answer refer to "How managing indexes helps you scale your deployment", but you are actually referring to indexers or search peers in the answer.

Perhaps you can update this to "How managing indexes ... "? As initially I expected some advice around index naming standards, number of indexes in a deployment et cetera

adukes_splunk
Splunk Employee
Splunk Employee

Hi @gjanders I see your point! The post is about adding indexers. I've updated the the title and the heading to more accurately reflect the topic.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...