Solved: Can multiple indexers write to a single index

jamesoconnell · ‎08-23-2011

My question is about Splunk topology.

Can multiple indexer processes write to a single physical index? Or is there a file I/O conflict that occurs in this setup?

Regards,
James O'Connell.

araitz · ‎08-23-2011

To add a work-around to dwaddle's answer, a viable approach on non-windows systems is to have multiple instances of Splunk write to different directories on the same logical volume.

In other words, lets say you have:

 [root@foobar ~]# df -h /data
 Filesystem                       Size  Used Avail Use% Mounted on
 /dev/mapper/VolGroup00-LogVol00  130G   73G   51G  60% /data

Install one instance of Splunk in /data/splunk1 and another in /data/splunk2. Each instance will be writing to discreet compartments of the same logical volume, thus avoiding the issues dwaddle describes above.

View solution in original post

mmattek · ‎11-30-2011

OK, I'm trying to understand this. I have two indexers, with only one running web, but doing distributed search.

so right now, this has no redundancy. I need to be able to search all logs even if one goes down, although I understand that performance will be reduced.

How do I accomplish this?

mmattek · ‎11-30-2011

done. http://splunk-base.splunk.com/answers/35354/totally-redundant-2-node-cluster

araitz · ‎11-30-2011

This should probably be a separate question.

Please review the following documentation:

http://docs.splunk.com/Documentation/Splunk/4.2.4/Installation/Highavailabilityreferencearchitecture

araitz · ‎08-23-2011

To add a work-around to dwaddle's answer, a viable approach on non-windows systems is to have multiple instances of Splunk write to different directories on the same logical volume.

In other words, lets say you have:

 [root@foobar ~]# df -h /data
 Filesystem                       Size  Used Avail Use% Mounted on
 /dev/mapper/VolGroup00-LogVol00  130G   73G   51G  60% /data

Install one instance of Splunk in /data/splunk1 and another in /data/splunk2. Each instance will be writing to discreet compartments of the same logical volume, thus avoiding the issues dwaddle describes above.

jamesoconnell · ‎09-12-2011

One other complication, what if one indexer goes down (/opt/splunk1)? Will the data in /data/splunk1 still be picked up in a search that subsequently goes through /opt/splunk2 indexer?

jamesoconnell · ‎09-12-2011

araitz, thanks again for the response. I want to make sure I understand -- forgive the potentially obvious question.

When you say install one instance of Splunk in /data/splunk1 and another in /data/splunk2 -- are you saying to have 2 indexer instances of Splunk installed in say /opt/splunk1 and /opt/splunk2 each having an index called 'sample' writing to separate db files located respectively in /data/splunk1 and /data/splunk2.

And a search head will search across the two indexers (/opt/splunk1 & /opt/splunk2) on the same index 'sample'.

Yes?

Thanks, James.

araitz · ‎09-08-2011

You are conflating the directories on an indexer's disks with the notion of a Splunk index. An index is an abstract entity that represents a data container, and may be composed of one or many components on the underlying file systems. The way to scale Splunk is to add more instances and use distributed search head to search across each instance. Thus, the search "index=main" run on a search head that distributes searches across several index servers is a search against one index ("main"), regardless of how many index servers are involved.

jamesoconnell · ‎09-08-2011

Thank you for the response. Writing to two different directories is essentially having two indexes though isn't it?

dwaddle · ‎08-23-2011

If I understand your question, this is not a workable topology.

Being pedantic, a Splunk index is comprised of one or more buckets -- each bucket is a shard of the total index. A bucket can only be written to by a single splunkd instance at a time. Depending on your configuration, a single indexer can have multiple buckets for an index which it writes to in parallel. Similarly, multiple indexers can each have their own buckets for an index and the data can be sprayed across all of the buckets on all of the indexers in parallel. But, even in this topology, each bucket is only being written to by a single splunkd process.

gkanapathy · ‎09-08-2011

Also, there is no advantage to this. If you have the IO capacity, you can set up multiple splunk instances instead and get better performance on both search and indexing. If you don't have extra IO capacity, this doesn't help you anyway.

araitz · ‎09-08-2011

You absolutely should not have multiple indexers write to the same index. This is not a supported configuration, and in fact is explicitly recommended against on this thread and elsewhere. Your tests might show that it is possible for a short time and in under certain conditions, but you face serious data integrity and other unknown conditions if you insist on taking this approach.

jamesoconnell · ‎09-08-2011

This answer is logical but my tests have shown otherwise. I am writing to the same index (same directory and db) from multiple indexers w/out seeing any issues. Recommendations on how to confirm this?

Regards, James.

Can multiple indexers write to a single index

Splunk MCP & Agentic AI: Machine Data Without Limits

Finding Based Detections General Availability

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience

Join the Conversation

Can multiple indexers write to a single index

Splunk MCP & Agentic AI: Machine Data Without Limits

Finding Based Detections General Availability

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience