Deployment Architecture

Where does the Master node in an index cluster store bucket replication information?

jcspigler2010
Path Finder

Have a silly yet relevant question. I deal with some very inquisitive customers when architecting, deploying and configuring splunk. One question has come up that I can't quite answer...

FYI, I'm well aware how Splunk Index Clustering works, where the configs are, how to deploy, what happens when a node goes down etc etc...

However I have not been able to answer the question of, where does the Master node store the bucket (primary,copy,raw,etc) status, information for the entire cluster? That goes for both single site and multi site

I know in server.conf you specify the cluster parameters and configs, but where is the actual bucket information. It has to be stored somewhere considering a SH participating in an indexer cluster checks with the master node first WHERE it should replicate its config bundle to for searching.

Thanks!

0 Karma
1 Solution

lguinn2
Legend

The Cluster Master has an in-memory database of the status of all buckets in the cluster. However, this information is actually stored as part of the name of each bucket. Whenever the CM is restarted, it rebuilds its in-memory database by querying the indexers for their bucket information. I am sure there are lots of optimizations, etc. but this is the bare-bones "how it works."

If you want some good insights into indexer clustering, take a look at the presentations from .conf2016 - you can find them at http://conf.splunk.com. (Navigate to the session replays for 2016.) There are recordings and slides for at least 3 great presentations on clustering. Skip the intro talks and look for the ones with words like "internals", "performance" and "debugging." I'd link them directly here, but the .conf site doesn't work that way...

View solution in original post

jcspigler2010
Path Finder

Thanks guys both fantastic answers!

0 Karma

lguinn2
Legend

The Cluster Master has an in-memory database of the status of all buckets in the cluster. However, this information is actually stored as part of the name of each bucket. Whenever the CM is restarted, it rebuilds its in-memory database by querying the indexers for their bucket information. I am sure there are lots of optimizations, etc. but this is the bare-bones "how it works."

If you want some good insights into indexer clustering, take a look at the presentations from .conf2016 - you can find them at http://conf.splunk.com. (Navigate to the session replays for 2016.) There are recordings and slides for at least 3 great presentations on clustering. Skip the intro talks and look for the ones with words like "internals", "performance" and "debugging." I'd link them directly here, but the .conf site doesn't work that way...

View solution in original post

somesoni2
Revered Legend

IMO it's stored somewhere in memory or storage internal to Splunk instance. It's definitely not in any conf files. The basis of this is that when we master node fails and we migrate to a new cluster master, we only restore server.conf (which contains cluster configuration) and master-apps directory. Rest all will be collected again by new master.

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.