I have 2 sites configured with a multisite cluster.
Site 1: Indexer, manager, independent search head.
Site 1: Indexer, independent search head.
I am looking for a way to have all user's searches and lookups synchronized to both search heads, and it looks like a search head cluster is the way to go.
I am having a little trouble grasping the cluster+affinity and how it applies to a search cluster.
My bandwidth between sites is quite low, and i do NOT want a site1 searchhead searching site2 and vice versa.
If I leave affinity active and run a search at site1, does the search take place ONLY on the site1 searchhead against the site1 peer?
Or is the search distributed across the 2 sites, even though it is only using the site1 index as the index to search?
I have no need for a 3rd search head. I don't have dedicated hardware, I don't need more resources, etc.
But, I know that the search cluster needs 3 searchheads.
Can I add a second searchhead at site1 to be the third, but leave it severely underpowered? A small dual core VM whose only purpose is to provide a 3rd member.
If I do this and perform a search at site1, will it attempt to search using both the strong and weak search head and slow me down?
What it comes down to is that I do not need the full "cluster", I just need to keep my user data replicated. It seems that the cluster is the best way to accomplish this, even though it adds complexity.
You can configure search affinity as you described I believe. However it would only see data in site1 if the search executed in site1 and site2 if it executed in site2.
A SHC is much more involved than what it sounds like you’re willing to sign up for.
I recommend that you write an rsync script that syncs the following:
Everything in local folders and local.meta files under splunkhome/etc/apps
It’s up to you how you want to do the diff. I recommend ssh keys and rsync scripts that check the file mod dates on both servers and syncs the newest to the oldest, set that up in cron every x minutes and you’re good to go.
At the end of your script you can hit the debug refresh endpoint and that’s as close to restarting splunk as you can get without restarting it.
At this point, adding something like rsync and a windows equivalent of ssh is a bit more involved. Maybe DFS. I will have to look into that.
It looks like search head pooling was more of what I needed, but that seems to be gone.
How about the hardware differences on a 3rd search node?
You also need a search head cluster deployer and also possible a network load balancer to fully enable SHC.
It’s not ok to use lower specs although it is possible... support would tell you to upgrade hardware before they’d help much... but you’d also have to make the underpowered search head never run searches... you’d have to make it what is known as an adhoc search head. So that the captain doesn’t assign the scheduled searches to it.
See this document:
Furthermore, you really need at least 4 SH + SHC deployer if you want true HA. I know that’s not what you’re looking for, but it’s important to understand how the SHC uses RAFT to elect a captain. The election requires an odd number of votes in order to determine a winner. If you have 3 and the captain goes down, only 2 are left to vote. No election occurs as such. That’s when scheduled searches are no longer guaranteed to execute. So when using splunk as a “tier 0” application when alerting is extremely important, you need 4 SHs so that at any given time the captain can crash and a new captain can be elected.
Now there’s even more discussion to be had if you want to consider HA with DR and zero MTR, you’d need 4 SHC in each location so your main DC can go down and searches are stilll good...
We have few users (less than 10), and no critical alerting needed. Scheduling is quite minimal, and availability/load balancing is not a huge concern. I am far more concerned about getting/storing data than I am about searching it, and my indexers are a much higher priority. Some downtime while a search head is rebuilt is not going to hurt.
That would leave me with:
The deployer I plan on running in tandem with my deployment server, as the load is fairly light.
The 2 primary searchheads (1 per site) will be the powerful dedicated hardware.
The 3rd searchhead will be a placeholder only.
And all of this so that I can share searches between 2 servers.
I will look into rsync a bit more, as it seems much more in line with the minimal functionality I need. Unfortunately the native windows robocopy doesn't have 2way sync capability, so it looks like 3rd party software will be necessary if I go that route.
Would require a bit of experience to write your own solution but not too much time.
Maybe someone’s already done it and wrote a blog! Best of luck! Let us know what you end up with.