Using a Heavy Forwarder vs Standalone Indexer at R...

davisb221 · ‎08-28-2020

We currently use a single-site cluster for our main environment. We need to be able to receive data from a remote site to our cluster but still maintain search-ability if the connection from the main site to the remote site is severed (intentionally or not). The remote site has less than 50 assets and may ingest as little as 3GB a day.

Converting our single-site cluster to a multi-site cluster is not possible.

Should we use a heavy forwarder in a "store and forward configuration" at the remote site to forward data to the cluster? I believe we would have some sort of search capability at the remote site if the connection fails.

Or

Should we put a single Indexer at the remote site and have it connected as a non-clustered search peer of our single search head that exists in our main environment? In this configuration we should also have local search capability if the connection fails.

I'm leaning on going the HF route or may even end up going with a non-distributed instance that gets backed up daily.

What do you think?

isoutamo · ‎08-28-2020

Hi

If you really need to do searches on remote site also when connections are gone, then the best option is use multisite cluster (even you said that you cannot do that). If you put there one node as indexer then you must remember that it's ingested data has counted towards your license (opposite what happened with multisite cluster) as you must "replicate" the data to this indexer and to your main indexer.

My proposal is that you set up HF(or several) on secondary site which just forward data to primary (maybe also persistent queues defined). Of course that means that you cannot do searches if connection is lost. Actual it's the same situation, even you set up additional indexers on remote site as searches are done by main site's SH (which hasn't connection to that idx when you have lost the connection between sites).

Basically this is just business decision, how important those searches are. If those are mandatory then just setup multisite cluster or replicate all data of this site to separate indexers. Then you also must set up a additional SH to secondary site (I don't like the idea to use that IDX / "HF" also a SH) to do queries when connections are down.

r. Ismo

davisb221 · ‎08-31-2020

Ingesting data twice isn't a good option for us either so a store and forward configuration on a HF is now out of the question. I could see the individual IDX (non-clustered search peer) at the remote site working. If the main site SH goes down we can perform local searches on the remote standalone IDX.

This site also has separate AD, DNS, etc. so it is a completely isolated environment. Maybe a single-stack non-distributed instance would work best here and if we decide we need HA we can stand up a distributed instance.

gcusello · ‎09-01-2020

Hi @davisb221,

only one detail: you're speaking of a separate not clustered Index, this means that you don't send data to the cluster and you don't have available these data for the searches on the cluster.

If instead you want the data both on separate Indexer and Cluster, you have to index data twice (as store and forward on HF!).

The only way to avoid to index twice is to extend the cluster.

Ciao.

Giuseppe

gcusello · ‎08-28-2020

Hi @davisb221,

you can configure your Heavy Forwarder to store a local copy of the data for the searches, but these logs are twice counted in the license consuption.

You can do it by GUI: [Settings -- Forwarding and Receiving -- Forwarding Defaults].

Ciao.

Giuseppe

Using a Heavy Forwarder vs Standalone Indexer at Remote Site for a Clustered Main Environment

distributed search

heavy forwarder

indexer

indexer clustering

Splunk AI Assistant for SPL 1.1.0 | Now Personalized to Your Environment for Greater ...

Unleash Unified Security and Observability with Splunk Cloud Platform

Enterprise Security Content Update (ESCU) | New Releases