Getting Data In

Should I use a heavy forwarder or an indexer cluster for my particular scenario?

Path Finder

Hi,

I'm hoping for some advice as I'm trying to understand the best way to configure Splunk components in the scenario below.

I have two Datacentres (DC) that operate as Active / Passive. Datacentre A (DCA) will be the active DC running all services and within it I will have a few hundred Windows machines with Universal Forwarders installed.

My current plan is to create an Indexer cluster consisting of two Indexers; not to share load but allow increased processing. There will then be a single standalone Search Head and a single cluster Master instance giving me a total of 4 separate machines in DCA.

I understand this is the first way to start scaling out, so in the future it would be easy to add more Indexers or move to a Search Head cluster if required. I think given the volume I am expecting to process I would be following a Splunk 'Small Enterprise' deployment.

The first bit I am unclear on is around forwarding from this cluster. If I wanted the Indexing cluster in DCA to forward data onto a 3rd party SOC for example, is that possible? I think where I'm getting confused is having read that an Indexer that forwards is actually a 'Heavy Forwarder', not an Indexer. Can an Indexer clusterer forward too?

If this is possible, it answers my second question. I want to mirror the DCA setup in a branch office that might have a poor link. If the link went down, could the Splunk Indexer cluster be configured to continue processing data locally and forward it onto DCA when it was back online?

Originally, I was thinking I would just use a Heavy Forwarder in a branch office, but that was because it seemed to me like Indexer clusters could not forward data.

I'm just not sure if I need a Heavy Forwarder or an Indexer cluster for this setup. I assume you can't cluster Heavy Forwarders so there would be processing constraints there?

Many thanks!

M

0 Karma

Super Champion

I guess it might be better if you post a picture for both your options to understand fully. Just my few tips
- I would never run active-passive setup for Splunk. But I would run active-active on DCA and DCB and put Indexer clusters on them with replication factor of 2
- You should NOT need any more indexers , but normal Universal forwarders or Heavy forwarders to collect and forward to the indexer clusters in DCA and DCB
- Indexers can forward data to 3rd party, but ensure the load and efficiency parameters of your hardware to cater for this extra work.
- Beware when you are forwarding to 3rd party. I would do just syslog out to a location from your cluster and NOT send directly . if you send directly, you need to worry about uptime of 3rd party etc. (In Summary move away from dependency of 3rd party)
- For your branch network, I would use just normal "Universal forwarders" and forward to your DCA/DCB cluster. If the link went down UF can then pause and start sending once link is back.

0 Karma

Path Finder

Thanks koshyk,

To clarify, DCA will contain an active set of Splunk services that are replicated to the secondary. Without going into details, the whole solution will just come up and work in DC2 in the event of failure.

I suppose what I'm interested in is the ability of Splunk Enterprise Indexer clusters to forward and the benefit of doing this over having Heavy Forwarders configured. I understand a Heavy Forwarder is a Splunk Enterprise instance that can do everything except distributed search. By that I understand the built in Search Head could not search other Indexers elsewhere, presumably because a Heavy Forwarder is designed to run as an 'all in one' component. Is that correct?

Can you cluster Heavy Forwarders?

Noted around 3rd party dependency. I would impliment a syslog server in this case.

Thanks for your help,
M

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!