Getting Data In

Universal Forwarder vs Heavy FOrwarder

ansif
Motivator

Hi All,

Is there any recent test,conf discussion or doc around mentioned below splunk blog 2016:

https://www.splunk.com/en_us/blog/tips-and-tricks/universal-or-heavy-that-is-the-question.html

Is it still 6 times lower with UF?

0 Karma

woodcock
Esteemed Legend

Yes, the answer is still completely correct. Do not use an intermediate forwarder tier at all. Always go straight to the Indexers from UF.

0 Karma

Richfez
SplunkTrust
SplunkTrust

"6 times lower" was a specific answer with a specific data set, but yes, that answer is still generally correct.

Indeed, as time moves forward, there's even fewer reasons to use a HF over a UF. You can keep searching for more references on that if you want, but I think that information is so easy to find now I'd rather not put information that may be stale in here.

All in all, the argument is a very strong one - if you do not need an HF for any of the reasons outlined in the blog you mention, then do not use an HF but instead use the UF.

Happy Splunking!
-Rich

0 Karma

ansif
Motivator

Is that ok to hit n numbers of servers to indexers rather through heavy forwarder?

I wish to know in terms of communication,global servers (6k) servers sending data to indexers (in one DC),is that ok to use UF instead of HF?

0 Karma

Richfez
SplunkTrust
SplunkTrust

Generally Yes.

In fact, read that blog again - it's not just "OK" to use the UF, but it's actually better to use the UF. Better in almost all cases.

More data to support this:
The first bullet here in "5 splunk myths".
The second paragraph here for UF vs. HF for collecting SFTP'ed Cisco CUCM data
And that's just what I found in a few seconds on a search engine. I'm pretty sure there are talks at conf.splunk.com on this topic too, and I know absolutely that there are a lot of talks where this is mentioned, even if it's not the talk's "topic".

For your particular question, it is better for data distribution to the indexers to have 6k UFs talking to the indexers than 6k UFs -> 4 HFs -> 2 IDX. Two points here 1) that's the minimum "recommended" HFs for that layer, and it isn't really enough IMO - the N*2 for IDX to HF ratio only really holds for larger counts, like a dozen indexers and 2 dozen HFs - when you get these little installs with two indexers, the chances of all 4 HFs pointing to one single indexer is fairly high and bad things may happen! 2) I wonder if you aren't going to have indexing problems regardless of how you arrange those 6k clients against only 2 IDX, with or without HFs, because of just general load, so I suggest rolling out in a staged manner and confirming you don't need to add indexers occasionally. I think you may be fine, but just take it slow, ok?

It's also better for performance of the end clients to use the UF on them. Unless your plan was thinking about using a hop through an HF layer, at which point end clients are the same regardless but you've introduced a bottleneck in that HF layer, but in that case also see the next point.

Better for resource utilization - why add 4 or 8 HFs for little to no reason? Use the money instead to buy another indexer or two! Your data and users will both appreciate it!.

Better for manageability. Better for security. Also better for reliability - one UF goes out and that's one thing, one of those concentrating HFs goes down and suddenly you have problems with a capital P.

The caveats, or "when I may need an HF":

The reasons you may require a HF are for certain types of data (like DB connect), certain types of complex per-event routing of data to different indexer clusters, or in the rare case where you can drop a lot of the data at the HF via regex. Even at 2:1 drop ratio (getting rid of half the events in individual files) it's not generally worth it. And this isn't talking about only collecting some windows event codes and not others, or only reading certain log files - the UF can do those fine. We're only talking about dropping individual events out of individual files using REGEX to filter to nullQueue (and a few similar cases).

If those are not the case, then use the UF. The exceptions may not happen at all in your environment, but if they do just stand up a specific HF for that particular need (like DB Connect which we've always run in a VM as sort of a tertiary HF box doing nothing but DB Connect). When you implement the UF and can't figure out how to do what you need, ask in a new question here in Answers. If we can't solve your problem with a UF, then maybe that's one that's a candidate to use an HF with.

So, use the UF in all cases where you can. Only use the HF if you actually have to use the HF. And if you don't know, try the UF first.

I hope that's clear. I know I sound like a broken record, but HF's were the thing to do like 8 years ago but ever since then they have been slowly losing prominence and they really aren't any longer recommended except in those specific cases. Folks are hanging on to the idea that HFs are better, but outside those few cases I mentioned where they're required they're typically not.

-Rich

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...