I am currently architecting our potential future Splunk deployment and I would like to implement Heavy Forwarders to take advantage of indexing and filtering prior to hitting the Indexers.
The idea is [Universal Forwarder] --> [Heavy Forwarder(s)] --> Indexers
We will have about 3,000 devices reporting to Splunk. Is there a recommended amount of Heavy Forwarders per number of devices? Will just one suffice? What kind of hardware specs are we talking about for these devices?
I've searched Splunkbase but couldn't find any info on this.
The reason that you haven't found any info on this is because most people don't do it this way. IMO, the added expense and complexity doesn't pay off. BTW, I think you meant "parsing and filtering" on the HF.
Unless you are going to filter out 70% or more of the events, I don't think it is worthwhile to put in an extra tier using HF. Be careful that you don't create a single point of failure for your system.
As far as the specs for a heavy forwarder, I would probably config it like an indexer, but without the disk I/O requirement. But the ratio of UF to HF probably depends on the rate of data flow. I would not expect a HF to process more than 100 GB a day of data in general.
Update: What @dwaddle said.
I would not expect a HF to process more than 100 GB a day of data in general.
What would that number be for a UF? We have a layer of two heavy forwarders receiving data before they hit the indexers - We index about 1TB a day. That 100GB number along with @dwaddle 's suggestion makes me think I should change this setup immediately.
The number is probably higher for a UF - but remember that you need to remove the 256 kbps limit on the UF! And yes, following Duane's suggestions - you should evaluate your current setup carefully.
[PS. Always take Duane's suggestions 😄 ]
[PS. Always take Duane's suggestions 😄 ]
- Noted 😄
I changed my dedicated universal forwarder layer limits to about 20000 KBps. I realized this the hard way and now have to restructure my intermediate forwarder layer. Possibly remove it completely and just rely on indexer discovery.
Here was my thought process - Heavy > Light --> more logs can be handled
I haz the dumb.
You want "at least" as many HFs as you have indexers. Consider this thought exercise:
You have 3000 UFs and 6 indexers. On average, each indexer will be taking in connections from 500 UFs at (let's say) 500EPS (1 event / sec / UF). Your indexing load is equally distributed across your indexers and come search time your search will run with maximum parallel performance.
Now, let's put just two HF's inbetween ... the HFs are each aggregating data from 1500 UFs (again on average) and passing on to one indexer at a time. Now, two indexers will each be indexing 1500EPS, while the other 4 are indexing nothing. Assuming that tripling the EPS does not hammer the indexer entirely (and it very well might), it has certainly substantially increased load. Also, you've messed with the balance of data across indexers which could make search response more erratic.
Usually, this type of configuration is not necessary. If you're going to go down this route I would suggest a consult with ProServ to see what current best practices here are.
Whatever people say, we do UF -> UF (Intermediate layer) -> Indexer . The core reason being we don't want to expose Indexer to outside world due to security reasons. Also in our Intermediate layer, we increased the parallel pipelines to match the indexer(s) count thus simulating load balancing.
Also if you look closely into clustered system, the indexing load is minuscule in compared to concurrent searches and datamodel queries. So it a question of your optimisation of "indexing" vs "exposing indexers to every network context"
This seems (to me) to be of dubious actual security benefit. The code base being run by a UF and by an indexer are so substantially similar that this likens to taking a glass window and putting in front of it another glass window in order to keep rocks out.
In terms of your hypothetical risk profile of someone using a TCP input on a Splunk to buffer overflow the process (for example) and gain RCE against Splunk, you've not done much to reduce that risk. And, if an attacker can RCE your intermediate forwarder layer, the odds are incredibly good that they can use that same RCE against your indexers, just staged off of your IF layer.
But, if this makes sense in your security policy and it works for you, then great. My advice would be to make sure your intermediate layer is "wide enough" to remove any performance concerns.
not always. There are less secure network contexts, outside internet facing services, highly secure devices etc. Not all of them should connect to same indexer layer. We have separate Intermediate forwarder layer on each of the network context, thus completely isolating indexers of network complexity.