Hello Experts
Please do not route to Splunk PS or Partner help. i want to do it by myself but with help of you experts.
I have 1 HQ , and 2 main big branch and + 100 small branches, i want to have visibility from all the sites what is the best design approach for this type of network. The data Ingestion is approximately 200 GB/Day which includes from all the sites ( HQ +Main Sites+ 100 branches)
Thanks
SPOF means that you have a single point which - in case of failure - renders a significant part of your infrastructure, not just this component, inoperable. With a seup of a single "aggregating forwarder" regardless of whether that is a UF or HF, you have a SPOF since outage of a single forwarder means that you're not receiving logs from a significant part of your environment.
In general - the preferred approach is to send events directly from UFs to indexers. And the official advice will always be "don't use the HFs unless you heed to filter out significant part of your events". There are some setups (like business limitations, network limitations and so on) in which you'd need an intermediate forwarder. Sometimes UF will be sufficient (when you're only forwarding data from other UF, sometimes you'll need HF (if you use apps with modular inputs). There are many factors that come into play here.
So one thing is whether to do any "forwarder aggregation" or not. And if so - to do just one forwarder or maybe two for HA. And another is whether to use UF or HF or maybe both of them.
But then again - these are all customised options and do not fit straight into any of the oficial splunk-approved architectures. So either you do them on your own and take the risk that it won't work properly or you go to your Partner/PS who will assess your requirements and will point out possible dangers of this or that approach and will help you minimize them, taking into account your detailed environment. That's what Partners/PS are for 🙂
Hi @adamgibs,
I suppose that you're speaking of netwoek segmentation and Deployment Server.
In this case, it depends on your network segmentation and security policies.
My hint is to put two HFs in all your big branches that concentrate all logs of their subnetworks, in this way you have to open less firewall routs between subnetworks.
About the minor branches, you could open the routes to one of the other networks HFs to maintain a network segmentation.
About the role of DS, if your HFs have in their subnetworks less than 50 clients, you could use one of them as DS, otherwise you should use a dedicated system.
You can configure one DS as primary DS to send apps to all the secondary DSs that manage a part of clients, but it isn't and easy configuration and you need a PS or at least a Splunk expert Architect to design and configurate it!
I understand tat you don't want to hear PS or Partners, but log and security management it isn't a joke and require expert professionals, so probably this is a position of your managers: so if they don't want to use them, at least they must train you or someone of your colleagues on Splunk Architecture, that (I repeat) it isn't a joke.
Also because, using a partners probably is the cheaper solution for your company!
Ciao.
Giuseppe
A quick side note - you don't need a HF just to "concentrate" data from multiple UF's. Another "central" UF instance will work just fine. You can perfectly well route data UF->UF->HF/idx.
@adamgibs - Generally deciding the architecture depends on a lot of factors.
Like data ingestion size, data restriction policies, do you need disaster recovery, do you need high availability, etc, etc.
Regardless I'm assuming here:
* You don't need disaster recovery
* You need high availability at the indexer level but not at the search head level.
* You don't have any data restriction policies at the hardware level. Though you can create index level restrictions in Splunk.
SH ES (if required) DMC/License-Master
IDX IDX IDX CM
big-site1 big-site2 Other sites
HF1 UF UF ... HF2 UF UF ... HF (only if necessity) UF UF UF ...
(Assume arrows for data flowing from HF/UFs -> IDXs -> SHs. I'm lazy drawing diagrams.😀)
* no indexer depends on searching load
* networking depends on your environment level restriction
Again designing architecture has a lot of ifs and buts, but would be the simplest form.
--------
I hope this helps, if it does consider upvoting/accepting-answer!!!
Dears
If i divide 100 small branches in 50 + 50 that means i will choose 2 small branches with HF and ask other 98 branches to send logs to these 2 HF in 2 seperate branches .
OR
All the 100 branches will send to HQ HF will create lots of bandwidth consumption on WAN links what is the alternate of this , i can't keep more HF which wil increase the cost on hardware as well on the licenses.
as rick mentioned
Another "central" UF instance will work just fine. You can perfectly well route data UF->UF->HF/idx.
i didn't heard abt central UF, can u share with me the link to read more on this as well what different Central UF will provide us , as i know the HF require additional license, does Central UF also requires ?
I m reading the below link for hints apart from you'll experts
https://www.splunk.com/pdfs/technical-briefs/splunk-validated-architectures.pdf
thanks
Well, routing data through UF is not a recommended solution (the recommended one is simply to send everything directly to indexers and it has its pros). But in some cases (like sending events from a site with a very limited connectivity). It can be an answer to some limitations. You simply define a splunktcp input on your intermediate UF and receive events from the source UF. As simple as that.
For HF you don't need any additional license meaning that you don't need to pay anything in addition to the license you already purchased. With on-prem installation you can just set it up as a part of your local environment and it is then acounted for in license master but doesn't use any licensing quota since it's not indexing any events. With HF installed for use with splunk cloud you just use a separate type of the license which allows for forwarding but does not allow for local indexing. But the HF license does not involve additional costs.
@adamgibs - I would agree with @PickleRick .
PS: Think about complex architecture only if you have the business requirements, and hardware/network restrictions.
Dears
so if the HF or UF fails the UF will not come to know and it will keep sending traffic to UF or HF IP address, but if UF is sending logs to Indexers then we can specify the redundancy if incase one indexer fails it will send it to another. Please correct me if i m wrong.
You simply define a splunktcp input on your intermediate UF and receive events from the source UF. As simple as that.
Can you elaborate more on the above lines , as i understand you are mentioning to build a intermediate UF and receive events from UF to send it to Intermediate UF, so i shld build in HQ Intermediate UF ??
as vatsal is mentioning
So in my case i need to save bandwidth so rick wouldn't you recommend to build a HF.
Awaiting your replies.
thanks
SPOF means that you have a single point which - in case of failure - renders a significant part of your infrastructure, not just this component, inoperable. With a seup of a single "aggregating forwarder" regardless of whether that is a UF or HF, you have a SPOF since outage of a single forwarder means that you're not receiving logs from a significant part of your environment.
In general - the preferred approach is to send events directly from UFs to indexers. And the official advice will always be "don't use the HFs unless you heed to filter out significant part of your events". There are some setups (like business limitations, network limitations and so on) in which you'd need an intermediate forwarder. Sometimes UF will be sufficient (when you're only forwarding data from other UF, sometimes you'll need HF (if you use apps with modular inputs). There are many factors that come into play here.
So one thing is whether to do any "forwarder aggregation" or not. And if so - to do just one forwarder or maybe two for HA. And another is whether to use UF or HF or maybe both of them.
But then again - these are all customised options and do not fit straight into any of the oficial splunk-approved architectures. So either you do them on your own and take the risk that it won't work properly or you go to your Partner/PS who will assess your requirements and will point out possible dangers of this or that approach and will help you minimize them, taking into account your detailed environment. That's what Partners/PS are for 🙂
Single-point-of-failure
Why intermediate HF is not recommended, and if required use UF.
Dears
Understood very well by your post and thanks for the patients to explain me hence trying to grab as much as i can from your hints, i went through these links
https://docs.splunk.com/Documentation/Forwarder/8.2.5/Forwarder/Configureloadbalancing
https://docs.splunk.com/Documentation/Splunk/8.2.6/Admin/Outputsconf#outputs.conf.example
@Rick & Vatsal
Please correct me if i m wrong
@adamgibs - Just remember, Intermediate (central) UF is not a requirement from the Splunk side and is generally not recommended as described previously.
Involve intermediate UF only if that is a Business/Network/Security requirement.