Hi there,
Currently running a UF + and HF on one box as well as it being a syslog collector. The HF needs to be there as there are feeds that require apps to pre-process data. This isnt the ideal intermediate layer but it has to be the way it is for now. I have 2 questions around this:
1. Would it be better to remove the UF and just send all events(even the feeds that dont need preprocessing) through the HF, or should I use UF for all other feeds and run both forwarders on the same server?
2. What would be the ideal server spec if parsing 2Tb of data per day through the HF tier?
Thanks!
Hi,
Since you are already leaning toward IMF layer setup, it is simply much better to install a universal forwarder on each syslog server, then have those UFs forward their data to the HFs for parsing and then HFs pass on to indexers to be written to disk. With that in mind, below are few resources that can steer you best:
https://docs.splunk.com/Documentation/Splunk/8.0.4/Forwarding/Forwarderdeploymenttopologies
https://www.splunk.com/en_us/blog/tips-and-tricks/using-syslog-ng-with-splunk.html
1. Depends on your use-case. Running both concurrently can land you in resource consumption situation especially if a lot of parsing is expected on that one HF (poor LB etc.).
Instead, if possible, seperate them as explained above. Have a UF on each syslog server then have them send to HFs for parsing and those HFs can parse data from other UFs etc. Or the other way round which ever fits your use-case.
2. Please have a look here: https://docs.splunk.com/Documentation/Splunk/8.0.4/Capacity/Referencehardware
We have it layered whereby load is spread across quite a few HFs that do all the parsing before we index any data. so 2TB is not much at all if you're distributing that load and can thus get away with Mid-range or standard spec servers that your IMF send to. However, if you are beaming all that data to 1 or 2 (depending on the UF<>HF ratio) you would definietly need to consider much beefier, high spec servers.
Hope this helps!
Hi,
Since you are already leaning toward IMF layer setup, it is simply much better to install a universal forwarder on each syslog server, then have those UFs forward their data to the HFs for parsing and then HFs pass on to indexers to be written to disk. With that in mind, below are few resources that can steer you best:
https://docs.splunk.com/Documentation/Splunk/8.0.4/Forwarding/Forwarderdeploymenttopologies
https://www.splunk.com/en_us/blog/tips-and-tricks/using-syslog-ng-with-splunk.html
1. Depends on your use-case. Running both concurrently can land you in resource consumption situation especially if a lot of parsing is expected on that one HF (poor LB etc.).
Instead, if possible, seperate them as explained above. Have a UF on each syslog server then have them send to HFs for parsing and those HFs can parse data from other UFs etc. Or the other way round which ever fits your use-case.
2. Please have a look here: https://docs.splunk.com/Documentation/Splunk/8.0.4/Capacity/Referencehardware
We have it layered whereby load is spread across quite a few HFs that do all the parsing before we index any data. so 2TB is not much at all if you're distributing that load and can thus get away with Mid-range or standard spec servers that your IMF send to. However, if you are beaming all that data to 1 or 2 (depending on the UF<>HF ratio) you would definietly need to consider much beefier, high spec servers.
Hope this helps!
thanks!