All Apps and Add-ons

Splunk Stream: How to configure the app to spread load across all indexers?

hemendralodhi
Contributor

Hello,

We are trying to set up Splunk Stream app for Netflow capture but have one confusion regarding data distribution to Indexers.

We are planning to have below as there will be huge amount of data and limitation of Wire data modular for Splunk Stream.

1) Install Independent Stream forwarder on Linux machine.
2) configure HTTP Event collector on indexer to receive the data sent by stream forwarder.
3) configure Flow collector in streamfwd.conf file.

As per documentation best practice for scaling flow ingestion is to use independent stream forwarder and use Nginx or other LB to distribute load among indexer cluster (http://docs.splunk.com/Documentation/StreamApp/7.0.1/DeployStreamApp/ConfigureFlowcollector) .

But as we are configuring HTTP event collector and distributing [httpː//streamfwd] stanza in all the indexers (per doc - http://docs.splunk.com/Documentation/StreamApp/7.0.1/DeployStreamApp/InstallStreamForwarderonindepen...) - will it not distribute load across all indexers? As per my understanding since token stanza will be distributed to all index servers in cluster - where and why Nginx is required.

Thanks
Hemendra

0 Karma
1 Solution

vshcherbakov_sp
Splunk Employee
Splunk Employee

hello @hemendralodhi,
Load balancer is needed to scale the event ingestion beyond what a single HEC endpoint can process. I don't have the exact perf numbers (yet), but the ball park is that a single instance of independent Stream Forwarder can collect and process/push to indexers up to ~several TB/day of netflow data. A single HEC-enabled indexer won't be able to handle such load, so that's why we recommend a load-balanced architecture.

View solution in original post

vshcherbakov_sp
Splunk Employee
Splunk Employee

hello @hemendralodhi,
Load balancer is needed to scale the event ingestion beyond what a single HEC endpoint can process. I don't have the exact perf numbers (yet), but the ball park is that a single instance of independent Stream Forwarder can collect and process/push to indexers up to ~several TB/day of netflow data. A single HEC-enabled indexer won't be able to handle such load, so that's why we recommend a load-balanced architecture.

hemendralodhi
Contributor

Thanks vshcherbakov for the response.

I am still trying to understand how stream forwarder will send the data to indexer. Since Stream app will be installed in Search Head , to enable HEC on indexer , I believe we have to install app on indexer as well and enable HEC?

So data forwarding is based on [httpː//streamfwd] stanza , to distribute load on all indexer we have to copy same stanza in all indexers and deploy Nginx. streamforwarder --------> Nginx -------> Indexer Clusters. How streamforwarder will talk to Nginx?

Thanks
Hemendra

0 Karma

vshcherbakov_sp
Splunk Employee
Splunk Employee

Stream forwarder takes its config from the SH and local config files.

You can configure the HEC endpoints with the Stream app's Configuration -> Distributed Forwarder Management -> "Edit Forwarder Group" dialog box and enter the Nginx endpoint into the Endpoint Urls textbox (you'll need to uncheck the HEC autoconfig first)

Stream app only needs to be installed on the SH, but the HEC config file it creates there should be replicated to the indexers so that they all have the same HEC token, etc.

0 Karma

hemendralodhi
Contributor

Thanks again. It is very helpful. We will configure it accordingly.

0 Karma

aaraneta_splunk
Splunk Employee
Splunk Employee

@hemendralodhi - Did the answer provided by vshcherbakov help provide a working solution to your question? If yes, please don't forget to resolve this post by clicking "Accept". If no, please leave a comment with more feedback. Thanks!

0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...