All Apps and Add-ons

Splunk Stream: How to configure the app to spread load across all indexers?

hemendralodhi
Contributor

Hello,

We are trying to set up Splunk Stream app for Netflow capture but have one confusion regarding data distribution to Indexers.

We are planning to have below as there will be huge amount of data and limitation of Wire data modular for Splunk Stream.

1) Install Independent Stream forwarder on Linux machine.
2) configure HTTP Event collector on indexer to receive the data sent by stream forwarder.
3) configure Flow collector in streamfwd.conf file.

As per documentation best practice for scaling flow ingestion is to use independent stream forwarder and use Nginx or other LB to distribute load among indexer cluster (http://docs.splunk.com/Documentation/StreamApp/7.0.1/DeployStreamApp/ConfigureFlowcollector) .

But as we are configuring HTTP event collector and distributing [httpː//streamfwd] stanza in all the indexers (per doc - http://docs.splunk.com/Documentation/StreamApp/7.0.1/DeployStreamApp/InstallStreamForwarderonindepen...) - will it not distribute load across all indexers? As per my understanding since token stanza will be distributed to all index servers in cluster - where and why Nginx is required.

Thanks
Hemendra

0 Karma
1 Solution

vshcherbakov_sp
Splunk Employee
Splunk Employee

hello @hemendralodhi,
Load balancer is needed to scale the event ingestion beyond what a single HEC endpoint can process. I don't have the exact perf numbers (yet), but the ball park is that a single instance of independent Stream Forwarder can collect and process/push to indexers up to ~several TB/day of netflow data. A single HEC-enabled indexer won't be able to handle such load, so that's why we recommend a load-balanced architecture.

View solution in original post

vshcherbakov_sp
Splunk Employee
Splunk Employee

hello @hemendralodhi,
Load balancer is needed to scale the event ingestion beyond what a single HEC endpoint can process. I don't have the exact perf numbers (yet), but the ball park is that a single instance of independent Stream Forwarder can collect and process/push to indexers up to ~several TB/day of netflow data. A single HEC-enabled indexer won't be able to handle such load, so that's why we recommend a load-balanced architecture.

hemendralodhi
Contributor

Thanks vshcherbakov for the response.

I am still trying to understand how stream forwarder will send the data to indexer. Since Stream app will be installed in Search Head , to enable HEC on indexer , I believe we have to install app on indexer as well and enable HEC?

So data forwarding is based on [httpː//streamfwd] stanza , to distribute load on all indexer we have to copy same stanza in all indexers and deploy Nginx. streamforwarder --------> Nginx -------> Indexer Clusters. How streamforwarder will talk to Nginx?

Thanks
Hemendra

0 Karma

vshcherbakov_sp
Splunk Employee
Splunk Employee

Stream forwarder takes its config from the SH and local config files.

You can configure the HEC endpoints with the Stream app's Configuration -> Distributed Forwarder Management -> "Edit Forwarder Group" dialog box and enter the Nginx endpoint into the Endpoint Urls textbox (you'll need to uncheck the HEC autoconfig first)

Stream app only needs to be installed on the SH, but the HEC config file it creates there should be replicated to the indexers so that they all have the same HEC token, etc.

0 Karma

hemendralodhi
Contributor

Thanks again. It is very helpful. We will configure it accordingly.

0 Karma

aaraneta_splunk
Splunk Employee
Splunk Employee

@hemendralodhi - Did the answer provided by vshcherbakov help provide a working solution to your question? If yes, please don't forget to resolve this post by clicking "Accept". If no, please leave a comment with more feedback. Thanks!

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...