Solved: How can we improve universal forwarder performance...

dbcase · ‎10-28-2016

Hi,

We have a proxy server where multiple log files get uploaded. The average is about 15 million events per day. Currently the server is processing approx 3 million events per hour (server=4 cores, 8gb memory, VMware VM)

Is there a way to improve performance/multithread the forwarder?

We've tried enabling parallelIngestionPiplines=2 in server.conf (This made the universal forwarder very unhappy error message below)

We confirmed that [thruput] maxKBps = 0 in limits.conf

The CPU nor memory is bound. We are using Splunk Cloud and our Internet circuit is not saturated either.

Any other thoughts?

Error message

Checking conf files for problems...
    Invalid key in stanza [general] in /opt/splunkforwarder/etc/system/local/server.conf, line 19: parallelIngestionPipelines  (value:  2)
    Your indexes and inputs configurations are not internally consistent. For more information, run 'splunk btool check --debug'

mattymo · ‎10-28-2016

Here is what I run through on the forwarders (assuming your indexing layer is healthy and keeping the pipeline moving)

First, as you stated, you need to deal with thruput. Just make sure you edit it in the right limits.conf (either in an app, or $SPLUNK_HOME/etc/system/local/ or the splunkUniversalForwarder app) and confirm in btool

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool limits list thruput --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf [thruput]
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf maxKBps = 2048

Then I usually give the parsingQueue some extra breathing room in server.conf, being careful to consider the host machine resources:

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server  list queue=parsingQueue --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf [queue=parsingQueue]
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_1_lookback_time = 60s
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_2_lookback_time = 600s
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_3_lookback_time = 900s
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf maxSize = 10MB
/opt/splunkforwarder/etc/system/default/server.conf                        sampling_interval = 1s
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$

While tweaking these settings I keep a close eye on metrics.log, by either grepping /opt/splunkforwarder/var/log/splunk/metrics.log or just using the search head to search:

index=_internal source=*metrics.log  host="n00b-splkufwd-01" group=tcpout_connections

index=_internal source=*metrics.log  host="n00b-splkufwd-01" blocked=true

I try to keep an eye on how much bandwidth is able to be moved on the wire, which does play a role in this....ie. how much data can one tcp connection to one of your indexers move. You may want to baseline some single socket tcp connections just to find out what you are working with...

Then if I still need more juice (and the host machine can afford it) I add the extra pipeline

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server list --debug | grep parallelIngestionPipelines
/opt/splunkforwarder/etc/system/local/server.conf                          parallelIngestionPipelines = 2

Also if you monitor many files be sure to up the fd_max to 300 in limits.conf, and ensure the ulimits for the user running splunk is tuned up

See the note about file systems here:

http://docs.splunk.com/Documentation/Splunk/6.5.2/Installation/Systemrequirements

- MattyMo

View solution in original post

mattymo · ‎10-28-2016

Here is what I run through on the forwarders (assuming your indexing layer is healthy and keeping the pipeline moving)

First, as you stated, you need to deal with thruput. Just make sure you edit it in the right limits.conf (either in an app, or $SPLUNK_HOME/etc/system/local/ or the splunkUniversalForwarder app) and confirm in btool

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool limits list thruput --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf [thruput]
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf maxKBps = 2048

Then I usually give the parsingQueue some extra breathing room in server.conf, being careful to consider the host machine resources:

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server  list queue=parsingQueue --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf [queue=parsingQueue]
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_1_lookback_time = 60s
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_2_lookback_time = 600s
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_3_lookback_time = 900s
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf maxSize = 10MB
/opt/splunkforwarder/etc/system/default/server.conf                        sampling_interval = 1s
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$

While tweaking these settings I keep a close eye on metrics.log, by either grepping /opt/splunkforwarder/var/log/splunk/metrics.log or just using the search head to search:

index=_internal source=*metrics.log  host="n00b-splkufwd-01" group=tcpout_connections

index=_internal source=*metrics.log  host="n00b-splkufwd-01" blocked=true

I try to keep an eye on how much bandwidth is able to be moved on the wire, which does play a role in this....ie. how much data can one tcp connection to one of your indexers move. You may want to baseline some single socket tcp connections just to find out what you are working with...

Then if I still need more juice (and the host machine can afford it) I add the extra pipeline

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server list --debug | grep parallelIngestionPipelines
/opt/splunkforwarder/etc/system/local/server.conf                          parallelIngestionPipelines = 2

Also if you monitor many files be sure to up the fd_max to 300 in limits.conf, and ensure the ulimits for the user running splunk is tuned up

See the note about file systems here:

http://docs.splunk.com/Documentation/Splunk/6.5.2/Installation/Systemrequirements

- MattyMo

somesoni2 · ‎10-28-2016

The attribute parallelIngestionPiplines was introduced in Splunk 6.3. What version of Splunk Universal Forwarder you have?

Also, try some of the Linux TCP tuning settings as well. I've seen it improve my HF performance a lot.

http://www.linux-admins.net/2010/09/linux-tcp-tuning.html

dbcase · ‎10-28-2016

We are running the 6.5 version of the forwarder

mattymo · ‎10-28-2016

line 19...seems low in the file...can you show us the server.conf?

I configured like this with no issue on 6.4.1

[general]
serverName = n00b-splkufwd-01
pass4SymmKey = <redacted>

parallelIngestionPipelines = 2

- MattyMo

How can we improve universal forwarder performance?

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability

Are you a member of the Splunk Community?

How can we improve universal forwarder performance?

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability