Getting Data In

How can we improve universal forwarder performance?

dbcase
Motivator

Hi,

We have a proxy server where multiple log files get uploaded. The average is about 15 million events per day. Currently the server is processing approx 3 million events per hour (server=4 cores, 8gb memory, VMware VM)

Is there a way to improve performance/multithread the forwarder?

We've tried enabling parallelIngestionPiplines=2 in server.conf (This made the universal forwarder very unhappy error message below)

We confirmed that [thruput] maxKBps = 0 in limits.conf

The CPU nor memory is bound. We are using Splunk Cloud and our Internet circuit is not saturated either.

Any other thoughts?

Error message

Checking conf files for problems...
    Invalid key in stanza [general] in /opt/splunkforwarder/etc/system/local/server.conf, line 19: parallelIngestionPipelines  (value:  2)
    Your indexes and inputs configurations are not internally consistent. For more information, run 'splunk btool check --debug'
0 Karma
1 Solution

mattymo
Splunk Employee
Splunk Employee

Here is what I run through on the forwarders (assuming your indexing layer is healthy and keeping the pipeline moving)

First, as you stated, you need to deal with thruput. Just make sure you edit it in the right limits.conf (either in an app, or $SPLUNK_HOME/etc/system/local/ or the splunkUniversalForwarder app) and confirm in btool

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool limits list thruput --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf [thruput]
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf maxKBps = 2048

Then I usually give the parsingQueue some extra breathing room in server.conf, being careful to consider the host machine resources:

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server  list queue=parsingQueue --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf [queue=parsingQueue]
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_1_lookback_time = 60s
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_2_lookback_time = 600s
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_3_lookback_time = 900s
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf maxSize = 10MB
/opt/splunkforwarder/etc/system/default/server.conf                        sampling_interval = 1s
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ 

While tweaking these settings I keep a close eye on metrics.log, by either grepping /opt/splunkforwarder/var/log/splunk/metrics.log or just using the search head to search:

index=_internal source=*metrics.log  host="n00b-splkufwd-01" group=tcpout_connections

index=_internal source=*metrics.log  host="n00b-splkufwd-01" blocked=true

I try to keep an eye on how much bandwidth is able to be moved on the wire, which does play a role in this....ie. how much data can one tcp connection to one of your indexers move. You may want to baseline some single socket tcp connections just to find out what you are working with...

Then if I still need more juice (and the host machine can afford it) I add the extra pipeline

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server list --debug | grep parallelIngestionPipelines
/opt/splunkforwarder/etc/system/local/server.conf                          parallelIngestionPipelines = 2

Also if you monitor many files be sure to up the fd_max to 300 in limits.conf, and ensure the ulimits for the user running splunk is tuned up

See the note about file systems here:

http://docs.splunk.com/Documentation/Splunk/6.5.2/Installation/Systemrequirements

- MattyMo

View solution in original post

mattymo
Splunk Employee
Splunk Employee

Here is what I run through on the forwarders (assuming your indexing layer is healthy and keeping the pipeline moving)

First, as you stated, you need to deal with thruput. Just make sure you edit it in the right limits.conf (either in an app, or $SPLUNK_HOME/etc/system/local/ or the splunkUniversalForwarder app) and confirm in btool

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool limits list thruput --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf [thruput]
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf maxKBps = 2048

Then I usually give the parsingQueue some extra breathing room in server.conf, being careful to consider the host machine resources:

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server  list queue=parsingQueue --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf [queue=parsingQueue]
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_1_lookback_time = 60s
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_2_lookback_time = 600s
/opt/splunkforwarder/etc/system/default/server.conf                        cntr_3_lookback_time = 900s
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf maxSize = 10MB
/opt/splunkforwarder/etc/system/default/server.conf                        sampling_interval = 1s
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ 

While tweaking these settings I keep a close eye on metrics.log, by either grepping /opt/splunkforwarder/var/log/splunk/metrics.log or just using the search head to search:

index=_internal source=*metrics.log  host="n00b-splkufwd-01" group=tcpout_connections

index=_internal source=*metrics.log  host="n00b-splkufwd-01" blocked=true

I try to keep an eye on how much bandwidth is able to be moved on the wire, which does play a role in this....ie. how much data can one tcp connection to one of your indexers move. You may want to baseline some single socket tcp connections just to find out what you are working with...

Then if I still need more juice (and the host machine can afford it) I add the extra pipeline

splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server list --debug | grep parallelIngestionPipelines
/opt/splunkforwarder/etc/system/local/server.conf                          parallelIngestionPipelines = 2

Also if you monitor many files be sure to up the fd_max to 300 in limits.conf, and ensure the ulimits for the user running splunk is tuned up

See the note about file systems here:

http://docs.splunk.com/Documentation/Splunk/6.5.2/Installation/Systemrequirements

- MattyMo

somesoni2
Revered Legend

The attribute parallelIngestionPiplines was introduced in Splunk 6.3. What version of Splunk Universal Forwarder you have?

Also, try some of the Linux TCP tuning settings as well. I've seen it improve my HF performance a lot.

http://www.linux-admins.net/2010/09/linux-tcp-tuning.html

0 Karma

dbcase
Motivator

We are running the 6.5 version of the forwarder

0 Karma

mattymo
Splunk Employee
Splunk Employee

line 19...seems low in the file...can you show us the server.conf?

I configured like this with no issue on 6.4.1

[general]
serverName = n00b-splkufwd-01
pass4SymmKey = <redacted>

parallelIngestionPipelines = 2
- MattyMo
0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...