Hi,
We have a proxy server where multiple log files get uploaded. The average is about 15 million events per day. Currently the server is processing approx 3 million events per hour (server=4 cores, 8gb memory, VMware VM)
Is there a way to improve performance/multithread the forwarder?
We've tried enabling parallelIngestionPiplines=2 in server.conf (This made the universal forwarder very unhappy error message below)
We confirmed that [thruput] maxKBps = 0 in limits.conf
The CPU nor memory is bound. We are using Splunk Cloud and our Internet circuit is not saturated either.
Any other thoughts?
Error message
Checking conf files for problems...
Invalid key in stanza [general] in /opt/splunkforwarder/etc/system/local/server.conf, line 19: parallelIngestionPipelines (value: 2)
Your indexes and inputs configurations are not internally consistent. For more information, run 'splunk btool check --debug'
Here is what I run through on the forwarders (assuming your indexing layer is healthy and keeping the pipeline moving)
First, as you stated, you need to deal with thruput. Just make sure you edit it in the right limits.conf (either in an app, or $SPLUNK_HOME/etc/system/local/ or the splunkUniversalForwarder app) and confirm in btool
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool limits list thruput --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf [thruput]
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf maxKBps = 2048
Then I usually give the parsingQueue some extra breathing room in server.conf, being careful to consider the host machine resources:
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server list queue=parsingQueue --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf [queue=parsingQueue]
/opt/splunkforwarder/etc/system/default/server.conf cntr_1_lookback_time = 60s
/opt/splunkforwarder/etc/system/default/server.conf cntr_2_lookback_time = 600s
/opt/splunkforwarder/etc/system/default/server.conf cntr_3_lookback_time = 900s
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf maxSize = 10MB
/opt/splunkforwarder/etc/system/default/server.conf sampling_interval = 1s
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$
While tweaking these settings I keep a close eye on metrics.log, by either grepping /opt/splunkforwarder/var/log/splunk/metrics.log or just using the search head to search:
index=_internal source=*metrics.log host="n00b-splkufwd-01" group=tcpout_connections
index=_internal source=*metrics.log host="n00b-splkufwd-01" blocked=true
I try to keep an eye on how much bandwidth is able to be moved on the wire, which does play a role in this....ie. how much data can one tcp connection to one of your indexers move. You may want to baseline some single socket tcp connections just to find out what you are working with...
Then if I still need more juice (and the host machine can afford it) I add the extra pipeline
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server list --debug | grep parallelIngestionPipelines
/opt/splunkforwarder/etc/system/local/server.conf parallelIngestionPipelines = 2
Also if you monitor many files be sure to up the fd_max to 300 in limits.conf, and ensure the ulimits for the user running splunk is tuned up
See the note about file systems here:
http://docs.splunk.com/Documentation/Splunk/6.5.2/Installation/Systemrequirements
Here is what I run through on the forwarders (assuming your indexing layer is healthy and keeping the pipeline moving)
First, as you stated, you need to deal with thruput. Just make sure you edit it in the right limits.conf (either in an app, or $SPLUNK_HOME/etc/system/local/ or the splunkUniversalForwarder app) and confirm in btool
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool limits list thruput --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf [thruput]
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/limits.conf maxKBps = 2048
Then I usually give the parsingQueue some extra breathing room in server.conf, being careful to consider the host machine resources:
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server list queue=parsingQueue --debug
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf [queue=parsingQueue]
/opt/splunkforwarder/etc/system/default/server.conf cntr_1_lookback_time = 60s
/opt/splunkforwarder/etc/system/default/server.conf cntr_2_lookback_time = 600s
/opt/splunkforwarder/etc/system/default/server.conf cntr_3_lookback_time = 900s
/opt/splunkforwarder/etc/apps/SplunkUniversalForwarder/default/server.conf maxSize = 10MB
/opt/splunkforwarder/etc/system/default/server.conf sampling_interval = 1s
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$
While tweaking these settings I keep a close eye on metrics.log, by either grepping /opt/splunkforwarder/var/log/splunk/metrics.log or just using the search head to search:
index=_internal source=*metrics.log host="n00b-splkufwd-01" group=tcpout_connections
index=_internal source=*metrics.log host="n00b-splkufwd-01" blocked=true
I try to keep an eye on how much bandwidth is able to be moved on the wire, which does play a role in this....ie. how much data can one tcp connection to one of your indexers move. You may want to baseline some single socket tcp connections just to find out what you are working with...
Then if I still need more juice (and the host machine can afford it) I add the extra pipeline
splunker@n00b-splkufwd-01:/opt/splunkforwarder/bin$ ./splunk btool server list --debug | grep parallelIngestionPipelines
/opt/splunkforwarder/etc/system/local/server.conf parallelIngestionPipelines = 2
Also if you monitor many files be sure to up the fd_max to 300 in limits.conf, and ensure the ulimits for the user running splunk is tuned up
See the note about file systems here:
http://docs.splunk.com/Documentation/Splunk/6.5.2/Installation/Systemrequirements
The attribute parallelIngestionPiplines
was introduced in Splunk 6.3. What version of Splunk Universal Forwarder you have?
Also, try some of the Linux TCP tuning settings as well. I've seen it improve my HF performance a lot.
We are running the 6.5 version of the forwarder
line 19...seems low in the file...can you show us the server.conf?
I configured like this with no issue on 6.4.1
[general]
serverName = n00b-splkufwd-01
pass4SymmKey = <redacted>
parallelIngestionPipelines = 2