Constant Memory growth with Universal Forwarder with ever increasing channels.
Once third party receiver is restarted, UF re-sends lot of duplicate data and frees up channels.
This issue is applicable only on UF if sendCookedData=false.
You may want to check General forwarder memory growth .
Do you see a trend, where channels are gradually increasing and one of the tcpout group is set to sendCookedData=false?
Metrics.log entries pointing to channels growth.
INFO Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12950
INFO Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12955
INFO Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12959
It's a known issue with universal forwarder where some sources (monitor/udp/tcp) may emit EOF marker late. This results in UF not able to free up channels. However the same config ( 3rd party forwarding) on HF is not an issue.
One way to test if you are hitting the issue, restart 3rd party receiver. If UF memory drops immediately, apply following workaround.
Until the issue is fixed by new patch, use following workaround.
Set following config for 3rd party tcpout group only.
forceTimebasedAutoLB=true
This setting will force close connections and allow consolidation of channels. Thus every `autoLBFrequency` interval reclaim memory.
Note: For all 8.2.x and older releases, forceTimebasedAutoLB works only if the total number of distinct 3rd party valid target <ip address:port> combinations are > 1. If there is only one receiver, forceTimebasedAutoLB setting is no-op. Please don't add dummy/no-existent <ip address:port> combination.
If your 3rd party receiver is on same box as UF, then you should be able to make > 1 receivers by adding `127.0.0.1` in `server` list.
[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>
sendCookedData=false
For 9.x UFs, connectionsPerTarget setting, if set to `auto` or > 1, then forceTimebasedAutoLB=true works for single receiver tcpout groups.
connectionsPerTarget = [<integer>|auto] * The maximum number of allowed outbound connections for each target IP address as resolved by DNS on the machine. * A value of "auto" or < 1 means splunkd configures a value for connections for each target IP address. Depending on the number of IP addresses that DNS resolves, splunkd sets 'connectionsPerTarget' as follows: * If the number of resolved target IP addresses is greater than or equal to 10, 'connectionsPerTarget' gets a value of 1. * If the number of resolved target IP addresses is greater than 5 and less than 10, 'connectionsPerTarget' gets a value of 2. * If the number of resolved target IP addresses is greater than 3 or less than equal to 5, 'connectionsPerTarget' gets a value of 3. * If the number of resolved target IP addresses is less than or equal to 3, 'connectionsPerTarget' gets a value of 4. * Default: auto
This issue is applicable only on UF if sendCookedData=false.
You may want to check General forwarder memory growth .
Do you see a trend, where channels are gradually increasing and one of the tcpout group is set to sendCookedData=false?
Metrics.log entries pointing to channels growth.
INFO Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12950
INFO Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12955
INFO Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12959
It's a known issue with universal forwarder where some sources (monitor/udp/tcp) may emit EOF marker late. This results in UF not able to free up channels. However the same config ( 3rd party forwarding) on HF is not an issue.
One way to test if you are hitting the issue, restart 3rd party receiver. If UF memory drops immediately, apply following workaround.
Until the issue is fixed by new patch, use following workaround.
Set following config for 3rd party tcpout group only.
forceTimebasedAutoLB=true
This setting will force close connections and allow consolidation of channels. Thus every `autoLBFrequency` interval reclaim memory.
Note: For all 8.2.x and older releases, forceTimebasedAutoLB works only if the total number of distinct 3rd party valid target <ip address:port> combinations are > 1. If there is only one receiver, forceTimebasedAutoLB setting is no-op. Please don't add dummy/no-existent <ip address:port> combination.
If your 3rd party receiver is on same box as UF, then you should be able to make > 1 receivers by adding `127.0.0.1` in `server` list.
[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>
sendCookedData=false
For 9.x UFs, connectionsPerTarget setting, if set to `auto` or > 1, then forceTimebasedAutoLB=true works for single receiver tcpout groups.
connectionsPerTarget = [<integer>|auto] * The maximum number of allowed outbound connections for each target IP address as resolved by DNS on the machine. * A value of "auto" or < 1 means splunkd configures a value for connections for each target IP address. Depending on the number of IP addresses that DNS resolves, splunkd sets 'connectionsPerTarget' as follows: * If the number of resolved target IP addresses is greater than or equal to 10, 'connectionsPerTarget' gets a value of 1. * If the number of resolved target IP addresses is greater than 5 and less than 10, 'connectionsPerTarget' gets a value of 2. * If the number of resolved target IP addresses is greater than 3 or less than equal to 5, 'connectionsPerTarget' gets a value of 3. * If the number of resolved target IP addresses is less than or equal to 3, 'connectionsPerTarget' gets a value of 4. * Default: auto
Reference:- If your 3rd party receiver is on same box as UF, then you should be able to make > 1 receivers by adding `127.0.0.1` in `server` list.
Question:- If the 3rd party receiver is on the same box as UF and if the server list already has an entry for 127.0.0.1 as in
[tcpout:todisk]
server=127.0.0.1:10010
Then is the suggestion to add one more entry for ip 127.0.0.1 and a dummy port?
No dummy/invalid ipaddress/port to use.
If the receiver is on same localhost and one of the following is true.
If `server` already has 127.0.0.1 then as per the answer add UF ip address.
If `server` already has UF ip address then as per the answer add 127.0.0.1 .
[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>
Issue is fixed by 8.2.5 and above.