Getting Data In

Constant Memory growth with Universal Forwarder UDP / tcp inputs and third party forwarding enabled.

hrawat_splunk
Splunk Employee
Splunk Employee

Constant Memory growth with Universal Forwarder with ever increasing channels.

 

Labels (2)
Tags (1)
0 Karma
1 Solution

hrawat_splunk
Splunk Employee
Splunk Employee

This issue is applicable only on UF if sendCookedData=false. 

You may want to check General forwarder memory growth .

Do you see a trend, where channels are gradually increasing and one of the tcpout group is set to sendCookedData=false?

Metrics.log entries pointing to  channels growth.
INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel,
current_size=12950

INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12955

INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12959


It's a known issue with universal forwarder where some sources (monitor/udp/tcp) may emit EOF marker late. This results in UF not able to free up channels. However the same config ( 3rd party forwarding) on HF is not an issue.
One way to test if you are hitting the issue, restart 3rd party receiver. If UF memory drops immediately, apply following workaround.

 
Until the issue is fixed by new patch, use following workaround.

Set following config for 3rd party tcpout group only.

forceTimebasedAutoLB=true

This setting will force close connections and allow consolidation of channels. Thus every `autoLBFrequency` interval reclaim memory.

Note: For all 8.2.x and older releases, forceTimebasedAutoLB works only if the total number of distinct 3rd party valid target <ip address:port> combinations are > 1. If there is only one receiver, forceTimebasedAutoLB setting is no-op. Please don't add dummy/no-existent <ip address:port> combination.

If your 3rd party receiver is on same box as UF, then you should be able to  make > 1 receivers by adding `127.0.0.1` in `server` list.

[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>
sendCookedData=false

For 9.x UFs, connectionsPerTarget setting, if set to `auto` or > 1, then forceTimebasedAutoLB=true works for single receiver tcpout groups.

connectionsPerTarget = [<integer>|auto]
* The maximum number of allowed outbound connections for each target IP address
  as resolved by DNS on the machine.
* A value of "auto" or < 1 means splunkd configures a value for connections for each
  target IP address. Depending on the number of IP addresses that DNS resolves,
  splunkd sets 'connectionsPerTarget' as follows:
  * If the number of resolved target IP addresses is greater than or equal to 10,
    'connectionsPerTarget' gets a value of 1.
  * If the number of resolved target IP addresses is greater than 5
    and less than 10, 'connectionsPerTarget' gets a value of 2.
  * If the number of resolved target IP addresses is greater than 3
    or less than equal to 5, 'connectionsPerTarget' gets a value of 3.
  * If the number of resolved target IP addresses is less than or equal to 3,
    'connectionsPerTarget' gets a value of 4.
* Default: auto

 

View solution in original post

hrawat_splunk
Splunk Employee
Splunk Employee

This issue is applicable only on UF if sendCookedData=false. 

You may want to check General forwarder memory growth .

Do you see a trend, where channels are gradually increasing and one of the tcpout group is set to sendCookedData=false?

Metrics.log entries pointing to  channels growth.
INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel,
current_size=12950

INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12955

INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12959


It's a known issue with universal forwarder where some sources (monitor/udp/tcp) may emit EOF marker late. This results in UF not able to free up channels. However the same config ( 3rd party forwarding) on HF is not an issue.
One way to test if you are hitting the issue, restart 3rd party receiver. If UF memory drops immediately, apply following workaround.

 
Until the issue is fixed by new patch, use following workaround.

Set following config for 3rd party tcpout group only.

forceTimebasedAutoLB=true

This setting will force close connections and allow consolidation of channels. Thus every `autoLBFrequency` interval reclaim memory.

Note: For all 8.2.x and older releases, forceTimebasedAutoLB works only if the total number of distinct 3rd party valid target <ip address:port> combinations are > 1. If there is only one receiver, forceTimebasedAutoLB setting is no-op. Please don't add dummy/no-existent <ip address:port> combination.

If your 3rd party receiver is on same box as UF, then you should be able to  make > 1 receivers by adding `127.0.0.1` in `server` list.

[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>
sendCookedData=false

For 9.x UFs, connectionsPerTarget setting, if set to `auto` or > 1, then forceTimebasedAutoLB=true works for single receiver tcpout groups.

connectionsPerTarget = [<integer>|auto]
* The maximum number of allowed outbound connections for each target IP address
  as resolved by DNS on the machine.
* A value of "auto" or < 1 means splunkd configures a value for connections for each
  target IP address. Depending on the number of IP addresses that DNS resolves,
  splunkd sets 'connectionsPerTarget' as follows:
  * If the number of resolved target IP addresses is greater than or equal to 10,
    'connectionsPerTarget' gets a value of 1.
  * If the number of resolved target IP addresses is greater than 5
    and less than 10, 'connectionsPerTarget' gets a value of 2.
  * If the number of resolved target IP addresses is greater than 3
    or less than equal to 5, 'connectionsPerTarget' gets a value of 3.
  * If the number of resolved target IP addresses is less than or equal to 3,
    'connectionsPerTarget' gets a value of 4.
* Default: auto

 

ravis_splunk
Splunk Employee
Splunk Employee

Reference:- If your 3rd party receiver is on same box as UF, then you should be able to  make > 1 receivers by adding `127.0.0.1` in `server` list.

Question:- If the 3rd party receiver is on the same box as UF and if the server list already has an entry for 127.0.0.1 as in

[tcpout:todisk]

server=127.0.0.1:10010

Then is the suggestion to add one more entry for ip 127.0.0.1 and a dummy port?

 

0 Karma

hrawat_splunk
Splunk Employee
Splunk Employee

No dummy/invalid ipaddress/port to use.


If the receiver is on same localhost and one of the following is true.

If `server`  already has 127.0.0.1 then as per the answer add UF ip address.
If `server`  already has UF ip address then as per the answer add 127.0.0.1 .

[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>

  

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...