Getting Data In

Constant Memory growth with Universal Forwarder UDP / tcp inputs and third party forwarding enabled.

hrawat_splunk
Splunk Employee
Splunk Employee

Constant Memory growth with Universal Forwarder with ever increasing channels.

Once third party receiver is restarted, UF re-sends lot of duplicate data and frees up channels.

Labels (2)
Tags (1)
0 Karma
1 Solution

hrawat_splunk
Splunk Employee
Splunk Employee

This issue is applicable only on UF if sendCookedData=false. 

You may want to check General forwarder memory growth .

Do you see a trend, where channels are gradually increasing and one of the tcpout group is set to sendCookedData=false?

Metrics.log entries pointing to  channels growth.
INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel,
current_size=12950

INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12955

INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12959


It's a known issue with universal forwarder where some sources (monitor/udp/tcp) may emit EOF marker late. This results in UF not able to free up channels. However the same config ( 3rd party forwarding) on HF is not an issue.
One way to test if you are hitting the issue, restart 3rd party receiver. If UF memory drops immediately, apply following workaround.

 
Until the issue is fixed by new patch, use following workaround.

Set following config for 3rd party tcpout group only.

forceTimebasedAutoLB=true

This setting will force close connections and allow consolidation of channels. Thus every `autoLBFrequency` interval reclaim memory.

Note: For all 8.2.x and older releases, forceTimebasedAutoLB works only if the total number of distinct 3rd party valid target <ip address:port> combinations are > 1. If there is only one receiver, forceTimebasedAutoLB setting is no-op. Please don't add dummy/no-existent <ip address:port> combination.

If your 3rd party receiver is on same box as UF, then you should be able to  make > 1 receivers by adding `127.0.0.1` in `server` list.

[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>
sendCookedData=false

For 9.x UFs, connectionsPerTarget setting, if set to `auto` or > 1, then forceTimebasedAutoLB=true works for single receiver tcpout groups.

connectionsPerTarget = [<integer>|auto]
* The maximum number of allowed outbound connections for each target IP address
  as resolved by DNS on the machine.
* A value of "auto" or < 1 means splunkd configures a value for connections for each
  target IP address. Depending on the number of IP addresses that DNS resolves,
  splunkd sets 'connectionsPerTarget' as follows:
  * If the number of resolved target IP addresses is greater than or equal to 10,
    'connectionsPerTarget' gets a value of 1.
  * If the number of resolved target IP addresses is greater than 5
    and less than 10, 'connectionsPerTarget' gets a value of 2.
  * If the number of resolved target IP addresses is greater than 3
    or less than equal to 5, 'connectionsPerTarget' gets a value of 3.
  * If the number of resolved target IP addresses is less than or equal to 3,
    'connectionsPerTarget' gets a value of 4.
* Default: auto

 

View solution in original post

hrawat_splunk
Splunk Employee
Splunk Employee

This issue is applicable only on UF if sendCookedData=false. 

You may want to check General forwarder memory growth .

Do you see a trend, where channels are gradually increasing and one of the tcpout group is set to sendCookedData=false?

Metrics.log entries pointing to  channels growth.
INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel,
current_size=12950

INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12955

INFO  Metrics - group=map, ingest_pipe=0, name=pipelineinputchannel, current_size=12959


It's a known issue with universal forwarder where some sources (monitor/udp/tcp) may emit EOF marker late. This results in UF not able to free up channels. However the same config ( 3rd party forwarding) on HF is not an issue.
One way to test if you are hitting the issue, restart 3rd party receiver. If UF memory drops immediately, apply following workaround.

 
Until the issue is fixed by new patch, use following workaround.

Set following config for 3rd party tcpout group only.

forceTimebasedAutoLB=true

This setting will force close connections and allow consolidation of channels. Thus every `autoLBFrequency` interval reclaim memory.

Note: For all 8.2.x and older releases, forceTimebasedAutoLB works only if the total number of distinct 3rd party valid target <ip address:port> combinations are > 1. If there is only one receiver, forceTimebasedAutoLB setting is no-op. Please don't add dummy/no-existent <ip address:port> combination.

If your 3rd party receiver is on same box as UF, then you should be able to  make > 1 receivers by adding `127.0.0.1` in `server` list.

[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>
sendCookedData=false

For 9.x UFs, connectionsPerTarget setting, if set to `auto` or > 1, then forceTimebasedAutoLB=true works for single receiver tcpout groups.

connectionsPerTarget = [<integer>|auto]
* The maximum number of allowed outbound connections for each target IP address
  as resolved by DNS on the machine.
* A value of "auto" or < 1 means splunkd configures a value for connections for each
  target IP address. Depending on the number of IP addresses that DNS resolves,
  splunkd sets 'connectionsPerTarget' as follows:
  * If the number of resolved target IP addresses is greater than or equal to 10,
    'connectionsPerTarget' gets a value of 1.
  * If the number of resolved target IP addresses is greater than 5
    and less than 10, 'connectionsPerTarget' gets a value of 2.
  * If the number of resolved target IP addresses is greater than 3
    or less than equal to 5, 'connectionsPerTarget' gets a value of 3.
  * If the number of resolved target IP addresses is less than or equal to 3,
    'connectionsPerTarget' gets a value of 4.
* Default: auto

 

ravis_splunk
Splunk Employee
Splunk Employee

Reference:- If your 3rd party receiver is on same box as UF, then you should be able to  make > 1 receivers by adding `127.0.0.1` in `server` list.

Question:- If the 3rd party receiver is on the same box as UF and if the server list already has an entry for 127.0.0.1 as in

[tcpout:todisk]

server=127.0.0.1:10010

Then is the suggestion to add one more entry for ip 127.0.0.1 and a dummy port?

 

0 Karma

hrawat_splunk
Splunk Employee
Splunk Employee

No dummy/invalid ipaddress/port to use.


If the receiver is on same localhost and one of the following is true.

If `server`  already has 127.0.0.1 then as per the answer add UF ip address.
If `server`  already has UF ip address then as per the answer add 127.0.0.1 .

[tcpout:thirdpartytcpout]
server=127.0.0.1:<target port>, <ip address of UF host>:<target port>

  

0 Karma

hrawat_splunk
Splunk Employee
Splunk Employee
Spoiler
 

Issue is fixed by 8.2.5 and above.

Get Updates on the Splunk Community!

Changes to Splunk Instructor-Led Training Completion Criteria

We’re excited to share an update to our instructor-led training program that enhances the learning experience ...

Stay Connected: Your Guide to January Tech Talks, Office Hours, and Webinars!

❄️ Welcome the new year with our January lineup of Community Office Hours, Tech Talks, and Webinars! &#x1f389; ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...