Installation

Splunk 5.0 consuming all memory. Is it possible to Downgrade from 5.0 to 4.3.4 or is it time to whip out backup?

larryrosen
Explorer

Since our upgrade to 5.0 splunk is no longer functional. It consumes all memory on the server until you can't even RDP to the box (Windows) or open the web interface. Short of restoring the last known good backup, any options to go back to the working 4.3.4?

1 Solution

hexx
Splunk Employee
Splunk Employee

Hello,

We have determined that in Splunk 5.0, active UDP inputs cause the main splunkd process to leak memory. The rate of this memory leak appears to be proportional to the rate of data that is being received on the UDP input(s). For that reason, it is possible for a very active UDP input to cause splunkd to eventually exhaust all available memory on the host.

The bug that references this behavior is SPL-58075 and has been added to the list of known issues for Splunk 5.0.

We are actively working towards the release of a fix to this issue in the next few days.

In the meantime, there are four possible work-arounds that we can propose:

  • Install a 4.3.4 universal forwarder on the same machine that is currently receiving the UDP traffic and migrate the UDP inputs to that instance. The universal forwarder should be configured to send all incoming data to the indexer on the same host.

  • Where possible customers should switch from sending data via UDP to sending via TCP as it helps to reduce potential data loss and is more in line with best practices for sending network data to Splunk. Be advised that this input removes syslogd event augmentation (e.g. timestamp and hostname pre-pending)

  • Schedule a restart of the impacted instance(s) at regular intervals to prevent memory exhaustion. This solution is not strongly recommended as it can introduce data loss and kick users out of the system.

  • Disable UDP inputs. This solution is not recommended since all data from the sending host(s) will be lost.

View solution in original post

hexx
Splunk Employee
Splunk Employee

Hello,

We have determined that in Splunk 5.0, active UDP inputs cause the main splunkd process to leak memory. The rate of this memory leak appears to be proportional to the rate of data that is being received on the UDP input(s). For that reason, it is possible for a very active UDP input to cause splunkd to eventually exhaust all available memory on the host.

The bug that references this behavior is SPL-58075 and has been added to the list of known issues for Splunk 5.0.

We are actively working towards the release of a fix to this issue in the next few days.

In the meantime, there are four possible work-arounds that we can propose:

  • Install a 4.3.4 universal forwarder on the same machine that is currently receiving the UDP traffic and migrate the UDP inputs to that instance. The universal forwarder should be configured to send all incoming data to the indexer on the same host.

  • Where possible customers should switch from sending data via UDP to sending via TCP as it helps to reduce potential data loss and is more in line with best practices for sending network data to Splunk. Be advised that this input removes syslogd event augmentation (e.g. timestamp and hostname pre-pending)

  • Schedule a restart of the impacted instance(s) at regular intervals to prevent memory exhaustion. This solution is not strongly recommended as it can introduce data loss and kick users out of the system.

  • Disable UDP inputs. This solution is not recommended since all data from the sending host(s) will be lost.

gkanapathy
Splunk Employee
Splunk Employee

Another option would be to enable UDP listening via a syslog agent on the server.

Configure a syslog agent (eg. syslogk, rsyslog, syslog-ng) that writes the data to a local file, then have the Splunk instance monitor the file rather than listen on UDP. You would also want to set up syslog file rotation and deletion. This may be more straightforward an option on Unix-type systems than on Windows.

kphillipson
Path Finder

Awesome Hexx,
I have been using the scheduled task that ran every 2 hours. I'm going to use your suggestion of using a forwarder. Next time I'll wait until X.1 of a major release comes out.

0 Karma

kphillipson
Path Finder

Hummmm, I'm having the same issue. Splunkd.exe will grow to >20GB in a few hours. Splunk is running on Windows 2008 R2 on a UCS physical blade, standalone. Just upgraded to 5.0 last Thursday. I have a support case open. I'll share what I find out.

0 Karma

larryrosen
Explorer

Windows 2008R2 all on 1 server. Can't get to the logs until our windows team can reboot the server so I can get in and stop the splunkd service from eating all the memory and hanging the box.

tskinnerivsec
Contributor

Can you describe your splunk deployment? Is everything installed on one server? Do you have a distributed deployment? Are you running splunk on linux or windows servers? Do you have a sample of the splunkd logs after the upgrade showing specific errors?

0 Karma
Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...