Linear memory growth on any splunk instance configured to receive data on splunktcpin, tcpin and udpin ports.
Following config in server.conf will fix the memory growth.
[prometheus]
disabled = true
Although it was not documented but 9.3.x/9.2.x/9.1.x etc/system/default/server.conf you will find
[prometheus]
disabled = true
Can confirm that it fixed memory leak that we see on our upgraded HF server.
This is not fixed in the newly released 9.4.1.
But what does prometheus do in splunk? Is it some new function that was added to the 9.x server and was set to disabled? Do not find any info in server.conf docs.
It's https://prometheus.io/ support that was added almost 4 years ago(8.2.0). But it was disabled due to memory explosion since 8.2.1
4 years?????
And nothing has been done to fix it. This part should then be removed from the code then.
Here you see memory one of our HF whas upgraded from 9.3.2 to 9.4. When all memory are used up, it runs for some hour more and then dies. We reported this issue just some days after 9.4.0 was released, and did get the fix just now.
I noticed that https://docs.splunk.com/Documentation/Splunk/9.4.0/Admin/Serverconf does not mention prometheus.
Is this an undocumented feature that is getting disabled to prevent a memory leak issue?
Although it was not documented but 9.3.x/9.2.x/9.1.x etc/system/default/server.conf you will find
[prometheus]
disabled = true
Is this issue likely to be fixed in an upcoming version release?
It's not fixed in upcoming releases.
However the fix (whenever part of a release) will also be same as the workaround.
[prometheus]
disabled = true
I do not see any fix for this in the just released 9.4.2 that was released month after it was discovered in 9.4.0
There is a setting in the next beta for Splunk so maybe it will come in 9.4.3
Also strange that this setting is not mention in the latest documentation:
https://docs.splunk.com/Documentation/Splunk/latest/Admin/Serverconf
[prometheus]
disabled = true