We have a situation where a Splunk forwarder is abruptly dying on one of the servers once a day or so.
Upon further investigation this was discovered:
root@intelsat [/var/log]# cat messages | grep splunkd
Jul 12 06:25:01 intelsat kernel: [30182] 0 30182 706988 480205 1 0 0 splunkd
Jul 12 06:25:01 intelsat kernel: [30183] 0 30183 13200 92 1 -17 -1000 splunkd
Jul 12 06:25:01 intelsat kernel: Out of memory: Kill process 30182 (splunkd) score 260 or sacrifice child
Jul 12 06:25:01 intelsat kernel: Killed process 30182 (splunkd) total-vm:2827952kB, anon-rss:1920820kB, file-rss:0kB
Jul 12 06:25:01 intelsat kernel: splunkd invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
Jul 12 06:25:01 intelsat kernel: splunkd cpuset=/ mems_allowed=0
Jul 12 06:25:01 intelsat kernel: Pid: 30201, comm: splunkd Not tainted 3.2.13-grsec-xxxx-grs-ipv6-64 #1
Jul 12 06:25:01 intelsat kernel: [30201] 0 30182 706988 480567 3 0 0 splunkd
Jul 12 06:25:01 intelsat kernel: [30183] 0 30183 13200 92 1 -17 -1000 splunkd
Jul 12 06:25:02 intelsat kernel: [30210] 0 30182 706988 480543 3 0 0 splunkd
Jul 12 06:25:02 intelsat kernel: [30183] 0 30183 13200 92 1 -17 -1000 splunkd
As I understand, it's happening because Splunk forwarder requests more memory than available on the system?
Is it possible to configure forwarder for some form of "safe" memory allocation strategy to prevent this from happening?
Ideally i'd want to configure forwarder to auto-restart as well...
... View more