Hi,
Today morning 2 search heads out of 3 from cluster went down. When i checked it killed by OS with message 'out of memory' in /var/log/message, but system had enough memory at that time when splunk process got killed by OS [around 38% was free]. In splunkd.log file i did not find any error messages. Can any one please let me know how to get the root cause of this issue and fix it. This already happen 3-4 times.
Thanks in advance.
thanks,
Shashank Soni.
THP being enabled is the #1 reason for poor Splunk RAM management. Run a health check from your MC and see if everything is setup correctly. This will check for ulimits, too.
Just wanted to add this link describing THP's impact on memory to woodcock's answer.
http://docs.splunk.com/Documentation/Splunk/7.1.1/ReleaseNotes/SplunkandTHP
Hope this helps.
Have you checked your ulimits ?
Ulimits can trigger OOM killer too from what I understand. Up voting.