New 6.0.x/6.1.x installation and both Indexer and Search Head seem to have latency and not performing as expected!
CPU/Mem/IOPS requirements all met - What's wrong?
Read this
http://docs.splunk.com/Documentation/Splunk/6.1.1/ReleaseNotes/SplunkandTHP
Check if your OS is affected:
cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
Then fix on OS level:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
Validate on OS level:
cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
Please note the above instructions may vary from distro to distro.
This setting will need to persist at each restart of the box of course.
After a restart of Splunk, you can validate the setting as seen by Splunkd process by checking $SPLUNK_HOME/var/log/splunk/splunkd.log
bad
06-17-2014 12:08:05.854 +0200 INFO ulimit - Linux transparent hugetables support, enabled="always" defrag="always"
good
06-17-2014 14:03:05.991 +0200 INFO ulimit - Linux transparent hugetables support, enabled="never" defrag="never"
You should then notice a substantial increase in performance.
This is required on Search Heads and Indexer version 6.x and greater on Linux only.
If you are setting transparent_hugepage=never in grub.conf (disable THP at boot time) then
"Linux transparent hugetables support, enabled="never" defrag="always"" in splunkd.log is also a GOOD condition.
Please see - https://access.redhat.com/solutions/46111
and section below. Splunk must reword their logging message
NOTE: Some third party application install scripts check value of above files and complain even if THP is disabled at boot time using transparent_hugepage=never, this is due to the fact when THP is disabled at boot time, the value of /sys/kernel/mm/redhat_transparent_hugepage/defrag will not be changed, however this is expected and system will never go in THP defragmentation code path when it is disabled at boot and THP defrag need not to be disabled separately.
Read this
http://docs.splunk.com/Documentation/Splunk/6.1.1/ReleaseNotes/SplunkandTHP
Check if your OS is affected:
cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
Then fix on OS level:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
Validate on OS level:
cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
Please note the above instructions may vary from distro to distro.
This setting will need to persist at each restart of the box of course.
After a restart of Splunk, you can validate the setting as seen by Splunkd process by checking $SPLUNK_HOME/var/log/splunk/splunkd.log
bad
06-17-2014 12:08:05.854 +0200 INFO ulimit - Linux transparent hugetables support, enabled="always" defrag="always"
good
06-17-2014 14:03:05.991 +0200 INFO ulimit - Linux transparent hugetables support, enabled="never" defrag="never"
You should then notice a substantial increase in performance.
This is required on Search Heads and Indexer version 6.x and greater on Linux only.
What if the log says Linux transparent hugetables support, enabled="always" defrag="never"
? Would it be still alright (in between bad and good?)
Nope. The correct configuration is never/never
Sorry, that's not always correct dwaddle. And Splunk is monitoring this incorrectly for cases where hugepages was disabled since boot. The issue is addressed in the RedHat doc itself (https://access.redhat.com/solutions/46111 "To disable THP at run time" 2nd Note). If hugepages is disabled in grub.conf at boot time (as per the RedHat instructions), then defrag is not set and so may still show "always". In fact hugepages defrag is never used if hugepages have been disabled since boot time. All you need to monitor for, in the common case where hugepages are disabled at boot, is enabled != "never".
Hey Glenn, agreed. If THP is disabled, then defrag on or off is meaningless. Splunk's health check code is misreporting the combination of enabled=never, defrag=always as an issue when it is not, and there is a bug opened to resolve that. Good catch!
As rc.local runs these commands after Splunk starts I put this in init.d so upon restart Splunk logs the correct/current status on reboot.
Actually I think I see what's missing, in addition to:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
we also need to run:
echo never > /sys/kernel/mm/transparent_hugepage/defrag
Also, I have to add these commands to /etc/rc.local so that the settings persist across reboots.
What does defrag="always" mean? After disabling THP that value still remains as "always"
10-13-2014 21:51:56.613 +0000 INFO ulimit - Linux transparent hugetables support, enabled="never" defrag="always"