I have heard that THP can be problematic for certain applications. I would like to know how this can impact Splunk, and what I need to do about it?
Some Linux distros have been shipping with THP enabled by default.
See the effects of this on the Splunk documentation here.
The Redhat info here explains 1 method of disabling THP (using grub.conf) as well as providing ways to validate they are disabled.
I like to follow this procedure:
(Each Sys Admin can come up with their own way to pull this off)
I run these two commands on all my systems that are running CentOS/Redhat 6.x or later that are splunk servers.
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
No need to restart splunk
Then to make these changes persistent across reboots I add this to the bottom of my /etc/rc.local
#disable THP at boot time
if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/redhat_transparent_hugepage/defrag; then
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
fi
I then validate I am not running things like ktune or tuned (this could actually override the settings you set above)
chkconfig --list |grep tune
ktune 0:off 1:off 2:off 3:off 4:off 5:off 6:off
tuned 0:off 1:off 2:off 3:off 4:off 5:off 6:off
To validate THP is disabled, I run the below three commands, or any variant you choose from here .
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
always madvise [never]
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
always madvise [never]
egrep 'trans|thp' /proc/vmstat (on this command I validate none of these results are changing)
nr_anon_transparent_hugepages 2
thp_fault_alloc 12793
thp_fault_fallback 18
thp_collapse_alloc 70
thp_collapse_alloc_failed 0
thp_split 2974
One thing to keep in mind, Splunk will log in $SPLUNK_HOME/var/log/splunk/splunkd.log on start up if THP is enabled or disabled
grep hugetables /opt/splunk/var/log/splunk/splunkd.log
11-18-2014 08:19:42.052 -0600 INFO ulimit - Linux transparent hugetables support, enabled="never" defrag="never"
A possible concern with this log entry can be two fold.
Because on my system /etc/rc.d/rc3.d/S90splunk is executed before /etc/rc.d/rc3.d/S99local after a reboot the splunkd.log entry will reflect they are enabled. However subsequent splunk restarts would reflect the proper information.
In summary:
You can run from the Monitoring Console (splunk_monitoring_console app context) to check THP effective state:
| rest /services/server/sysinfo splunk_server_group=dmc_group_indexer | table splunk_server version numberOfCores, numberOfVirtualCores physicalMemoryMB ulimits.open_files transparent_hugepages.effective_state | sort +splunk_server
I've never had a grub.conf overwritten by a kernel update in 20 years using any RedHat derived distro. Maybe other ones don't behave well (I refuse to use ubuntu). I believe any modifications are probably happening through ktune or tuned, neither of which I use.
It would definitely be better if Splunk took care of the configs as part of the install, at least for people who aren't so adept at creating their own configs. Or just don't start up until the underlying configs are correct. Why compromise the performance of Splunk for things that are known like THP and open file settings?
There's no need to do the defrag setting when THP is off no matter how you turn off THP, since there's nothing to defrag.
We ran into severe performance degradation and deduced it was memory page handling. An alternative that worked well on our hardware was setting vm.zone_reclaim_mode = 1
This was set to 0 causing memory allocation to occur across nodes in a NUMA architecture. Setting this value forced reclamation of memory assigned to the local node first. With the transient nature of the Splunk search workload processes the impact was stunning.
Server load averages were regularly spiraling out of control many times s over 100 on 12 core / 24 hyperthreaded indexers and searchheads. The symptoms are the same , servers became completely unresponsive for minutes at a time. The root cause may be the same -- poor cross node memory management. We set the parameter and in a matter of a few minutes the load was 2-3 and very hard to drive over 5-6. End user search performance was orders of magnitude faster.
Red Hat Enterprise Linux Server release 6.5 my investigation shows this setting to be CPU specific and some users reported worse performance in our case it was dramatically improved.
Another option to consider when facing similar performance issues.
Hi mwk1000,
Mind if I ask you what hardware you were running the indexers on when you had to set this?
This problem is generic to all Linux kernels for the last several years.
It becomes really catastrophic as memory becomes closer to exhausted (e.g over 80% in use and rising), because of the thrashing to defragment pages. However, the work to simply map in transparent huge pages for relatively short lived search processes is always a net loss.
It's a no-brainer to always turn these off in all Linux installs.
Where did you set this config? I'm seeing similar issues on one of my indexers.
Syntax like vm.zone_reclaim_mode is a sysctl value. There will be an equivalent virtual file in proc with slightly different naming.
The existence of these values can be kernel version dependent, but the referred system is RHEL6.5, so typically it will be availble there and on newer systems.
Some Linux distros have been shipping with THP enabled by default.
See the effects of this on the Splunk documentation here.
The Redhat info here explains 1 method of disabling THP (using grub.conf) as well as providing ways to validate they are disabled.
I like to follow this procedure:
(Each Sys Admin can come up with their own way to pull this off)
I run these two commands on all my systems that are running CentOS/Redhat 6.x or later that are splunk servers.
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
No need to restart splunk
Then to make these changes persistent across reboots I add this to the bottom of my /etc/rc.local
#disable THP at boot time
if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/redhat_transparent_hugepage/defrag; then
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
fi
I then validate I am not running things like ktune or tuned (this could actually override the settings you set above)
chkconfig --list |grep tune
ktune 0:off 1:off 2:off 3:off 4:off 5:off 6:off
tuned 0:off 1:off 2:off 3:off 4:off 5:off 6:off
To validate THP is disabled, I run the below three commands, or any variant you choose from here .
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
always madvise [never]
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
always madvise [never]
egrep 'trans|thp' /proc/vmstat (on this command I validate none of these results are changing)
nr_anon_transparent_hugepages 2
thp_fault_alloc 12793
thp_fault_fallback 18
thp_collapse_alloc 70
thp_collapse_alloc_failed 0
thp_split 2974
One thing to keep in mind, Splunk will log in $SPLUNK_HOME/var/log/splunk/splunkd.log on start up if THP is enabled or disabled
grep hugetables /opt/splunk/var/log/splunk/splunkd.log
11-18-2014 08:19:42.052 -0600 INFO ulimit - Linux transparent hugetables support, enabled="never" defrag="never"
A possible concern with this log entry can be two fold.
Because on my system /etc/rc.d/rc3.d/S90splunk is executed before /etc/rc.d/rc3.d/S99local after a reboot the splunkd.log entry will reflect they are enabled. However subsequent splunk restarts would reflect the proper information.
In summary:
Regarding the splunkd.log related to THP
If you are setting transparent_hugepage=never in grub.conf (disable THP at boot time) then
"Linux transparent hugetables support, enabled="never" **defrag="always**""
in splunkd.log is also a GOOD condition.
Please see - https://access.redhat.com/solutions/46111
and section below. Splunk must reword their logging message in splund.log
NOTE: Some third party application install scripts check value of above files and complain even if THP is disabled at boot time using transparent_hugepage=never, this is due to the fact when THP is disabled at boot time, the value of /sys/kernel/mm/redhat_transparent_hugepage/defrag will not be changed, however this is expected and system will never go in THP defragmentation code path when it is disabled at boot and THP defrag need not to be disabled separately
Ryan's page on using systemd for this:
https://www.rfaircloth.com/2017/04/28/using-systemd-squash-thp-start-splunk-enterprise/
Interesting @anilyelmar, if you submit a P4 ticket, Splunk Support can make your suggestion an enhancement request.
I submitted P4.
By the way,
My customized installation of 6.5.3 is no more logging anything related to THP in splukd.log where as vanilla 6.5.3 does. Do you know which config controls this logging? I will have to check and correct it.
I can not easily compare my customized 6.5.3 config with vanilla 6.5.3 .
thanks
/sys/kernel/mm/redhat_transparent_hugepage/defrag does NOT need to be set to never if /sys/kernel/mm/redhat_transparent_hugepage/enabled is set to never. The system will not try to defrag something that is already off. This is documented in the RedHat guide to turning off THP.
Also, doing it via grub will survive kernel upgrades. By default, all options are carried over to the new kernel lines in grub.conf.
I am on RHEL 7. I've got a discrepancy between what I see in Linux versus what I see in Splunk. If i understand the Linux verification steps, I see Transparent Huge Pages are not enabled. However, this search in Splunk plus messages in splunkd.log indicate they are enabled. Here is the search (from @Stefan 😞
| rest splunk_server=local /services/server/info
| join type=outer splunk_server [rest splunk_server=local /services/server/sysinfo | fields splunk_server transparent_hugepages.*]
| eval transparent_hugepages.effective_state = if(isnotnull('transparent_hugepages.effective_state'), 'transparent_hugepages.effective_state', "unknown")
| eval transparent_hugepages.enabled = case(len('transparent_hugepages.enabled') > 0, 'transparent_hugepages.enabled', 'transparent_hugepages.effective_state' == "ok" AND (isnull('transparent_hugepages.enabled') OR len('transparent_hugepages.enabled') = 0), "feature not available", 'transparent_hugepages.effective_state' == "unknown" AND isnull('transparent_hugepages.enabled'), "unknown")
| eval transparent_hugepages.defrag = case(len('transparent_hugepages.defrag') > 0, 'transparent_hugepages.defrag', 'transparent_hugepages.effective_state' == "ok" AND (isnull('transparent_hugepages.defrag') OR len('transparent_hugepages.defrag') = 0), "feature not available", 'transparent_hugepages.effective_state' == "unknown" AND isnull('transparent_hugepages.defrag'), "unknown")
| eval severity_level = case('transparent_hugepages.effective_state' == "unavailable", -1, 'transparent_hugepages.effective_state' == "ok", 0, 'transparent_hugepages.effective_state' == "unknown", 1, 'transparent_hugepages.effective_state' == "bad", 2)
| fields transparent_hugepages.enabled transparent_hugepages.defrag transparent_hugepages.effective_state severity_level
| fields - _timediff
And here are the search results:
transparent_hugepages.enabled transparent_hugepages.defrag transparent_hugepages.effective_state severity_level
always always bad 2
And, here is what I see in ~/var/log/splunk/splunkd.log:
04-25-2017 16:40:54.899 -0700 INFO ulimit - Linux transparent hugepage support, enabled="always" defrag="always"
04-25-2017 16:40:54.899 -0700 WARN ulimit - This configuration of transparent hugepages is known to cause serious runtime problems with Splunk. Typical symptoms include generally reduced performance and catastrophic breakdown in system responsiveness under high memory pressure. Please fix by setting the values for transparent huge pages to "madvise" or preferably "never" via sysctl, kernel boot parameters, or other method recommended by your Linux distribution.
But here are the places we are told to look to see if they are disabled (RHEL 7):
# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
# cat /sys/kernel/mm/transparent_hugepage/defrag
[always] madvise never
and here is the vmstat grep:
# egrep 'trans|thp' /proc/vmstat
nr_anon_transparent_hugepages 740
thp_fault_alloc 2752847
thp_fault_fallback 655381
thp_collapse_alloc 83345
thp_collapse_alloc_failed 600
thp_split 748006
thp_zero_page_alloc 3
thp_zero_page_alloc_failed 0
Why this discrepancy?
How interesting. If you don't get much action in terms of answers on this post, it might do well as a new thread...
Well, it turns out that I'm reading the output on those two cat commands wrong. Whatever is the bracketed word is the current state. So [always] madvise never
means ALWAYS. So Splunk was right!
On Splunk 6.5.2 the grep of splunkd.log is different:
grep "transparent hugepage" /opt/splunk/var/log/splunk/splunkd.log
What it looks like if Splunk things THP is enabled:
04-25-2017 16:40:54.899 -0700 INFO ulimit - Linux transparent hugepage support, enabled="always" defrag="always"
04-25-2017 16:40:54.899 -0700 WARN ulimit - This configuration of transparent hugepages is known to cause serious runtime problems with Splunk. Typical symptoms include generally reduced performance and catastrophic breakdown in system responsiveness under high memory pressure. Please fix by setting the values for transparent huge pages to "madvise" or preferably "never" via sysctl, kernel boot parameters, or other method recommended by your Linux distribution.
Adding yet another version - just with nifty loops and hopefully addressing variants in paths. Perfect for your /etc/rc.local
edits...hopefully.
#SPLUNK: disable THP at boot time
THP=`find /sys/kernel/mm/ -name transparent_hugepage -type d | tail -n 1`
for SETTING in "enabled" "defrag";do
if test -f ${THP}/${SETTING}; then
echo never > ${THP}/${SETTING}
fi
done
This needs to be quoted:
echo "never > ${THP}/${SETTING}"
rc.local above updated as this is a link on some rh6.x but a directory on rh7.x
#SPLUNK: disable THP at boot time
THP=`find /sys/kernel/mm/ -name transparent_hugepage \( -type l -o -type d \)| tail -n 1`
for SETTING in "enabled" "defrag";do
if test -f ${THP}/${SETTING}; then
echo never > ${THP}/${SETTING}
fi
done
Me gusta.