Monitoring Splunk

Performance impacts of Spectre/Meltdown mitigation

Path Finder

Does anyone have figures of the performance impact of CVE-2017-5754, CVE-2017-5753 and CVE-2017-5715 (Spectre/Meltdown) patches on Splunk?

Communicator

Is Splunk planning to publish any official documentation pertaining to performance impacts associated to Spectre/Meltdown patches and/or provide any mitigation/remediation recommendations given how significant the impact might be? We have RHEL kernels at our organization and are deeply concerned with some of the reported performance impacts on this thread, so any information release would be highly appreciated.

Contributor

Splunk cannot "mitigate" or "remediate" the impact of a kernel patches that, essentially, turn off all CPU optimizations. These flaws are inherent to the architecture of the CPU.

Your organization must determine, for your organization , if the potential risk is sufficient to warrant applying the patches. To perform risk analysis, you need to understand the nature of the risk, and then examine how likely those potential risks could occur in your environment.

Most organizations do not seem to bother with risk analysis of system vulnerabilities. Meltdown/Spectre is definitely a case where it should be performed.

0 Karma

Communicator

I've heard you can expect 20-50 percent performance impact depending on search load if you're not running as root. This has much to do with the way the patch blocks user access to L3 cache.

0 Karma

Ultra Champion

hi @davpx - can you shed any more light on the "not running as root" part of your comment - I hadn't seen this noted previously and am looking for more info.

If my comment helps, please give it a thumbs up!
0 Karma

Contributor

Adding a link to RedHat's Speculative Execution Exploit Performance Impacts - Describing the performance impacts to security patches for CVE-2017-5754 CVE-2017-5753 and CVE-2017-5715

https://access.redhat.com/articles/3307751

Path Finder

That's the document that sent me looking for the exact figures - as far as I understand Splunk workloads are in "Modest" and "Measurable" impact categories,

0 Karma

Contributor
0 Karma

SplunkTrust
SplunkTrust

I would assume it would have a significant impact, due to the high disk I/O and memory caching, but I'm no kernel engineer.

I plan to patch one of our indexers this week, and I hope to report my findings.

EDIT:
I have seen a significant impact on performance when the remediation is enabled. Redhat has a article how to disable it.
https://access.redhat.com/articles/3311301

I wrote a tuned profile for Splunk in which you can have it run a script do disable that remediation.
https://github.com/jewnix/tuned-splunk

0 Karma

Contributor

I've asked the community on Slack/IRC to weigh in with concrete data pre-/post-mitigation.

Path Finder

Started a google drive where people can put in their data from testing.
Included prepatch data for part of our deployment. Anyone is free to deposit their own data

https://drive.google.com/drive/folders/1LegN7VuOA9y8VHY5D7XjARhvVpBaciz1?usp=sharing

Explorer

Any updates yet? We are scheduled for patching, just wondering if anyone has some figures on indexer performance hits yet?

0 Karma

SplunkTrust
SplunkTrust

I saw around 50% increase in system load.

Some people in the industry I spoke to, do not recommend running production servers with the patch enabled. Especially on databases, file-servers or any I/O intensive load.

You can patch the system, and disable the feature by using tuned or the equivalent for your platform.

Splunk Employee
Splunk Employee

we all about #facts and #proof here 😉

Lets see those pre and post patch signals y'all!

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!