Splunk Enterprise

Why am getting Splunk Health Alerts after upgrade to 8.2?

Stefanie
Builder

I upgraded from 7.2 to 8.0 and then 8.0 to 8.2

After the upgrade to our distributed deployment, I am getting bombarded with email Health Alerts.

"sum_top3_cpu_percs__max_last_3m"  is red due to the following: "Sum of 3 highest per-cpu iowaits reached red threshold of 15"

"avg_cpu__max_perc_last_3m" is red due to the following: "System iowait reached red threshold of 3"

"single_cpu__max_perc_last_3m" is red due to the following: "Maximum per-cpu iowait reached red threshold of 10"

 

I was getting them on my Indexers yesterday but this morning it seems to be our Enterprise Security SH, our Deployment Server,  and our regular Search Head.

 

I am unable to disable these alerts due to our Company's policy. 

 

What can I do to either a.) resolve this cpu/iowait issue or b.) change the alert settings?

I don't notice a difference in performance. I'm just curious as to what's causing this CPU usage spike?

Because it seems to me - as in the example of avg cpu max percent if the CPU usage is above 3%, it is going to alert me?

0 Karma

pellegrini
Path Finder

You can change the thresholds on each enterprise instance. Most of what is described here is locally configured on each instance. See Answers https://community.splunk.com/t5/Splunk-Enterprise/Where-do-I-configure-the-health-conf-so-that-I-can...

 

 

0 Karma

emallinger
Communicator

Hi there,

Had a similar issue with Io:Wait being way too sensitive and not being able to deactivate it.

Had the same answer about SPL-213405.

Wait and see !

Ema

0 Karma

gjanders
SplunkTrust
SplunkTrust

You can change them via health.conf https://docs.splunk.com/Documentation/Splunk/latest/Admin/Healthconf

 

I think many have found the iowait check too sensitive in 8.2...including myself

rkantamaneni
Engager

For more details on this issue, go to the following Splunk Answer: https://community.splunk.com/t5/Splunk-Enterprise/Cannot-Disable-Health-Report-Features-in-8-2-2/m-p...

From @nunoaragao:

Splunk have now updated their documentation regarding disable health report features.
It states in a box:

"If distributed health reporting is enabled for your deployment, disabling a feature on the local instance will not be reflected in the health report."

It seems, the workaround to disable a feature in +8.2 has just became a feature. The old behavior in +8.1 in which you could disable a single feature regardless of distributed health report has been "improved"/

My case(s) with Splunk Support were #2733102 and #2737559 .. SPL-213405 is Splunk's internal JIRA to track this issue. It may, or not, then show up on Splunk's release notes as known issue. It's still being investigated. If you deal with Support you can ask to link with it.

My issue is that Docs say "You can disable any feature (...) for example, if you want to exclude a feature's status from the health report". So we expect to be able to disable a specific feature (i.e. Buckets) without requiring to disable distributed_health_reporter, which would also disable/hide a lot of other features if we're on a typical topology where we have search head clusters and clustered indexers. In other words, tell the Search Head to gray out a Indexer peer feature even if that peer is reporting health.

Get Updates on the Splunk Community!

Using Machine Learning for Hunting Security Threats

WATCH NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more for ...

New Learning Videos on Topics Most Requested by You! Plus This Month’s New Splunk ...

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

How I Instrumented a Rust Application Without Knowing Rust

As a technical writer, I often have to edit or create code snippets for Splunk's distributions of ...