Splunk Enterprise

Where do I configure the health.conf so that I can disable the IOWaits alert?

BlueSocket
Communicator

Dear All,

I have a Search Head, Deployment Server, Monitoring Console, a Cluster Manager, an Indexer Cluster and two unclustered Indexers.

On the Monitoring Console, I get alerts about the IOWaits being high on the two unclustered indexers and this has been happening only since we upgraded to 8.2.5.

There is no evidence of any issues, other than this alert in SplunkWeb and I want to disable it. I am using the following KB article:

https://docs.splunk.com/Documentation/Splunk/8.2.5/Admin/Healthconf

On the Monitoring Console server, I have put the following into the etc\apps\search\local\health.conf file:

[feature:iowait]
alert:sum_top3_cpu_percs__max_last_3m.disabled = 1

However, I am still getting the appearing in SplunkWeb on the Monitoring Console server.

Why is this? Am I configuring the health.conf in the wrong server or the wrong folder, or what? When I run a cmd btool health list, I see the configuration there, but Splunk is not doing as it is being told! If I am doing the wrong thing, even, can someone point me to some documentation that explains what I should be doing?

Thanks in advance! 

Labels (2)
Tags (2)

pellegrini
Path Finder

The Health feature has caused some confusion regards local vs. distributed config. I have investigated this and it is very flexible to configure even though the docs is not so clear about it. So far I have not found any Answers posts that isn't possible to solve using standard config.

If you do configuration locally on Monitoring Console (DMC), as you described, that threshold will only be valid for the DMC local host. There is no distributed threashold. You need to configure the threshold on each and every enterprise instance (e.g. your standalone indexers). Either you do config in Splunk Web under Settings menu on each enterprise instance and just click Save. Or toggle Status Disable/Enable. This will take direct effect and does not require restart.

If you instead configure health.conf on each instance, example disable iowait, put this in health.conf

[feature:iowait]
disabled = 1

And then you need to do a reload e.g. http://<your_splunk>:<splunk_port>/debug/refresh

 

If you have a index cluster, applying a cluster bundle, this will trigger a restart of the peers. Version 8.2.4

Hope this solves your issue.

0 Karma

pellegrini
Path Finder

In addition you can use these searches to benchmark iowait performance over time, so you can set relevant thresholds for your environment. Just replace the hostname:

CPU IOwait average

index=_internal  source="*/splunk/var/log/splunk/health.log"   
feature=IOWait component=PeriodicHealthReporter node_type=indicator
indicator=avg_cpu__max_perc_last_3m host=ind0*
| timechart span=30s max(measured_value) min(due_to_threshold_value) by host

CPU IOwait single CPU

index=_internal  source="*/splunk/var/log/splunk/health.log"   
feature=IOWait component=PeriodicHealthReporter node_type=indicator
indicator=single_cpu__max_perc_last_3m host=ind0*
| timechart span=300s max(measured_value) min(due_to_threshold_value) by host

CPU IOwait top3 CPU

index=_internal  source="*/splunk/var/log/splunk/health.log"  
feature=IOWait component=PeriodicHealthReporter node_type=indicator
indicator=sum_top3_cpu_percs__max_last_3m host=ind0*
| timechart span=30s max(measured_value) min(due_to_threshold_value) by host

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Did you restart the MC server after changing the config file?

Have you tried making the same health.conf change on the indexers?

---
If this reply helps you, Karma would be appreciated.
0 Karma

BlueSocket
Communicator

Yes, I restarted the MC several times after the changes to the configurations.

 

No, I have not edited the health.conf on the Indexers. They are quite difficult to restart at the moment and I was hoping that someone would have a KB or document that could help (or know definitively) before I went there.

0 Karma
Get Updates on the Splunk Community!

Observability Highlights | November 2022 Newsletter

 November 2022Observability CloudEnd Of Support Extension for SignalFx Smart AgentSplunk is extending the End ...

Avoid Certificate Expiry Issues in Splunk Enterprise with Certificate Assist

This blog post is part 2 of 4 of a series on Splunk Assist. Click the links below to see the other ...

Using Machine Learning for Hunting Security Threats

REGISTER NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more ...