Getting Data In

Disk I/O error monitoring on Linux Servers

shugup2923
Path Finder

I am looking to monitor Disk IO error, is there any way to monitor it..

Currently we have filtered disk related hardware error message like below format and same output is  redirected into splunk readable log file. we are monitoring  error message  if that log file contains “I/O error” string.

Command which we used to convert hardware message into splunk readable format:

# dmesg -L -T| grep -iE "I/O error"|tr -d '['| awk -F']' '{print $1 "," $2}'

Thu Oct  1 00:01:00 2020, blk_update_request: I/O error, dev fd0, sector 0

Fri Oct  2 00:01:00 2020, blk_update_request: I/O error, dev fd0, sector 0

Fri Oct  2 00:01:00 2020, blk_update_request: I/O error, dev fd0, sector 0

But this is not the feasible way to monitor as this command don't work on all linux version, so is there any default app available to monitor Disk I/O error.

Labels (1)
0 Karma

shugup2923
Path Finder

Thu Oct  1 00:01:00 2020, blk_update_request: I/O error, dev fd0, sector 0

Fri Oct  2 00:01:00 2020, blk_update_request: I/O error, dev fd0, sector 0

Fri Oct  2 00:01:00 2020, blk_update_request: I/O error, dev fd0, sector 0

this is the output, currently my script works only with redhat 7, centos 7, I need a common way to  monitor on all versions.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Which versions of Linux are you using?

0 Karma

shugup2923
Path Finder

redhat 7, centos 7,redhat 6,centos 6,suse linux 12,ubuntu 16,amazon ami linux

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Does the output of dmesg look the same on all systems?

Does grep work the same way with those options on all systems?

You need to look at what you script is doing at each stage to figure out what is different with the failing systems compared to the working systems.

0 Karma

shugup2923
Path Finder

Dmesg command itself don't work on all flavours of linux

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Do you have an example of the output from dmesg from each of these?

0 Karma
Get Updates on the Splunk Community!

Observability | How to Think About Instrumentation Overhead (White Paper)

Novice observability practitioners are often overly obsessed with performance. They might approach ...

Cloud Platform | Get Resiliency in the Cloud Event (Register Now!)

IDC Report: Enterprises Gain Higher Efficiency and Resiliency With Migration to Cloud  Today many enterprises ...

The Great Resilience Quest: 10th Leaderboard Update

The tenth leaderboard update (11.23-12.05) for The Great Resilience Quest is out >> As our brave ...