All Apps and Add-ons

Splunk for linux not grabbing right fields

kkalmbach
Path Finder

I have several linux boxes that are being monitored by splunk and a few of them are having problems running the "cpu.sh" script.

The problem is that on some of the boxes, there is not "steal" column in the sar output:


uname -a
Linux hostname 2.6.9-89.35.1.ELsmp #1 SMP Tue Jan 4 22:30:58 EST 2011 i686 i686 i386 GNU/Linux

$ sar -P ALL 1 1
Linux 2.6.9-89.35.1.ELsmp (fralnxnmsapp5) 11/10/2011

02:21:05 PM CPU %user %nice %system %iowait %idle
02:21:06 PM all 0.00 0.00 0.00 0.00 100.00
02:21:06 PM 0 0.00 0.00 0.00 0.00 100.00
02:21:06 PM 1 0.00 0.00 0.00 0.00 100.00
02:21:06 PM 2 0.00 0.00 0.00 0.00 100.00
02:21:06 PM 3 0.00 0.00 0.00 0.00 100.00

Average: CPU %user %nice %system %iowait %idle
Average: all 0.00 0.00 0.00 0.00 100.00
Average: 0 0.00 0.00 0.00 0.00 100.00
Average: 1 0.00 0.00 0.00 0.00 100.00
Average: 2 0.00 0.00 0.00 0.00 100.00
Average: 3 0.00 0.00 0.00 0.00 100.00

One more piece of info, the version of sar on the box's are pretty old:


$ sar -V
sysstat version 5.0.5
(C) Sebastien Godard
Usage: sar [ options... ] [ [ ] ]

Splunk's cpu script assumes the "steal" column is there. As a result, all the values end up in the wrong column.

Is there a fix for this, or should I change the cpu script myself?

Thanks
-Kevin

0 Karma
1 Solution

araitz
Splunk Employee
Splunk Employee

Wow, that is a really old version of sar! I don't think in our testing we encountered a version quite that old.

Assuming you are using the latest version of the Unix/Linux app, on line 30 of cpu.sh, it seems like you should change this:

 FORMAT='{cpu=$(NF-6); pctUser=$(NF-5); pctNice=$(NF-4); pctSystem=$(NF-3); pctIowait=$(NF-2); pctIdle=$NF}'

To this:

 FORMAT='{cpu=$(NF-5); pctUser=$(NF-4); pctNice=$(NF-3); pctSystem=$(NF-2); pctIowait=$(NF-1); pctIdle=$NF}'

This is purely an eyeball, so your mileage may vary. I will make a note to triage this issue for the next maintenance release of the app, though I can't guarantee it will make the cut.

View solution in original post

araitz
Splunk Employee
Splunk Employee

Wow, that is a really old version of sar! I don't think in our testing we encountered a version quite that old.

Assuming you are using the latest version of the Unix/Linux app, on line 30 of cpu.sh, it seems like you should change this:

 FORMAT='{cpu=$(NF-6); pctUser=$(NF-5); pctNice=$(NF-4); pctSystem=$(NF-3); pctIowait=$(NF-2); pctIdle=$NF}'

To this:

 FORMAT='{cpu=$(NF-5); pctUser=$(NF-4); pctNice=$(NF-3); pctSystem=$(NF-2); pctIowait=$(NF-1); pctIdle=$NF}'

This is purely an eyeball, so your mileage may vary. I will make a note to triage this issue for the next maintenance release of the app, though I can't guarantee it will make the cut.

View solution in original post

araitz
Splunk Employee
Splunk Employee

The second option seems viable, but the first option seems super easy to implement. The app is open sourced under the Apache license, and it should be up on Splunk's github account pretty soon, so feel free to contribute your fix when that day comes.

0 Karma

kkalmbach
Path Finder

Thanks,
That's close to what I was thinking, but since we deploy the same app to several forwarders, I was thinking about something like:


if (NF == 😎
FORMAT='{cpu=$(NF-5); pctUser=$(NF-4); pctNice=$(NF-3); pctSystem=$(NF-2); pctIowait=$(NF-1); pctIdle=$NF}'
else
(what is currently there)

The other thing I though about was to just grab the header from the output (change % to pct) and use that for a header.

For now I'll probably go with the first option, but do you see a problem with the second?

0 Karma