Hi. I am on Splunk 6.4.1 with the Unix Add-on (5.2.3). I was wondering if anyone had similar issues or solutions to get the following OS data working on AIX servers :
SPLUNK LOG
memTotalMB memFreeMB memUsedMB memFreePct memUsedPct pgPageOut swapUsedPct pgSwapOut cSwitches interrupts forks processes threads loadAvg1mi waitThreads interrupts_PS pgPageIn_PS pgPageOut_PS
MANUALLY RAN
[server:splunk]/opt/splunkforwarder/etc/apps/Splunk_TA_nix/bin> ./vmstat.sh
memTotalMB memFreeMB memUsedMB memFreePct memUsedPct pgPageOut swapUsedPct pgSwapOut cSwitches interrupts forks processes threads loadAvg1mi waitThreads interrupts_PS pgPageIn_PS pgPageOut_PS
awk: The field -11 cannot be less than 0.
The input line number is 5.
The source line number is 1.
2 . ps.sh / top.sh -- I am trying to get CPU by process and ps is not cutting it. Below says that splunk is running "0.8", but its actually running around 20% on the server (through top/topas). Has anyone tried getting data through the top/topas command from AIX?
SPLUNK LOG
19136670 - 0.8 1-19:52:10 1.0 370340 347348 - A 9-20:11:23 splunkd --nodaemon_-p_8089__internal_exec_splunkd
Here are the three issues I discovered
cpu.sh
sar -P ALL 1 1
vmstat.sh
elif [ "x$KERNEL" = "xAIX" ] ; then
assertHaveCommand uptime
assertHaveCommand ps
assertHaveCommand vmstat
assertHaveCommandGivenPath /usr/sbin/swap
assertHaveCommandGivenPath /usr/bin/svmon
CMD='eval uptime ; ps -e | wc -l ; ps -em | wc -l ; /usr/sbin/swap -s ; vmstat 1 1 | tail -1 ; vmstat -s ; svmon; '
PARSE_0='NR==1 {loadAvg1mi=0+$(NF-2)} NR==2 {processes=$1} NR==3 {threads=$1-processes }'
# ps -em inclundes processes with there threads ( at least one), so processes must be excluded to count threads #
PARSE_1='(NR==4) {swapUsed=0+$(NF-5); swapFree=0+$(NF-1)} (NR==5) {pgPageIn_PS=0+$(NF-11); pgPageOut_PS=0+$(NF-10)}'
PARSE_2='/^memory / {memTotalMB=$2 / 256 ; memFreeMB=$4 / 256}'
PARSE_3='/paging space page outs$/ {pgPageOut=$1 ; pgSwapOut="?" }'
# no pgSwapOut parameter and can't be monitored in AIX (by Jacky Ho, Systex)
PARSE_4='/cpu context switches$/ {cSwitches=$1} /device interrupts$/ {interrupts=$1 ; forks="?" }'
PARSE_5='/^CPU_COUNT/ {cpuCount=$2}'
MASSAGE="$PARSE_0 $PARSE_1 $PARSE_2 $PARSE_3 $PARSE_4 $PARSE_5 $DERIVE"
This is the change I made in vmstat.sh:
CMD='eval uptime ; ps -e | wc -l ; ps -em | wc -l ; /usr/sbin/swap -s ; vmstat 1 1; vmstat -s ; svmon; `dirname $0`/hardware.sh;'
-->
CMD='eval uptime ; ps -e | wc -l ; ps -em | wc -l ; /usr/sbin/swap -s ; vmstat 1 1 | tail -1 ; vmstat -s ; svmon; '
The reason:
$ vmstat 1 1 | tail -1
1 0 981011 134358 0 0 0 0 0 0 12 1184 388 1 1 98 0
$ vmstat 1 1
System Configuration: lcpu=2 mem=14080MB
kthr memory page faults cpu
r b avm fre re pi po fr sr cy in sy cs us sy id wa
3 0 982403 132966 0 0 0 0 0 0 81 23250 1299 18 17 64 0
,Here are the three issues I discovered while running on AIX 7.1
cpu.sh
sar -P ALL 1 1
vmstat.sh
elif [ "x$KERNEL" = "xAIX" ] ; then
assertHaveCommand uptime
assertHaveCommand ps
assertHaveCommand vmstat
assertHaveCommandGivenPath /usr/sbin/swap
assertHaveCommandGivenPath /usr/bin/svmon
CMD='eval uptime ; ps -e | wc -l ; ps -em | wc -l ; /usr/sbin/swap -s ; vmstat 1 1 | tail -1 ; vmstat -s ; svmon; '
PARSE_0='NR==1 {loadAvg1mi=0+$(NF-2)} NR==2 {processes=$1} NR==3 {threads=$1-processes }'
# ps -em inclundes processes with there threads ( at least one), so processes must be excluded to count threads #
PARSE_1='(NR==4) {swapUsed=0+$(NF-5); swapFree=0+$(NF-1)} (NR==5) {pgPageIn_PS=0+$(NF-11); pgPageOut_PS=0+$(NF-10)}'
PARSE_2='/^memory / {memTotalMB=$2 / 256 ; memFreeMB=$4 / 256}'
PARSE_3='/paging space page outs$/ {pgPageOut=$1 ; pgSwapOut="?" }'
# no pgSwapOut parameter and can't be monitored in AIX (by Jacky Ho, Systex)
PARSE_4='/cpu context switches$/ {cSwitches=$1} /device interrupts$/ {interrupts=$1 ; forks="?" }'
PARSE_5='/^CPU_COUNT/ {cpuCount=$2}'
MASSAGE="$PARSE_0 $PARSE_1 $PARSE_2 $PARSE_3 $PARSE_4 $PARSE_5 $DERIVE"
Here was the change
CMD='eval uptime ; ps -e | wc -l ; ps -em | wc -l ; /usr/sbin/swap -s ; vmstat 1 1; svmon; dirname $0
/hardware.sh;'
-->
CMD='eval uptime ; ps -e | wc -l ; ps -em | wc -l ; /usr/sbin/swap -s ; vmstat 1 1 | tail -1 ; vmstat -s ; svmon; '
This might be a known issue ADDON-14093. Please open a support ticket so they can validate if this is config related or related to ADDON-14093. Make sure to include what version of AIX because I vaguely recall this might have to do with changes in newer AIX versions. Also, include the link to this post in case there's new details shared here.
Since these doesn't seem to be going anywhere, I'm logging a support ticket.
can you share the solution if you received it from Splunk or can you share the ticket for further followup ?
I logged case #561504. I wasn't too pleased when it was re-classed as an enhancement request.
I'm not holding my breath that we'll see this addressed anytime soon ; (
Thanks for your quick reply. Let me try to reach them Splunk representative.