All Apps and Add-ons
Highlighted

ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Path Finder

I am getting the following in splunkd.log for ps_sos.ps1:


 D:\Splunk\etc\apps\sos\bin\powershell\ps_sos.ps1 : Exception of type 'Microsoft
 .PowerShell.Commands.WriteErrorException' was thrown.
 At line:1 char:2
 + . <<<<  'D:\Splunk\etc\apps\sos\bin\\powershell\ps_sos.ps1'
     + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorExcep 
    tion
     + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorExceptio 
    n,ps_sos.ps1

I am running 6.1.1 (free license) and SOS 3.2. Any ideas on what's going on and how to fix it????

Below in comments I tried to put in the record dumps but ran out of room. Here they are now...

05/13/14 11:51:26.000 AM


PID pctCPU RSZKB VSZKB ELAPSED COMMAND ARGS
1980 0 137468 307072 0-0:42:12 splunkd service
3652 0 24288 90056 0-0:0:30 splunkd search--id=rt1400007053.24--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
3908 22556 88008 0-0:0:30 splunkd System.Object[]
4064 22572 88008 0-0:0:30 splunkd System.Object[]
4944 24284 90056 0-0:0:30 splunkd System.Object[]
5324 0 16288 81500 0-0:41:7 splunkd instrument-resource-usage
6676 0 24176 90056 0-0:0:21 splunkd search--id=rt1400007064.25--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
7180 24456 90056 0-0:0:21 splunkd System.Object[]
7580 61 24448 91080 0-0:0:3 splunkd search--id=1400007082.30--maxbuckets=300--ttl=600--maxout=500000--maxtime=0--lookups=1--reducefreq=10_--rf=*
2456 0 70816 150620 0-0:42:12 splunkweb

05/13/14 11:51:21.000 AM


PID pctCPU RSZKB VSZKB ELAPSED COMMAND ARGS
1980 0 136340 307072 0-0:42:7 splunkd service
3652 0 24288 90056 0-0:0:25 splunkd search--id=rt1400007053.24--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
3908 0 22428 88008 0-0:0:2 splunkd search--id=rt1400007076.28--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
4064 0 22564 88008 0-0:0:3 splunkd search--id=rt1400007076.29--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
4944 0 22284 88008 0-0:0:5 splunkd search--id=rt1400007074.26--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
5324 0 16288 81500 0-0:41:3 splunkd instrument-resource-usage
6676 0 24176 90056 0-0:0:16 splunkd search--id=rt1400007064.25--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
7180 0 22448 88008 0-0:0:3 splunkd search--id=rt1400007076.27--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
2456 0 70816 150620 0-0:42:6 splunkweb

Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Splunk Employee
Splunk Employee

Did these errors start after you installed S.o.S 3.2? Were there none before?

How often do you see these errors - do they occur every time (5s) the script runs?

Finally, what output do you get if you attempt to run the script manually from the command line?

%SPLUNK_HOME%\bin\splunk cmd %SPLUNK_HOME%\etc\apps\sos\bin\sospowershell.cmd ps_sos.ps1

0 Karma
Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Path Finder

I have been running 3.2 for a while and I am unsure if they happened before splunk 6.1 and 6.1.1.

These errors do NOT happen every 5s. An example of the time stamps are:
05-13-2014 11:15:03.862
05-13-2014 11:17:09.409
05-13-2014 11:17:09.615
05-13-2014 11:17:24.872
05-13-2014 11:18:19.241
05-13-2014 11:19:04.387
05-13-2014 11:26:49.459
and now it is 11:48 and no more entries! Strange hu!!

When I run the script at the command line, I get the output I would expect from a ps in Linux.

I can also look with a Splunk search (source="ps_sos.ps1") and I can see the data coming in ever 5s.

0 Karma
Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Splunk Employee
Splunk Employee

This is good to know. Now, if you don't mind digging a bit further: Do you see any records in the sos index for the times when the script reports an error? Or are there gaps? If there are records there, can you show a sample event produced during a run of ps_sos.ps1 that also yielded an error?

0 Karma
Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Path Finder

Okay - lets see if I got this right.

The log shows 2 failures at 11:51:24.906 and 11:51:25.243

Looking in the records I see: (Let me know if this is what you wanted...)

0 Karma
Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Splunk Employee
Splunk Employee

The text of the event appears to not have carried over in your comment. Perhaps you can insert it in your question?

0 Karma
Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Path Finder

How do I add something bigger than the small comment size given???

0 Karma
Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Path Finder

I added it to the base question. Is that what you wanted???

You notice they are a little off the previous and following records that are timestamped at:
7 5/13/14 11:51:30.000 AM
10 5/13/14 11:51:15.000 AM

0 Karma
Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Splunk Employee
Splunk Employee

Yes indeed! Thank you, I'll take a look.

0 Karma
Highlighted

Re: ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

Splunk Employee
Splunk Employee

UPDATE: This defect is now fixed as of S.o.S 3.2.1 / S.o.S TA for Windows 2.3.4


Thank you for digging for the information I requested.

This is a resurgence of a bug I thought I had squashed in version 3.2 (internal reference: SUP-812). This new iteration is on file as product defect SOS-11.

The issue is that the ps_sos.ps1 scripted input first collects a list of PIDs to report on, and then walks through that list one by one to collect additional information - most notably the ARGS and pctCPU fields.

Sometimes, the process ends between the time when we establish the list and the time when we get to the actual process for further introspection.

Note that we see the string System.Object[] as a value for ARGS for 4 processes in your output:


3908 22556 88008 0-0:0:30 splunkd System.Object[]
4064 22572 88008 0-0:0:30 splunkd System.Object[]
4944 24284 90056 0-0:0:30 splunkd System.Object[]
7180 24456 90056 0-0:0:21 splunkd System.Object[]

From the previous event, we know that these PIDs correspond to ad-hoc real-time searches:


3908 0 22428 88008 0-0:0:2 splunkd search--id=rt1400007076.28--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
4064 0 22564 88008 0-0:0:3 splunkd search--id=rt1400007076.29--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
4944 0 22284 88008 0-0:0:5 splunkd search--id=rt1400007074.26--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*
7180 0 22448 88008 0-0:0:3 splunkd search--id=rt1400007076.27--maxbuckets=300--ttl=600--maxout=100000--maxtime=0--lookups=1--reducefreq=10--rf=*

It seems likely that these processes ended between the time that ps_sos.ps1 established a list of relevant processes, and the time it prodded each of these processes in turn to retrieve ARGS and pctCPU.

In fixing this bug, we will evaluate process entries before printing them and do one of two things:
* Do not print out any entries for which we failed to retrieve ARGS or pctCPU because the process disappeared mid-execution
* Keep the script output as it is, but deal with the error more gracefully by suppressing it, avoiding the pollution of splunkd.log

The bottom line is: We will fix it in a future version of the S.o.S app, but for the time being you can ignore these errors in splunkd.log and consider them benign.

View solution in original post