All Apps and Add-ons

ps_sos.ps1 Microsoft.PowerShell.Commands.WriteErrorException

dlpco
Path Finder

I am getting the following in splunkd.log for ps_sos.ps1:


 D:\Splunk\etc\apps\sos\bin\powershell\ps_sos.ps1 : Exception of type 'Microsoft
 .PowerShell.Commands.WriteErrorException' was thrown.
 At line:1 char:2
 + . <<<<  'D:\Splunk\etc\apps\sos\bin\\powershell\ps_sos.ps1'
     + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorExcep 
    tion
     + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorExceptio 
    n,ps_sos.ps1

I am running 6.1.1 (free license) and SOS 3.2. Any ideas on what's going on and how to fix it????

Below in comments I tried to put in the record dumps but ran out of room. Here they are now...

05/13/14 11:51:26.000 AM


PID pctCPU RSZ_KB VSZ_KB ELAPSED COMMAND ARGS
1980 0 137468 307072 0-0:42:12 splunkd service
3652 0 24288 90056 0-0:0:30 splunkd search_--id=rt_1400007053.24_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
3908 22556 88008 0-0:0:30 splunkd System.Object[]
4064 22572 88008 0-0:0:30 splunkd System.Object[]
4944 24284 90056 0-0:0:30 splunkd System.Object[]
5324 0 16288 81500 0-0:41:7 splunkd instrument-resource-usage
6676 0 24176 90056 0-0:0:21 splunkd search_--id=rt_1400007064.25_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
7180 24456 90056 0-0:0:21 splunkd System.Object[]
7580 61 24448 91080 0-0:0:3 splunkd search_--id=1400007082.30_--maxbuckets=300_--ttl=600_--maxout=500000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
2456 0 70816 150620 0-0:42:12 splunkweb

05/13/14 11:51:21.000 AM


PID pctCPU RSZ_KB VSZ_KB ELAPSED COMMAND ARGS
1980 0 136340 307072 0-0:42:7 splunkd service
3652 0 24288 90056 0-0:0:25 splunkd search_--id=rt_1400007053.24_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
3908 0 22428 88008 0-0:0:2 splunkd search_--id=rt_1400007076.28_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
4064 0 22564 88008 0-0:0:3 splunkd search_--id=rt_1400007076.29_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
4944 0 22284 88008 0-0:0:5 splunkd search_--id=rt_1400007074.26_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
5324 0 16288 81500 0-0:41:3 splunkd instrument-resource-usage
6676 0 24176 90056 0-0:0:16 splunkd search_--id=rt_1400007064.25_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
7180 0 22448 88008 0-0:0:3 splunkd search_--id=rt_1400007076.27_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
2456 0 70816 150620 0-0:42:6 splunkweb

1 Solution

hexx
Splunk Employee
Splunk Employee

UPDATE: This defect is now fixed as of S.o.S 3.2.1 / S.o.S TA for Windows 2.3.4


Thank you for digging for the information I requested.

This is a resurgence of a bug I thought I had squashed in version 3.2 (internal reference: SUP-812). This new iteration is on file as product defect SOS-11.

The issue is that the ps_sos.ps1 scripted input first collects a list of PIDs to report on, and then walks through that list one by one to collect additional information - most notably the ARGS and pctCPU fields.

Sometimes, the process ends between the time when we establish the list and the time when we get to the actual process for further introspection.

Note that we see the string System.Object[] as a value for ARGS for 4 processes in your output:


3908 22556 88008 0-0:0:30 splunkd System.Object[]
4064 22572 88008 0-0:0:30 splunkd System.Object[]
4944 24284 90056 0-0:0:30 splunkd System.Object[]
7180 24456 90056 0-0:0:21 splunkd System.Object[]

From the previous event, we know that these PIDs correspond to ad-hoc real-time searches:


3908 0 22428 88008 0-0:0:2 splunkd search_--id=rt_1400007076.28_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
4064 0 22564 88008 0-0:0:3 splunkd search_--id=rt_1400007076.29_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
4944 0 22284 88008 0-0:0:5 splunkd search_--id=rt_1400007074.26_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
7180 0 22448 88008 0-0:0:3 splunkd search_--id=rt_1400007076.27_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*

It seems likely that these processes ended between the time that ps_sos.ps1 established a list of relevant processes, and the time it prodded each of these processes in turn to retrieve ARGS and pctCPU.

In fixing this bug, we will evaluate process entries before printing them and do one of two things:
* Do not print out any entries for which we failed to retrieve ARGS or pctCPU because the process disappeared mid-execution
* Keep the script output as it is, but deal with the error more gracefully by suppressing it, avoiding the pollution of splunkd.log

The bottom line is: We will fix it in a future version of the S.o.S app, but for the time being you can ignore these errors in splunkd.log and consider them benign.

View solution in original post

hexx
Splunk Employee
Splunk Employee

UPDATE: This defect is now fixed as of S.o.S 3.2.1 / S.o.S TA for Windows 2.3.4


Thank you for digging for the information I requested.

This is a resurgence of a bug I thought I had squashed in version 3.2 (internal reference: SUP-812). This new iteration is on file as product defect SOS-11.

The issue is that the ps_sos.ps1 scripted input first collects a list of PIDs to report on, and then walks through that list one by one to collect additional information - most notably the ARGS and pctCPU fields.

Sometimes, the process ends between the time when we establish the list and the time when we get to the actual process for further introspection.

Note that we see the string System.Object[] as a value for ARGS for 4 processes in your output:


3908 22556 88008 0-0:0:30 splunkd System.Object[]
4064 22572 88008 0-0:0:30 splunkd System.Object[]
4944 24284 90056 0-0:0:30 splunkd System.Object[]
7180 24456 90056 0-0:0:21 splunkd System.Object[]

From the previous event, we know that these PIDs correspond to ad-hoc real-time searches:


3908 0 22428 88008 0-0:0:2 splunkd search_--id=rt_1400007076.28_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
4064 0 22564 88008 0-0:0:3 splunkd search_--id=rt_1400007076.29_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
4944 0 22284 88008 0-0:0:5 splunkd search_--id=rt_1400007074.26_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*
7180 0 22448 88008 0-0:0:3 splunkd search_--id=rt_1400007076.27_--maxbuckets=300_--ttl=600_--maxout=100000_--maxtime=0_--lookups=1_--reduce_freq=10_--rf=*

It seems likely that these processes ended between the time that ps_sos.ps1 established a list of relevant processes, and the time it prodded each of these processes in turn to retrieve ARGS and pctCPU.

In fixing this bug, we will evaluate process entries before printing them and do one of two things:
* Do not print out any entries for which we failed to retrieve ARGS or pctCPU because the process disappeared mid-execution
* Keep the script output as it is, but deal with the error more gracefully by suppressing it, avoiding the pollution of splunkd.log

The bottom line is: We will fix it in a future version of the S.o.S app, but for the time being you can ignore these errors in splunkd.log and consider them benign.

halr9000
Motivator

marked as answered, cool?

0 Karma

hexx
Splunk Employee
Splunk Employee

We just released S.o.S 3.2, so it won't be for a few weeks at the very least, possibly a few months.

0 Karma

brooklynotss
Path Finder

I just updated to S.o.S 3.2.1 but am still seeing these errors pretty often...

E:\Program
Files\Splunk\etc\apps\TA-sos_win\bin\powershell\ps_sos.ps1
: Exception of type
'Microsoft.PowerShell.Commands.WriteErrorException'
was thrown. At line:1 char:1
+ . 'E:\Program Files\Splunk\etc\apps\TA-sos_win\bin\powershell\ps_sos.ps1'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Write-Error],
WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,ps_sos.ps1

0 Karma

dlpco
Path Finder

Sounds like a plan. I noticed that most things have 0%cpu but the elapsed seem to be correct.

Seems to be a good answer - you have any idea when the update will come out?

0 Karma

hexx
Splunk Employee
Splunk Employee

Yes indeed! Thank you, I'll take a look.

0 Karma

dlpco
Path Finder

I added it to the base question. Is that what you wanted???

You notice they are a little off the previous and following records that are timestamped at:
7 5/13/14 11:51:30.000 AM
10 5/13/14 11:51:15.000 AM

0 Karma

dlpco
Path Finder

How do I add something bigger than the small comment size given???

0 Karma

hexx
Splunk Employee
Splunk Employee

The text of the event appears to not have carried over in your comment. Perhaps you can insert it in your question?

0 Karma

dlpco
Path Finder

Okay - lets see if I got this right.

The log shows 2 failures at 11:51:24.906 and 11:51:25.243

Looking in the records I see: (Let me know if this is what you wanted...)

0 Karma

hexx
Splunk Employee
Splunk Employee

This is good to know. Now, if you don't mind digging a bit further: Do you see any records in the sos index for the times when the script reports an error? Or are there gaps? If there are records there, can you show a sample event produced during a run of ps_sos.ps1 that also yielded an error?

0 Karma

dlpco
Path Finder

I have been running 3.2 for a while and I am unsure if they happened before splunk 6.1 and 6.1.1.

These errors do NOT happen every 5s. An example of the time stamps are:
05-13-2014 11:15:03.862
05-13-2014 11:17:09.409
05-13-2014 11:17:09.615
05-13-2014 11:17:24.872
05-13-2014 11:18:19.241
05-13-2014 11:19:04.387
05-13-2014 11:26:49.459
and now it is 11:48 and no more entries! Strange hu!!

When I run the script at the command line, I get the output I would expect from a ps in Linux.

I can also look with a Splunk search (source="ps_sos.ps1") and I can see the data coming in ever 5s.

0 Karma

hexx
Splunk Employee
Splunk Employee

Did these errors start after you installed S.o.S 3.2? Were there none before?

How often do you see these errors - do they occur every time (5s) the script runs?

Finally, what output do you get if you attempt to run the script manually from the command line?

%SPLUNK_HOME%\bin\splunk cmd %SPLUNK_HOME%\etc\apps\sos\bin\sospowershell.cmd ps_sos.ps1

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...