Getting Data In

Multiple gzip broken pipe errors

MasterDuke
Engager

I am seeing many errors like the below:

  • {timestamp} INFO ArchiveProcessor - handling file=/path/to/file.gz
  • {timestamp} INFO ArchiveProcessor - reading path=/path/to/file.gz (seek=0 len={some number that is actually equal to the length of the file on disk})
  • {timestamp} ERROR ArchiveContext - from archive='/path/to/file.gz': gzip: stdout: Broken pipe
  • {timestamp} INFO ArchiveProcessor - Finished processing file '/path/to/file.gz', removing from stats

These files are created by IBM InfoSphere Streams, using the 'gzip' compression option of its FileSink operator. They are on an NFS mount. The odd thing is that I don't get these error from all of the files, but definitely from most. If I try to use regular gunzip to decompress the files, I get no errors or warnings even in verbose mode and they decompress just fine.

What is causing all these errors?

Tags (2)

sjalexander
Path Finder

Since you ruled out gzip itself as the culprit, this looks to me like a pipeline problem, and not exactly a gzip failure - which means the trouble would be in ArchiveProcessor.

In other words ArchiveProcessor may not be handling the pipeline correctly (a bug! ... perhaps). There are at least 2 other possibly related issues to be found on this site:

http://answers.splunk.com/answers/57272/large-data-archives-zip-being-corrupted-on-indexing.html
http://answers.splunk.com/answers/132045/error-archiveprocessor-with-zip-files.html

these refer to zip files, but i wonder if it might be common cause, specifically pipeline handling in the ArchiveProcessor logic.

As an aside, i'm completely unsure if your issue is related, but these are food for thought:

https://blog.nelhage.com/2010/02/a-very-subtle-bug/
http://bugs.python.org/issue1652

muebel
SplunkTrust
SplunkTrust

What version of Splunk are you running?

0 Karma

MasterDuke
Engager

Tried several different 6.x.y versions, on 6.2.1 now I believe.

0 Karma

wsnyder2
Path Finder

I am seeing these to .. any updates ?

09-24-2015 05:49:35.834 -0400 INFO ArchiveProcessor - handling file=/var/cdnlog/cdn.n1.paychexinc.com_20150924003358_112706.log.gz
09-24-2015 05:49:35.834 -0400 INFO ArchiveProcessor - reading path=/var/cdnlog/cdn.n1.paychexinc.com_20150924003358_112706.log.gz (seek=0 len=863894)
09-24-2015 05:49:36.295 -0400 ERROR ArchiveContext - From archive='/var/cdnlog/cdn.n1.paychexinc.com_20150924003358_112706.log.gz': gzip: stdout: Broken pipe
09-24-2015 05:49:37.667 -0400 INFO ArchiveProcessor - Finished processing file '/var/cdnlog/cdn.n1.paychexinc.com_20150924003358_112706.log.gz', removing from stats
09-24-2015 05:49:37.667 -0400 INFO ArchiveProcessor - handling file=/var/cdnlog/cdn.paychexinc.com_20150924020241_122706.log.gz
09-24-2015 05:49:37.668 -0400 INFO ArchiveProcessor - reading path=/var/cdnlog/cdn.paychexinc.com_20150924020241_122706.log.gz (seek=0 len=53513669)
09-24-2015 05:49:37.668 -0400 WARN TcpOutputProc - The event is missing source information. Event :
09-24-2015 05:49:38.041 -0400 ERROR ArchiveContext - From archive='/var/cdnlog/cdn.paychexinc.com_20150924020241_122706.log.gz': gzip: stdout: Broken pipe

0 Karma

atorrrr
Engager

also dealing with this, please let me know if you find more info

0 Karma

pj_elia
Engager

I'm having the same issue. Looking for a solution now..

0 Karma

ericlarsen
Path Finder

I'm getting the same error when trying to ingest .gz files into Splunk. Please let me know if you found a resolution.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...