All Apps and Add-ons

How to resolve error "[hadoop] Error reading compressed journal while streaming" when attempting to search Hadoop AWS storage?

campbellj1977
Explorer

We are getting the following error when attempting to search our hadoop AWS S3 storage:

[hadoop] Error reading compressed journal while streaming: gzip data truncated, provider=StdinGzDataProvider

[hadoop] Exception - java.io.FileNotFoundException: No such file or directory: s3a://heroku-splunk/archives/main/archive_v3/main/008DA7C6-EACA-486D-8E49-994E14249C23/1479168000_1477785600/1478476800_1478390400/db_1478425271_1478420579_238/journal.gz

Can anyone point me in a good direction to start investigating?

0 Karma

kpawar_splunk
Splunk Employee
Splunk Employee

Error reading compressed journal while streaming: gzip data truncated, provider=StdinGzDataProvider" error is because one or more archived journal.gz is corrupted.

If splunk suffers crash or an unclean shutdown (power loss, hardware failure, OS failure, etc) then some buckets can be left in a bad state where not all data is searchable. If bucket is corrupted locally on indexer, then archived bucket will also be corrupted.
Local splunk buckets can be fixed by following these instructions : http://docs.splunk.com/Documentation/Splunk/6.5.0/Indexer/Bucketissues

Currently there is no way to fix corrupted journal.gz that are archived.

There is fix so that search reads data from corrupted journal till it hits corrupted part of the journal. Error message will be logged in search.log suggesting that particular journal is corrupted.Also, with this fix, search will continue after reading corrupt journal, instead of stopping at corrupt journal.
The fix is available in latest maintenance versions 6.3.9 onward, 6.4.6 onward and 6.5.3 onward. The fix will be included in next major release. You need to upgrade your SH to apply the fix.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...