Security

Hunk Reports an Error with Apache Hadoop

jgreenleaf
Explorer

When running queries against a VIX backed by an instance of apache hadoop, my searches would raise the following error:

01-08-2014 15:34:24.386 ERROR ChunkedOutputStreamReader - Invalid header line="OpenJDK 64-Bit Server VM warning: You have loaded library /opt/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now."

01-08-2014 15:34:24.386 ERROR ChunkedOutputStreamReader - Invalid header line="It's highly
recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'."

01-08-2014 15:34:24.386 ERROR SearchOperator:stdin - Cannot consume data with unset stream_type

01-08-2014 15:34:24.386 ERROR ResultProvider - Error in 'SearchOperator:stdin': Cannot consume data with unset stream_type

And an error would be reported in the webui.

Tags (1)
1 Solution

Ledion_Bitincka
Splunk Employee
Splunk Employee

The root cause of the problem here is that the JVM prints those warnings to stdout, which Hunk uses as a communication channel with the search process - thus breaking the communication protocol. You should be able to disable the warnings thrown by the JVM by adding the following Java options to vix.env.HADOOP_CLIENT_OPTS in the provider:

-XX:-PrintWarnings

This should disable JVM warnings altogether, or you can redirect them to the stderr instead

-XX:+DisplayVMOutputToStderr

View solution in original post

jmagiera_splunk
Splunk Employee
Splunk Employee

I had a similar error with java 1.7.0 installed on HUNK. After a downgrade to Java 1.6.0 HUNK works.
I use Hortonworks as Hadoop platform

0 Karma

thesteve
Path Finder

While the above answers are both correct, I wanted to leave something a little more complete since I too felt the pain of dealing with this error.

You can disable the error message or redirect it to stderr, but that only moves the error out of your way and doesn't deal with the root problem. The root problem is that the hadoop distribution does not include native libraries. They must be compiled from source.

You can build your own distribution that includes native libraries using the following steps:

1) Install developer tools and dependencies:

1a) From repositories:

apt-get install gcc g++ make maven cmake zlib zlib1g-dev

for RedHat environments, you can probably use a similar yum line:

yum install gcc g++ make maven cmake zlib zlib-devel

There may be some other dependencies or slightly different package names depending on what you already have installed and what OS you are running. If so, some google-able errors will pop up during the rest of the process.

1b) Protocol Buffers From Source:

mkdir /tmp/protobuf
cd /tmp/protobuf
wget http:// protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
tar -xvzf ./protobuf-2.5.0.tar.gz
cd protobuf-2.5.0
./configure --prefix=/usr
make
sudo make install
cd java
mvn install
mvn package
sudo ldconfig
cd /tmp
rm -rf protobuf

2) download hadoop source:

mkdir /tmp/hadoop-build
cd /tmp/hadoop-build
wget http:// apache.petsads.us/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz
tar -xvzf ./hadoop-2.2.0-src.tar.gz
cd hadoop-2.2.0-src

3) Edit the hadoop-auth pom file.

vi hadoop-common-project/hadoop-auth/pom.xml

add the following dependency:

<dependency>
   <groupId>org.mortbay.jetty</groupId>
  <artifactId>jetty-util</artifactId>
  <scope>test</scope>
</dependency>    

You should see an already existing dependency that looks very similar if you search for "org.mortbay.jetty", add this dependency above or below it.

3) Compile it:

export Platform=x64 
cd /tmp/hadoop-build/hadoop-2.2.0-src
mvn clean install -DskipTests
cd hadoop-mapreduce-project
mvn package -Pdist,native -DskipTests=true -Dtar 
cd /tmp/hadoop-build/hadoop-2.2.0-src
mvn package -Pdist,native -DskipTests=true -Dtar 

4) Copy your natively compiled distribution somewhere to be saved:

cp /tmp/hadoop-build/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0.tar.gz /my/distribution/share/hadoop-2.2.0.tar.gz

5) Delete the build files (once you are satisfied that everything is working properly):

cd /tmp
rm -rf hadoop-build

Now any fresh installations based on this build will include native 64 bit libraries. You can set up a new instance of hadoop locally, or you can simply overwrite the files in the $HADOOP-INSTALL/lib/native directory with those in your hadoop-2.2.0.tar.gz file.

thesteve
Path Finder

1b is only if you don't have protocol buffers installed. The hadoop readme specified it needed at least 2.5 (my system had 2.4).

I just realized I have 2 step 3s. I couldn't build the native libraries without editing the auth POM. (It's updated in the next release).

I wasn't too sure about just copying over just libhadoop.so.1.0.0, so on my first system I simply overwrote all the files in the native directory as well. It tested fine. Consecutive systems I used the generated release tarball.

0 Karma

jgreenleaf
Explorer

I didn't have to do steps 1b) or 3) btw. Also, I just copied over the recompiled native library, as that's all you really need.

Ledion_Bitincka
Splunk Employee
Splunk Employee

The root cause of the problem here is that the JVM prints those warnings to stdout, which Hunk uses as a communication channel with the search process - thus breaking the communication protocol. You should be able to disable the warnings thrown by the JVM by adding the following Java options to vix.env.HADOOP_CLIENT_OPTS in the provider:

-XX:-PrintWarnings

This should disable JVM warnings altogether, or you can redirect them to the stderr instead

-XX:+DisplayVMOutputToStderr

jgreenleaf
Explorer

This is a message printed out by apache hadoop because its binary distribution compiles libhadoop.so for 32-bit systems, while my system was 64. This is fine from hadoop's standpoint http://stackoverflow.com/questions/19943766/hadoop-unable-to-load-native-hadoop-library-for-your-pla... , however splunk treats this as a CRITICAL ERROR, so it has to be resolved to get hunk to work properly. The solution is to get the source distribution of apache hadoop, and recompile it yourself (which is a HUGE PAIN, the java build conventions are the worst).

Get Updates on the Splunk Community!

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

The University of Nevada, Las Vegas (UNLV) is another premier research institution helping to shape the next ...

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...