Why does my Hunk search partially completes then d...

jmallorquin · ‎12-07-2016

Hi,

When I search for events from the virtual index, I start to receive events but the query only finishes partially and displays this message:

ChunkedOutputStreamReader: Invalid transport header line="194.xxx.xx.185 194.xxx.xx.185 - [15/Nov/2016:11:22:38 +0100] "GET /stat/ebanking/min/css/img_s24/menu/menu-stin.png HTTP/1.1" 200 150 "https://www.mycompany.cz/stat/ebanking/min/css/global_s24.css?v=37_prod.25" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36" "17850" TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 0"

If i query the same virtual index but with no filter, the job finish ok.

Any advice?

Regards,

rdagan_splunk · ‎12-08-2016

1) Is the version of your Hadoop on the Client - Splunk Search Head - the same version as your Hadoop on the server?
2) Can you share your configurations from /opt/splunk/etc/apps/search/local/indexes.conf ?

jmallorquin · ‎12-09-2016

Hi,

Here is the conf file:

[provider:pruebas]
vix.command.arg.3 = $SPLUNK_HOME/bin/jars/SplunkMR-hy2.jar
vix.env.HADOOP_HOME = /usr/lib/hadoop
vix.env.HUNK_THIRDPARTY_JARS = $SPLUNK_HOME/bin/jars/thirdparty/common/avro-1.7.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/avro-mapred-1.7.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/commons-compress-1.10.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/commons-io-2.4.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/libfb303-0.9.2.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/parquet-hive-bundle-1.6.0.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/snappy-java-1.1.1.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-exec-1.2.1.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-metastore-1.2.1.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-serde-1.2.1.jar
vix.env.JAVA_HOME = /etc/alternatives/jre_1.8.0_openjdk
vix.family = hadoop
vix.fs.default.name = hdfs://172.20.1.xxx:8020
vix.mapreduce.framework.name = yarn
vix.output.buckets.max.network.bandwidth = 0
vix.splunk.home.hdfs = /tmp/
vix.env.HADOOP_HEAPSIZE = 1024
vix.mapred.child.java.opts = -server -Xmx2048m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
#vix.mapred.job.map.memory.mb = 2048
#vix.mapred.job.reduce.memory.mb = 2048
vix.mapreduce.map.java.opts = -server -Xmx2048m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
vix.mapreduce.map.memory.mb = 1024
vix.mapreduce.reduce.java.opts = -server -Xmx2048m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
vix.mapreduce.reduce.memory.mb = 2048
vix.yarn.resourcemanager.address = 172.20.1.xxx:8032
vix.yarn.resourcemanager.scheduler.address = 172.20.1.xxx:8030
vix.mapreduce.jobhistory.address = 172.20.1.xxx:10020
vix.mapred.job.reduce.memory.mb = 1024

Thanks in advance

rdagan_splunk · ‎12-09-2016

The only thing that does not look right is the flag vix.splunk.home.hdfs = /tmp/

Are you sure /tmp/ actually exists on HDFS? Also, please confirm that the verison of Hadoop on the Client is the same as on the Server?

jmallorquin · ‎12-12-2016

Hi rdagan

I checkd and the SH have the same versión 5.8.2 . Also checked that /tmp/ has write permisos and exists, I can see the internal folders that Hunk create.

Other idea?

Thanks in advance for the help

rdagan_splunk · ‎12-12-2016

My recommendation will be to:
1) Use Cloudera Manager or Ambari and add the Splunk Search Head as a new Edge Node. That should remove all the MapReduce errors you are seeing.

** Right now it looks like you are able to see few events, but then the Job crash. The first few events you are seeing are Streaming back from Hadoop and Not using MapReduce. Splunk calls this feature Mix Mode.

2) From the Command line make sure you are able to run MapReduce Jobs
*. Prep: Make sure running this test under the same user, same queue as Splunk

*. Generate 1GB of data
yarn jar
/usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduceexamples.jar
teragen 1000000000 /root/splunkmr/teraInput

*. Run Terasort to sort the generated data
yarn jar
/usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduce-examples.jar
terasort /root/spllunkmr/teraInput /root/splunkmr/teraOutput

*. Run TeraValidae
yarn jar
/usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduce-examples.jar
teravalidate -D mapred.reduce.tasks=8 /root/splunkmr/teraOutput
/root/splunkmr/teraValidate

Why does my Hunk search partially completes then displays message "ChunkedOutputStreamReader: Invalid transport header line"?

SOC4Kafka - New Kafka Connector Powered by OpenTelemetry

Your Voice Matters! Help Us Shape the New Splunk Lantern Experience

Building Momentum: Splunk Developer Program at .conf25

Are you a member of the Splunk Community?

Why does my Hunk search partially completes then displays message "ChunkedOutputStreamReader: Invalid transport header line"?

SOC4Kafka - New Kafka Connector Powered by OpenTelemetry

Your Voice Matters! Help Us Shape the New Splunk Lantern Experience

Building Momentum: Splunk Developer Program at .conf25