Splunk Search

Why does my Hunk search partially completes then displays message "ChunkedOutputStreamReader: Invalid transport header line"?

jmallorquin
Builder

Hi,

When I search for events from the virtual index, I start to receive events but the query only finishes partially and displays this message:

ChunkedOutputStreamReader: Invalid transport header line="194.xxx.xx.185 194.xxx.xx.185 - [15/Nov/2016:11:22:38 +0100] "GET /stat/ebanking/min/css/img_s24/menu/menu-stin.png HTTP/1.1" 200 150 "https://www.mycompany.cz/stat/ebanking/min/css/global_s24.css?v=37_prod.25" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36" "17850" TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 0"

If i query the same virtual index but with no filter, the job finish ok.

Any advice?

Regards,

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

1) Is the version of your Hadoop on the Client - Splunk Search Head - the same version as your Hadoop on the server?
2) Can you share your configurations from /opt/splunk/etc/apps/search/local/indexes.conf ?

0 Karma

jmallorquin
Builder

Hi,

Here is the conf file:

[provider:pruebas]
vix.command.arg.3 = $SPLUNK_HOME/bin/jars/SplunkMR-hy2.jar
vix.env.HADOOP_HOME = /usr/lib/hadoop
vix.env.HUNK_THIRDPARTY_JARS = $SPLUNK_HOME/bin/jars/thirdparty/common/avro-1.7.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/avro-mapred-1.7.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/commons-compress-1.10.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/commons-io-2.4.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/libfb303-0.9.2.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/parquet-hive-bundle-1.6.0.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/snappy-java-1.1.1.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-exec-1.2.1.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-metastore-1.2.1.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-serde-1.2.1.jar
vix.env.JAVA_HOME = /etc/alternatives/jre_1.8.0_openjdk
vix.family = hadoop
vix.fs.default.name = hdfs://172.20.1.xxx:8020
vix.mapreduce.framework.name = yarn
vix.output.buckets.max.network.bandwidth = 0
vix.splunk.home.hdfs = /tmp/
vix.env.HADOOP_HEAPSIZE = 1024
vix.mapred.child.java.opts = -server -Xmx2048m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
#vix.mapred.job.map.memory.mb = 2048
#vix.mapred.job.reduce.memory.mb = 2048
vix.mapreduce.map.java.opts = -server -Xmx2048m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
vix.mapreduce.map.memory.mb = 1024
vix.mapreduce.reduce.java.opts = -server -Xmx2048m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
vix.mapreduce.reduce.memory.mb = 2048
vix.yarn.resourcemanager.address = 172.20.1.xxx:8032
vix.yarn.resourcemanager.scheduler.address = 172.20.1.xxx:8030
vix.mapreduce.jobhistory.address = 172.20.1.xxx:10020
vix.mapred.job.reduce.memory.mb = 1024

Thanks in advance

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

The only thing that does not look right is the flag vix.splunk.home.hdfs = /tmp/

Are you sure /tmp/ actually exists on HDFS? Also, please confirm that the verison of Hadoop on the Client is the same as on the Server?

0 Karma

jmallorquin
Builder

Hi rdagan

I checkd and the SH have the same versión 5.8.2 . Also checked that /tmp/ has write permisos and exists, I can see the internal folders that Hunk create.

Other idea?

Thanks in advance for the help

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

My recommendation will be to:
1) Use Cloudera Manager or Ambari and add the Splunk Search Head as a new Edge Node. That should remove all the MapReduce errors you are seeing.

** Right now it looks like you are able to see few events, but then the Job crash. The first few events you are seeing are Streaming back from Hadoop and Not using MapReduce. Splunk calls this feature Mix Mode.

2) From the Command line make sure you are able to run MapReduce Jobs
*. Prep: Make sure running this test under the same user, same queue as Splunk

*. Generate 1GB of data
yarn jar
/usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduceexamples.jar
teragen 1000000000 /root/splunkmr/teraInput

*. Run Terasort to sort the generated data
yarn jar
/usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduce-examples.jar
terasort /root/spllunkmr/teraInput /root/splunkmr/teraOutput

*. Run TeraValidae
yarn jar
/usr/hdp/2.3.0.0-2557/hadoop-mapreduce/hadoop-mapreduce-examples.jar
teravalidate -D mapred.reduce.tasks=8 /root/splunkmr/teraOutput
/root/splunkmr/teraValidate

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...