Hi There,
I'm also experiencing the same issue. I'd like to not create a new question since I'd like to have an answer for this problem.
A couple of observations:
If I do a straight search of an index (e.g: index=name). My MapReduce job returns fine, all results are posted.
If I do a search with a filter (e.g: index=name field=value), I will receive this error 100% of the time.
Checking the search and task tracker logs, it seems like the map job completes (hits 100%), and doesn't return properly afterwards. I'm not sure if there is a reduce portion of the MR job, but if there is, it never starts.
I'm running a kerberos secured hortonworks hadoop cluster - HDP 2.2.4.2-2 and running jobs via Yarn 2.x
Here are my logs below:
Interface Errors:
[provider] Error while running external process, return_code=255. See search.log for more info
[provider] IOException - Error while waiting for MapReduce job to complete, job_id=[!http://:8088/proxy/application_1440205050298_7902/ job_1440205050298_7902], state=FAILED, reason=Task failed task_1440205050298_7902_m_000003
search.log:
09-23-2015 18:47:09.007 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 0.00%, Reduce: 0.00%
09-23-2015 18:47:09.540 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:09.843 INFO DispatchThread - Generating results preview took 1 ms
09-23-2015 18:47:10.505 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:10.786 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 0.00%, Reduce: 0.00%
09-23-2015 18:47:11.463 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:11.759 INFO DispatchThread - Generating results preview took 1 ms
09-23-2015 18:47:12.449 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:12.753 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 0.00%, Reduce: 0.00%
09-23-2015 18:47:13.474 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:13.791 INFO DispatchThread - Generating results preview took 1 ms
09-23-2015 18:47:14.517 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:14.835 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 0.00%, Reduce: 0.00%
09-23-2015 18:47:15.741 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:16.146 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 11.97%, Reduce: 0.00%
09-23-2015 18:47:16.150 INFO DispatchThread - Generating results preview took 1 ms
09-23-2015 18:47:17.031 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:18.104 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:18.402 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 11.97%, Reduce: 0.00%
09-23-2015 18:47:18.402 INFO DispatchThread - Generating results preview took 1 ms
09-23-2015 18:47:19.368 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:20.080 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 18.38%, Reduce: 0.00%
09-23-2015 18:47:22.089 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 25.27%, Reduce: 0.00%
09-23-2015 18:47:24.100 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 25.27%, Reduce: 0.00%
09-23-2015 18:47:26.112 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 16.70%, Reduce: 0.00%
09-23-2015 18:47:28.125 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 0.00%, Reduce: 0.00%
09-23-2015 18:47:29.808 INFO DispatchThread - Generating results preview took 1 ms
09-23-2015 18:47:30.137 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 0.00%, Reduce: 0.00%
09-23-2015 18:47:32.146 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 100.00%, Reduce: 0.00%
09-23-2015 18:47:34.158 INFO ERP.provider - JobSubmitter - JobId=job_1440205050298_7902, Map: 100.00%, Reduce: 0.00%
09-23-2015 18:47:34.359 INFO ERP.provider - JobSubmitter - job.nameSPLK_<hunk-server>_1443059205.319_0, Successful tasks: 0
09-23-2015 18:47:34.359 INFO ERP.provider - JobSubmitter - job.nameSPLK_<hunk-server>_1443059205.319_0, Failed or killed tasks: 4
09-23-2015 18:47:34.420 INFO ERP.provider - JobSubmitter - time spent writing map errors, elapsed_ms=60
09-23-2015 18:47:34.498 INFO ERP.provider - FileBasedHeartBeat - stopping hearbeat thread, path=/user/hunk/dispatch/1443059205.319, interval=1000, _startIndex=0, _index=43
09-23-2015 18:47:34.499 WARN PreviewWriter - - data_source="/data/file/path/part-r-00000", data_host="<hunk-node>", data_sourcetype="none"
09-23-2015 18:47:34.503 ERROR ERP.provider - SplunkMR - Error while waiting for MapReduce job to complete, job_id=[!http://<namenode>:8088/proxy/application_1440205050298_7902/ job_1440205050298_7902], state=FAILED, reason=Task failed task_1440205050298_7902_m_000003
09-23-2015 18:47:34.503 ERROR ERP.provider - Job failed as tasks failed. failedMaps:1 failedReduces:0
09-23-2015 18:47:34.503 ERROR ERP.provider - java.io.IOException: Error while waiting for MapReduce job to complete, job_id=[!http://<namenode>:8088/proxy/application_1440205050298_7902/ job_1440205050298_7902], state=FAILED, reason=Task failed task_1440205050298_7902_m_000003
09-23-2015 18:47:34.503 ERROR ERP.provider - Job failed as tasks failed. failedMaps:1 failedReduces:0
09-23-2015 18:47:34.503 ERROR ERP.provider - at com.splunk.mr.JobSubmitter.waitForCurrentJobToComplete(JobSubmitter.java:493)
09-23-2015 18:47:34.503 ERROR ERP.provider - at com.splunk.mr.JobSubmitter.waitForCompletion(JobSubmitter.java:204)
09-23-2015 18:47:34.503 ERROR ERP.provider - at com.splunk.mr.SplunkMR$SearchHandler.executeMapReduce(SplunkMR.java:1208)
09-23-2015 18:47:34.503 ERROR ERP.provider - at com.splunk.mr.SplunkMR$SearchHandler.executeImpl(SplunkMR.java:936)
09-23-2015 18:47:34.503 ERROR ERP.provider - at com.splunk.mr.SplunkMR$SearchHandler.execute(SplunkMR.java:779)
09-23-2015 18:47:34.503 ERROR ERP.provider - at com.splunk.mr.SplunkMR.runImpl(SplunkMR.java:1526)
09-23-2015 18:47:34.503 ERROR ERP.provider - at com.splunk.mr.SplunkMR.run(SplunkMR.java:1308)
09-23-2015 18:47:34.503 ERROR ERP.provider - at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
09-23-2015 18:47:34.503 ERROR ERP.provider - at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
09-23-2015 18:47:34.503 ERROR ERP.provider - at com.splunk.mr.SplunkMR.main(SplunkMR.java:1554)
09-23-2015 18:47:34.874 ERROR ERP.provider - Error while invoking command: /usr/hdp/2.2.4.2-2/hadoop/bin/hadoop com.splunk.mr.SplunkMR - Return code: 255
09-23-2015 18:47:34.875 WARN PreviewWriter - - data_source="/data/file/path/part-r-00000", data_host="<hunk-host>", data_sourcetype="none"
09-23-2015 18:47:34.875 INFO SearchOperator:stdin - StdinDataProvider::finish called
09-23-2015 18:47:34.875 INFO AggregatorMiningProcessor - Got done message for: source::/data/file/path/part-r-00000|host::<hunk-host>||2
09-23-2015 18:47:35.360 WARN SearchOperator:kv - buildRegexList provided empty conf key, ignoring.
09-23-2015 18:47:35.552 INFO ERPSearchResultCollector - ERP peer=provider is done reading search results.
Task Tracker Log - Specifically the mapper described by the interface error (task_1440205050298_7902_m_000003)
2015-09-23 18:47:08,081 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties
2015-09-23 18:47:08,175 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-09-23 18:47:08,176 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2015-09-23 18:47:08,196 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2015-09-23 18:47:08,196 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1440205050298_7902, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@1680748d)
2015-09-23 18:47:08,273 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:provider, Ident: (HDFS_DELEGATION_TOKEN token 47861 for hunk)
2015-09-23 18:47:08,274 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: TIMELINE_DELEGATION_TOKEN, Service: 192.168.1.10:8188, Ident: (owner=hunk, renewer=yarn, realUser=, issueDate=1443059213325, maxDate=1443664013325, sequenceNumber=39, masterKeyId=2)
2015-09-23 18:47:08,389 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2015-09-23 18:47:08,622 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: <data-node-loc>/usercache/hunk/appcache/application_1440205050298_7902,<data-node-loc>/usercache/hunk/appcache/application_1440205050298_7902,<data-node-loc>/usercache/hunk/appcache/application_1440205050298_7902,<data-node-loc>/usercache/hunk/appcache/application_1440205050298_7902,<data-node-loc>/usercache/hunk/appcache/application_1440205050298_7902,<data-node-loc>/usercache/hunk/appcache/application_1440205050298_7902,<data-node-loc>/usercache/hunk/appcache/application_1440205050298_7902
2015-09-23 18:47:09,118 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2015-09-23 18:47:09,664 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2015-09-23 18:47:09,975 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: /data/file/path/to/_SUCCESS:0+0
2015-09-23 18:47:10,191 INFO [main] com.splunk.mr.input.VixTimeSpecifier: using timezone=null, _tz.id="<time-zone>", name="<time-zone>" for regex=/data/file/path/to(20\d\d)/(\d\d), format=yyyyMM
2015-09-23 18:47:10,191 INFO [main] com.splunk.mr.input.VixTimeSpecifier: using timezone=null, _tz.id="<time-zone>", name="<time-zone>" for regex=/data/file/path/to(20\d\d)/(\d\d), format=yyyyMM
2015-09-23 18:47:10,209 WARN [main] com.splunk.mr.SplunkBaseMapper: Could not create preprocessor object, will try the next one ... class=com.splunk.mr.input.SplunkJournalRecordReader, message=File path does not match regex to use this record reader, class=com.splunk.mr.input.SplunkJournalRecordReader, name=splunkjournal, path=hdfs://data/file/path/to/_SUCCESS, regex=/journal\.gz$.
2015-09-23 18:47:10,218 WARN [main] com.splunk.mr.SplunkBaseMapper: Could not create preprocessor object, will try the next one ... class=com.splunk.mr.input.ValueAvroRecordReader, message=File path does not match regex to use this record reader, class=com.splunk.mr.input.ValueAvroRecordReader, name=avro, path=hdfs://data/file/path/to/_SUCCESS, regex=\.avro$.
2015-09-23 18:47:10,225 WARN [main] com.splunk.mr.SplunkBaseMapper: Could not create preprocessor object, will try the next one ... class=com.splunk.mr.input.SimpleCSVRecordReader, message=File path does not match regex to use this record reader, class=com.splunk.mr.input.SimpleCSVRecordReader, name=csv, path=hdfs://data/file/path/to/_SUCCESS, regex=\.([tc]sv)(?:\.(?:gz|bz2|snappy))?$.
2015-09-23 18:47:10,230 WARN [main] com.splunk.mr.SplunkBaseMapper: Could not create preprocessor object, will try the next one ... class=com.splunk.mr.input.SequenceFileRecordReader, message=File path does not match regex to use this record reader, class=com.splunk.mr.input.SequenceFileRecordReader, name=sequence, path=hdfs://data/file/path/to/_SUCCESS, regex=\.seq$.
2015-09-23 18:47:10,230 INFO [main] com.splunk.mr.JobSubmitterInputFormat: using class=com.splunk.mr.input.SplunkLineRecordReader to process split=/data/file/path/to/_SUCCESS:0+0
2015-09-23 18:47:10,249 INFO [main] com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library
2015-09-23 18:47:10,252 INFO [main] com.hadoop.compression.lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 66217595dd210805b8cf223b2d1b3b6f77fda073]
2015-09-23 18:47:10,328 INFO [main] com.splunk.mr.SplunkSearchMapper: CONF_DN_HOME is set to /tmp/splunk/<hunk-server>/
2015-09-23 18:47:10,341 INFO [main] com.splunk.mr.SplunkSearchMapper: Ensuring Hunk is setup correctly took elapsed_ms=14
2015-09-23 18:47:10,497 INFO [main] com.splunk.mr.SplunkSearchMapper: _splunkProcess.getStdin()=java.lang.UNIXProcess$ProcessPipeOutputStream@267fd96e
2015-09-23 18:47:10,504 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
2015-09-23 18:47:10,535 INFO [main] com.splunk.mr.SplunkBaseMapper: Pushed records=0, bytes=0 to search process ...
2015-09-23 18:47:10,948 INFO [main] com.splunk.mr.SplunkSearchMapper: Processed records=0, bytes=0, return_code=134, elapsed_ms=588
2015-09-23 18:47:11,147 ERROR [main] com.splunk.mr.SplunkSearchMapper: Mapper exception=java.io.IOException: Search process failed to exit cleanly. Return code=134.
2015-09-23 18:47:11,318 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Search process failed to exit cleanly. Return code=134.
at com.splunk.mr.SplunkSearchMapper.runImpl(SplunkSearchMapper.java:325)
at com.splunk.mr.SplunkBaseMapper.run(SplunkBaseMapper.java:169)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
2015-09-23 18:47:11,323 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Any help that anyone can provide would be appreciated.
... View more