My provider configuration inside indexes.conf
looks like
[provider:analytics-emr]
vix.env.HADOOP_HOME = /opt/hadoop-2.2.0
vix.env.JAVA_HOME = /usr/lib/jvm/java-7-oracle-1.7.0.45/jre
vix.family = hadoop
vix.fs.default.name = xxx
vix.mapreduce.framework.name = yarn
vix.yarn.resourcemanager.address = xxx:8032
vix.yarn.resourcemanager.scheduler.address= xxx:8030
vix.splunk.home.hdfs = /hunk-dir
vix.splunk.setup.package = /opt/splunk_packages/hunk-6.1.1.tgz
According to the Web UI, this provider is using MR v1. How do I configure it to use MR v2?
I wanted to do this without going through the UI, mainly so that I can launch a new Hunk node programmatically.
After some messing around, I found that if I set the following
vix.command.arg.3 = $SPLUNK_HOME/bin/jars/SplunkMR-s6.0-hy2.0.jar
Hunk knows that I want to use YARN. There is probably another value for MR v2.
I wanted to do this without going through the UI, mainly so that I can launch a new Hunk node programmatically.
After some messing around, I found that if I set the following
vix.command.arg.3 = $SPLUNK_HOME/bin/jars/SplunkMR-s6.0-hy2.0.jar
Hunk knows that I want to use YARN. There is probably another value for MR v2.
You can select the Haoop version from the UI under "Provider"
Also you need to add these in your indexes.conf:
vix.mapred.job.tracker = jobtracker.hadoop.splunk.com:8021
vix.fs.default.name = hdfs://hdfs.hadoop.splunk.com:8020
vix.splunk.home.datanode = /
For more detail you can check here:
http://docs.splunk.com/Documentation/Hunk/6.1.1/Hunk/Setupavirtualindex