Archive
Highlighted

Hunk timing out when connecting to Hive Metastore

Path Finder

I have Hunk working talking to my Kerberized Hadoop cluster and doing "normal" searches on data with reading HDFS and doing MR jobs.

Now I am trying to get it to talk to Hive tables. I believe that Hunk is supposed to talk to the metastore to identify the details about the table I am looking at, and is then supposed to go to the file directly and launches a MR job (if necessary). It does not actually talk JDBC or ODBC to the HiveServer2. (This is a shame in my opinion as Tez is a lot faster than MR)

However for me Hunk is timing out when connecting to the Metastore - and the metastore then reports a message it could not understand. The error on the server is identical to the one you get when Nagios opens a connection to the server and then disconnects - so I am worried that Hunkis not talking thrift properly to the metastore...

Here is what I see in the web UI
[myuathiveprovider] Error while running external process, returncode=255. See search.log for more info
[myuathiveprovider] RuntimeException - Failed to get Hive metadata for table [demo
order_history] and db [default]:java.net.SocketTimeoutException: Read timed out

diving in to search.log I see this

07-02-2015 18:05:36.915 DEBUG ERP.myuathiveprovider - method: "KERBEROS"
07-02-2015 18:05:36.915 DEBUG ERP.myuathiveprovider - mechanism: "GSSAPI"
07-02-2015 18:05:36.915 DEBUG ERP.myuathiveprovider - protocol: "rm"
07-02-2015 18:05:36.915 DEBUG ERP.myuathiveprovider - serverId: "bruathdp002.MYDOMAIN.local"
07-02-2015 18:05:36.915 DEBUG ERP.myuathiveprovider - }
07-02-2015 18:05:36.917 DEBUG ERP.myuathiveprovider - SaslRpcClient - Received SASL message state: CHALLENGE
07-02-2015 18:05:36.917 DEBUG ERP.myuathiveprovider - token: "f\006\t*\206H\206\367\022\001\002\002\002\000oW0U\240\003\002\001\005\241\003\002\001\017\242I0G\240\003\002\001\027\242@\004>\336yJ\202\335;[V&\335\\\246\031\314u\234\317\214\036\361\244\356\030\355@\252\313<\310\270\315\037p+\a\0357\270b\020#\037\332\\0\v\353G\036\306\226\277!\201\207r\020\320\351\274Q\256"
07-02-2015 18:05:36.918 DEBUG ERP.myuathiveprovider - SaslRpcClient - Sending sasl message state: RESPONSE
07-02-2015 18:05:36.918 DEBUG ERP.myuathiveprovider - token: ""
07-02-2015 18:05:36.919 DEBUG ERP.myuathiveprovider - SaslRpcClient - Received SASL message state: CHALLENGE
07-02-2015 18:05:36.919 DEBUG ERP.myuathiveprovider - token: "
0\006\t\206H\206\367\022\001\002\002\002\001\021\000\377\377\377\377\207\261\312\030$\000(\030\257\312.\267u\242\257\365\314\317\3778f\217\343\001\001\000\000\001"
07-02-2015 18:05:36.920 DEBUG ERP.myuathiveprovider - SaslRpcClient - Sending sasl message state: RESPONSE
07-02-2015 18:05:36.920 DEBUG ERP.myuathiveprovider - token: "
0\006\t
\206H\206\367\022\001\002\002\002\001\021\000\377\377\377\377\223\345q\203J\036E\376\252\342^?\327\025\267\274\247\017\201\331\016\217\323\3104\001\001\000\000\001"
07-02-2015 18:05:36.923 DEBUG ERP.myuathiveprovider - SaslRpcClient - Received SASL message state: SUCCESS
07-02-2015 18:05:36.925 DEBUG ERP.myuathiveprovider - ProtobufRpcEngine$Invoker - Call: getClusterMetrics took 30ms
07-02-2015 18:05:36.942 DEBUG ERP.myuathiveprovider - ProtobufRpcEngine$Invoker - Call: getClusterNodes took 2ms
07-02-2015 18:05:36.960 WARN ERP.myuathiveprovider - ResourceMgrDelegate - getBlacklistedTrackers - Not implemented yet
07-02-2015 18:05:36.960 INFO ERP.myuathiveprovider - ClusterInfoLogger - Hadoop cluster spec: provider=myuathiveprovider, tasktrackers=5, mapinuse=1, mapslots=50, reduceinuse=1, reduceslots=10
07-02-2015 18:05:37.677 INFO ERP.myuathiveprovider - HiveMetaStoreClient - Connected to metastore.
07-02-2015 18:05:42.967 INFO DispatchThread - Generating results preview took 1 ms
07-02-2015 18:05:45.894 DEBUG ERP.myuathiveprovider - SearchController - stopOnMaxHadoopNodesExceedingLicense ...
07-02-2015 18:05:45.920 WARN ERP.myuathiveprovider - SearchController - Could not parse enforcemaxnodes=null
07-02-2015 18:05:52.971 INFO DispatchThread - Generating results preview took 1 ms
07-02-2015 18:05:55.920 DEBUG ERP.myuathiveprovider - SearchController - stopOnMaxHadoopNodesExceedingLicense ...
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - SplunkMR - Failed to get Hive metadata for table [demoorderhistory] and db [default]:java.net.SocketTimeoutException: Read timed out
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - java.lang.RuntimeException: Failed to get Hive metadata for table [demoorderhistory] and db [default]:java.net.SocketTimeoutException: Read timed out
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.datasource.hive.HiveMetaStoreUtils.getTable(HiveMetaStoreUtils.java:72)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.HiveSplitGenerator.getTableSchema(HiveSplitGenerator.java:88)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.HiveSplitGenerator.sendSplitToAcceptor(HiveSplitGenerator.java:67)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.FileSplitGenerator.generateSplits(FileSplitGenerator.java:68)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.VirtualIndex$FileSplitter.accept(VirtualIndex.java:1418)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.VirtualIndex$FileSplitter.accept(VirtualIndex.java:1396)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.VirtualIndex$VIXPathSpecifier.addStatus(VirtualIndex.java:576)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.VirtualIndex$VIXPathSpecifier.listStatus(VirtualIndex.java:609)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.VirtualIndex$Splitter.generateSplits(VirtualIndex.java:1566)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.VirtualIndex.generateSplits(VirtualIndex.java:1485)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.VirtualIndex.generateSplits(VirtualIndex.java:1437)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.input.VixSplitGenerator.generateSplits(VixSplitGenerator.java:55)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.SplunkMR$SearchHandler.streamData(SplunkMR.java:674)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.SplunkMR$SearchHandler.executeImpl(SplunkMR.java:936)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.SplunkMR$SearchHandler.execute(SplunkMR.java:771)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.PrivilegedCommandHandlerAction.run(PrivilegedCommandHandlerAction.java:51)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at com.splunk.mr.PrivilegedCommandHandlerAction.run(PrivilegedCommandHandlerAction.java:17)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at java.security.AccessController.doPrivileged(Native Method)
07-02-2015 18:05:57.738 ERROR ERP.myuathiveprovider - at javax.security.auth.Subject.doAs(Subject.java:415)

This tells me that it was able to do the Kerberos authentication, but it failed after that about 20 seconds later (from second 37 to second 57).

It is not a firewall issue AFAICS because a connection to the relevant port is made.

Here is some of my indexes.conf

[provider:myuathiveprovider]
vix.family = hadoop
vix.env.JAVAHOME = /opt/projects/java/jdk7latest64
vix.env.HADOOP
HOME = /usr/hdp/current/hadoop-client
vix.mapred.job.tracker = bruathdp002.MYDOMAIN.local:50030

vix.splunk.setup.package = /opt/hunk/hunk-6.2.1-249325-linux-2.6-x86_64.tgz

vix.fs.default.name = hdfs://uatcluster:8020
vix.splunk.home.hdfs = /user/splunk
vix.splunk.home.datanode = /home/hunk/$SPLUNKSERVERNAME/
vix.hadoop.security.authentication = kerberos
vix.hadoop.security.authorization = true
vix.java.security.krb5.conf = /etc/krb5.conf
vix.kerberos.principal = splunk@MYDOMAIN.LOCAL
vix.kerberos.keytab = /etc/security/keytabs/splunk.service.keytab
vix.dfs.namenode.kerberos.principal = nn/HOST@MYDOMAIN.LOCAL
vix.mapred.job.reuse.jvm.num.tasks = 1
vix.yarn.resourcemanager.principal = rm/
HOST@MYDOMAIN.LOCAL
vix.yarn.nodemanager.principal = yarn/HOST@MYDOMAIN.LOCAL
vix.hive.metastore.sasl.enabled = false
vix.hive.metastore.kerberos.principal = hive/
HOST@MYDOMAIN.LOCAL
vix.command.arg.3 = $SPLUNK_HOME/bin/jars/SplunkMR-s6.0-hy2.0.jar
vix.mapreduce.framework.name = yarn
vix.yarn.resourcemanager.address = bruathdp002.MYDOMAIN.local:8050
vix.yarn.resourcemanager.scheduler.address = bruathdp002.MYDOMAIN.local:8030
vix.splunk.search.column.filter = false
vix.splunk.search.debug = 1
vix.output.buckets.max.network.bandwidth = 0
vix.splunk.impersonation = 1
vix.splunk.search.splitter = HiveSplitGenerator
vix.hive.metastore.uris = thrift://bruathdp003.MYDOMAIN.local:9083

[hive01]
vix.description = hive data
vix.input.1.path = /data/orderhistory
vix.provider = myuathiveprovider
vix.input.1.splitter.hive.dbname = default
vix.input.1.splitter.hive.tablename = demo
order_history

This file is I believe a CSV text file. At some point in the future I want to use ORC.

Tags (3)
0 Karma
Highlighted

Re: Hunk timing out when connecting to Hive Metastore

Splunk Employee
Splunk Employee

Regarding Hive JDBC, see this blog that shows how to connect Splunk DB Connect with Hive: http://blogs.splunk.com/2015/02/25/splunk-db-connect-cloudera-hive-jdbc-connector/

Regarding the error:
1) I assume vix.input.1.path = /data/orderhistory should be vix.input.1.path = /data/orderhistory/...
2) Are you able to connect to this DB and Table using Hive command line on the same node as Hunk - As the Hunk User?
3) I suspect that you do not need these flags:
vix.splunk.impersonation = 1
vix.mapred.job.tracker = bruathdp002.MYDOMAIN.local:50030

0 Karma
Highlighted

Re: Hunk timing out when connecting to Hive Metastore

Splunk Employee
Splunk Employee

You need to enable sasl
vix.hive.metastore.sasl.enabled = true

0 Karma
Highlighted

Re: Hunk timing out when connecting to Hive Metastore

Path Finder

That seems to be the problem. Thanks.

I have gotten further and have a new error - but that can be solved, or put in a new question.

I wish I knew how to find out all these settings.

0 Karma