I have used Hunk + Hive. It works okay on highly partitioned data but we have one particular database that is not highly partitioned and very slow to search with Hunk + Hive.
I am instead trying Splunk DB Connect + our company JDBC driver for Hive. I have run into an authentication problem. Our authentication is kerberos and I have run kinit before and can see with klist that we have credentials.
The rpc.log says:
Caused by: java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://myhiveserver:50515/mydbname;sasl.qop=auth;principal=hive/myhiveserver@mydomain: GSS initiate failed
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:238)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:175)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101)
at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:316)
at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:518)
... 33 more
Caused by: org.apache.thrift.transport.TTransportException: GSS initiate failed
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:211)
... 39 more
I tested with beeline using the following and was successful (note the anon anon setting)
connect jdbc:hive2://hiveserver:50515/default;sasl.qop=auth;principal=hive/hiveserver@mydomain anon anon org.apache.hive.jdbc.HiveDriver
Note: I did have to copy (a random collection of) jar files into the /opt/splunk/etc/apps/splunk_app_db_connect/bin/lib to even get to the GSS error. hive-exec.jar, hive-common.jar and others.
1) what do others do for kerberos for Splunk DB Connect?
2) in the identities file, what do I put for username and password if the authentication is really in a keytab?
So the real trick for me with the JDBC driver and Hive was to have these entries in the inputs.conf file. Without the -Djavax line using Kerberos did not work.
[rpcstart://default]
javahome = <path to my java>
useSSL = 0
options = -Djavax.security.auth.useSubjectCredsOnly=false
And for my user and Kerberos password, in identities.conf I omitted the password.
[mydb]
use_win_auth = 0
username = user@Mydomain.com
disabled = 0
Hi, Were you able to get Kerberos authentication working with Splunk DB connect? I am stuck on similar issue and using DB Connect 3.2. I already have setup db_connection_types.conf and added Cloudera driver file HiveJDBC41.jar under $SPLUNK_HOME/etc/apps/splunk_app_db_connect/drivers
So the real trick for me with the JDBC driver and Hive was to have these entries in the inputs.conf file. Without the -Djavax line using Kerberos did not work.
[rpcstart://default]
javahome = <path to my java>
useSSL = 0
options = -Djavax.security.auth.useSubjectCredsOnly=false
And for my user and Kerberos password, in identities.conf I omitted the password.
[mydb]
use_win_auth = 0
username = user@Mydomain.com
disabled = 0
Hi @burwell,
I'm having a really bad time trying to configure Splunk to use a Kerberos Keytab to authenticate my Impala connection. I'm able to get it working with a temporary kerberos ticket (kinit) but for our use case we need to use a keytab file. You mentioned keytab in a few of your comments, did you get it working with a keytab, or are you only using a cached ticket?
Adam
Hi @606866581 yes we use keytab for Hunk everywhere and have not had an issue.
In my provider I have these settings (amongst others)
vix.hadoop.security.authentication = kerberos
vix.java.security.krb5.realm = xxmyhost.mydomain.COM
vix.hadoop.security.authorization = true
Another place to possibly ask this question is the hadoop splunk-usergroup Slack channel.
Since this is a non tested driver, have you tried to create your own connection?
$SPLUNK_HOME/etc/apps/splunk_app_db_connect/local/db_connection_types.conf
[myhive2]
displayName = myhive2
serviceClass = com.splunk.dbx2.DefaultDBX2JDBC
jdbcUrlFormat = jdbc:hive2://hiveserver:50515/default;sasl.qop=auth;principal=hive/hiveserver@mydomain anon anon org.apache.hive.jdbc.HiveDriver
Then in the UI you use some fake parameters.
Kerberos = see example under /opt/splunk/etc/apps/splunk_app_db_connect/default/db_connection_types.conf - generic_mssql_kerberos
Also, this link is useful = http://docs.splunk.com/Documentation/DBX/2.4.0/DeployDBX/Installdatabasedrivers#Install_the_SQL_Serv...
The identities file, I can only assume = Username: Enter the username of your Kerberos account. Enter the password of your Kerberos account.
Thanks Raanan.
About the identities file, since the keytab is the encrypted password I wasn't sure how to enter it.