Splunk Search

Hunk Complexe Search Stops due to Kerberos

benoitleroux
Explorer

Using Hunk with simple search like index=myindex retreives all the expected results. But as soon as I add something else (ei: sourcetype=mysourcetype or add something like | stats by count user) the search stop at some point after 400 000 more events with and error like:

[myprovider] JobStartException - Failed to start MapReduce job. Please consult search.log for more information. Message: [ Failed to start MapReduce job, name=SPLK_myclient_XXXXXXXX.XX_X ] and [ Failed on local exception: java.io.IOException: Couldn't setup connection for myuser/myclient@myrealm to myuser/hadoopnamenode@myrealm; Host Details : local host is: "myclient/myclientIP"; destination host is: "hadoopNameNode":9001; ]

I tried the solution related to the ephemeral port needed to send the confirmation of Mareduced Job termination by shuting down both firewall of hadoop and the client devices, the same error occured.

Search.log shows several part like

09-09-2014 17:35:31.620 DEBUG ERP.idoop11 - Client$Connection$1 - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]

But it continue to parse the log files until it crashes.

Any clue of where to look at?

Thanks,

Tags (3)
0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

Happy to hear things are working

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

The reason you are not seeing the error If you are just running ' index=xyz ' is because you are not running MR Jobs. You are just running Splunk streaming on Hadoop.
To run MR Jobs you need ' index=xyz | and be in Smart Mode '

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

Before you can use Hunk, it requires:
1) Hadoop libraries
2) These Hadoop libraries must be identical version as the Hadoop Server !! Important otherwise your MR will fail
3) Java
4) A user that can install Hunk + exists on HDFS /user/ !! Important otherwise your MR will fail

Once you got the above working, now we can talk about Kerberos 

1) Make sure the Hadoop client node (Hunk) has a keytab / Has a fully working Kerberos / kinit / kadmin / etc .. before you configure Hunk to use it

2) Find the file /etc/krb5.conf

3) From that file you can find many of these values, which are needed by Hunk to use with Kerberos:

vix.java.security.krb5.kdc =
vix.java.security.krb5.realm =
vix.kerberos.principal =
vix.kerberos.keytab =
vix.hadoop.security.authentication =
vix.hadoop.security.authorization =
vix.dfs.namenode.kerberos.principal =
vix.mapreduce.jobtracker.kerberos.principal =
vix.hadoop.security.auth_to_local =

benoitleroux
Explorer

Thank, I was able to fix the provider configuration. The only error I made was in key

vix.mapreduce.jobtracker.kerberos.principal

where I did set MyUser/_HOST@myrealm rather than mapred/_HOST@myrealm. MyUser is the one running the Hunk client and is not defined on hadoop even if it has credential to access it.

No need of the following keys while I use kerberos conf file

vix.java.security.krb5.kdc
vix.java.security.krb5.realm
vix.hadoop.security.auth_to_local

0 Karma

anandhim
Path Finder

hi rdagan, as Benoit mentioned above, the search works fine when executing it just for the index, but shows this error whenever any additional search parameter is added to it. Won't it result in the same error and show no results if the configs you mentioned were not in place?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...