All Apps and Add-ons
Highlighted

what are the requirements from Cloudera CDH for setting up Splunk Hadoop Data Roll

Communicator

What are the requirements from the CDH Team to set up Hadoop data roll?

  1. Do splunk hosts need cloudera scm agents?
  2. Should splunk hosts be added to the CDH Cluster for CDH Client Parcels to be installed on splunk hosts? Note: I manually copied all the config files on the splunk hosts from the CDH Edge-Node
0 Karma
Highlighted

Re: what are the requirements from Cloudera CDH for setting up Splunk Hadoop Data Roll

Splunk Employee
Splunk Employee

1) No need for Cloudera Manager agent on the Splunk Search Head and Indexers since they are just Hadoop clients. You will need the Hadoop binaries and Java to be distributed to all of the Splunk nodes.
Here is the link to the documentation: http://docs.splunk.com/Documentation/Splunk/latest/Indexer/ArchivingindexestoHadoop
2) Using Cloudera Manager to generate and install Hadoop client on all the Splunk servers (Indexers and Search Heads) could make your life much easier from a Hadoop management point of view. However, it is not required.

Highlighted

Re: what are the requirements from Cloudera CDH for setting up Splunk Hadoop Data Roll

Communicator

Hi @rdagan,

I did follow the same steps , installed Hadoop binaries and copied teh configs into splunk hosts.
But Im unable to run simple hadoop commands on the splunk hosts. I get the below error.

bash-4.1$ hdfs dfs -ls /
17/12/07 03:04:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.ExceptionInInitializerError
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2138)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2103)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2197)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2223)
at org.apache.hadoop.security.Groups.(Groups.java:99)
at org.apache.hadoop.security.Groups.(Groups.java:95)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:420)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:284)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:806)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:776)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:649)
at org.apache.hadoop.fs.FileSystem$Cache$Key.(FileSystem.java:2874)
at org.apache.hadoop.fs.FileSystem$Cache$Key.(FileSystem.java:2866)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2729)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:385)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:184)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102)
at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
Caused by: java.lang.RuntimeException: Bailing out since native library couldn't be loaded
at org.apache.hadoop.security.JniBasedUnixGroupsMapping.(JniBasedUnixGroupsMapping.java:46)
... 30 more

What could be the reason?

0 Karma
Highlighted

Re: what are the requirements from Cloudera CDH for setting up Splunk Hadoop Data Roll

Splunk Employee
Splunk Employee

I would recommend that for at least the Splunk Search Head, you get your Hadoop team to setup a full Hadoop Client environment. That will eliminate many configuration issues.

Once that is done, for all the indexers, you will have the knowledges of the right configurations.

0 Karma