All Apps and Add-ons

what are the requirements from Cloudera CDH for setting up Splunk Hadoop Data Roll

saranya_fmr
Communicator

What are the requirements from the CDH Team to set up Hadoop data roll?

  1. Do splunk hosts need cloudera scm agents?
  2. Should splunk hosts be added to the CDH Cluster for CDH Client Parcels to be installed on splunk hosts? Note: I manually copied all the config files on the splunk hosts from the CDH Edge-Node
0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

1) No need for Cloudera Manager agent on the Splunk Search Head and Indexers since they are just Hadoop clients. You will need the Hadoop binaries and Java to be distributed to all of the Splunk nodes.
Here is the link to the documentation: http://docs.splunk.com/Documentation/Splunk/latest/Indexer/ArchivingindexestoHadoop
2) Using Cloudera Manager to generate and install Hadoop client on all the Splunk servers (Indexers and Search Heads) could make your life much easier from a Hadoop management point of view. However, it is not required.

saranya_fmr
Communicator

Hi @rdagan,

I did follow the same steps , installed Hadoop binaries and copied teh configs into splunk hosts.
But Im unable to run simple hadoop commands on the splunk hosts. I get the below error.

bash-4.1$ hdfs dfs -ls /
17/12/07 03:04:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.ExceptionInInitializerError
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2138)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2103)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2197)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2223)
at org.apache.hadoop.security.Groups.(Groups.java:99)
at org.apache.hadoop.security.Groups.(Groups.java:95)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:420)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:284)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:806)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:776)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:649)
at org.apache.hadoop.fs.FileSystem$Cache$Key.(FileSystem.java:2874)
at org.apache.hadoop.fs.FileSystem$Cache$Key.(FileSystem.java:2866)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2729)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:385)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:184)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102)
at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
Caused by: java.lang.RuntimeException: Bailing out since native library couldn't be loaded
at org.apache.hadoop.security.JniBasedUnixGroupsMapping.(JniBasedUnixGroupsMapping.java:46)
... 30 more

What could be the reason?

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

I would recommend that for at least the Splunk Search Head, you get your Hadoop team to setup a full Hadoop Client environment. That will eliminate many configuration issues.

Once that is done, for all the indexers, you will have the knowledges of the right configurations.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...