I'm configuring my cluster on latest version of Hadoop Connect following the video on that application page on splunkbase : http://www.splunk.com/view/SP-CAAAHBZ
Even while the Hadoop version I'm using is the same as the one used on that video I'm getting an error when trying to save the cluster configuration.
After filling in all the cluster information I'm getting a "Failed to get remote Hadoop version (namenode=headnode, port=50070): 'Version' keyword is not found."
I'm running on CentOS 5.5.
Is there any known reason for this?
Thanks,
Ramón Pin
Hi every one. Finally this problem seems to be resolved. Our hadoop machines are not listed on our DNS, we are using /etc/hosts to asign a name to them. It seems that the aplication is issuing a DNS request for the machine name and not getting the name from /etc/hosts. All the Hadopo commands and processes use /etc/hosts normally. We have configured the Hadoop URL using headnode's IP and it register the cluster.
Can you please file a support case and include a diag so we can take a look at the log files as well?
We'll do it as soon as we can. Thank you for your support.
Hadoop Connect tries to find the version of the cluster by using jmx or a fallback mechanism. What does this URL return in your evinronment:
http://[namenode-host]:50070/jmx?qry=*adoop:service=NameNode,name=NameNodeInfo
{
"beans" : [ {
"name" : "Hadoop:service=NameNode,name=NameNodeInfo",
"modelerType" : "org.apache.hadoop.hdfs.server.namenode.FSNamesystem",
"Threads" : 27,
"HostName" : "headnode",
"Used" : 362116714496,
"Version" : "1.0.3, r1335192",
"Total" : 570697924608,
"UpgradeFinalized" : true,
"Free" : 175745486848,
"Safemode" : "",
"NonDfsUsedSpace" : 32835723264,
"PercentUsed" : 63.451557,
"PercentRemaining" : 30.794834,
"TotalBlocks" : 4495,
"TotalFiles" : 7345,
...}
I cut the result to fit the comment's size.
I tried with nc from Splunk's machine:
$ nc -vz headnode 50070
Connection to headnode 50070 port [tcp/*] succeeded!
Also tried hdfs access as described on the video-tutorial:
$ /hadoop-1.0.3/bin/hadoop dfs -ls /
drwxr-xr-x - hadoop supergroup 0 2012-05-21 13:29 /_distcp_logs_y8txnu
drwxr-xr-x - hadoop supergroup 0 2012-07-25 09:00 /benchmarks
drwxr-xr-x - hadoop supergroup 0 2013-03-11 16:05 /user
You've verified you have connectivity to that host on that port? No firewall issues? Setup is generally pretty straightforward. I've not seen that error previously, so I'm guessing this is due to a basic environment issue like connectivity. By verified connectivity, I mean using telnet or nc to verify you can actually open a TCP connection to that host and port.