Deployment Architecture

Hadoop Client Node Configuration

soujanyabargavi
New Member

Assume that there is a Hadoop Cluster that has 20 machines. Out of those 20 machines 18 machines are slaves and machine 19 is for NameNode and machine 20 is for JobTracker.

Now i know that hadoop software has to be installed in all those 20 machines.

but my question is which machine is involved to load a file xyz.txt in to Hadoop Cluster. Is that client machine a separate machine . Do we need to install Hadoop software in that clinet machine as well. How does the client machine identifes Hadoop cluster?

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

You are correct. A client machine is needed to load the file and you will need the Hadoop libraries to be installed on the client node.
The client node will know how to identifies the Hadoop cluster using the Name Node IP and Port. These days, Task Tracker is not used, so you will need the Yarn Resource Manager IP and Port.

0 Karma
Get Updates on the Splunk Community!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...

Stay Connected: Your Guide to October Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...