How to Set-up Hadoop Cluster

Akash Pandey
3 min readOct 11, 2020

What is Hadoop Cluster and Big-data ?

Set-up Hadoop Cluster ?

Set-up Name Node on AWS Cloud and Data Node on Local VM

How to Launch Instance on AWS Cloud ?

In NamNode -

To set-up hadoop cluster , we have to download two softwares — 1st JDK because Hadoop is internally configured in JAVA and second hadoop Software . So , I have downloaded both the s/w in my base OS and transfer these software to EC2 Instance using winscp Software .

Next , I have transfer these software to Root user .

Then , Install Jdk and hadoop using

rpm -ivh jdk-8u171-linux-x64.rpm — To Install JDK

rpm -ivh hadoop-1.2.1–1.x86_64.rpm — — force — To Install Hadoop

Create one directory and then format it to use it as a Namenode .

Then

cd /etc/hadoop/

In hdfs-site.xml file , we have to write one property

<?xml version=”1.0"?>
<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<! — Put site-specific property overrides in this file. →

<configuration>
<property>
<name>dfs.name.dir</name>
<value>/nn</value>
</property>
</configuration>

In core-site.xml

<?xml version=”1.0"?>
<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<! — Put site-specific property overrides in this file. →

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:9001</value>
</property>
</configuration>

To start the Namenode -

-> hadoop-daemon.sh start namenode

To check whether it’s started on not , we have to use jps command .

Here , No data node is connected . So we have to configure data node and attach it to Name node .

In Data Node , again we have to install jdk and hadoop software .

In Data Node —

First , we have to make one directory

  • mkdir /datanode

Then ,write property in hdfs-site.xml and core-site.xml as given in figure

and then start the datanode .

  • hadoop-daemon.sh start datanode

Now , Again if we check DataNode is connected to Namenode and we can check using this command -

  • hadoop dfsadmin -report

Now , Namenode and Data Node is Connected . Similarly , We can attach many datanode to Namenode .

Thank you :)

--

--

Akash Pandey

I am a Computer Science Undergraduate , who is seeking for opportunity to do work in challenging work environment .