Hadoop , Docker , AWS Cloud Automation using Python Code

Akash Pandey
11 min readNov 7, 2020
Automation

What the World needs ?

The Answer to it is Automation

Here’s my Article on Automation using python . I create Menu program which can automate Hadoop , Docker , LVM , some services of AWS Cloud , Prediction automation using previous Data Set etc. Anyone can use this Menu Program without knowing the actual Linux command to set-up Hadoop cluster or Docker container or automating AWS cloud .

I have done the following things as follows :-

Hadoop

  • Run any Linux Command Locally & Remotely
  • Configure WebServer on Local OS , Remote OS and AWS Cloud
  • Configure and start Namenode on Local OS and AWS Cloud
  • Configure and start Datanode on Local OS and AWS Cloud
  • Created Virtual Group
  • Created Logical Volume
  • Contribute Limited Storage to Hadoop Cluster
  • Attach More hard-disks to Virtual group dynamically
  • Increase Partition Size Dynamically

AWS Cloud

  • Created & Deleted Key Pair
  • Created & Deleted Security Group
  • Adding Ingress rules to existing Security group
  • Launch Instance
  • Created EBS volume
  • Attached EBS volume to Ec2 Instance
  • Configured WebServer
  • Created static partition and mount /var/www/html folder on EBS volume
  • Created S3 bucket accessible to public
  • Inserted the object inside s3 bucket which is accessible by everyone
  • Created Cloudfront distribution providing S3 as origin
  • Delete object inside S3 bucket
  • Deleted S3 bucket
  • Stop , Start and terminate Ec2 Instance

Docker

  • Pull Image from Docker hub
  • launch Container
  • Click to know number of Container running and docker Images in OS
  • Inspect docker container
  • Remove docker Images from OS
  • Start and stop docker Container
  • Delete Single or all docker container
  • Configure Webserver inside docker container

Machine Learning

  • Predict output from Data Set

What is Bigdata ? Hadoop , Launch EC2 Instance on AWS Cloud manually , Set-up Hadoop Cluster on AWS Cloud, Providing Specific amount of storage to Hadoop Cluster ?

Hadoop

Print Date :-

Configure Web Server on Local OS :-

Code :-

Configured Name-Node & Data Node on AWS Cloud ( Set-up Hadoop Cluster on Cloud using python )

Initially , Hadoop and Java is not Installed inside ec2 instance .

Configuring Name-node on Cloud

Code :-

core_site() :-

hdfs_site() :-

Name-node Code :-

Configuring Data-node on cloud

Here , Name-Node is started and I am configuring and Started Data node

Now , Hadoop cluster is Set-up .

Code :-

Creation of Virtual group

Code :-

Creation of Logical Volume

Code :-

Increase Partition Size dynamically

Previously , Logical Volume size is 33 Gb . Now , I have added 10GB more Storage to LV . So , Size of Logical Volume will be 43 GB

Code :-

Configured and Started Data-Node in Local OS providing Specific Volume , ie. Logical Volume Storage

Here , I provided Logical volume storage to data-node . So , we can increase the storage attach to Hadoop cluster dynamically whenever we need .

Now , 2 Data-Node are configured

Code :-

Configured WebServer on Remote OS

Here , No Httpd Software are installed in Remote Server

Installing httpd Software remotely

WebServer Configured

Code :-

AWS Cloud

Creation of Key Pair

Here , one key is already created .

Key Created .

Code :-

Deletion of Key Pair :-

Key deleted .

Code :-

Creation of Security Group :-

When we create a Security Group , by default no rules is attached to this SG .

So , Attaching Ingress Rules to SG . Here , I added to two rules which allows Port no. 80 and 22 for Http and SSH protocol respectively .

Code :-

Launched Ec2 Instance :-

Instance launched

Code :-

Creation of EBS Volume :-

When we create EBS Volume , here it is Available . so we have to attach this EBS Volume to Ec2 Instance .

Attached EBS Volume to Ec2 Instance

Volume attached .

Here , Volume is attached to the Ec2 Instance but not providing it’s storage to Ec2 Instance . So , In Order to use this Extra EBS Volume , we have to make partition of hard-disks then format the partition and finally mount .

Code :-

Configuring WebServer on AWS cloud

WebServer Configured .

Code :-

Created Static Partiton and Mount /var/www/html folder on EBS volume

Now , EBS Volume is Partitioned , Partitions are formatted and /var/www/html are mounted on /dev/xvdf3 partition .

Code :-

Creation of S3 Bucket :-

Here , Bucket is empty as we didn’t put any object inside the bucket .

Putting Object inside the S3 Bucket accessible publicly:-

Code :-

Creation of Cloudfront distribution providing S3 Origin :-

Now , We can access the S3 bucket object using cloudfront domain name leads to less latency and high security .

Code :-

Deletion of Object from S3 Bucket :-

Image removed from s3 bucket

Code :-

Deletion of S3 bucket :-

whoakss3 S3 bucket deleted .

Code :-

Terminate Ec2 Instance :-

Stop Ec2 Instance :-

Code :-

Predict Values from the previous Data Set :-

To predict any future values , we have to install some library using pip3 command . Here , I downloaded Anaconda and some python library like pandas , sklearn , joblib etc .

Code :-

Configure Webserver running inside docker container :-

Code :-

Here , All the above mention things are automated using python Code though here we use terraform code for AWS Cloud and Ansible , DevOps tool , for configuration management .

Github :-

--

--

Akash Pandey

I am a Computer Science Undergraduate , who is seeking for opportunity to do work in challenging work environment .