You are on page 1of 22

Cloud Computing 2021-22

Academic Year: 2021-22 Programme: MBATECH-IT


Year: 3rd Semester: VI
Name of Student: Deepak Chaudhary Batch: A1
Roll No: I010 Date of experiment: 06/12/2021
Faculty: Prof. Rejo Mathew Signature with Date:

Experiment 1: Hadoop Distribution (Cloudera)

Aim: Study and install private cloud on Cloudera platform

Learning Outcomes:
After completion of this experiment, student should be able to
1. Understand private cloud deployment
2. Understand Hadoop distribution and need for it
3. Install and configure Cloudera private cloud

Theory:
Apache Hadoop is a collection of open-source software utilities that facilitates using a network
of many computers to solve problems involving massive amounts of data and computation. It
provides a software framework for distributed storage and processing of big data using the
MapReduce programming model. Cloudera, Inc. is a US-based software company that provides
a software platform for data engineering, data warehousing, machine learning and analytics that
runs in the cloud or on premises. Cloudera started as a hybrid open-source Apache Hadoop
distribution, CDH (Cloudera Distribution Including Apache Hadoop),that targeted enterprise-
class deployments of that technology.

Hadoop is a software framework for distributed processing of large datasets across large
clusters of computers
➢ Large datasets  Terabytes or petabytes of data
➢ Large clusters  hundreds or thousands of nodes
Hadoop is open-source implementation for Google MapReduce. Hadoop is based on a
simple programming model called MapReduce. Hadoop is based on a simple data model,
any data will fit.

1
Cloud Computing 2021-22

Map Reduce

Job Tracker is the master node (runs with the namenode)


➢ Receives the user’s job
➢ Decides on how many tasks will run (number of mappers)
➢ Decides on where to run each mapper (concept of locality)

2
Cloud Computing 2021-22

Task Tracker is the slave node (runs on each datanode)


➢ Receives the task from Job Tracker
➢ Runs the task until completion (either map or reduce task)
➢ Always in communication with the Job Tracker reporting progress

Mappers and Reducers are users’ code (provided functions)


Just need to obey the Key-Value pairs interface

Mappers:

➢ Consume <key, value> pairs


➢ Produce <key, value> pairs

Reducers:
➢ Consume <key, <list of values>>
➢ Produce <key, value>
Shuffling and Sorting:
➢ Hidden phase between mappers and reducers
➢ Groups all similar keys from all mappers, sorts and passes them to a certain
reducer in the form of <key, <list of values>>

3
Cloud Computing 2021-22

Highlights of Cloudera
1. Control costs and manage resources
– Auto-scale
– Auto-suspend
2. Easy provisioning and support for multiple types of workloads
3. Consistent security and data governance across applications and datasets
4. Enables enterprise IT staff to quickly respond to business demands

For Cloudera

1. Download the Cloudera VM. Download the Cloudera VM


from https://downloads.cloudera.com/demo_vm/virtualbox/cloudera-quickstart-vm-
5.4.2-0-virtualbox.zip. The VM is over 4GB, so will take some time to download.

2. Unzip the Cloudera VM:

On MAC systems: Double click cloudera-quickstart-vm-5.4.2-0-virtualbox.zip

On Windows systems: Right-click cloudera-quickstart-vm-5.4.2-0-virtualbox.zip and select


“Extract All…”

3. Start VirtualBox.

4. Begin importing. Import the VM by going to File -> Import Appliance

4
Cloud Computing 2021-22
5. Click the Folder icon.

6. Select the cloudera-quickstart-vm-5.4.2-0-virtualbox.ovf from the Folder where you


unzipped the VirtualBox VM and click Open.

7. Click Continue to proceed.

8. Click Import.

5
Cloud Computing 2021-22

9. The virtual machine image will be imported. This can take several minutes.

10. Launch Cloudera VM. When the importing is finished, the quickstart-vm-5.4.2-0 VM will
appear on the left in the VirtualBox window. Select it and click the Start button to launch the
VM.

11. Cloudera VM booting. It will take several minutes for the Virtual Machine to start. The
booting process takes a long time since many Hadoop tools are started.

6
Cloud Computing 2021-22

12. The Cloudera VM desktop. Once the booting process is complete, the desktop will
appear with a browser.

Workaround Procedure:

1. Goto the following link


https://www.cloudera.com/about/training/courses/cdp-private-cloud-
fundamentals.html#?classType=ondemand
2. Register by putting basic information
3. Check all the modules alongwith videos provided about Cloudera platform
4. Complete the installation based on requirements and demo.

Download Location: https://downloads.cloudera.com/demo_vm/virtualbox/cloudera-


quickstart-vm-5.12.0-0-virtualbox.zip
Download Location: https://downloads.cloudera.com/demo_vm/vmware/cloudera-
quickstart-vm-5.12.0-0-vmware.zip
Download Location: https://downloads.cloudera.com/demo_vm/kvm/cloudera-
quickstart-vm-5.12.0-0-kvm.zip

7
Cloud Computing 2021-22
Download Location: https://downloads.cloudera.com/demo_vm/docker/cloudera-
quickstart-vm-5.12.0-0-beta-docker.tar.gz

Screenshots: -

8
Cloud Computing 2021-22

9
Cloud Computing 2021-22

10
Cloud Computing 2021-22

11
Cloud Computing 2021-22

12
Cloud Computing 2021-22

13
Cloud Computing 2021-22

Questions:
1. How is private cloud different from public cloud in Cloudera?

Answer)

14
Cloud Computing 2021-22

15
Cloud Computing 2021-22

2. List the advantages of using Hadoop distribution?

Answer)

16
Cloud Computing 2021-22

17
Cloud Computing 2021-22

3. Why managing resources at platform level is better than at application level? How is
it done in Cloudera?
Answer)

18
Cloud Computing 2021-22

19
Cloud Computing 2021-22

4. What are containers in cloud? How does it help in Cloudera?


Answer)

20
Cloud Computing 2021-22

21
Cloud Computing 2021-22

5. How is upgrade and eliminating bottlenecks achieved in Cloudera?

Answer)

Conclusion: - From this experiment, I am able to understand a learn private cloud deployment,
understand the concept of Hadoop Distribution and its significance, and install and configure
Cloudera private cloud.

22

You might also like