You are on page 1of 4

ETHIOPIA TECHNICAL

UNIVERSITY

Emerging Technology (EMTE )

Assignment

Jan, 2022
Review of:

Clustered Computing and Hadoop


Ecosystem
Introduction
One of the challenges in dealing with big data is inability of single/individual
computer to handle these data. Since, it demands large storage area and complex
computational demand. This problem is solved by using computer cluster.

Cluster Computing
Cluster computing is a collection of tightly or loosely connected computers that
work together so that they act (work) as a single entity. The resources from these
computers are pooled to appear as one more powerful computer than the
individual computer.

The benefits of clustered computing:-

a. Resource pooling: - cluster combines the basic three resources (storage,


CPU and Memory) of each computer to work as a
single entity.
b. High Availability: - cluster provides continuous availability since all hosts in
the cluster have accesses to the same shared storage, one host frailer can
failover to another host without any downtime. And varying level of fault
tolerance.
c. Easy Scalability: cluster can easily scaled horizontally by adding more
machines to the group.

However, using cluster requires a solution for managing; cluster membership,


coordinating resource sharing and scheduling actual work on individual nodes.
And this challenge is solved by using Hadoop and its ecosystem.

Hadoop

Is an open source framework that allows for the distributed processing


of large data sets across cluster using simple programming models.
Hadoop’s have four basic characteristics:-
a. Economical : it can use ordinary computer for cluster
b. Reliable : data can copied and stored at different machines of
cluster so not lost due to hard ware problems of some computers
c. Scalable: easily scalable both horizontally and vertically.
d. Flexible : can handle type of data for later us

Hadoop Ecosystem
Is a framework evolved from its four core components; that are data
management, data access, data processing and data storage. Hadoop ecosystem
elements at various stage of data processing is summarized as follow

You might also like