Chapter-1Introduction to DS,Issues and Architecture

Chapter-1
Introduction to Distributed Systems

Why should we learn about distributed
systems?
1. Distributed systems are very common today and designing a distributed
system is more complicated than designing standalone system.
2. Most of the online applications that we use on a daily basis are distributed
in some shape or form.
3. The examples include World Wide Web (WWW), Google, Amazon, P2P file
sharing systems, volunteer computing, grid and cluster computing, cloud
computing, etc.
What is a distributed system?
Definition: Two definitions can be given.
1. It is one that has multiple connected CPUs.
2. A collection of multiple machines that appears to its users as a single
coherent system.
What are the advantages
1. Resource sharing:
• Distributed systems enable communication over the network and
resource sharing across machines (e.g. a process on one machine can
access files stored on a different machine).
2. Economic:
• Distributed systems lead to better economics in terms of price and
performance.
• It is usually more cost effective to buy multiple inexpensive small
machines and share the resources across those machines than buying
a single large machine.
Cont...
3. Reliability:
• Distributed systems have better reliability compared to centralized
systems.
• When one machine in a distributed system fails, there are other
machines to take over its task and the whole system can still function.
• It is also possible to achieve better reliability with a distributed system
by replicating data on multiple machines.
Cont...
4. Scalability:
• As the number of machines in a distributed system increases, all of the
resources on those machines can be utilized which leads to performance
scaling up.
• It is usually hard to achieve linear scalability due to various bottlenecks.
5. Incremental growth:
• If an application becomes more popular and more users use the application,
more machines can be added to its cluster to grow its capacity on demand.
• This is an important reason why the cloud computing paradigm is so popular
today.
Issues and Architectures
Scalability Problems and Techniques
It is often hard to distribute everything you have in the system. There are
three common types of bottleneck that prevent the system from scaling
up:
1 .Centralized services
2. Centralized data
3.Centralized algorithms
Centralized services
• This simply means that the whole application is centralized, i.e., the
application runs on a single server.
• In this case, the processing capacity of the server will become a
bottleneck.
• The solution is to replicate the service on multiple machines but this
will also make the system design more complicated.
Centralized data
• This means that the code may be distributed, but the data are stored
in one centralized place (e.g. one file or one database).
• In this case access to the data will become a bottleneck.
• Caching frequently-used data or replicating data at multiple locations
may solve the bottleneck but new problems will emerge such as data
consistency.
Centralized algorithms
• This means that the algorithms used in the code make centralized
assumptions
• e.g. doing routing based on complete information.
Architectures
1.Layered Architectures
2.Object-Based Style
3.Event-Based Architecture
4.Shared Data Space
1.Layered Architectures
Cont....
• A layered architecture looks like a stack, as seen in the figure above.
• The system is partitioned into a sequence of layers and each layer can
communicate to the layer above or below.
• For example, layer i can communicate with layer i+1 and layer i−1 but
not the others (e.g. layer i+2).
• This is the main restriction of a layered design.
• The layered architecture is especially common in web applications
where this architecture is divided across the client and the server.
• Common instances of these systems are multitiered architectures and
network stack.
2.Object-Based Style
Cont....
• In this architecture, each component corresponds to an object.
• Unlike in standard OOP programming, objects can be distributed
across multiple machines.
• As shown in the figure above, the system can have many objects.
• Each object have has its own states and exposes its own interface
which other objects can use.
• All objects can communicate with any other object without
restriction, making this a “generalized” version of the layered design.
• Components interact with each other via remote procedure calls.
3.Event-Based Architecture
Cont....
• An event-based architecture has many components that communicate using a publisher-
subscriber (pubsub) model via an event bus instead of through direct communication.
• In this architecture, a component that sends an event to the event bus is a publisher, and
a component that subscribes to certain types of events on the event bus is a subscriber.
• Each component will work asynchronously.
• After a component sends information by publishing an event, the event bus then checks
for subscriptions matching the recipient information enclosed in the newly published
event.
• If one or more matching subscriptions is found, the event bus will deliver the data to the
appropriate component(s).
• There are many kinds of event busesk, e.g., memory-based or disk-based.
4.Shared Data Space
Cont...
• The shared data space architecture has a shared data space which is like a physical
bulletin board.
• A component posts information and some component may come along later and
retrieve the information.
• Unlike in the event-based architecture, data posted in the shared data space have
no specific information about the recipient.
• Posted data can be in the shared data space for a while until some component
actively retrieve this data.
• The components in the data-space architecture are loosely coupled in space and
time.
• The data that is published is not addressed to anyone in particular, and that the
data may not be recieved in real-time.
Types of Distributed System
Types of Distributed Systems
• Client/Server Systems
• Peer-to-Peer Systems
• Middleware
• Three-tier
• N-tier
Client/Server Systems
• These are the most basic forms of servers. The client gives an input to the server,
and the server replies with an output. The client wants to perform a task on a
server, and the server allocates and performs the task and sends the result as a
response. These types of servers can be applied to many other servers.
Peer-to-Peer Systems
• In this system, each node executes its task on its locally allocated memory and
shares the data through a supporting medium. Computer network applications
use a peer-to-peer system to manage processors that communicate with each
other but maintain independent memory bases.
Middleware
• It is an application which sits between two different applications and
provides services and benefits to both
Three-tier
• Three-tier system uses a distinct layer and server for each program
function. The client's data is stored in the middle tier. It contains an
application layer, a data layer, and a presentation layer. This three-tier
system is most commonly used in web or online applications
N-tier
• It is also known as a multitier distributed system. As the name
suggests, this system may contain any number of functions, similar to
the three-tier system. This N-tier system is more commonly used in
web applications and data systems.
Application & Characteristics of
Distributed System
Applications of Distributed System
• In Networking, ethernet and LAN are the best examples of distributed systems.
Computers sent messages to other computers and systems using local IP
addresses.
• In Telecommunications, telephone and cellular networks have distributed
networks with a broad spread of base stations. They persist in growing in
complexity as a distributed network.
• In Real-time Systems, the systems are distributed globally. Many major industries
use these systems. Companies like Uber and Lyft use dispatch systems, many
significant airlines use flight control systems and e-commerce websites, and
logistic companies use real-time tracking systems.
Characteristics of Distributed System
The key characteristics of distributed systems are:
• Transparency
• Heterogeneity
• Openness
• Scalability
• Fault Tolerance
• Concurrency
• Efficiency
Transparency
• Transparency One of the essential characteristics of the distributed system, transparency, is the notion that
the user interacts with a whole quantity rather than a cluster of cooperating elements.
• A system capable of presenting itself as a whole to the user is called transparent. Transparency is divided
into the eight sub-characteristics illustrated in the following table
Transparency Description
Access Hide differences in data representation and object accessibility.
Location Hide the object’s location.
Relocation Hide the location of the moving object while still in use.
Migration Hide that an object may move to another location
Replication Hide replication of the object
Concurrency Hide that an object may have a shared databases access
Failure Hide any resource failures.
Persistence Hide the fact about memory location.
Heterogeneity
• Heterogeneity refers to the system's ability to operate on various hardware and software
components. Middleware in the software layer helps achieve heterogeneity. The goal of the
middleware is to interpret the programming calls such that the distributed processing gets
completed.
Openness
• Another important characteristic of Distributed system is openness. A
distributed system's
• openness is the difficulty in extending or improving an existing
system. In order to make an open distributed system,
• The interface of the components should be well-defined and precise.
• The interface of the components should be standardized.
• Integration of new components with existing ones must be effortless
Scalability
• In terms of effectiveness, scalability is one of the significant characteristics of distributed systems.
• It refers to the ability of the system to handle growth as the number of users increases.
• Scalability is accomplished by adding more computer systems to the existing etworks.
• A centralised system affects the scalability of a distributive system. If a system is centralised,
• more nodes will try to communicate, which results in a bottleneck at that particular time in the
system
Fault Tolerance
• A distributed system is very likely to be prone to system failures. This is due to the
fact that several computers have diverse-aged hardware. The ability of a system
to handle these failures is called fault tolerance. Fault tolerance is achieved by:
• Recovery: Systems and processes will have a stored backup. It takes over when the system
fails.
• Redundancy: When a component acts predictable and in a controlled way is called
redundancy.
Concurrency
• Concurrency is the system's capability to access and use shared resources. It
means multiple actions are performed at the same time.
• In distributive systems, the concurrent execution of activities takes place in
different components running on numerous machine
Efficiency
• Efficiency refers to the capability of the system to use its resources effectively to
execute the given tasks. The system's design, the workload handled by the
system, and the hardware and software resources used are some critical factors
affecting the system's efficiency.
• Some of the common ways to improve the efficiency of the system are:
• Optimizing the design of the system. This minimizes the amount of
communication and
• coordination required between the different components, reducing any extra
power onsumption.
• Carefully negating the workload of the system. This balance avoids any
component overload and ensures that the system can make the most efficient
use of its resources.
Disadvantages of Distributed Systems
• Currently, there is no relevant software for distributed systems.
• As the data is distributed, security is a primary concern as it may be
easily accessible.
• If there is a delay in the network, the user may face difficulty
accessing data.
• The distributive system has an intricate Database that faces
challenges to manage.
• Network overloading is also a challenge faced in distributive systems.
This happens when all the nodes send data at once

Chapter-1Introduction to DS,Issues and Architecture

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter-1Introduction to DS,Issues and Architecture

Uploaded by

Copyright:

Available Formats

Chapter-1

Introduction to Distributed Systems

You might also like