CS3551 DC - Int - I - Answer Key 7.9.23
CS3551 DC - Int - I - Answer Key 7.9.23
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 1
2 What are the two major types of Distributed Systems?
a) Client Server System can be applied with multiple servers. CO1 BTL-1
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 2
7 How a FIFO execution is implemented?
To implement a FIFO logical channel over a non-FIFO
channel,a separate numbering scheme is used to sequence the CO2 BTL-1
messages.
The sender assigns a sequence number and appends
connection_id to each message and then transmitted then the
receiver arranges the incoming messages according to the
sender’s sequence numbers and accepts “next” messages per
sequence.
8 Differentiate between synchronous and asynchronous communication.
synchronous form of asynchronous form of
CO2 BTL-1
communication communication
• In synchronous form • In asynchronous form of
of communication, the communication, the use of the send
sending and receiving operation is non-blocking in that
processes synchronize at the sending process is allowed to
every message. proceed as soon as the message has
been copied to a local buffer and
the transmission of the message
proceeds in parallel with the
sending process.
• In this case, both send and The receive operation can have
receive are blocking operations. blocking and non-blocking
variants.
In Synchronous In Asynchronous transmission, the
transmission, the time time interval of transmission is not
interval of transmission is constant, it is random.
constant.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 3
9 Define Logical time and logical clocks.
Logical time
Lamport proposed a model of logical time that can be used to CO2
BTL-1
provide an ordering among the events at processes running in
different computers in a distributed system.
Logical time allows the order in which the messages are
presented to be inferred without recourse to clocks.
Logical clocks
Lamport invented a simple mechanism by which the happened
before ordering can be captured numerically, called a logical clock.
A Lamport logical clock is a ncreasing software counter,
whose value need bear no particular relationship to any
physical clock.
Each process pi Keeps its own logical clock, Li , which it uses
to apply so- called Lamport timestamps to events.
We denote the timestamp of event e at pi by Li(e) , and by L(e)
we denote the timestamp of event e at whatever process it
occurred at.
10 What are the two phases in obtaining a global snapshot?
First locally recording the snapshot at every process
CO2 BTL-1
Second distributing the resultant global snapshot to all the
initiators
PART B
11(a) List and describe the components of a Distributed System. CO1 BTL-1 &
(7) BTL-2
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 4
11(b) Define distributed systems. What are the significant issues and
challenges of the distributed systems? CO1 BTL-1
12(a) i) Discuss the different trends in distributed systems. (10) CO1 BTL-6
CO1 BTL-5
ii) Enlighten ANY TWO examples of distributed systems. (4)
12(b) Describe the need of Distributed System with following basis:
a) Requirements (5) CO1 BTL-2
b) Major Characteristics (4)
c) Reasons (4)
13(a) i) What are Message Passing System(MPS) and Shared Memory
System (SMS)? Describe both in terms of Communication in
CO1 BTL-1 &
Distributive Computing requirements. (8)
BTL-2
ii) Compare Message Passing System (MPS) and Shared
BTL-2
Memory System (SMS). (5) CO1
13(b) i) Identify and explain the basic properties of scalar time. (7) CO2 BTL-2
ii) List and explain the basic properties of vector time. (6)
CO2 BTL-1
14(a) What are the different ways of synchronizing physical clocks? CO2 BTL-1 &
Explain Physical clock synchronization algorithm. (13) BTL-2
ii) Briefly explain the Vector Clock with rules that define this
vector clock. (5)
15(a) Explain in detail on three major message ordering paradigms. CO2 BTL-1
Given example for each one of them. (13)
15(b) What is group communication? What are the Key areas of CO2 BTL-1
applications of group communication? Explain the
programming model for group communication. (13)
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 5
PART – C
CO2 BTL-1
ii) Explain global states and consistent cuts with example. (8) CO2 BTL-1
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 6
PART – B( 13 Marks) - ANSWER KEY
a) Devices or Systems:
The devices or systems in a distributed system have their own processing capabilities
and may also store and manage their own data.
b) Network:
The network connects the devices or systems in the distributed system, allowing them to
communicate and exchange data.
c) Resource Management:
Distributed systems often have some type of resource management system in place to
allocate and manage shared resources such as computing power, storage, and
networking.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 7
The architecture of a Distributed Computing System is typically a Peer-to-Peer Architecture, where
devices or systems can act as both clients and servers and communicate directly with each other.
The three basic components of a distributed system include
a) primary system controller,
b) system data store, and
c) database.
The primary system controller is the only controller in a distributed system and keeps track of
everything.
It’s also responsible for controlling the dispatch and management of server requests throughout the
system.
The executive and mailbox services are installed automatically on the primary system controller.
In a non-clustered environment, optional components consist of a user interface and secondary
controllers.
a) Secondary controller
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 8
The secondary controller is a process controller or a communications controller.
It’s responsible for regulating the flow of server processing requests and managing the system’s
translation load.
It also governs communication between the system and VANs or trading partners.
b) User-interface client
The user interface client is an additional element in the system that provides users with important
system information.
This is not a part of the clustered environment, and it does not operate on the same machines as
the controller.
It provides functions that are necessary to monitor and control the system.
Each system has only one data store for all shared data.
The data store is usually on the disk vault, whether clustered or not.
For non-clustered systems, this can be on one machine or distributed across several devices, but
all of these computers must have access to this datastore.
3. Database
In a distributed system, a relational database stores all data. Once the data store locates the data, it
shares it among multiple users.
Relational databases can be found in all data systems and allow multiple users to use the same
information simultaneously.
***********************
ii) Illustrate with necessary diagram how distributed components are loosely coupled. (6)
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 9
ANSWER KEY:
The loosely coupled system contains distributed memory. On the other hand, a tightly coupled system
has a shared memory.
The loosely coupled system contains a low data rate. On the other hand, the tightly coupled system
contains a high data rate.
These are the systems in which data is stored and processed on many machines which are
connected by some network.
To make it more simple, distributed systems are a collection of several separate(individual) systems
which communicate (through a LAN or WAN) and cooperate with each other (using some
software) in order to provide the users, access to various resources that the system maintains.
One important point to note about distributed systems is that they are loosely-coupled i.e;
hardware and software may communicate with each other but they need not depend upon each
other.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 10
Thus the loosely-coupled components in Distributed System provides major support for
performance.
a) Cloud Computing: Cloud Computing systems are a type of distributed computing system that are
used to deliver resources such as computing power, storage, and networking over the Internet.
b) Peer-to-Peer Networks: Peer-to-Peer Networks are a type of distributed computing system that is
used to share resources such as files and computing power among users.
c) Distributed Architectures: Many modern computing systems, such as microservices architectures,
use distributed architectures to distribute processing and data storage across multiple devices or
systems.
A distributed computation consists of a set of processes that cooperate to achieve a common goal.
A main characteristic of these computations is that the processes do not already share a common
global memory and that they communicate only by exchanging messages over a communication
network.
*****************************
2) Define distributed systems. What are the significant issues and challenges of the distributed
systems? NOV/DEC 2017, APRIL/MAY 2018
ANSWER KEY
Designing a distributed system does not come as easy and straight forward.
A number of challenges need to be overcome in order to get the ideal system.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 11
1. Heterogeneity:
The Internet enables users to access services and run applications over a heterogeneous collection
of computers and networks.
Heterogeneity (that is, variety and difference) applies to all of the following:
i) Hardware devices: computers, tablets, mobile phones, embedded devices, etc.
ii) Operating System: Ms Windows, Linux, Mac, Unix, etc.
iii) Network: Local network, the Internet, wireless network, satellite links, etc.
iv) Programming languages: Java, C/C++, Python, PHP, etc.
v) Different roles of software developers, designers, system managers
Different programming languages use different representations for characters and data structures
such as arrays and records.
These differences must be addressed if programs written in different languages are to be able to
communicate with one another.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 12
Programs written by different developers cannot communicate with one another unless they use
common standards, for example, for network communication and the representation of primitive
data items and data structures in messages.
For this to happen, standards need to be agreed and adopted – as have the Internet protocols.
Middleware :
The term middleware applies to a software layer that provides a programming abstraction as
well as masking the heterogeneity of the underlying networks, hardware, operating
systems and programming languages.
Most middleware is implemented over the Internet protocols, which themselves mask the
differences of the underlying networks, but all middleware deals with the differences in
operating systems and hardware .
The term mobile code is used to refer to program code that can be transferred from one computer to
another and run at the destination – Java applets are an example. Code suitable for running on one
computer is not necessarily suitable for running on another because executable programs are normally
specific both to the instruction set and to the host operating system.
[Link]:
Transparency is defined as the concealment from the user and the application programmer of the
separation of components in a distributed system, so that the system is perceived as a whole rather than as
a collection of independent components.
In other words, distributed systems designers must hide the complexity of the systems as much as they
can.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 13
Migration: Hide that a resource may move to another location
Relocation: Hide that a resource may be moved to another location while in use Replication
Hide that a resource may be copied in several places
Concurrency: Hide that a resource may be shared by several competitive users Failure Hide
the failure and recovery of a resource
Persistence: Hide whether a (software) resource is in memory or a disk
3. Openness
The openness of a computer system is the characteristic that determines whether the system can
be extended and re-implemented in various ways.
The openness of distributed systems is determined primarily by the degree to which new
resource-sharing services can be added and be made available for use by a variety of client
programs.
If the well-defined interfaces for a system are published, it is easier for developers to add new
features or replace sub-systems in the future.
Example: Twitter and Facebook have API that allows developers to develop theirs own software
interactively.
4. Concurrency
Both services and applications provide resources that can be shared by clients in a distributed
system. There is therefore a possibility that several clients will attempt to access a shared resource
at the same time.
For example, a data structure that records bids for an auction may be accessed very frequently
when it gets close to the deadline time.
For an object to be safe in a concurrent environment, its operations must be synchronized in such
a way that its data remains consistent.
This can be achieved by standard techniques such as semaphores, which are used in most
operating systems.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 14
5. Security
Many of the information resources that are made available and maintained in distributed systems
have a high intrinsic value to their users.
Their security is therefore of considerable importance.
availability for the authorized (protection against interference with the means to access the resources).
6. Scalability
i) Size
Number of users and resources to be processed. Problem associated is overloading
ii) Geography
Distance between users and resources. Problem associated is communication reliability
iii) Administration
As the size of distributed systems increases, many of the system needs to be controlled. Problem
associated is administrative mess
7. Failure Handling
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 15
When faults occur in hardware or software, programs may produce incorrect results or may stop
before they have completed the intended computation.
The handling of failures is particularly difficult.
**************************
4. Mobile & ubiquitous computing – Small and portable devices are possible to be used within
distributed systems
• E.g. laptop computers, handheld devices, wearable devices, devices embedded in appliances –
Mobile computing: portability of the devices and the ability to connect to networks in different
places –
Ubiquitous computing: small computing devices that available everywhere and are easily
attached to networks
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 16
5. Portable & handheld devices in a distributed system.
8. Cloud computing:
Cloud computing: distributed computing utility.
A cloud is a set of internet-based application, storage and computing services sufficient to
support most users’ needs.
Cloud are implemented on cluster computers to provide the appropriate scale and
performance required by such services
9) A cluster computer: a set of interconnected computers that cooperate closely to provide a single
integrated high-performance computing capability –
A blade server: a computer server that has been designed to minimize the use of physical
space and energy
10. Grid Computing – Is a form of cloud computing – Authorized users share processing power,
memory and data storage – Use to support scientific applications
*****************
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 17
ii) Enlighten ANY TWO examples of distributed systems. MAY/JUNE 2016 (4)
ANSWER KEY
a) Web search
Web search has emerged as a major growth industry in the last decade, with recent figures
indicating that the global number of searches has risen to over 10 billion per calendar month.
The task of a web search engine is to index the entire contents of the World Wide Web,
encompassing a wide range of information styles including web pages, multimedia sources and
(scanned) books.
This is a very complex task, as current estimates state that the Web consists of over 63 billion
pages and one trillion unique web.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 18
and Flickr;
the emergence ofsocial networking through services such as Facebook and MySpace.
e) Healthcare
The growth of health informatics as a discipline with its emphasis on online electronic patient
records and related issues of privacy;
the increasing role of telemedicine in supporting remote diagnosis or more advanced services
such as remote surgery (including collaborative working between healthcare teams);
the increasing application of networking and embedded systems technology in assisted living, for
example for monitoring the elderly in theirown homes.
f) Education
The emergence of e-learning through for example web-based tools such as virtual learning
environments;
associated support for distance learning; support for collaborative or community-based learning.
i) GPS in route finding systems and more general traffic management systems;
ii) the modern car itself as an example ofa complex distributed system (also applies to other forms of
transport such as aircraft);
iii) the development of web-based map services such as MapQuest, Google Maps and Google Earth.
*********************
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 19
4) Describe the need of Distributed System with following basis:
a) Requirements (5)
b) Major Characteristics (4)
c) Reasons (4)
ANSWER KEY:
The motivation for using a distributed system is some or all of the following
requirements:
In many applications such as money transfer in banking, or reaching consensus among parties that are
geographically distant, the computation is inherently distributed.
2) Resource sharing
The resources such as peripherals, complete data sets in databases, special libraries, as well as
data (variable/files) cannot be fully replicated at all the sites because it is often neither practical
nor cost-effective.
Further, they cannot be placed at a single site because access to that site might prove to be a
bottleneck.
Therefore, such resources are typically distributed across the system.
For example, distributed databases such as DB2 partition the data sets across several servers, in
addition to replicating them at a few sites for rapid access as well as reliability.
In many scenarios, the data cannot be replicated at every site participating in the
distributed execution because it may be too large or too sensitive to be replicated.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 20
For example, payroll data within a multinational corporation is both too large and too sensitive to
be replicated at every branch office/site.
It is therefore stored at a central server which can be queried by branch offices.
Similarly, special resources such as supercomputers exist only in certain locations, and to access
such supercomputers, users need to log in remotely.
Advances in the design of resource-constrained mobile devices as well as in the wireless
technology with which these devices communicate have given further impetus to the importance
of distributed protocols and middleware.
4. Enhanced reliability
A distributed system has the inherent potential to provide increased reliability because of the
possibility of replicating resources and executions, as well as the reality that geographically
distributed resources are not likely to crash/malfunction at the same time under normal
circumstances.
• integrity, i.e., the value/state of the resource should be correct, in the face of concurrent access from
multiple processors, as per the semantics expected by the application;
• fault-tolerance, i.e., the ability to recover from system failures, where such failures may be defined to
occur in one of many failure models
By resource sharing and accessing geographically remote data and resources, the
performance/cost ratio is increased.
Although higher throughput has not necessarily been the main objective behind using a
distributed system, nevertheless, any task can be partitioned across the various computers in the
distributed system.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 21
Such a configuration provides a better performance/cost ratio than using special parallel
machines.
This is particularly true of the NOW configuration.
6. Scalability
As the processors are usually connected by a wide-area network, adding more processors does not pose a
direct bottleneck for the communication network.
Heterogeneous processors may be easily added into the system without affecting the performance,
as long as those processors are running the same middleware algorithms.
Similarly, existing processors may be easily replaced by other processors.
A main characteristic of these computations is that the processes do not already share a common global
memory and that they communicate only by exchanging messages over a communication network.
i) Multiple Devices or Systems: Processing and data storage is distributed across multiple devices
or systems.
ii) Peer-to-Peer Architecture: Devices or systems in a distributed system can act as both clients and
servers, as they can both request and provide services to other devices or systems in the network.
iii) Shared Resources: Resources such as computing power, storage, and networking are shared
among the devices or systems in the network.
Horizontal Scaling: Scaling a distributed computing system typically involves adding more devices or
systems to the network to increase processing and storage capacity. This can be done through hardware
upgrades or by adding additional devices or systems to the network.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 22
Reasons for building a Distributed System
A distributed system contains multiple nodes that are physically separate but linked together using
the network.
All the nodes in this system communicate with each other and handle processes in tandem.
Each of these nodes contains a small part of the distributed operating system software.
a) It is inherently distributed :
For example, sending a message from your mobile phone to your friend’s phone.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 23
c) For better performance
Get data from nearby node rather than one halfway round the world.
For example, a huge amount of data will not be fit into a single machine.
a) Resource sharing
b) Computation speedup
c) Reliability
d) Communication.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 24
Generally, one host at one site, the server, has a resource that another host at another site, the
client (or user), would like to use. A general structure of a distributed system is shown in Figure
**************************
5) i) What are Message Passing System(MPS) and Shared Memory System (SMS)?
(8)
ANSWER KEY:
If a set of processes communicate with one another by sending and receiving messages over a
communication channel is called message-passing systems.
Message passing in distributed systems involves communication between nodes to coordinate
actions, exchange data, and propagate information.
The pattern of the connection provided by the channel is described by some topology systems.
The collection of the channels is called a network.
This allows multiple processes to read and write data to the message queue without being
connected to each other
Messages are stored on the queue until their recipient retrieves them.
Message queues are quite useful for inter process communication and are used by most operating
systems.
In this model, data is shared by sending and receiving messages between co-operating processes,
using system calls .
Message Passing is particularly useful in a distributed environment where the communicating
processes may reside on different, network connected, systems.
Message passing architectures are usually easier to implement but are also usually slower than
shared memory architectures.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 26
Message-Passing Communication: for data transfer
Tasks exchange data through communications by sending and receiving explicit messages.
Data transfer usually requires cooperative operations to be performed by each process.
For example, a send operation must have a matching receive operation.
Message passing is a flexible and scalable method for inter-node communication in distributed
systems.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 27
It enables nodes to exchange information, coordinate activities, and share data without relying on
shared memory or direct method invocations.
Models like synchronous and asynchronous message passing offer different synchronization and
communication semantics to suit system requirements.
Synchronous message passing ensures sender and receiver synchronization, while asynchronous
message passing allows concurrent execution and non-blocking communication.
ii) Compare Message Passing System (MPS) and Shared Memory System (SMS). (5)
ANSWER KEY:
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 28
All remote data accesses are explicit and not involve communication by the
therefore the programmer is always aware underlying runtime support.
of whether a particular operation is in-
process or involves the expense of
communication.
***************************
ANSWER KEY:
Scalar Time is a simple implementation that uses local clocks and two rules to ensure
correct order.
Other methods, such as using timestamps, can also be used but must obey causality to
work properly.
By using logical clocks, processes in a distributed system can work together in an
organized and efficient way.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 29
Time domain is the set of non-negative integers.
The logical local clock of a process pi and its local view of the global timeare squashed
into one integer variable Ci .
Scalar Clocks provide a naive solution to the problem, giving an eventually consistent
ordering of events.
i) Consistency Property
Scalar clocks satisfy the monotonicity and hence the consistency property:
for two events ei and ej , ei → ej =⇒ C(ei ) < C(ej ).
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 30
The main problem in totally ordering events is that two or more events at different
processes may have identical timestamp.
For example in Figure 3.1, the third event of process P1 and the second event of process
P2 have identical scalar timestamp.
ii) List and explain the basic properties of vector time. (6)
ANSWER KEY:
In the system of vector clocks, the time domain is represented by a set of n-dimensional
non-negative integer vectors.
An Example of Vector Clocks
a) Isomorphism
If events in a distributed system are timestamped using a system of vector clocks, we have the
following property.
If two events x and y have timestamps vh and vk, respectively, then
x → y ⇔ vh < vk x ǁ y ⇔ vh ǁ vk
Thus, there is an isomorphism between the set of partially ordered events produced by a
distributed computation and their vector timestamps.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 32
b) Strong Consistency
The system of vector clocks is strongly consistent; thus, by examining the vector timestamp of
two events, we can determine if the events are causally related.
However, Charron-Bost showed that the dimension of vector clocks cannot be less than n, the
total number of processes in the distributed computation, forthis property to hold.
c) Event Counting
If d=1 (in rule R1), then the i th component of vector clock at process pi ,vti [i ], denotes the
number of events that have occurred at pi until that instant.
Clearly, vh[j] – 1 represents the total number of events that causally precede e in the
distributed computation.
**********************
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 33
2) What are the different ways of synchronizing physical clocks? Explain Physical clock
synchronization algorithm. (13)
ANSWER KEY:
Clock Synchronization
Synchronized clocks incorporate technology allowing the clocks to receive time data and
display accurate and uniform time throughout a facility.
Synchronized clocks are used in many types of facilities for the purpose of having all
clocks in the facility show the exact same time.
If two systems do not interact with each other then there is no need of synchronization.
So, what usually matters is that processes agree on the order in which events occur rather
than the time at which they occurred.
Logical clocks does not need exact time. So, absolute time is not a constrain in logical
clocks. Logical clocks just bothers about the message to be delivered and not about the
timings of the events occurred.
Using Interrupts, computer generally updates a software clock. More the interrupts, higher
the overhead. Accuracy will be more.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 34
The most common logical clock synchronization algorithms for distributed systems are
Lamport’s algorithm.
1. centralized and
2. distributed.
These have one node with a real-time receiver and are called time server node.
The clock time of this node is regarded as correct and used as reference time.
These have one node with a real-time receiver and are called time server node. The clock time of
this node is regarded as correct and used as reference time.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 35
The goal of this algorithm is to keep the clocks of all other nodes synchronized with time server
node.
i. Cristian’s Algorithm
In this method each node periodically sends a message to the server. When the time server
receives the message it responds with a message T, where T is the current time of server
node.
Assume the clock time of client be To when it sends the message and T1 when it receives
the message from server. To and T1 are measured using same clock so best estimate of
time for propagation is (T1-To)/2.
When the reply is received at clients node, its clock is readjusted to T+(T1-T0)/2. There
can be unpredictable variation in the message propagation time between the nodes hence
(T1-T0)/2 is not good to be added to T for calculating current time.
For this several measurements of T1-To are made and if these measurements exceed some
threshold value then they are unreliable and discarded. The average of the remaining
measurements is calculated and the minimum value is considered accurate and half of the
calculated value is added to T.
Advantage-It assumes that no additional information is available.
Disadvantage- It restricts the number of measurements for estimating the value.
This is an active time server approach where the time server periodically broadcasts its
clock time and the other nodes receive the message to correct their own clocks.
In this algorithm the time server periodically sends a message to all the computers in the
group of computers. When this message is received each computer sends back its own
clock value to the time server. The time server has a prior knowledge of the approximate
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 36
time required for propagation of a message which is used to readjust the clock values. It
then takes a fault tolerant average of clock values of all the computers. The calculated
average is the current time to which all clocks should be readjusted.
The time server readjusts its own clock to this value and instead of sending the current
time to other computers it sends the amount of time each computer needs for
readjustment. This can be positive or negative value and is calculated based on the
knowledge the time server has about the propagation of message.
[Link] algorithms
In this approach the clock process at each node broadcasts its local clock time in the form
of a “resync” message at the beginning of every fixed-length resynchronization interval.
This is done when its local time equals To+iR for some integer i, where To is a fixed time
agreed by all nodes and R is a system parameter that depends on total nodes in a system.
After broadcasting the clock value, the clock process of a node waits for time T which is
determined by the algorithm.
During this waiting the clock process collects the resync messages and the clock process
records the time when the message is received which estimates the skew after the waiting
is done. It then computes a fault-tolerant average of the estimated skew and uses it to
correct the clocks.
The global averaging algorithms do not scale as they need a network to support broadcast
facility and a lot of message traffic is generated.
Localized averaging algorithms overcome these drawbacks as the nodes in distributed
systems are logically arranged in a pattern or ring.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 37
Each node exchanges its clock time with its neighbors and then sets its clock time to the
average of its own clock time and of its neighbors.
************************
3) i) Explain Logical time and logical clocks. Describe the different types of logical clocks
used in distributive computing Environment. (8)
ii) Briefly explain the Vector Clock with rules that define this vector clock. (5)
ANSWER KEY:
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 38
Logical time
The logical time in distributed systems is used to maintain the consistent ordering of
events.
The concept of causality, i.e. the causal precedence relationship, is fundamental for
distributed systems. Usually, it is tracked using physical time, but physical clocks are hard
to maintain in a distributed system, so logical clocks are used instead.
The idea that makes logical clocks different is that these are designed to maintain the
information about the order of events rather than pertaining to the same notion of time as
the physical clocks.
Logical Clock
Logical Clocks refer to implementing a protocol on all machines within your distributed
system, so that the machines are able to maintain consistent ordering of events within
some virtual timespan.
Is a mechanism for capturing chronological and causal relationships in a distributed
system.
Distributed systems may have no physically synchronous global clock, so a logical clock
allows global ordering on events from different processes in such systems.
The first implementation, the Lamport timestamps, was proposed by Leslie Lamport in
1978.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 39
Example :
If we go outside then we have made a full plan that at which place we have to go first,
second and so on.
We don’t go to second place at first and then the first place.
We always maintain the procedure or an organization that is planned before.
In a similar way, we should do the operations on our PCs one by one in an organized way.
**********************
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 40
4) Explain in detail on three major message ordering paradigms. Given example for
each one of them. (13)
ANSWER KEY:
i) non-FIFO
ii) FIFO
iii) causal order
iv) synchronous order
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 41
Fig. 3.1 Hierarchy of execution classes (a) Venn diagram (b) Example
Execution
This is because synchronous order offers the most simplicity due to the restricted number
of possibilities, whereas non-FIFO order offers the greatest difficulties because it admits
a much larger set of possibilities that the developer and verifier need to account for.
(a) An A-execution that is not a FIFO execution (b) An A-execution that is also a FIFO
execution
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 42
(ii) FIFO executions
A FIFO execution is an A-execution in which,
for all (s, r) and (s1, r1) ∈ T, (s ∼ s1 and r ∼ r1 and s ≺ s1) =⇒ r ≺ r1.
On any logical link in the system, messages are necessarily delivered in the order in
which they are sent.
To implement FIFO over non-FIFO link: use {seq_num, conn_id} per message.
Receiver uses buffer toorder messages.
If two send events s and s1 are related by causality ordering (not physical time
ordering), then a causally ordered execution requires that their corresponding receive
events r and r1 occur in the same order at all common destinations.
If s and s1 are not related by causality, then CO is blankly satisfied.
Examples
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 43
Fig. 3.2 Illustration of causally ordered executions.
(a) Violates CO because s1 ≺ s3 and at the common destination P1, we have r3 ≺r1.
(b) Satisfies CO. Only s1 and s2 are related by causality but the destinations of the
corresponding messages aredifferent.
(c) Satisfies CO. No send events are related by causality.
(d) Satisfies CO. s2 and s1 are related by causality but the destinations of the
corresponding messages are different. Similarly for s2 and s3.
Example:
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 44
An execution (E, ≺) is an empty-interval (EI) execution if for each pair of events (s, r)
∈ T, the open interval set {x ∈ E | s ≺x ≺r} in the partial order is empty.
Example:
This holds for allmessages in the execution. Hence, the execution is EI.
Example
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 45
Causality in a synchronous execution
The synchronous causality relation << on E is the smallest transitive relation that satisfies
the following: S1: If x occurs before y at the same process, then x << y.
S2: If (s, r) ∈ T, then for all x ∈ E, [(x << s ⇐⇒ x << r) and (s << x
⇐⇒ r << x)]. S3: If x << y and y << z, then x << z.
************************
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 46
5) What is group communication? What are the Key areas of applications of group
communication? Explain the programming model for group communication. (13)
ANSWER KEY
Group communication
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 47
This technique is mainly used to find a way to address problem of a high workload
on host system and redundant information from process in system. Multitasking
can significantly decrease time taken for message handling.
iii) A unicast Communication: P1 process communicating with only P3 process.
When the host process tries to communicate with a single process in a distributed system
at the same time. Although, same information may be passed to multiple processes. This
works best for two processes communicating as only it has to treat a specific process only.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 48
An object group is a collection of objects that process the same set of invocations
concurrently.
Client objects invoke operations on a single, local object, which acts as a proxy for the
group.
The proxy uses a group communication system to send the invocations to the members of
the object group.
Object parameters and results are marshalled as in RMI and the associated calls are
dispatched automatically to the right destination objects/methods.
A group is said to be closed if only members of the group may multicast to it A process in a
closed group delivers to itself any message that it multicasts to the group.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 49
Key areas of applications of group communication
Group Communication is an important building block for distributed systems, with key
areas of applications including:
The reliable dissemination of information to potentially large numbers of clients.
Support for collaborative applications.
Group communication issues only one multicast operation to send a message to each of a
group of processes instead of issuing multiple send operations to individual processes.
The use of a single multicast operation instead of multiple send operations enables the
implementation to be efficient in its utilization of bandwidth.
The implementation can also minimize the total time taken to deliver the message to all
destinations, as compared with transmitting it separately and serially.
The ordering property of the messages is responsible for controlling the sequence of
messages to be delivered. Some types of message ordering:
i) No order: messages are sent to the group without concern for ordering.
ii) FIFO ordering: messages are delivered in the order in which they were sent.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 50
iii) Casual ordering: messages are sent after receiving another message.
iv) Total ordering: all group members receive all messages in the same order.
i) FIFO ordering: If a process sends one message before another, it will be delivered in this
order at all processes in the group.
ii) Causal ordering: If a message happens before another message in the distributed system
this so-called causal relationship will be preserved in the delivery of the associated
messages at all processes.
iii) Total ordering: In total ordering, if a message is delivered before another message at one
process, then the same order will be preserved at all processes.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 51
b) Non-FIFO (N-FIFO): a channel acts like a set in which a sender process adds messages
and receiver removes messages in random order.
c) Causal Ordering (CO): It follows Lamport’s law. o The relation between the three models
is given by CO FIFO N-FIFO
**********************
1) i) List and describe ANY SIX major design issues of Distributed System. (6)
ANSWER KEY:
DESIGN ISSUES
Primary issues in the design of the distributed systems included providing access to remote data in the
face of failures, file system design, and directory structure design.
Below we describe the important design issues and challenges after categorizing them as
(i) having a greater component related to systems design and operating systems design, or
(ii) having a greater component related to algorithm design, or
(iii) emerging from recent technology advances and/or driven by new applications.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 52
For example, the current practice of distributed computing follows the client–server architecture
to a large degree, whereas that receives scant attention in the theoretical distributed algorithms
community.
First,
an overwhelming number of applications outside the scientific computing community of users of
distributed systems are business applications for which simple models are adequate.
Second,
the state of the practice is largely controlled by industry standards, which do not necessarily
choose the “technically best” solution.
The fundamental issue in design and implementation of DSM system is data irregularity.
The data irregularity might be raised by the synchronous access.
To solve this problem in the DSM system we need to utilize some synchronization primitives,
semaphores, event count, and so on.
Performance is an important issue and challenge of Distributed Software System.
To minimize the constraints, and thus challenges, problems are to be discussed and solutions are to be
provided.
In distributed software system different task scheduling algorithms are developed.
These algorithms should be evaluated on different available task evaluation parameters for a specific
task graph which ultimately should represent the DSS.
The best algorithm performance result should ultimately be adopted.
This approach will minimize the challenges of DSS.
The following are some of the major design issues of distributed systems:
Here are a number of design considerations to take into account.
i) Heterogeneity:
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 53
Heterogeneity is applied to the network, computer hardware, operating system, and
implementation of different developers.
A key component of the heterogeneous distributed system client-server environment is
middleware.
Middleware is a set of services that enables applications and end-user to interact with each
other across a heterogeneous distributed system.
ii) Openness:
The openness of the distributed system is determined primarily by the degree to which new
resource-sharing services can be made available to the users.
Open systems are characterized by the fact that their key interfaces are published.
It is based on a uniform communication mechanism and published interface for access to shared
resources.
It can be constructed from heterogeneous hardware and software.
iii) Scalability:
The scalability of the system should remain efficient even with a significant increase in the
number of users and resources connected.
It shouldn’t matter if a program has 10 or 100 nodes; performance shouldn’t vary.
A distributed system’s scaling requires consideration of a number of elements, including size,
geography, and management.
iv) Security:
The security of an information system has three components Confidentially, integrity, and
availability.
Encryption protects shared resources and keeps sensitive information secrets when
transmitted.
v) Failure Handling:
When some faults occur in hardware and the software program, it may produce incorrect
results or they may stop before they have completed the intended computation so corrective
measures should to implemented to handle this case.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 54
Failure handling is difficult in distributed systems because the failure is partial i, e, some
components fail while others continue to function.
vi) Concurrency:
There is a possibility that several clients will attempt to access a shared resource at the same
time. Multiple users make requests on the same resources, i.e. read, write, and update.
Each resource must be safe in a concurrent environment. Any object that represents a shared
resource in a distributed system must ensure that it operates correctly in a concurrent
environment.
vii) Transparency:
Transparency ensures that the distributed system should be perceived as a single entity by the
users or the application programmers rather than a collection of autonomous systems, which is
cooperating.
The user should be unaware of where the services are located and the transfer from a local
machine to a remote one should be transparent.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 55
The performance of distributed systems are affected by Broadcast/multicast processing and
required to develop a delivering procedure that completes the processing in minimum time.
*************
ANSWER KEY:
7 ii) CHALLENGES
The Distribution System start addresses the challenges in designing distributed systems from a
system building perspective.
a) Security,
b) Maintaining consistency of data in every system,
c) Network Latency between systems,
d) Resource Allocation, or
e) Proper node balancing across multiple nodes
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 56
iv) dynamic reconfiguration with nodes as well as objects joining and leaving the network randomly;
v) replication strategies to expedite object search;
vi) tradeoffs between object size latency and table sizes;
vii) anonymity,
viii) privacy, and
ix) security.
ii) Processes
Some of the issues involved are:
i) management of processes and threads at clients/servers; code migration; and
ii) the design of software and mobile agents.
iii) Naming
Devising easy to use and robust schemes for names, identifiers, and addresses is essential for locating
resources and processes in a transparent and scalable manner.
Naming in mobile systems provides additional challenges because naming cannot easily be tied to
any static geographical topology.
iv) Synchronization
Mechanisms for synchronization or coordination among the processes are essential.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 57
Mutual exclusion is the classical example of synchronization, but many other forms of
synchronization, such as leader election are also needed.
In addition, synchronizing physical clocks, and devising logical clocks that capture the essence of
the passage of time, as well as global state recording algorithms, all require different forms of
synchronization.
viii) Security
Distributed systems security involves various aspects of cryptography, secure channels, access
control, key management – generation and distribution, authorization, and secure group
management.
ix) Applications Programming Interface (API) and transparency
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 58
The API for communication and other specialized services is important for the ease of use and
wider adoption of the distributed systems services by non-technical users.
ii) Location transparency : makes the locations of resources transparent to the users.
iv) Replication transparency: does not let the user become aware of any replication.
v) Concurrency transparency: deals with masking the concurrent use of shared resources for the user.
vi) Failure transparency: refers to the system being reliable and fault-tolerant.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 59
The interleaving model and partial order model are two widely adopted models of distributed
system executions.
They have proved to be particularly useful for operational reasoning and the design of distributed
algorithms.
The input/output automata model and the TLA (temporal logic of actions) are two other examples
of models that provide different degrees of infrastructure for reasoning more formally with and
proving the correctness of distributed programs.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 60
The processes must be allowed to execute concurrently, except when they need to synchronize to
exchange information, i.e., communicate about shared data.
Synchronization is essential for the distributed processes to overcome the limited observation of
the system state from the viewpoint of any one process.
The synchronization mechanisms can also be viewed as resource management and concurrency
management mechanisms to streamline the behavior of the processes that would otherwise act
independently.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 61
Designing mechanisms to achieve these design and verification goals is a challenge.
vii) Performance
Although high throughput is not the primary goal of using a distributed system, achieving good
performance is important. In large distributed systems, network latency (propagation and
transmission times) and access to shared resources can lead to large delays which must be
minimized.
The user perceived turn-around time is very important.
The following are some example issues arise in determining the performance:
• Metrics
Appropriate metrics must be defined or identified for measuring the performance of theoretical
distributed algorithms, as well as for implementations of such algorithms.
The former would involve various complexity measures on the metrics, whereas the latter would
involve various system and statistical metrics.
• Measurement methods/tools
As a real distributed system is a complex entity and has to deal with all the difficulties that arise
in measuring performance over a WAN/the Internet, appropriate methodologies and tools must be
developed for measuring the performance metrics.
*******************
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 62
16) b) i) Describe in details on model of Distributed Computing. (7)
ANSWER KEY:
The actions are atomic and the actions of a process are modeled as three types of events:
i) internal events,
The execution of process pi produces a sequence of events e1, e2, e3, …, and it is denoted by
msg indicates the dependency that exists due to message passing between two events.
Let ex i denote the xth event at process pi .
For a message m, let send(m) and rec(m) denote its send and receive events, respectively.
The events at a process are linearly ordered by their order of occurrence.
The execution of process pi produces a sequence of events
e1i , e2i , ..., and is denoted by Hi where
Hi = (hi , →i )
hi is the set of events produced by pi and
Occurrence of events
The occurrence of events changes the states of respective processes and channels.
The occurrence of events changes the states of respective processes and channels, thus
causing transitions in the global system state.
A send event changes the state of the process that sends the message and the state of the
channel on which the message is sent.
A receive event changes the state of the process that receives the message and the state of the
channel on which the message is received.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 64
Fig: Space time distribution of distributed systems
This term was coined by Lamport. Happens-before defines a partial order of events in a
distributed system.
Some events can’t be placed in the order. If say A →B if A happens before B. A→B is defined
using the following rules:
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 65
Local ordering:A and B occur on same process and A occurs before B.
Consider two events c and d; c→d and d→c is false (i.e) they are not casually related, then c and d are
said to be concurrent events denoted as c||d.
Fig 1 shows the communication of messages m1 and m2 between three processes p1, p2 and p3.
a, b, c, d, e and f are events.
It can be inferred from the diagram that, ab; cd; ef; b->c; df; ad; af; bd; bf.
Also a||e and c||e are concurrent events.
************************
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 66
PART – C ( 15 MARKS)
16) ii) Explain global states and consistent cuts with example. (8)
ANSWER KEY
Global state
The global state of a distributed system is a collection of the local states of its
components, namely, the processes and the communication channels.
The state of a process at any time is defined by the contents of processor registers,
stacks, local memory, etc. and depends on the local context of the distributed application.
The state of a channel is given by the set of messages in transit in the channel.
Example:
Let LS x denote the state of process p after the occurrence of event e x and before the
event e x+1.
i i i i
1 3 3
A global state GS1 consisting of local states {LS1 , LS2 , LS3 , LS4 } is 2
inconsistent because the stateof p2 has recorded the receipt of message m12, however,
the state of p1 has not recorded its send.
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 67
Cuts of a Distributed Computation
A cut in a space-time diagram is a line joining an arbitrary point on each process line that
slices the space‐time diagram into a PAST and a FUTURE.
In the space–time diagram of a distributed computation, a zigzag line joining one arbitrary
point on eachprocess line is termed a cut in the computation.
Such a line slices the space–time diagram, and thus the set of events in the distributed
computation, into a PAST and a FUTURE.
The PAST contains all the events to the left of the cut and the FUTURE contains all the
events to the right of the cut.
For a cut C, let PAST(C) and FUTURE(C) denote the set of events in the PAST and
FUTURE of C, respectively.
Every cut corresponds to a global state and every global state can be graphically
represented as a cut in the computation’s space–time diagram.
A cut C = {c1, c2, c3, … } is consistent if for all sites there are no events ei and ej such that: (ei -
-> ej ) and (ej --> cj ) and (ei -/-> ci )
A consistent global state corresponds to a cut in which every message received in the
PAST of the cut was sent in the PAST of that cut. Such a cut is known as a consistent cut.
A consistent cut obeys causality and any run of the Chandy-Lamport Global Snapshot
algorithm creates a consistent cut.
****************************
CS3551- DISTRIBUTED COMPUTING Prepared by Er. [Link] MOHIDEEN, ASST. PROF. /IT, AMSCE Page 68









