You are on page 1of 21

UNIT I INTRODUCTION

Introduction: Definition –Relation to computer system components –Motivation –Relation to parallel


systems – Message-passing systems versus shared memory systems –Primitives for distributed
communication –Synchronous versus asynchronous executions –Design issues and challenges. A model of
distributed computations: A distributed program –A model of distributed executions –Models of
communication networks –Global state – Cuts –Past and future cones of an event –Models of process
communications. Logical Time: A framework for a system of logical clocks –Scalar time –Vector time –
Physical clock synchronization: NTP.

Definition

A collection of independent computers that appears to the users of the system as a single coherent
computer.

Features of distributed systems

 No common physical clock


 No shared memory
 Geographical separation
 Autonomy and heterogeneity .

Differences between centralized and distributed systems

Centralized systems Distributed systems


Centralized systems have non autonomous Distributed systems have autonomous components.
components
Centralized systems are build using homogenous Distributed systems are build using heterogeneous
components. components.
Centralized systems have single point of control and Distributed systems have multiple point of control
failures. and failures.

Explain Relation to computer system components

Relation to computer system components

 A typical distributed system is shown in Figure 1.1


 Each computer has a memory-processing unit and the computers are connected by a
communication network.
 Figure 1.2 shows the relationships of the software components that run on each of the computers
and use the local operating system and network protocol stack for functioning.
 The distributed software is also termed as middleware.
 A distributed execution is the execution of processes across the distributed system.
 An execution is also sometimes termed a computation or a run.
 The middleware is the distributed software that drives the distributed system, while providing
transparency of heterogeneity at the platform level.
 Examples of middleware:
1. Object Management Group’s (OMG)
2. Common Object Request Broker Architecture (CORBA)
3. Remote Procedure Call (RPC)
4. Message Passing Interface (MPI)

Motivation to Distributed Systems

 Inherently distributed computations


 Resource sharing
 Access to geographically remote data and resources
 Enhanced reliability
 Increased performance/cost ratio
 Scalable
Explain in detail about Parallel Systems with a neat example

Relation to Parallel Systems

A system is said to be a Parallel System in which multiple processor have direct access to shared memory
which forms a common address space.

Characteristics of parallel systems

Multiprocessor System

 Multiprocessor system is a parallel system.


 The multiple processors have direct access to shared memory which forms a common address
space.
 Multiprocessor is a set of processors connected by a communication network.
 Two standard architecture for parallel systems.
(a). Uniform memory access (UMA) multiprocessor.
(b). Non-uniform memory access (NUMA) multiprocessor.

Omega network

 An Omega network is a network configuration used in parallel computing architectures.


 An Omega network is a multistage interconnection network.
 The outputs from each stage are connected to the inputs of the next stage.
 A multistage omega network formed by a 2x2 switch.
 The 2× 2 switch allows data on either of the two input wires.
 The Omega network connecting n processors with n memory units has n/2 switching elements of
size 2 × 2 arranged in log n stages.
Butterfly network

 Unlike the Omega network, the generation of the interconnection pattern between a pair of
adjacent stages depends not only on n but also on the stage number s.
 The recursive expression is as follows. Let there be M = n/2 switches per stage, and
stage s ∈[ 0, log2n−1].

Multicomputer parallel System

A multicomputer parallel system is a parallel system in which the multiple processors do not have direct
access to shared memory. The memory of the multiple processors may or may not form a common address
space. Such computers usually do not have a common clock.
Figure 5(a) shows a wrap-around 4×4 mesh. For a k×k mesh which will contain k2 processors, the maximum
path length between any two processors is 2k/2−1. Routing can be done along the Manhattan grid. Figure
5(b) shows a four-dimensional hypercube. A k-dimensional hypercube has 2k processor-and-memory units.
Each such unit is a node in the hypercube, and has a unique k-bit label. Each of the k dimensions is
associated with a bit position in the label.

Array processor

 Array processor uses multiple synchronized arithmetic logic units to achieve spatial parallelism. It is
also called synchronous multiprocessor.
 Multiprocessor system consists of several processors some of which may be I/O processors, common,
fast access data local store and common, slow access main store.
 These components are interconnected by a common bus for carrying data and control information.

Explain Flynn’s Taxonomy in detail

Flynn’s taxonomy
 Flynn's taxonomy is a specific classification of parallel computer architectures that are based on the
number of concurrent instruction and data streams available in the architecture.
 Flynn's taxonomy based on the number of instruction streams and data streams are the following
 Single instruction, single data stream(SISD)
 Multiple instruction, single data stream(MISD)
 Single instruction, multiple data stream(SIMD)
 Multiple instruction, multiple data stream(MIMD)

Single instruction, single data stream (SISD)

 An SISD computing system is a uniprocessor machine which is capable of executing a single


instruction, operating on a single data stream.
 In SISD, machine instructions are processed in a sequential manner.
Multiple instruction, single data stream (MISD)

An SIMD system is a multiprocessor machine capable of executing the different instruction on all the CPUs
but operating on same data set.

Single instruction, multiple data stream (SIMD)

An SIMD system is a multiprocessor machine capable of executing the same instruction on all the CPUs but
operating on different data streams.
Multiple instruction, multiple data stream(MIMD)

An MIMD system is a multiprocessor machine which is capable of executing multiple instructions on


multiple data sets. Each processor in the MIMD model has separate instruction and data streams.

Define the term Coupling, parallelism, concurrency, and granularity

Coupling

The degree of coupling among a set of modules, whether hardware or software, is measured in terms of
the interdependency and binding and/or homogeneity among the modules.

Parallelism or speedup of a program on a specific system

This is a measure of the relative speedup of a specific program, on a given machine. The speedup depends
on the number of processors and the mapping of the code to the processors. It is expressed as the ratio of
the time T(1) with a single processor, to the time T with n processors.

Concurrency of a program
The parallelism/concurrency in a parallel/distributed program can be measured by the ratio of the number
of local (non-communication and non-shared memory access) operations to the total number of
operations, including the communication or shared memory access operations.

Granularity of a program

The ratio of the amount of computation to the amount of communication within the parallel/distributed
program is termed as granularity.

Explain message-passing systems versus shared memory systems.

Message passing systems:

 This allows multiple processes to read and write data to the message queue without being
connected to each other.
 Messages are stored on the queue until their recipient retrieves them.
 Message queues are quite useful for interprocess communication and are used by most operating
systems.

Shared memory systems:

 The shared memory is the memory that can be simultaneously accessed by multiple processes. This
is done so that the processes can communicate with each other.
 Semaphores and monitors are common synchronization mechanisms on shared memory systems.
Explain primitives for distributed communication.

 Message send and message receive communication primitives are done through Send() and
Receive(), respectively.
 A Send primitive has two parameters: the destination, and the buffer in the user space that holds
the data to be sent.
 The Receive primitive also has two parameters: the source from which the data is to be received
and the user buffer into which the data is to be received.

There are two ways of sending data when the Send primitive is

Buffered: The standard option copies the data from the user buffer to the kernel buffer. The data later gets
copied from the kernel buffer onto the network.

Unbuffered: The data gets copied directly from the user buffer onto the network.

Sender and receiver can be blocking or non blocking.Three combinations are possible using blocking and
nonblocking

1. Blocking send, Blocking receive.

2. Nonblocking send, blocking receive.

3.Nonblocking send ,nonblocking receive.

1. Blocking send, Blocking receive

Bothe sender and receiver are blocked until the message is delivered.

2. Nonblocking send, blocking receive

Sender may continue on, the receiver is blocked until the requested message arrives.

3. Nonblocking send, nonblocking receive

Sending process sends the message and resumes the operation.


Explain the design issues and challenges of distributed system.

Design issues and challenges of distributed systems are as follows:

 Heterogeneity
 Openness
 Security
 Scalability
 Failure Handling
 Concurrency
 Transparency

Heterogeneity

Heterogeneity means diversity of the distributed systems in terms of hardware, software platform etc.

Hardware devices: computers, tablets, mobile phones, embedded devices, etc.


Operating System: Ms Windows, Linux, Mac, Unix, etc.
Network: Local network, the Internet, wireless network, satellite links, etc.
Programming languages: Java, C/C++, Python, PHP, etc.
Middleware : The term middleware applies to a software layer that provides a programming abstraction as
well as masking the heterogeneity of the underlying platform.

Heterogeneity and mobile code : The term mobile code is used to refer to program code that can be
transferred from one computer to another and run at the destination – Java applets are an example.

Openness

 Distributed systems can be easily extended and modified.


 Integration of new system with existing systems.
 Openness is concerned with extensions and improvement of distributed systems.

 One of the important features of distributed systems is openness and flexibility.


1. Every service is easily accessible to every client.
2. It is easy to implement, install and debug new services.
3. Users can write and install their own services.
Security

 Security become more important in distributed system.


 Security for information resources has three components.
Confidentiality:
Protection against disclosure to unauthorized individuals.
Integrity:
Protection against alteration or corruption.
Availability:
Protection against interference with the means to access the resources.
 Security challenges that are not yet fully met:
Denial of service attack: is an attempt to make a computer or network resource unavailable to its
intended users.
Security of mobile code: Mobile code needs to be handled with care.

Scalability
 A system is described as scalable if it will remain effective when there is a significant increase in the
number of resources and the number of users.
 The design of scalable distributed systems presents the following challenges.
o Controlling the cost of resources.
o Controlling the performance loss.
o Avoiding performance bottlenecks.
Scalability has 3 dimensions:
Size: Number of users and resources to be processed.
Geography: Distance between users and resources.
Administration: As the size of distributed systems increases, many of the system need to be controlled.

Failure Handling
 Failure Handling is difficult in distributed system because failure is partial i.e. some components fail
while other continue to function.
 Techniques for dealing with failures:
Detecting Failures –Not all failures are detected but some of the failures can be detected. For example:
Corrupted data from file is detected by using checksum.
Mask Failures – failure are hidden or made less severe. For example: Messages are retransmitted.
Tolerating Failures-In the internet, client can be designed to tolerate failures.
Recover from failures – if a server crashes roll back to previous state
Redundancy –Services can be made to tolerate failures. For example: A database can be replicated in
several server.

Concurrency
 Distributed systems is usually multi-users environment.
 Several clients will access a shared resource at the same time.

Transparency:
 Transparency is defined as the concealment from the user.
 Distributed systems designers must hide the details of the distributed systems resources.
 Some of the transparency in distributed systems are:

Access Hide how a resource is accessed


Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location while in use
Replication Hide that a resource may be copied in several places
Concurrency Hide that a resource may be shared by several competitive users
Failure Hide the failure and recovery of a resource

Brief about the challenges of Distributed systems from a system perspective.

1.Communication

This task involves designing suitable communication mechanisms among the various processes in the
networks.
Examples: RPC, RMI

2. Processes
The main challenges involved are: process and thread management at both client and server environments,
migration of code between systems, design of software and mobile agents.

3. Naming
Devising easy to use and robust schemes for names, identifiers, and addresses is essential for locating
resources and processes in a transparent and scalable manner.

4. Synchronization
Mutual exclusion, leader election, deploying physical clocks, global state recording are some
synchronization mechanisms.

5. Data storage and access Schemes


Designing file systems for easy and efficient data storage with implicit accessing mechanism is very much
essential for distributed operation.

6. Consistency and replication


The notion of Distributed systems goes hand in hand with replication of data, to provide high degree of
scalability. The replicas should be handled with care since data consistency is prime issue.

7. Fault tolerance
 This requires maintenance of fail proof links, nodes, and processes.
 Some of the common fault tolerant techniques are resilience, reliable communication, distributed
commit, checkpointing and recovery, agreement and consensus, failure detection, and self-
stabilization.

8. Security
Cryptography, secure channels, access control, key management-generation and distribution,
authorization, and secure group management are some of the security measure that is imposed on
distributed systems.

9. Applications Programming Interface (API) and transparency


Access Hide how a resource is accessed
Location Hide where a resource is located
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another location while in use
Replication Hide that a resource may be copied in several places
Concurrency Hide that a resource may be shared by several competitive users
Failure Hide the failure and recovery of a resource

10. Scalability and modularity:


The algorithms, data and services must be as distributed as possible. Various techniques such as
replication, caching and cache management, and asynchronous processing help to achieve scalability.

Explain the Model of distributed computations.

A Distributed Program

 A distributed program is composed of a set of asynchronous processes that communicate by


message passing over the communication network.
 Each process may run on different processor.
 The processes do not share a global memory.
 Process execution and message transfer are asynchronous –a process sending a message does not
wait for the delivery of the message to be complete.

A model of distributed executions

 The execution of a process consists of a sequential execution of its actions.


 The actions of a process are modeled as three types of events: internal events, message send
events, and message receive events.
 A send event changes the state of the process that sends the message and the state of the channel
on which the message is sent.
 A receive event changes the state of the process that receives the message and the state of the
channel on which the message is received.

The distributed execution is depicted by a space–time diagram. Figure shows the space–time diagram of a
distributed execution involving three processes. A horizontal line represents the progress of the process; a
dot indicates an event; a slant arrow indicates a message transfer. The execution of an event takes a finite
amount of time. In this figure, for process p1, the second event is a message send event, the third event is
an internal event, and the fourth event is a message receive event.

Causal precedence relation

Causal message ordering is a partial ordering of messages in a distributed computing environment. It is the
delivery of messages to a process in the order in which they were transmitted to that process.

Happen Before Relation

If say A →B if A happens before B. A→B is defined using the following rules:

 Local ordering: A and B occur on same process and A occurs before B.


 Messages: send(m) → receive(m) for any message m.

Write short notes on Models of communication networks


The three main types of communication models in distributed systems are:

 FIFO (first-in, first-out)


 Non-FIFO (N-FIFO)
 Causal Ordering (CO)

FIFO (first-in, first-out)

Each channel acts as a FIFO message queue and message ordering is preserved by channel.

Non-FIFO (N-FIFO)

A channel acts like a set in which a sender process adds messages and receiver removes messages in
random order.

Causal Ordering (CO)

The “causal ordering” model is based on Lamport’s “happens before” relation. A system that supports the
causal ordering model satisfies the following property:

If say A →B if A happens before B. A→B is defined using the following rules:

• Local ordering: A and B occur on same process and A occurs before B.

• Messages: send(m) → receive(m) for any message m.

Write short notes on models of Process Communication

There are two basic models of process communications

 Synchronous
 Asynchronous

Synchronous

 The sender process blocks until the message has been received by the receiver process.
 The sender process resumes after the receiver process has accepted the message.
 The sender and the receiver processes must synchronize to exchange a message.

Asynchronous

 It is non- blocking communication where the sender and the receiver do not synchronize to
exchange a message.
 The sender process does not wait for the message to be delivered to the receiver process

Explain Logical clock in detail. (or)


Explain Scalar time and Vector time in detail.

Logical clock

Logical clocks are based on capturing chronological and causal relationships of processes and ordering
events based on these relationships.

Three types of logical clock are maintained in distributed systems:

 Scalar clock
 Vector clock
 Matrix clock

Scalar clock

 Scalar time is designed by Lamport to synchronize all the events in distributed systems.
 A Lamport logical clock is an incrementing counter maintained in each process.
 When a process receives a message, it resynchronizes its logical clock with that sender maintaining
causal relationship.

The Lamport’s algorithm is governed using the following rules

 All the process counters start with value 0.


 A process increments its counter for each event (internal event, message sending, message
receiving) in that process.
 When a process sends a message, it includes its incremented counter value with the message.
 On receiving a message, the counter of the recipient is updated by
Max(receiver –counter,message-timestamp)+1

Basic properties of scalar time:

1.Consistency property:
Scalar clock always satisfies monotonicity. C(ei) < C(ej)

2. Total Reordering:

Scalar clocks order the events in distributed systems.But all the events do not follow a common identical
timestamp. Hence a tie breaking mechanism is essential to order the events. The tie breaking is done
through:

 Linearly order process identifiers.


 Process with low identifier value will be given higher priority.

3. Event Counting

Event Counting represents the number of events executed by process.

Vector clock

 Vector Clocks use a vector counter instead of an integer counter.


 The vector clock of a system with N processes is a vector of N counters, one counter per process.

Vector counters have to follow the following update rules:

 Initially, all counters are zero.


 Each time a process experiences an event, it increments its own counter in the vector by one.

 Each time a process sends a message, it includes a copy of its own (incremented) vector in the
message.
 Each time a process receives a message, it increments its own counter in the vector by one and
updates each element in its vector by
Basic Properties of Vector Clock

1. Isomorphism:
 If events in a distributed system are timestamped using a system of vector clocks, we have
the following property.
 If two events x and y have timestamps vh and vk, respectively, then

2. Strong consistency

 The system of vector clocks is strongly consistent; thus, by examining the vector timestamp of two
events, we can determine if the events are causally related.

3. Event Counting

 Event Counting represents the number of events executed by process.

Explain efficient implementations of vector clocks

1. Singhal–Kshemkalyani’s differential technique


1. In this technique, when a process pi sends a message to a process pj, it piggybacks only those
entries of its vector clock that differ since the last message sent to pj.
2. The technique works as follows: if entries i1,i2,...in1 of the vector clock at pi have changed to
v1,v2,v3...vn1 respectively, since the last message sent to pj , then process pi piggybacks a
compressed timestamp of the form:
{(i1,v1),(i2,v2),. . . , (in1,vn1)}
to the next message to pj .
3. When pj receives this message, it updates its vector clock as follows:
vti[ik] = max(vti[ik],vk) for k=1,2,3,....,n
2. Fowler–Zwaenepoel’s direct-dependency technique

Fowler–Zwaenepoel direct dependency technique reduces the size of messages by transmitting only a
scalar value in the messages.

1. Whenever an event occurs at pi, Di[i] := Di[i] + 1


2. When a process pi sends a message to process pj, it piggybacks the updated value of Di[i] in the
message.
3. When pi receives a message from pj with piggybacked value d, pi updates its dependency vector as
follows: Di[j]:= max{Di[j], d}.

How NTP is adopted in physical time synchronization? (or) Discuss about NTP.

Clock synchronization

Clock synchronization is the process of ensuring that physically distributed processors have a common
notion of time.
Basic terminologies

Time: The time of a clock in a machine p is given by the function Cp(t),where Cp(t)= t for a perfect clock.

Frequency: Frequency is the rate at which a clock progresses.

Offset: Clock offset is the difference between the time reported by a clock and the real time.

Skew: clock skew is defined as the difference between the times on two clocks.

Clock Drift rate: clock drift rate is the difference in precision between a reference clock and physical clock.

Clocking Inaccuracies

 Physical clocks are synchronized to an accurate real-time standard like UTC (Universal Coordinated
Time).
 Due to the clock inaccuracy, a timer (clock) is said to be working within its specification if:

Offset delay estimation method

 The Network time Protocol(NTP) which is widely used for clock synchronization on the internet.
 NTP is designed as a hierarchical tree of time servers.
 The primary sever at the root synchronizes with the UTC.
 The next level contains secondary servers, which act as a backup to the primary server.
 At the lowest level is the synchronization subnet which has the clients.

Offset and delay estimation between processes from same servers

 Source node cannot accurately estimate the local time on the target node due to varying message
or network delays between the nodes.
 Figure shows how NTP timestamps are numbered and exchanged between peers A and B.
 Let T1,T2,T3,T4 be the values of the four most recent timestamps .
 Assume clocks A and B are stable and running at the same speed.
 Let a = T1 − T3 and b = T2 − T4.
 If the network delay difference from A to B and from B to A, called differential delay, is small, the
clock offset θ and roundtrip delay δ of B relative to A at time T4 are approximately given by
Clock offset θ= (T1-T3) +(T2-T4)/2

Round trip delay δ=(T1-T3) -(T2-T4)


 =a-b
Offset and delay estimation between processes from different servers
 A pair of servers in symmetric mode exchange pairs of timing messages.
 The offset Oi can be estimated as:

 The round-trip delay is estimated as:

*************************************

You might also like