You are on page 1of 29

Mrs. Sameera Begum(CSE Dept.

, DCET)

DECCAN COLLEGE OF ENGINEERING AND TECHNOLOGY


Dar-Us-Salam, Hyderabad

Department of Computer Science & Engineering

SUBJECT: ADVANCED OPERATING SYSTEMS-(PE 732 CS)

B.E (CSE) VII SEM(AICTE)-2023-2024

UNIT-I
Architectures of Distributed Systems:
 System Architecture Types
 Issues in distributed Systems
Theoretical Foundations:
 Introduction
 Limitations of Distributed Systems
 Lamport’s Logical Clocks
 Vector Clock’s
 Casual Ordering of messages
 Global State Reordering Algorithm
 Cuts of a distributed Computation
 Termination Detection

AOS(UNIT-1) 1
Mrs. Sameera Begum(CSE Dept., DCET)

UNIT I

Introduction:

Distributed System is used to describe a system with following characteristics:

• Several computers that do not share memory or a clock.

• communicate with each other by exchanging messages over a


communication network.

• Local and Remote Resources

(The resources owned and controlled by a computer are said to be local,


while the resources owned and controlled by other computers and those that
can only be accessed through the network are said to be remote.

Advantages over traditional time-sharing systems:

Resource sharing. Since a computer can request a service from another computer
by sending an appropriate request to it over the communication network, hardware
and software resources can be shared among computers. For example, a printer, a
compiler, a text processor, or a database at a computer can be shared with remote
computers.
Enhanced performance. A distributed computing system is capable of providing
rapid response time and higher system throughput. This ability is mainly due to the
fact that many tasks can be concurrently executed at different computers.
Moreover, distributed systems can employ a load distributing technique to improve
response time. In load distributing, tasks at heavily loaded computers are

AOS(UNIT-1) 2
Mrs. Sameera Begum(CSE Dept., DCET)

transferred to lightly loaded computers, thereby reducing the time tasks wait before
receiving service.
Improved reliability and availability. A distributed computing system provides
improved reliability and availability because a few components of the system can
fail without affecting the availability of the rest of the system. Also, through the
replication of data (e.g., files and directories) and services, distributed systems can
be made fault tolerant. Services are processes that provide functionality (e.g., a file
service provides file system management; a mail service provides an electronic
mail facility).
Modular expandability. Distributed computing systems are inherently amenable
to modular expansion because new hardware and software resources can be easily
added without replacing the existing resources.

System Architecture Types

Distributed systems is classified into three broad categories

 Minicomputer model

AOS(UNIT-1) 3
Mrs. Sameera Begum(CSE Dept., DCET)

In this model, distributed system consists of several minicomputers(Eg VAXs).

Each computer supports multiple users and provides access to remote resources.

The ratio of no. of processors to users is less than one.

 Workstation model

In this model, distributed system consists of a number of workstations(up to


several thousand).

Each user has a workstation at his disposal where all the users work is performed.
With the help of distributed file system,a user can access data regardless of its
location

The ratio of no. of processors to the no. of users is equal to one.

 Processor pool model

This model attempts to allocates one or more processors according to users need

Once the task is completed, the processors are returned to the pool.

The ratio of no. of processors to users is normally greater than one.

Distributed Operating System

Definition:

Operating System –Is a Program that manages resources of a computer


system and provides users with a friendly interface to the system.

Distributed OS(DOS)--Extends the concept of Resource Management and


User Interface to a distributed system consisting of several autonomous systems
connected through a communication N/W
AOS(UNIT-1) 4
Mrs. Sameera Begum(CSE Dept., DCET)

A DOS appears to its users as a centralized operating system for a single machine
but it runs on multiple –independent computers.

It maintains TRANSPERENCY in the way the multiple processors


communicating with each other and the remote access of available resources.

Issues in Distributed Operating Systems

1. Global Knowledge
2. Naming
3. Scalability
4. Compatibility
5. Process Synchronization

AOS(UNIT-1) 5
Mrs. Sameera Begum(CSE Dept., DCET)

6. Resource Management
7. Security
8. Structuring
9. Client-Server Model

 Global Knowledge

In the case of shared memory computer systems, the up-to-date state of all the
processes and resources, in other words, the global (entire) state of the system,
is completely and accurately known. Hence, the potentially problematic issues
that arise in the design of these systems are well understood and efficient
solutions to them exist.
In distributed computing systems, these same issues take on new dimensions
and their solutions become much more complex for the following reasons. Due
to the unavailability of a global memory and a global clock, and due to
unpredictable message delays, it is practically impossible for a computer to
collect up-to-date information about the global state of the distributed
computing system
Therefore, a fundamental problem in the design of a distributed operating
system is to determine efficient techniques to implement decentralized system
wide control, where a computer does not know the current and complete status
of the global state
 Naming

Names are used to refer objects. Objects are computers, printers, services, files and
users. An example of a service is an name service

AOS(UNIT-1) 6
Mrs. Sameera Begum(CSE Dept., DCET)

Name service maps logical name to physical address

Make use of: Table look-up or algorithm.(In the implementation of Table lookup,
tables(also known as directories) that store names and their physical addresses are
used for mapping names to their addresses.

In Distributed systems, the directories are either replicated or partitioned in DOS.

Demerits of replicated directories:

More storage needed

Synchronization requirements needed to meet when updates are done

With partitioned directories in DOS, Updates may be a problem and difficult to


resolve name to (IP) address mapping.

 Scalability

The techniques used in designing a system should not result in system


unavailability or Performance should not be degraded when the system grows.

For example, broadcast based protocols work well for small systems( systems
having a small number of computers) but not for large systems.

System requirements should (ideally) increase linearly with the number of


computer systems

Includes: overheads for message exchange in algorithms used for file system
updates, directory management...

 Compatibility

The three different levels of compatibility that exits in distributed systems are

AOS(UNIT-1) 7
Mrs. Sameera Begum(CSE Dept., DCET)

Binary level: All Processors execute the same binary instruction, even though
the processors may differ in performance and in input-output. It exhibits binary
level compatibility

Execution level: This level of compatibility is said to exist in a distributed system


if the same source code can be compiled and executed properly on any computer in
the system.

Protocol level: It is the least restrictive form of compatibility. It achieves


interoperability by requiring all system components to support a common set of
protocols.

 Process Synchronization

The Synchronization of processes in distributed systems is Difficult because of


unavailability of shared memory. The distributed operating system has to
synchronize processes running at different computers when they try to
concurrently access a shared resource such as file directory. It is necessary that the
shared resource be accessed by a single process at a time and this process is called
Mutual exclusion problem.Concurrent access to a single resource by several un
coordinated processes must be serialized to avoid dead locks.

 Resource Management

Resource Management in distributed operating system is concerned with


making both local and remote resources available to users in an effective
manner.

Specific location of a resource must be hidden from the user.

AOS(UNIT-1) 8
Mrs. Sameera Begum(CSE Dept., DCET)

Resources are made available in following ways:

Data Migration: Data is brought to location of computation that need access to it


by the distributed operating system The data may be afile or contents of physical
memory.

Main issues: consistency, minimization of delays.

Computation Migration: Computation is moved to the location where required.


For example

In distributed scheduling, one computer may require another computer status. It is


more efficient and safe to find this information at the remote and send the required
information back, rather than to transfer the private data structure of the operating
system

RPC is mainly used in migrating the computation and for providing


communication between computers..

 Distributed Scheduling:

In Distributed Scheduling, processes are transferred from one computer to


another by distributed operating system.

Process originated at one computer is executed at a different computer.

Improves performance of overall system.

AOS(UNIT-1) 9
Mrs. Sameera Begum(CSE Dept., DCET)

 Security

Two issues of security:

Authentication: It is the process of guaranteeing that an entity is what it claims to


be.

Authorization :It is the process of deciding what privileges an entity has and
making only these privileges available.

 Structuring

The Structure of an operating system defines how various parts of operating


system are organized

Monolithic Kernel:Construct One big Monolithic kernel which provides all OS


Services. Not needed in DOS as services are scattered.

Collective kernel: In the collective kernel structure, an operating system is


structured as a collection of processes that are largely independent of each other.
OS Services are implemented as independent processes.

The nucleus of the operating system also referred as Micro kernel supports the
interaction(through messages) between the processes providing the system
services.

Object-oriented operating system: It Implements services as objects. The


operating system that is structured using objects is known as Object-oriented
operating system.

Object types: process, directory, file, …

Operations on the objects: encapsulated data can be manipulated.

AOS(UNIT-1) 10
Mrs. Sameera Begum(CSE Dept., DCET)

Communication in DOS

Theoretical Foundations – Logical Clock

The inherent Limitations of Distributed Systems caused by the lack of common


memory and a system wide common clock that can be shared by all the processes

Limitations of Distributed Systems

1.Absence of Global Clock

AOS(UNIT-1) 11
Mrs. Sameera Begum(CSE Dept., DCET)

2.Absence of Shared Memory

 A Distributed System with two sites:

AOS(UNIT-1) 12
Mrs. Sameera Begum(CSE Dept., DCET)

AOS(UNIT-1) 13
Mrs. Sameera Begum(CSE Dept., DCET)

LAMPORT’S LOGICAL CLOCKS

Lamport’s proposed the scheme to order events in a distributed system using


logical clocks. The execution of processes is characterized by a sequence of events.
Depending on the application, the execution of the procedure could be one event or
the execution of an instruction could be one event. When processes exchange
messages, sending a message constitutes one event and receiving a message
constitutes one event.

A relation that orders events based on the behavior of the underlying computation
is as follows

Ordering of Events

Happened Before relationship: The Happened Before relation captures the casual
dependencies between events. i.e whether two events are casually related or not.
The relation  is defined as follows.

For two events a and b, a → b if

 a and b are events in the same process and a occurred before b

 a is a send event of a message m and b is the corresponding receive


event at the destination process

 If a → b and b → c, then a → c i.e “” relation is transitive.

Causally Related Events: Event a casually affects event b if ab

Concurrent Events: Two distinct events a and b are said to be concurrent

(denoted by a||b) if

a → b means that a is a potential cause of b

AOS(UNIT-1) 14
Mrs. Sameera Begum(CSE Dept., DCET)

Example: In the space-time diagram of below, e11, e12, e13, and e14 are in
process P; events and e21,e22, e23, and e24 are events in message transfers
process P. The arrows represent between the processes. For example, arrow
e12¬23 message corresponds to a sent from process P1 to process P2, e12 is the
event of sending the message at P1, and e23 is the event of receiving the same
message at P2. In Fig. 5.2, we se that e22 e13, e13 ’->e14, and therefore e22 
e14. In other words, event 22 causally

affects event e14. Note that whenever ab holds for two events a and b, there
exists a path from a to b which moves only forward along the time axis in the
space-time diagram. Events e21 and e11 are concurrent even though e11 appears to
have occurred before e21 in real (global) time for a global observer.

Logical clocks

In order to realize the relation , Lamport introduced the following system of


logical clocks. There is a clock Ci at each process Pi in the system.

The clock Ci can be thought of as a function that assigns a number Ci(a) to any
event a, called the timestamp of event a, at Pi

AOS(UNIT-1) 15
Mrs. Sameera Begum(CSE Dept., DCET)

The numbers assigned by the system of clocks have no relation to physical time,
and hence the name logical clocks

The timestamp of an event is the value of the clock when it occurs.

CONDITIONS SATISFIED BY THE SYSTEM OF CLOCKS

AOS(UNIT-1) 16
Mrs. Sameera Begum(CSE Dept., DCET)

Example

Both the clock values Cp1, and Cp2, are assumed to be zero initially and d is
assumed to be 1. e11 is an internal event in process P1 which causes CP1, to be
incremented to 1 due to IR1. Similarly, e21 and e22 are two events in P2 resulting
CP2= 2 due to IR1. e16 is a message send event in P1 which increments Cp1, to 6
due to IRI. e16 is a message send event in P1 which increments Cp1 to 6 due to IR1

The message is assigned a timestamp = 6. The event e25, corresponding to the


receive event of the above message, increments the clock Cp2, to 7 (max(4+1,6+1)

AOS(UNIT-1) 17
Mrs. Sameera Begum(CSE Dept., DCET)

due to rules IR1 and IR2. Similarly, e24 is a send event in P2. The message is
assigned a timestamp = 4. The event e17 Corresponding to the receive event of the
above message increments the clock Cp1, to 7 (max(6+1, 4+1)) due to rules IR1
and and IR2

VIRTUAL TIME. Lamport's system of logical clocks implements an


approximation to global/physical time, which is referred to as virtual time. Virtual
time advances along with the progression of events and is therefore discrete. If no
events occur in the system, virtual time stops, unlike physical time which
continuously progresses. Therefore, to wait for a virtual time instant to pass is risky
as it may never occur

Limitation of Lamport’s Clock

If a → b implies C(a) < C(b)

BUT

C(a) < C(b) doesn’t imply a → b !!( If a and b are events in different processes)

So not a true clock

AOS(UNIT-1) 18
Mrs. Sameera Begum(CSE Dept., DCET)

Vector Clocks

Let n be the number of processes in distributed system. Each process Pi is


equipped with a clock Ci. Clock Ci can be thought of as a function that assigns a
vector Ci(a) to any event a.

Implementation rules:

[IR1] Clock Ci is incremented between any two successive events in process Pi

Ci[i]:= Ci[i]+d

AOS(UNIT-1) 19
Mrs. Sameera Begum(CSE Dept., DCET)

[IR2] If event a is sending message m by process Pi, then message m is assigned a


timestamp tm=Ci(a), On receiving the same message m by process Pj, Cj is
updated as follows:

Cj[k] = max(Cj[k], tm[k]) for all k

For events a and b with vector timestamps ta and tb,

• ta = tb iff for all i, ta[i] = tb[i] Equal

• ta ≠ tb iff for some i, ta[i] ≠ tb[i] Not Equal

• ta ≤ tb iff for all i, ta[i] ≤ tb[i] Less than or equal

• ta < tb iff (ta ≤ tb and ta ≠ tb) less than

• ta || tb iff (ta < tb and tb < ta) Concurrent

• a → b iff ta < tb

• Events a and b are causally related iff ta < tb or tb < ta, else they are
concurrent

• Note that this is still not a total order

CAUSAL ORDERING OF MESSAGES

The Causal ordering of messages deals with the notion of maintaining the same
casual relationship that holds among “message send” events with the
corresponding “message receive” events.

• If send(m1)→ send(m2), then every recipient of both message m1 and m2


must “recieve” m1 before m2. (where send(m) is the event sending message
M)

AOS(UNIT-1) 20
Mrs. Sameera Begum(CSE Dept., DCET)

Causal Ordering of Messages

The above figure shows a violation of casual ordering of messages in s distributed


system.

Send(M1) Send(M2) , M2 is delivered before M1 to process p3

There are two protocols to make use of vector clocks for the causal ordering.

Briman – Schiper –Stephenson Protocol

Schiper – Egglii – Sandoz Protocol

AOS(UNIT-1) 21
Mrs. Sameera Begum(CSE Dept., DCET)

CAPTURING GLOBAL STATE

Global State Collection

Issues:

– Need to capture both node and channel states

– system cannot be stopped

– no global clock

Some notations:

– LSi : local state of process i

– send(mij) : send event of message mij from process i to process j

– rec(mij) : receive event of messag mij

– time(x) : time at which state x was recorded

– time (send(m)) : time at which send(m) occured


AOS(UNIT-1) 22
Mrs. Sameera Begum(CSE Dept., DCET)

For a message mij sent by Si to Sj, we say that

send(mij) є LSi iff time(send(mij)) < time(LSi)

rec(mij) є LSj iff time(rec(mij)) < time(LSj)

transit(LSi,LSj) = { mij | send(mij) є LSi and rec(mij) ≠ LSj}

inconsistent(LSi, LSj) = {mij | send(mij) ≠LSi and rec(mij) є LSj}

Global state: collection of local states

GS = {LS1, LS2,…, LSn}

GS is consistent iff

for all i, j, 1 ≤ i, j ≤ n,

inconsistent(LSi, LSj) = Ф

GS is transitless iff

for all i, j, 1 ≤ i, j ≤ n,

transit(LSi, LSj) = Ф

Strongly consistent global state. A global state is strongly consistent if it is


consistent and transitless. In a strongly consistent state, not only the send events of
all the recorded received events are recorded, but the receive events of all the
recorded send events are also recorded. Thus, a strongly consistent state
corresponds to a consistent global state in which all channels are empty. In Fig. the
global state {LS11, LS21, LS31} is a strongly consistent global state.

AOS(UNIT-1) 23
Mrs. Sameera Begum(CSE Dept., DCET)

Chandy-Lamport’s Algorithm

• Uses special marker messages.

• One process acts as initiator, starts the state collection by following the
marker sending rule below.

• Marker sending rule for process P:

– P records its state; then for each outgoing channel C from P on which
a marker has not been sent already, P sends a marker along C before
any further message is sent on C

Marker Receiving Rule for Q:

• Marker Receiving Rule for Q

• When Q receives a marker along a channel C:

– If Q has not recorded its state then Q records the state of C as empty;
Q then follows the marker sending rule

– If Q has already recorded its state, it records the state of C as the


sequence of messages received along C after Q’s state was recorded
and before Q received the marker along C

Points to Note:
AOS(UNIT-1) 24
Mrs. Sameera Begum(CSE Dept., DCET)

• Markers sent on a channel distinguish messages sent on the channel before


the sender recorded its states and the messages sent after the sender recorded
its state

• The state collected may not be any state that actually happened in reality,
rather a state that “could have” happened

• Requires FIFO channels

• Network should be strongly connected (works obviously for connected,


undirected also)

• Message complexity O(|E|), where E = no. of links

CUTS OF A DISTRIBUTED COMPUTATION

A Cut is a graphical representation of a global state. A consistent cut is a graphical


representation of a consistent global state.

CUT: A cut of a distributed computation is a set C={c1,c2,c3,……cn} where ci is


the cut event at site Si. Graphically Cut is a zig-zig line that connects the
corresponding cut events in the time-space diagram.

In the given example, events c1,c2,c3 and c4 form a cut.

If a cut event ci at site Si is Si’s local state at that instant, then clearly cut denotes
a global state of the system.

CONSISTENT CUT: Let ek denote an event at site Sk. A cut C={c1,c2,c3…..cn}


is a consistent c tiff

AOS(UNIT-1) 25
Mrs. Sameera Begum(CSE Dept., DCET)

Where ci є C and cj є C

That is a cut is consistent cut if every message that was received before a cut event
was sent before the cut event at the sender site in the cut.

For example in the below figure the cut is not consistent because the message sent
by S2 is received before c3 but the corresponding send did not occur before event
c2. Thai is, ee’,e’c3 and ec2

AOS(UNIT-1) 26
Mrs. Sameera Begum(CSE Dept., DCET)

Termination Detection

A distributed computation generally consists of a set of cooperating


processes which communicate with each other by exchanging messages. In the
case of a distributed computation, it is important to know when the computation
has terminated.

A process may either be in an active state or idle state. Only active processes
can send messages. An active process may become idle at any time. An idle
process can become active on receiving a computation message. Computation
messages are those that are related to the underlying computation being performed
by
the cooperating processes. A computation is said to have terminated if and only if
all the processes are idle and there are no messages in transit. The messages sent
by the termination detection algorithm are referred to as control messages.

Basic Idea

One of the cooperating processes monitors the computation and is called the
controlling agent. Initially all processes are idle, the controlling agent’s weight
equals 1, and the weight of the rest of the processes is zero. The computation starts
when the controlling agent sends a computation message to one of the processes.
Any time a process sends a message, the process’s weight is split between itself
and the process receiving the message

The weight received along with a message is added to the weight of the process.
Thus, the algorithm assigns a weight W (0 < W < 1) to each active process
(including the controlling agent) and to each message in transit.

AOS(UNIT-1) 27
Mrs. Sameera Begum(CSE Dept., DCET)

On finishing the computation, a process sends its weight to the controlling agent,
which adds the received weight to its own weight. When the weight of the
controlling agent is once again equal to 1, it concludes that the computation has
terminated.

NOTATIONS. The following notations are used in the algorithm:


• B(DW) = Computation message sent as a part of the computation and DW is the
weight assigned to it.
• C{DW) = Control message sent from the processes to the controlling agent and
DW is the weight assigned to it.

Huang’s Termination Detection Algorithm


Rule 1. The controlling agent or an active process having weight W may send a
computation message to a process P by doing:
Derive W1and W2 such that
W1 + W2 = W, W1 >0, W2 > 0;
W := Wi;
Send B(W2) to P;
Rule 2. On receipt of B(DW), a process P having weight W does:
W= W + DW;
If P is idle, P becomes active;
Rule 3. An active process having weight W may become idle at any time by doing:
Send C(W) to the controlling agent;
W := 0;
(The process becomes idle);

AOS(UNIT-1) 28
Mrs. Sameera Begum(CSE Dept., DCET)

Rule 4. On receiving C(DW), the controlling agent having weight W takes the
following actions:
W := W + DW\
If W = 1, conclude that the computation has terminated

AOS(UNIT-1) 29

You might also like