Professional Documents
Culture Documents
(PCCO7020T)
1
Teaching Scheme Examination Scheme
2
• Prerequisite: Java Programming, Operating systems, Computer
Network.
• Course Objectives:
To provide students with contemporary knowledge in distributed
systems.
To equip students with skills to analyze and design distributed
applications.
To provide master skills to measure the performance of distributed
synchronization algorithms.
3
Course Outcomes
Blooms Blooms
CO Course Outcomes Description
Level
CO3 Evaluate the various techniques used for clock synchronization and L5 Evaluate
mutual exclusion.
CO5 Explain the concepts of file systems like NFS, AFS. L2 Understand
4
Course Contents
Unit-I 04 Hrs.
Introduction to Distributed Systems
• Characterization of Distributed Systems: Issues, Goals, and Types of
distributed systems, Distributed System Models, Hardware concepts,
Software Concept.
• Middleware: Services offered by middleware, Client Server model.
CO1 Describe knowledge of the basic elements and concepts related to L2 Understand
distributed system technologies.
5
Unit-II 08 Hrs.
Communication
• Layered Protocols, Interprocess communication (IPC): MPI, Remote
Procedure Call (RPC), Remote Object Invocation, Remote Method
Invocation (RMI)
• Message Oriented Communication, Stream Oriented Communication,
Group Communication.
6
Unit-III 08 Hrs.
Synchronization
• Exclusion, Distributed Mutual Exclusion-Classification of mutual
Exclusion Algorithm, Requirements of Mutual Exclusion Algorithms,
Performance measure.
• Non Token based Algorithms: Lamport Algorithm, Ricart–Agrawala’s
Algorithm, Maekawa’s Algorithm.
• Token Based Algorithms: Suzuki-Kasami’s Broardcast Algorithms,
Singhal’s Heurastic Algorithm, Raymond’s Tree based Algorithm,
Comparative Performance Analysis.
CO3 Evaluate the various techniques used for clock synchronization and L5 Evaluate
mutual exclusion.
7
Unit-IV
Resource and Process Management
• Desirable Features of global Scheduling algorithm, Task assignment
approach, Load balancing approach, load sharing approach,
Classification of Static and Dynamic Load Balancing algorithms,
Comparison of LBA.
• Introduction to process management, process migration, Threads.
8
Unit-V 07 Hrs.
Consistency, Replication and Fault Tolerance
• Introduction to replication and consistency, Data-Centric and Client-
Centric Consistency Models, Replica Management.
• Fault Tolerance: Introduction, Process resilience, Reliable client-server
and group communication, Recovery.
9
Unit-VI 07 Hrs.
Distributed File Systems
• Introduction and features of DFS, File models, File Accessing models,
File-Caching Schemes, File Replication.
• Case Study: Distributed File Systems (DFS), Network File System
(NFS), Andrew File System (AFS).
• Trends in Distributed Computing: Edge Computing, Cloud Computing,
Fog Computing.
CO5 Explain the concepts of file systems like NFS, AFS. L2 Understand
10
• Text Books:
1. Andrew S. Tanenbaum and Maarten Van Steen, “Distributed
Systems: Principles and Paradigms”, 2nd Edition, Pearson Education,
2017.
2. George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed
Systems: Concepts and Design”, 5th Edition, Pearson Education,
2011.
• Reference Books:
1. Andrew S. Tanenbaum and Maarten Van Steen, “Distributed
Systems: Principles and Paradigms”, 3rd Edition, Pearson Education,
2017.
2. M. L. Liu, “Distributed Computing Principles and Applications”, 2nd
Edition, Pearson Addison Wesley, 2004.
11
Evaluation Scheme:
Theory:
• Continuous Assessment (A):
Subject teacher will declare Teacher Assessment criteria at the start of semester .
13
Characterization of Distributed Systems
14
Computer architectures consisting of interconnected, multiple processors are basically of
two types:
Tightly Coupled Systems
Loosely Coupled Systems
15
Tightly Coupled Systems:
In these systems, the processors do not share memory and each processor has its own local
memory [Fig. b] .
If a processor writes the value 100 to the memory location x, this write operation will only
change the contents of its local memory and will not affect the contents of the memory of any
other processor.
In these systems, all physical communication between the processors is done by passing
messages across the network that interconnects the processors.
Loosely Coupled Systems are referred to as distributed computing systems, or simply
distributed systems.
17
Examples of distributed systems
Finance and commerce
• e-Commerce -Amazon and eBay
• payments technologies - PayPal
Continued…
19
Issues:
• Issues related to the design of Distributed System:
1. Transparency
2. Flexibility
3. Reliability
4. Performance
5. Scalability
6. Security
7. Fault Tolerance
20
1. Transparency
• It means to hide from users the fact that processes and resources are physically
distributed across the network.
• Types of Transparency:
Access Transparency
- It hides the heterogeneous nature of machines from the users and
allow access to resources whether local or remote.
- The machine across the network may use different data representations and different
operating systems.
- The machines may use different naming conventions for files.
- All these differences must be hidden from the users and applications.
21
Location Transparency
-It implies that users can access any resource irrespective of its location.
-Naming: It is possible to achieve location transparency by assigning logical names to
resources.
-Users should not be able to identify the location of the resources like printers, CPU, Files
etc.
Relocation Transparency:
It hides the movement of a resource when it changes its location to another node
while it is in use.
Other processes that are interacting with this process are not aware of this
movement.
This function is handled by the process migration mechanism of Distributed
Operating System.
23
Replication Transparency:
Replication allows increase in availability and improves performance by providing
access at a nearby location.
Replication supports fault tolerance.
Using replication transparency, nodes of a system can make multiple copies of
any resource like files and databases which are unnoticed by the users.
Concurrency Transparency:
In D.S., the same resource can be accessed by multiple users.
The system is concurrency-transparent, if the users are unaware of each other.
To keep system in consistent state, a resource is locked until it is accessed by a
particular user.
It will be unlocked after the user’s use to make it available for other users in a
sequence.
24
Failure Transparency:
It hides the failure and recovery of a resource.
It refers to how errors and subsequent recoveries of nodes within the system are
invisible to users and application.
A D.S. is failure-transparent if the users can’t notice that a resource has failed and
the system recovers from the failure.
Persistence Transparency:
It implies that users need not be aware that the resource is in volatile memory or
on the disk.
e.g. : Database servers copy the object’s state to memory, perform the operation
and copy the modified object back to the memory.
25
Types of Transparencies (Summary)
26
2. Flexibility
• The D.S. design may need modifications in initial stages.
• To achieve this flexibility, it is to be decided whether to use Monolithic kernel or
Microkernel on each machine.
• Kernel as a central control provides basic system facilities.
• Kernel executing in supervisory mode, has the privilege to access machine’s
physical resources.
• If the process executes in user mode, it has restricted access to resources.
• The major functions of the kernel are: memory management, process
management and resource management.
• The kernel is placed above hardware and below application layer.
Flexibility Continued…
27
Fig: Monolithic Vs Microkernel
• Monolithic kernels use the “kernel does it all’ approach with all
functionalities provided by the kernel irrespective of whether all
machines use it or not.
• Microkernels use ‘minimalist’, modular approach with accessibility to
other services as needed.
28
3. Reliability
A user prefers distributed system where multiple processors are available as this
system becomes reliable in case of crash.
On failure, a back up is available.
Data reliability means data is available without any errors.
In case of data replication, all copies should be consistent.
4. Performance
It implies that application should run in a distributed environment just as it were
running on a single processor.
The performance of a D.S. is measured by the metrics: response time,
throughput, system utilization & amount of network capacity used.
Performance can be optimized by the actions like:
- To batch the messages together
- Data caching
- Computation grain size: only large computations with low interaction rate & less
data transfer should done as remote computations.
29
5. Scalability
Scalability in D.S. has some specific issues:
Concept Example Issues
Centralized A single server for all users With increased no of users, server
Services face manageability issues.
Centralized Data A single on-line telephone book May lead to communication failures
& bottlenecks.
Centralized A single algorithm doing routing based Overloaded situation may occur.
Algorithms on available information
Scalability Continued…
30
Fig: Hide Communication Latencies
31
Scalability Continued…
Fig: Hide Distribution
32
Scalability Continued…
• Hide replication:
• Scalability often leads to degradation of performance.
• So the replication of components across the D.S. is used as the solution for it.
• This increases availability & assist the load balancing.
• But it also incur consistency problem.
33
6. Security
In D.S., user’s information needs to be secure.
Security here has three aspects:
Confidentiality: protection against unauthorized access
Integrity: protection of data against corruption
Availability: protection against failure & always being accessible
Encoding, Encryption techniques support the security of D.S.
7. Fault Tolerance
Ideally D.S. are designed to mask failures.
In a system with multiple servers, if any server breaks down, the other server
takes up the load.
34
Goals
• Four important goals that should be met while building distributed system
• A distributed system should:
2. It should reasonably hide the fact that resources are distributed across a
network;
4. It should be scalable.
35
1. Making Resources Accessible
• The main goal of a distributed system is to make it easy for the users (and
applications) to access remote resources, and to share them in a controlled and
efficient way.
• Resources can be printers, computers, storage facilities, data, files, Web pages,
and networks etc.
• It makes economic sense to share costly re-sources such as supercomputers, high-
performance storage systems, image setters and other expensive peripherals.
• Connecting users and resources also makes it easier to collaborate and ex-
change information - groupware.
• As connectivity and sharing increase, security is becoming increasingly
important.
• A related problem is that increased connectivity can also lead to unwanted
communication (eg: spam mails). As a solution we need special information
filters that select incoming messages based on their content.
36
2. Distribution Transparency
• An important goal of a distributed system is to hide the fact that its processes and
resources are physically distributed across multiple computers.
• A distributed system that is able to present itself to users and applications as if it
were only a single computer system is said to be transparent.
• Types of Transparencies
37
3. Openness
• An open distributed system is a system that offers services according to standard
rules that describe the syntax and semantics of those services.
• In DS, services are generally specified through interfaces, which are often
described in an Interface Definition Language (IDL).
• IDL specifies precisely the names of the functions, types of the parameters,
return values, possible exceptions that can be raised, and so on.
• The services are precisely specified by means of natural language to do what it is
needed.
• An open distributed system should be easy to configure the system out of different
components - extensible.
• To achieve flexibility in open distributed systems, it is crucial that the system is
organized as a collection of relatively small and easily replaceable or adaptable
components.
38
4. Scalability
• Scalability in D.S. has some specific issues:
Concept Example Issues
Centralized A single server for all users With increased no of users, server
Services face manageability issues.
Centralized Data A single on-line telephone book May lead to communication failures
& bottlenecks.
Centralized A single algorithm doing routing based Overloaded situation may occur.
Algorithms on available information
39
Fig: Hide Communication Latencies
41
Distributed System Models
1. Minicomputer
2. Workstation
3. Workstation-Server
4. Processor-Pool
5. Hybrid
42
1. Minicomputer Model
44
2. Workstation Model
Continued…
48
Advantages : As compared to the workstation model, the workstation-server model has several
advantages.
49
4. Processor-Pool Model
Each processor in the pool has its own memory to load and run a system
program or an application program of the distributed computing system
Continued…
51
As shown in the figure, in the pure processor-pool model, the processors
in the pool have no terminals attached directly to them
• users access the system from terminals that are attached to the
network via special devices (small diskless workstations or graphic
terminals)
Continued…
52
Advantages:
In this model, a user does not log onto a particular machine but to the system as
a whole.
• . This is in contrast to other models in which each user has a home machine
(e.g. a workstation or minicomputer)
As compared to the workstation-server model, the processor-pool model allows
better utilization of the available processing power of a distributed computing
system.
• This is because the entire processing power of the system is available for use
by the currently logged-on users.
The processor pool model provides greater flexibility than the workstation-server
model in the sense that the system's services can be easily expanded
Disadvantage:
Unsuitable for high performance interactive applications because of the slow
speed of communication between computer with application program &
interactive terminal
53
5. Hybrid Model
Out of the four models described above, the workstation-server model, is the
most widely used model for building distributed computing systems as a large
number of computer users only perform simple interactive tasks such as editing
jobs, sending electronic mails, and executing small programs.
In a working environment that has groups of users who often perform jobs
needing massive computation, the processor-pool model is more attractive and
suitable.
54
The hybrid model is based on the workstation-server model but with the addition
of a pool of processors.
• The processors in the pool can be allocated dynamically for computations that
are too large for workstations or that require several computers concurrently for
efficient execution.
But the hybrid model is more expensive to implement than the workstation-
server model or the processor-pool model.
55
Types of Distributed Systems
• This discussion of the types of DS, makes distinction between distributed
computing systems, distributed information systems, and distributed embedded
systems
1. Distributed Computing Systems
• It is an important class of distributed systems used for high-performance
computing tasks.
• It can be roughly divided into two subgroups as:
Each cluster consists of a collection of compute nodes that are controlled and accessed
by means of a single master node.
A characteristic feature of cluster computing is its homogeneity. 57
• Grid Computing Systems
Grid computing systems have a high degree of heterogeneity: no assumptions are
made concerning hardware, OS, networks, administrative domains, security
policies, etc.
Resources from different organizations are brought together to allow the
collaboration of a group of people or institutions.
Such a collaboration is realized in the form of a virtual organization.
The people belonging to the same virtual organization have access rights to the
resources (super computers, storage facilities, databases, sensors etc) that are
provided to that organization.
61
• Enterprise Application Integration
• It is the process of bringing different enterprises/businesses together.
• The databases and workflows associated with enterprise/business applications
ensure that the business uses information consistently and that changes in data
done by one business application are reflected correctly in another’s.
• In particular, application components should be able to communicate directly with
each other and not merely by means of the request/reply behavior that was
supported by transaction processing systems.
• The main idea was that existing applications could directly exchange information,
as shown in Fig.
62
Distributed Pervasive Systems
• a distributed pervasive system is part of our surroundings.
• An important feature is the general lack of human administrative control.
• The following three requirements for pervasive applications:
3. Recognize sharing as the default: The devices generally join the system in
order to access (and possibly provide) information to easily read, store,
manage, and share information.
63
Hardware Concept in D.S.
• Two Types:
1. Multiprocessor System
i. Private Memory Architecture
ii. Shared Memory Architecture
2. Multicomputer System
64
1. Multiprocessor System
It is a computer system with two or more central processing units (CPUs)
share full access to a common RAM.
65
2. Multicomputer System:
It is a computer system with multiple processors that are connected together to
solve a problem. Each processor has its own memory and it is accessible by that
particular processor and those processors can communicate with each other via an
interconnection.
• The Operating System is the interaction between user and the hardware.
69
1. Distributed Operating System
70
• A distributed operating system is one that looks to its users like an ordinary
centralized operating system but runs on multiple, independent central processing
units (CPUs).
• It hides the existence of multiple computers and provides a single-system image
to its users. (It makes a collection of networked machines act as a virtual
uniprocessor)
•In this, there is a single system wide operating system and each computer of the
distributed computing system runs a part of this global operating system.
(The processes and several resources are managed globally).
•Fault tolerance capability is usually very high as compared to that of a network
operating system.
• This OS has scalability problem as it supports only limited no of computers with
shared resources.
• Examples: AIX , Solaris , Mach/OS, LOCUS
• Applications: Automated Banking System, Railway Reservation System.
71
2. Network Operating Systems
• The users view the DS as a collection of distinct machines connected by a
communication subsystem. (The users are aware of the fact that multiple
computers are being used).
•Each computer of the Distributed Computing System has its own local operating
system. There is essentially no coordination at all among the computers.
• It provides little or no fault tolerance capability.
• It fails to provide single coherent view of the system.
• Examples: Microsoft Windows Server 2008, UNIX, Linux, Mac OS X
72
3. Middleware Operating Systems
• As distributed OS has lack of scalability & Network OS fails to provide a
single coherent view, a new layer between these two is called Middleware OS.
• Middleware is an intermediate layer of software between the application and the
network.
• It is used in distributed systems to provide common services, such as
authentication, authorization, compilation for best performance on particular
architectures, input/output translation, and error handling.
• It has common set of services for local applications & independent set of
services for remote applications.
• It supports heterogeneity where user gets freedom to write the application using
any supported language under any platform.
• eg: Java Remote Method Invocation (RMI)
73
Services Offered by Middleware
• Naming Service: Used to locate objects/interfaces
• Persistence Service: used to store distributed objects permanently on to data
store.
• Messaging Service: used to send or receive messages in terms of request response
primitive
• Querying Service: used to query on distributed objects
• Concurrency Service: used to share and access the resources concurrently
• Security Service: used to provide security to the various shared resources.
74
Client Server Model
76
• Client-Server Addressing:
Client should know server address for communicating with it.
• Machine Addressing:
• Client sends the address as a part of the message which is extracted by the server.
• If multiple processes running on the server, the process ID should be sent as a part of server.
• Process Addressing:
• In this message is sent to the processes and not to the machines.
• Two part name- one is machine id & other is process id.
• The client uses the machine ID to locate the current machine & the server uses the process id
to locate the process on that machine.
• Name Server Addressing:
• An extra machine is used to map ASCII level names to machine address.
• The process addressing techniques are as follows:
• Hardwire the machine number into client code.
• Processes pick random address & the machine address is located by broadcast method.
77
Fig: Addressing Techniques
78
• Client-Server Implementation
Reference Books:
• Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles
and Paradigms”, 3rd Edition, Pearson Education, 2017.
• M. L. Liu, “Distributed Computing Principles and Applications”, 2nd Edition, Pearson
Addison Wesley, 2004.
80