You are on page 1of 38

DISTRIBUTED SYSTEMS

Introduction
Dr.P.Amudha
Associate Professor
What is an Operating System
An operating system is:

A collection of software components that


• Provides useful abstractions and
• Manages resources to
• Support application programs, and
• Provide an interface for users and programs
Operating System Functions
An operating system’s main functions are to:

• Schedule processes & multiplex CPU


• Provide mechanisms for IPC and
synchronization
• Manage main memory
• Manage other resources
• Provide convenient persistent storage (files)
• Maintain system integrity, handle failures
• Enforce security policies (e.g., access control)
• Give users and processes an interface
Introduction
• We have smartphones, computers everywhere,
remote storage, massive online games, apps, tons of
data…
• The future of computing is distributed
• Technological innovation will have a profound impact
in the future of jobs

4
6 Aug 2021
Definition of a Distributed System (1)
A distributed system is (Tannenbaum):

A collection of independent computers


that appears to its users as a single
coherent system.

A distributed system is (Lamport):

One in which the failure of a computer


you didn't even know existed can
render your own computer unusable
Definitions
– A system in which hardware and software
components of networked computers
communicate and coordinate their activity
only by passing messages.
- A computing platform built with many
computers that:
– Operate concurrently;
– Are physically distributed; (have their own failure modes)
– Are linked by a network;
– Have independent clocks

6
6 Aug 2021
Consequences
• Concurrent execution of processes:
– Non-determinism, race conditions, synchronisation, deadlocks,

• No global clock
– Coordination is done by message exchange
– No single global notion of the correct time
• No global state
– No process has a knowledge of the current global state of the
system.
• Units may fail independently
– Network faults may isolate computers that are still running
– System failures may not be immediately known

7
6 Aug 2021
Why do we have distributed
systems?
• People are distributed but need to work together…
• Hardware needs to be physically close to people
(who are distributed)…
• Information is distributed but needs to be shared
(trustworthily)…
• Hardware can be shared (increases computing power
by doing work in parallel; more efficient resource
utilisation)…

8
6 Aug 2021
Examples of distributed
systems…
• Intra-nets, Inter-net, WWW, email, …
• DNS (Domain Name System)
– Hierarchical distributed database
• Distributed supercomputers, Grid/Cloud computing
• Electronic banking
• Airline reservation systems
• Peer-to-peer networks
• Sensor networks
• Mobile and Pervasive Computing

9
6 Aug 2021
Evolution
• Parallel Computing was a hot topic in the 70s and 80s. (the
vision existed since the 1920s)
– Cluster computers started dominating in the 1990s.
• Early distributed systems:
– Airline reservation systems
– Banking systems
• The real proliferation came with developments in network
technology and the WWW (early 90s)

10
6 Aug 2021
Definition of a Distributed System (2)

Figure 1-1. A distributed system organized as middleware. The


middleware layer extends over multiple machines, and offers
each application the same interface.
Role of Middleware (MW)
• In some early research systems: MW tried to provide
the illusion that a collection of separate machines
was a single computer.
– E.g. NOW project: GLUNIX middleware
• Today:
– clustering software allows independent computers to work
together closely
– MW also supports seamless access to remote services, doesn’t try
to look like a general-purpose OS
Middleware Examples
• CORBA (Common Object Request Broker Architecture)
• DCOM (Distributed Component Object Management) – being
replaced by .net
• Sun’s ONC RPC (Remote Procedure Call)
• RMI (Remote Method Invocation)
• SOAP (Simple Object Access Protocol)
Middleware Examples
• All of the previous examples support communication across a
network:
• They provide protocols that allow a program running on one
kind of computer, using one kind of operating system, to call a
program running on another computer with a different operating
system
– The communicating programs must be running the same middleware.
Challenges
• Performance
• Concurrency
• Failures
• Scalability
• System updates/growth
• Heterogeneity
• Openness
• Multiplicity of ownership, authority
• Security
• Quality of service/user experience
• Transparency
• Debugging
Architecture of Distributed System

• A distributed operating system runs on multiple independent computers, connected


through communication network, but appears to its users as a single virtual
machine and runs its own OS. 

• Sharing of resources is the main motivation of distributed systems.


• Resources may be managed by servers and accessed by clients, or they may be
encapsulated as objects and accessed by client objects.
• Each computer node has its own memory.
Characteristics of Distributed Systems
 
A Distributed System has the following characteristics:
It consists of several independent computers connected through
communication network,
The computers communicate with each other by exchanging message
over a communication network.
Each computer has its own memory, clock and runs its own operating
system.
Each computer has its own resources, called local resources
Remote resources are accessed through the network
Motivation
The prime motivation of distributed systems is to share resources.

A resource is an entity that can be usefully shared among users.

Any hardware or software entity is a resource.

Resources are managed by a service.

A service is managed by one or more servers, which provide access to a set of resources

to clients via a set of well-defined operations (an interface).

The motivation behind the development of Distributed Systems was:


 

Users desire to have computational power at low cost.

Need of the people working in a group to communicate with each other

Sharing of information (data)

Sharing of expensive computer resources.


 
Advantages of Distributed Systems
 
Resource Sharing: Due to communication between connected computers
resources can be shared among computers.
 
Enhance Performance: This is due to the fact that many tasks can be executed
concurrently at different computers. Load distribution among computers can
further improve response time.
 
Improved reliability and availability: Increased reliability is due to the fact that
if few computers fail others are available and hence the system continues.
 
Modular expandability: New hardware and software resources can be added
without replacing the existing resources.
 
 
Inherent Limitations of Distributed Systems
 
•The lack of common memory and system wide common clock is an inherent
problem in distributed systems.
• Without a shared memory, up-to-date information about the state of the
system is not available to every process via a simple memory lookup.
• The state information must therefore be collected through communication.
•In the absence of global time, it becomes difficult to talk about temporal
order of events.
•The combination of unpredictable communication delays and the lack of
global time in a distributed system make it difficult to know how up-to-date
collected state information really is.
GOALS
• Connecting users and Resources

• Transparency

• Openness

• Scalability
Connecting users and Resources

 To make easy for users to access remote resources and


share them with other users
 Resources eg., printers, computers, data, files, web pages,
networks, storage facilities
 Security is important where systems may provide little
protection for intrusion on communication
 Passwords are sent in clear text through network hoping
server is trustworthy.
Distribution Transparency
• Software hides some of the details of the
distribution of system resources.
– Makes the system more user friendly.
• A distributed system that appears to its users
& applications to be a single computer system
is said to be transparent.
– Users & apps should be able to access remote resources in the
same way they access local resources.
• Transparency has several dimensions.
Transparency in a Distributed System
•Transparency - A distributed system that is able to present itself to users
and application as if it were only a single computer.

Figure 1-2. Different forms of transparency in a distributed system


Other forms:
Persistence - Hide wether resource is in volatile memory or on disk
Parallelism – Hide the number of nodes working on a task
Size – Hide the number of components in the system
Revision – Hide changes in software/hardware versions
Openness
 An open distributed system is a system that offers services
according to standard rules that describe the syntax and semantics.

syntax of those services specified through interfaces which are


described in Interface Definition language (IDL).

Interoperability, Portability and flexibility are important.

Interoperability – two implementation's of systems or components


of different manufacturer can work together relying on each other
services

Portability – an application developed for one system can be


executed without modification on another different distributed system.
Scalability
Scalability of a system can be measured along three different
dimensions

size- adding more users and resources

 geographically scalable where users and resources are far apart

Administratively scalable and easy to manage even if it spans many

independent administrative organization.


Scalability Problems
•When a system needs to scale, different types of problems to be solved.
•When more users or resources to be supported, confronted with the limitations of centralized services, data, algorithms
•Eg., if, using single server in a specific machine, then services are centralized
•Server becomes bottlenecks as number of user grows

Figure 1-3. Examples of scalability limitations.


Contd.,

 Sometimes using single server becomes unavoidable.

 Eg., maintain confidential information like medical records, bank


accounts, personal loans etc.,

 Have to be maintained in a single server in a secured room.

 Copying server to several locations may be vulnerable to security


attacks

 Likewise, it is just as bad as centralized services are centralized data

 Tracking telephone numbers and addresses of 50 million people.

 2.5gigabyte disk can be used for storage

 Single database would saturate all communication lines


Contd.,
 If DNS is maintained in a single table, then it is difficult to work with internet

 DNS maintains information on millions of computers and DNS forms an


essential service for locating web servers.

 If each request to resolve a URL have to be forwarded to same DNS server,


then no one can use Web.

 Same is for centralized algorithms: in a large DS, number of messages


have to be routed over many lines.

 Hence information can be collected about the loads on all machines and use
graph theory algorithms to compute optimal routes

 But collecting and transporting information will again overload the network
which must be avoided.

 Hence decentralized algorithms must be used.


Scalability Problems
Characteristics of decentralized algorithms:
• No machine has complete information about the system
state.
• Machines make decisions based only on local information.
• Failure of one machine does not ruin the algorithm.
• There is no implicit assumption that a global clock exists.
(less obvious but important)
eg., at 12:00:00 all machines note the size of output queue.
But will fail because it is impossible to get all clocks
getting synchronized.
Algorithms should take into account of the lack of clock
synchronization
Scalability Examples
Distributed systems are necessary:
• Web search
• Financial transactions
• Multiplayer games
• DNS
• Travel reservation systems
• Utility infrastructure (e.g., power grid)
• Embedded systems (e.g., cars)
• Sensor networks
Contd.,
• Geographical scalability has its own problem.

• It is hard to scale the existing DS designed for LAN is that based


on synchronous communication.

• A party requesting a service is called client and works well in


LAN, where communication between two machines may be in few
hundred microseconds and provides highly reliable
communication

• In WAN, IPC may be hundreds of milliseconds and moreover,


communication is unreliable
Contd.,
Scalability - Administrative
•Different domains may have different policies about resource
usage, management, security, etc.
• Trust often stops at administrative boundaries
– Requires protection from malicious attacks
Scalability Techniques
Three Techniques for scaling:
1. hiding communication latencies
2. distribution
3. Replication
Hiding communication latencies
 It is applicable in geographical scalability
 Avoid waiting for responses to remote service request
 Constructing requesting application it uses only asynchronous communication (eg.,
when requesting for a service at remote machine, do other useful work at requester’s side
which is an alternative to waiting for a reply)
 When reply comes application is interrupted and special handler is called to handle
previously issued request
 But there are many applications that cannot make effective use of asynchronous
communication.
 Eg., in interactive application when a user sends a request, he will not have better work to
do than to wait for the reply.
 Hence better to reduce overall communication
Scaling Techniques (1)

Figure 1-4. The difference between letting (a) a server or (b) a client check forms as they
are being filled.

 Move some part of computation from the server to client


 In accessing the database, forms are filled by sending a separate message for
each field to the server and wait for an acknowledgment
 Normally the server checks for syntactic errors before accepting an entry
 Now ship the code for filling the form to the client and client may return the filled
form
Scaling Techniques (2) Distribution

Figure 1-5. An example of dividing the DNS name space into zones.

• It splits a component into smaller parts and spreading across the system (Eg., The DNS name space is
hierarchically organized into tree of domains, which are divided into non-overlapping zones
• Names in each zone are handled by a single name server
• Resolving a name returns the network address of the associated host
• eg., In the name nl.vu.cs.flits, to resolve, it is passed to the server of zone Z1.
• It returns the address of the server for zone Z2, to which vu.cs.flits can be handed.

• The server for Z2 will return the address of the server for zone Z3, which is capable of handing the last
part of the name and will return the address of the associated host
Scaling Technique 3 - Replication
• It is good idea to replicate components across a distributed system

• Replication increases availability, and also helps to balance the load


between components for better performance

• Caching is a special form of replication and makes a copy of the


resource inorder to be accessed by clients

• Disadv: Having multiple copies, modifying one copy of resource


makes that copy different from others

• Hence caching and replication leads to consistence problem

• Update must be propagated to other copies also.

• If two updates happens concurrently, then copy updating must also


be in same order.
Pitfalls when Developing
Distributed Systems
False assumptions made by first time developer:
• The network is reliable.
• The network is secure.
• The network is homogeneous.
• The topology does not change.
• Latency is zero.
• Bandwidth is infinite.
• Transport cost is zero.
• There is one administrator.

You might also like