Introduction Distributed Syetem and Principles For Distributed System

1 Introduction to Distributed System
1.1 Introduction
1.1.1 Goal of Distributed System
1.1.2 Types of Distributed System
1.1.3 Characterization of DS
1.2 Architecture
1.3 Naming
1.4 Process
1.5 Communication
1.6 Synchronization
1.7 Fault Tolerance
Faizur Rashid (PhD)
1
1.1. Introduction and Definition
• before the mid-80s, computers were
 very expensive (hundred of thousands or even millions of dollars)
 very slow (a few thousand instructions per second)
 not connected among themselves
 after the mid-80s: two major developments
 cheap and powerful microprocessor-based computers appeared
 computer networks
 LANs at speeds ranging from 10 to 1000 Mbps (now even
10Gbps)
 WANs at speed ranging from 64 Kbps to gigabits/sec
• Consequence
 feasibility of using a large network of computers to
work for the same application; this is in contrast to the
old centralized systems where there was a single
computer with its peripherals 2
• Definition of a Distributed System
 distributed system : a collection of

independent computers that appears to its
users as a single coherent system -computer
(Tanenbaum& Van Steen)
 this definition has two aspects:
1.hardware: autonomous machines
2.software: a single system view for the users
3
Other Definition
• A distributed system is a system designed to support the
development of applications and services which can
exploit a physical architecture consisting of multiple,
autonomous processing elements that do not share
primary memory but cooperate by sending
asynchronous messages over a communication network
(Blair & Stefani)
4
• Why Distributed?
• Resource and Data Sharing
 printers, databases, multimedia servers, ...
• Availability, Reliability
 the loss of some instances can be hidden
• Scalability, Extensibility
 the system grows with demand (e.g., extra servers)
• Performance
 huge power (CPU, memory, ...) available
• Inherent distribution, communication
 organizational distribution, e-mail, video
5
• Characteristics of Distributed Systems
• differences between the computers and the ways they
communicate are hidden from users
• users and applications can interact with a distributed
system in a consistent and uniform way regardless of
location
• distributed systems should be easy to expand and scale
• a distributed system is normally continuously available,
• even if there may be partial failures
6
1.1.1. Goals of a Distributed System
• to support heterogeneous computers and networks and to provide a
single-system view, a distributed system is often organized by
means of a layer of software called middleware that extends over
multiple machines
• a distributed system organized as middleware; note that

the middleware layer extends over multiple machines,
and offers each application the same interface
• Note: most diagrams in all slides are taken from the text
book 7
• a distributed system should
 easily connect users with resources (printers, computers, storage
facilities, data, files, Web pages, ...)
 Some of the reasons
– economics: sharing resources such as printers and high-speed

computers
– to collaborate and exchange information
– groupware: software for collaborative editing, teleconferencing, etc.
– e-commerce: buying and selling goods
 be transparent: hide the fact that the resources and processes are
distributed across multiple computers
 be open
 be scalable
 Transparency in a Distributed System
 a distributed system that is able to present itself to users and applications as if it were only a
single computer system is said to be transparent
8
• Different forms of transparency in a distributed system
• Transparency Description
• AccessHide differences in data representation (endianness, file
naming, ...) and how a resource is accessed
• Location Hide where a resource is physically located; where is
http://www.prenhall.com/index.html? (naming)
• Migration Hide that a resource may move to another location
• Relocation Hide that a resource may be moved to another
location while in use; e.g., mobile users using their
wireless laptops and moving from place to
place
• Replication Hide that a resource is replicated (for availability
and performance); all replicas have the same name
• Concurrency Hide that a resource may be shared by several
competitive users; a resource must be left in a
consistent state; through locking
• Failure Hide the failure and recovery of a resource
9
• Openness in a Distributed System
• a distributed system should be open
• we need well-defined interfaces
• Interoperability
 components of different origin can communicate.
• portability
 components work on different platforms
• another goal of an open distributed system is that it should be
flexible and extensible; easy to configure the system out of different
components; easy to add new components, replace existing ones;
 Open Distributed System is a system that offers services according
to standard rules that describe the syntax and semantics of those
services;
10
• Scalability in Distributed Systems
• a distributed system should be scalable; there are three dimensions
 size: adding more users and resources to the system
 geographically: users and resources may be far apart
 administratively: should be easy to manage even if it spans many
administrative organizations
11
• Concept Example
• Centralized services Single server for all users-mostly for
security reasons
• Centralized data A single on-line telephone book
• Centralized algorithms Doing routing based on complete
information
• Scaling Techniques: how to solve scaling problems
 the problem is mainly performance, and arises as a result of
limitations in the capacity of servers and networks (for
geographical scalability with high latency and mostly unreliable
links)
 three possible solutions: hiding communication latencies,
distribution, and replication
12
 How to solve scaling problems ?

 How the Performance of Scaling Technique
(Latency) can be improved?
13
1.1.2. Types of Distributed System
Three types: distributed computing systems, distributed information
systems, and distributed pervasive/embedded systems
1.Distributed Computing Systems
-Used for high-performance computing tasks
-two types: cluster computing and grid computing
-Cluster Computing
 a collection of similar workstations or PCs (homogeneous),
closely connected by means of a high-speed LAN
 each node runs the same operating system
 used for parallel programming in which a single compute intensive
program is run in parallel on multiple machines
14
• an example of a cluster computing system􀂄

• a master node runs a middleware (containing libraries for parallel
programs) and controls other compute nodes;
• it allocates tasks
• provides an interface to users etc.
15
Grid Computing
• Resource sharing and coordinated problem solving in dynamic,
multi-institutional virtual organizations”(Ian Foster)
• high degree of heterogeneity: no assumptions are made concerning
hardware, operating systems, networks, administrative domains,
security policies, etc.
• Globus is a software system for Grid Computing; read about the
Globus Alliance at http://www.globus.org/
2. Distributed Information Systems
• many networked applications
• Problem: interoperability
• at the lowest level: wrap a number of requests into a single larger
request and have it executed as a distributed transaction; all or
none of the requests would be executed
• how to let applications communicate directly with each other, i.e.,
Enterprise Application Integration (EAI)
16
• The Transaction Model
• the model for transactions comes from the world of business
• a supplier and a retailer negotiate on
 Price
 delivery date
 quality etc.
• until the deal is concluded they can continue negotiating
or one of them can terminate
• but once they have reached an agreement they are
bound by law to carry out their part of the deal
• transactions between processes is similar with this
17
scenario
• e.g., assume the following banking operation
 withdraw an amount x from account 1
 deposit the amount x to account 2
• what happens if there is a problem after the first activity is carried
out
• group the two operations into one transaction; either both are
carried out or neither
• we need a way to roll back when a transaction is not completed
18
2.1. Architecture
2.1 Architectural Styles
• refers to the logical organization of distributed systems
into software components
• a component is a modular unit with well-defined,
required and provided interfaces that is replaceable
within its environment; can be replaced provided that we
respect its interfaces
• a connector is a mechanism that mediates
communication, coordination, or cooperation among
components, e.g., facilities for RPC, message passing, or
streaming multimedia data
• there are various architectural styles
 Layered architectures
 Object-based architectures
 Data-centered architectures
19
 Event-based architectures
1.2. Architecture
• Layered architectures
• components are organized in a layered fashion where a component at layer
Li is allowed to call components at the underlying layer Li-1, but not the
other way around;
• requests go down the hierarchy and results flow Upward
• e.g., network layers
the layered architectural style 20

1.2. Architecture Style
• Object-based architectures
• each object corresponds to a component and these components are
connected through a remote procedure call mechanism (matches the client-
server paradigm)
the object-based architectural style

21
2.1. Architecture Style
• Data-centered architectures
• processes communicate through a common repository; e.g., a shared
distributed file system
• Event-based architectures
• processes communicate through the propagation of events (can also
optionally carry data) publish/subscribe systems
• processes publish events and the middleware ensures that only those
processes that subscribed to those events will receive them
• processes are loosely coupled;

no need of explicitly referring to
each other
22
the event-based architectural style
2.2.1. Centralized Architectures
• a server may sometimes act as a client leading to a
physically three-tiered architecture; an example is the
organization of Web sites
three tiered architecture: an example of a server acting as a client
23
1.3 Naming
1.3.1 Name, Identifier, Address

1.3.2 Flat Naming
1.3.3 Structured Naming
Faizur Rashid (Dr.)
24
1.3.1. Introduction
• names play an important role to:
 share resources
 uniquely identify entities
 refer to locations
 etc.
• an important issue is that a name can be resolved to the entity it
refers to
• to resolve names, it is necessary to implement a naming system
• in a distributed system, the implementation of a naming system is
itself often distributed.
• Efficiency and scalability of the naming system are the main issues
25
1.3.1 Names, Identifiers, and Addresses
• a name in a distributed system is a string of bits or
characters that is used to refer to an entity
• an entity is anything; e.g., resources such as hosts,
printers, disks, files, objects, processes, users, Web
pages, newsgroups, mailboxes, network connections, ...
• entities can be operated on
 e.g., a resource such as a printer offers an interface
containing operations for printing a document,
requesting the status of a job, etc.
 a network connection may provide operations for
sending and receiving data, setting quality of service
parameters, etc.
• to operate on an entity, it is necessary to access it through its
access point, itself an entity (special) 26
Access point
• the name of an access point is called an address(such as
IP address and port number as used by the transport
layer)
• the address of the access point of an entity is also
referred to as the address of the entity
• an entity can have more than one access point (similar to
accessing an individual through different telephone
numbers)
• an entity may change its access point in the course of
time (e.g., a mobile computer getting a new IP address as
it moves)
27
Examples
• name of an FTP server (entity)
 URL of the FTP server
• address of the FTP server
 IP number: port number
• the address of the FTP server may change
• there are three classes on naming systems: flat naming, structured
naming, and attribute-based naming
28
1.3.2 Flat Naming
• a name is a sequence of characters without structure;
like human names? may be if it is not an Ethiopian name
• difficult to be used in a large system since it must be
centrally controlled to avoid duplication
• moreover, it does not contain any information on how to
locate the access point of its associated entity
• how are flat names resolved (or how to locate an entity
when a flat name is given)
 name resolution: mapping a name to an address or an
address to a name is called name-address resolution
 possible solutions: simple solutions, home-based
approaches, and hierarchical approaches
29
1.3.2 Flat Naming
1.Simple Solutions
 two solutions (for LANs only): broadcasting and Multicasting,
and Forwarding Pointers
a. Broadcasting and Multicasting
 broadcast a message containing the identifier of an entity; only
machines that can offer an access point for the entity send a
reply
 e.g., ARP (Address Resolution Protocol) in the Internet to find the
data link address (MAC address) of a machine
 a computer that wants to access another computer for which it
knows its IP address broadcasts this address
b. Forwarding Pointers
 how to look for mobile entities
 when an entity moves from A to B, it leaves
behind a reference to its new location 30
1.3.2 Flat Naming
2.Home-Based Approaches
• Broadcasting and multicasting have scalability problems;
performance and broken links are problems in forwarding pointers
• a home location keeps track of the current location of an entity;
often it is the place where an entity was created
31
1.3.2 Flat Naming
problems:
• creates communication latency (Triangle routing: correspondent-
home network-mobile)
• the home location must always exist;
• the host is unreachable if the home does no more exist
(permanently changed);
• the solution is to register the home at a traditional name service and
let a client first look up the location of the home
32
1.3.2 Flat Naming
3.Hierarchical Approaches
• a generalization of the two-tiered approach into multiple layers
• a network is divided into a collection of domains, similar to DNS
• a single top-level domain spans the entire network
• each domain can be subdivided into multiple, smaller domains
• the lowest-level domain is called a leaf domain; typically a LAN
• each domain D has an associated directory node dir(D) that keeps
track of the entities in that domain leading to a tree of directory
nodes
• the root (directory) node knows about all entities
33
1.3.2 Flat Naming
• hierarchical organization of a location service into domains, each

having an associated directory node
34
1.3.2 Flat Naming
• each entity is represented by a location record in the directory node
dir(D) to keep track of its whereabouts
• a location record for an entity in a leaf domain contains the entity’s
current address; all other high-level domains will have only pointers
to this address; this means the root node will store only pointers to
all entities
• an entity may have multiple addresses, for instance, if it is
replicated; a higher level domain containing the two sub domains
where the entity has addresses will have two pointers
35
1.3.2 Flat Naming
• an example of storing information of an entity having two addresses

in different leaf domains D1 and D2
36
1.3.2 Flat Naming
• example of a look up operation
 a client (in Domain D) would like to locate an entity E
looking up a location in a hierarchically organized location service

37
• flat names are not convenient for humans
• Name Spaces
• names are organized into a name space
 each name is made of several parts; the first may define the
nature of the organization, the second the name, the third
departments, …
 the authority to assign and control the name spaces can be
decentralized where a central authority assigns only the first two
parts
• a name space is generally organized as a labeled, directed graph
with two types of nodes
 leaf node: represents the named entity and stores information
such as its address or the state of that entity
 directory node: a special entity that has a number of outgoing
edges, each labeled with a name
• each node in a naming graph is considered as another entity with
an identifier
38
a general naming graph with a single root node, no

• a directory node stores a table in which an outgoing edge is represented as
a pair (edge label, node identifier), called a directory table
• each path in a naming graph can be referred to by the sequence of labels
corresponding to the edges of the path and the first node in the path, such
as
39
N:<label-1, label-2, ..., label-n>, where N refers to the first node in the path
• such a sequence is called a pathname
• if the first node is the root of the naming graph, it is called an
absolute path name; otherwise it is a relative path name
• instead of the path name n0:<home, steen, mbox>, we often use its
string representation /home/steen/mbox
• there may also be several paths leading to the same node, e.g., node
n5 can be represented as /keysor /home/steen/keys
• although the above naming graph is directed acyclic graph(a node
can have more than one incoming edge but is not permitted to have
a cycle), the common way is to use a tree (hierarchical) with a single
root (as is used in file systems)
 in a tree structure, each node except the root has exactly one
incoming edge; the root has no incoming edges
 each node also has exactly one associated (absolute) path name
40
• Symbolic link: representing an entity by a leaf node and
instead of storing the address or state of the entity, the
node stores an absolute path name
the concept of a symbolic link explained in a naming graph

• when first resolving an absolute path name stored in a node (e.g.,
/home/steen/keys in node n6), name resolution will return the path
name stored in the node (/keys), at which point it can continue with
41
resolving that new path name, i.e., closure mechanism
1.4. Process (Threads)
• Communication takes place between processes a
process is a program in execution
• from OS perspective, management and scheduling of
processes is important
• other important issues arise in distributed systems
 Multithreading to enhance performance by overlapping
communication and local processing
 how are clients and servers organized and server design issues
 process or code migration for enhancing performance, reducing
communication, to exploit parallelism, and to dynamically
configure clients and servers
 Software agents that perform a task through cooperation and
agent technology
42
• Threads and their Implementation
• how are processes and threads related?
• Process tables or PCBs are used to keep track of
processes
• there are usually many processes executing
concurrently
• processes should not interfere with each other; sharing
resources by processes is transparent
• this concurrency transparency has a high price;
allocating resources for a new process and context
switching take time
• a thread also executes independently from other
threads; but no need of a high degree of concurrency
transparency thereby resulting in better performance
43
• threads can be used in both distributed and non-
distributed systems
• Threads in Non-distributed Systems
 a process has an address space(containing program text and
data) and a single thread of control, as well as other resources
such as open files, child processes, accounting information, etc.
three processes each with one thread one process with three threads
44
• Threads take turns in running
• Threads allow multiple executions to take place in the same process environment,
called multithreading
• Thread Usage –Why do we need threads?
 e.g., a word processor has different parts for
 interacting with the user
 formatting the page as soon as changes are made
 timed savings (for auto recovery)
 spelling and grammar checking, etc.
1.Simplifying the programming model: since many activities are going on at once more
or less independently
2.They are easier to create and destroy than processes since they do not have any
resources attached to them
3.Performance improves by overlapping activities if there is too much I/O; i.e., to avoid
blocking when waiting for input or doing calculations, say in a spreadsheet
4. Real parallelism is possible in a multiprocessor system
45
• Thread Implementation
• threads are usually provided in the form of a thread package
• the package contains operations to create and destroy a thread,
a. construct a thread library that is executed entirely in user mode(the
OS is not aware of threads)
 cheap to create and destroy threads; just allocate and free memory
 context switching can be done using few instructions; store and reload
only CPU register values
 disadvantage: invocation of a blocking system call will block the entire
process to which the thread belongs and all other threads in that
process
b. implement them in the OS’skernel
 let the kernel be aware of threads and schedule them
• expensive for thread operations such as creation and deletion since each requires a
system call
46
• Threads in Distributed Systems
• Multithreaded Clients
 consider a Web browser; fetching different parts of a page can be
implemented as a separate thread, each opening its own TCP
connection to the server
 each can display the results as it gets its part of the page
 parallelism can also be achieved for replicated servers since
each thread request can be forwarded to separate replicas
• Multithreaded Servers
 servers can be constructed in three ways
a. single-threaded process
 it gets a request, examines it, carries it out to completion before getting
the next request

 threads allow blocking system calls without blocking the entire process;
this means multiple logical connections (communications) can be

established at the same time
 the server is idle while waiting for disk read, i.e., system calls are blocking;
other requests cannot be handled 47

• Threads
• threads are more important for implementing servers e.g., a file
server
 the dispatcher thread reads incoming requests for a file operation from
clients and passes it to an idle worker thread
 the worker thread performs a blocking disk read; in which case another
thread may continue, say the dispatcher or another worker thread
48
a multithreaded server organized in a dispatcher/worker model
1.4.1. Anatomy of a Client
• Two issues: user interfaces and client-side software for distribution
transparency
a. User Interfaces
 to create a convenient environment for the interaction of a human
user and a remote server;
 e.g. mobile phones with simple displays and a set of keys
 GUIs are most commonly used
49
the basic organization of the X Window System

• the window manager is a special application and is in charge of the
“look and feel” of the screen that is presented to users
50
b. Client-Side Software for Distribution Transparency

 in addition to the user interface, parts of the processing and data
level in a client-server application are executed at the client side
 an example is embedded client software for ATMs, cash
registers, etc.
 moreover, client software can also include components to
achieve distribution transparency
 e.g., replication transparency
 assume a distributed system with replicated servers; the client
proxy can send requests to each replica and a client side software
can transparently collect all responses and passes a single return
value to the client application
51
transparent replication of a server using a client-side solution

• location, migration, and relocation transparency can also be
handled using naming and client cooperation;
• e.g., when a server changes location, the client software can be
informed without the user knowing to access transparency and
failure transparency in communication (keep on trying) can also be
achieved using client-side software
52
1.4.2. Server
3.3.1 General Design Issues
• How to organize servers?
• Where do clients contact a server?
• Whether and how a server can be interrupted
• Whether or not the server is stateless
a. How to organize servers?
• Iterative server
 the server itself handles the request and returns the result
• Concurrent server
 it passes a request to a separate process or thread and waits for
the next incoming request; e.g., a multithreaded server; or by
forking a new process as is done in Unix
53
1.4.2. Server
b.Where do clients contact a server?
• using end points or ports at the machine where the server is running
where each server listens to a specific endpoint
• how do clients know the endpoint of a service?
 globally assign endpoints for well-known services; e.g. FTP is on
TCP port 21, HTTP is on TCP port 80
 for services that do not require pre-assigned endpoints, it can be
dynamically assigned by the local OS
 IANA (Internet Assigned Numbers Authority) Ranges
 IANA divided the port numbers into three ranges
• Well-known ports: assigned and controlled by IANA for standard 54

services, e.g., DNS uses port 53
1.4.2. Server
• Registered ports: are not assigned and controlled by IANA; can only
be registered with IANA to prevent duplication e.g., MySQL uses
port 3306
• Dynamic ports or ephemeral ports : neither controlled nor registered
by IANA
55
1.4.2. Server
c. Whether and how a server can be interrupted
• for instance, a user may want to interrupt a file transfer, may be it
was the wrong file
• let the client exit the client application; this will break the connection
to the server; the server will tear down the connection assuming
that the client had crashed
d. Whether or not the server is stateless
• a stateless server does not keep information on the state of its
clients; for instance a Web server
• soft state: a server promises to maintain state for a limited time;
e.g., to keep a client informed about updates; after the time expires,
the client has to poll
56
1.4.2. Server
the general organization of a three-tiered server cluster 57

1.4.2. Server
• Distributed Servers
 the problem with a server cluster is when the logical switch
(single access point) fails making the cluster unavailable
 hence, several access points can be provided where the
addresses are publicly available leading to a distributed server
 e.g., the DNS can return several addresses for the same host
name
58
1.4.3 Code Migration
• so far, communication was concerned on passing data
• we may pass programs, even while running and in heterogeneous
systems
• code migration also involves moving data as well: when a program
migrates while running, its status, pending signals, and other
environment variables such as the stack and the program counter
also have to be moved
59
1.4.3 Code Migration
• Reasons for Migrating Code
• to improve performance; move processes from heavily-loaded to
lightly-loaded machines (load balancing)
• to reduce communication: move a client application that performs
many database operations to a server if the database resides on the
server; then send only results to the client
• to exploit parallelism (for nonparallel programs): e.g., copies of a
mobile program (called a mobile agent or a crawler as is called in
search engines) moving from site to site searching the Web
60
1.5 Communication
• Interposes communication is at the heart of all

distributed systems
• communication in distributed systems is based on
message passing as offered by the underlying network
which is harder as opposed to using shared memory
• modern distributed systems consist of thousands of
processes scattered across an unreliable network such
as the Internet
• unless the primitive communication facilities of the
network are replaced by more advanced ones,
development of large scale Distributed Systems
becomes extremely difficult
61
1.5 Network Protocols and Standards
• why communication in distributed systems? because there is no
shared memory
• two communicating processes must agree on the syntax and
semantics of messages
• a protocol is a set of rules that governs data communications
• a protocol defines what is communicated, how it is communicated,
and when it is communicated
• for instance, for one computer to send a message to another
computer, the first computer must perform the following general
steps (highly simplified)
 break the data into small sections called
packets(message, datagram, packet, frame)
 add addressing information to the packets identifying
the source and destination computers
 deliver the data to the network interface card for
transmission over the network
62
 the receiving computer must perform the same steps, but in reverse
order
 accept the data from the NIC
 remove transmitting information that was added by the
transmitting computer
 reassemble the packets of data into the original message
• the key elements of a protocol are syntax, semantics, and timing
• syntax: refers to the structure or format of the data
• semantics: refers to the meaning of each section of bits
• timing: refers to when data should be sent and how fast they can be
sent
functions of protocols
• each device must perform the same steps the same way so that the
data will arrive and reassemble properly; if one device uses a
protocol with different steps, the two devices will not be
able to communicate with each other
63
Network (Reference) Models
Layers and Services
• within a single machine, each layer uses the services immediately
below it and provides services for the layer immediately above it
• between machines, layer x on one machine communicates with layer
x on another machine
Two important network models or architectures
 The ISO OSI (Open Systems Interconnection) Reference Model
 The TCP/IP Reference Model
a. The OSI Reference Model
• consists of 7 layers
• Open–to connect open systems or systems that are open for
communication with other systems
64
• Physical: Physical characteristics of the media Host (upper) Layers

Media (lower) Layers
• Data Link: Reliable data delivery across the link
• Network: Managing connections across the network or routing
• Transport: End-to-end connection and reliability (handles lost
packets); TCP (connection-oriented),UDP (connectionless), etc.
• Session: Managing sessions between applications(dialog control
and synchronization); rarely supported
• Presentation: Data presentation to applications; concerned with the
syntax and semantics of the information transmitted
• Application: Network services to applications; contains protocols
that are commonly needed by users; FTP, HTTP, SMTP, ...
65
b.The TCP/IP Reference Model
• TCP/IP -Transmission Control Protocol/Internet Protocol
• used by ARPANET and its successor the Internet
• design goals
 the ability to connect multiple networks (internetworking) in a
seamless way
 the network should be able to survive loss of subnet hardware,
i.e., the connection must remain intact as long as the source and
destination machines are properly functioning
 flexible architecture to accommodate requirements of different
applications -ranging from transferring files to real-time speech
transmission
• has 4 (or 5 depending on how you see it) layers: Application,
Transport, Internet (Internetwork), Host-to-network (some split it into
Physical and Data Link)
66
• OSI and TCP/IP Layers Correspondence
67
Middleware Protocols
• a middleware is an application that contains general-purpose
protocols to provide services
• example of middleware services
 authentication and authorization services
 distributed transactions (commit protocols; locking mechanisms)
 middleware communication protocols (calling a procedure or
invoking an object remotely, synchronizing streams for real-time
data, multicast services)
• hence an adapted reference model for networked communications is
required
68
1.5.2 Remote Procedure Call
• the first distributed systems were based on explicit message

exchange between processes through the use of explicit send and
receive procedures; but do not allow access transparency
• in 1984, Birrel and Nelson introduced a different way of handling
communication: RPC
• it allows a program to call a procedure located on another machine
• simple and elegant, but there are implementation problems
• the calling and called procedures run in different address spaces
• parameters and results have to be exchanged;
• what if the machines are not identical?
• what happens if both machines crash?
69
1.5.2 Remote Procedure Call
Steps of a Remote Procedure Call

1.Clientprocedure calls client stub in the normal way
2.Clientstub builds a message and calls the local OS (packing parameters into
a message is called parameter marshaling)
3.Client'sOS sends the message to the remote OS
4.Remote OS gives the message to the server stub
5.Serverstub unpacks the parameters and calls the server
6.Serverdoes the work and returns the result to the stub
7.Serverstub packs it in a message and calls the local OS
8.Server'sOS sends the message to the client'Sos
9.Client'sOS gives the message to the client stub
10.Stub unpacks the result and returns to client
• server machine vs server process; client machine vs client process
• hence, for the client remote services are accessed by making ordinary
(local) procedure calls; not by calling send and receive
70
1.5.3 Types of Communication
Message-Oriented Communication (JAVA)
• RPCs and RMIs are not adequate for all distributed system
applications
• the provision of access transparency may be good but they have
semantics that is not adequate for all applications
• example problems
• they assume that the receiving side is running at the time of
communication
• a client is blocked until its request has been processed
• messaging is the solution
71
• communication can be
 persistent or transient
 asynchronous or synchronous
• persistent: a message that has been submitted for transmission is
stored by the communication system as long as it takes to deliver it
to the receiver
 e.g., e-mail delivery, snail mail delivery
• transient: a message that has been submitted for transmission is
stored by the communication system only as long as the sending
and receiving applications are executing
• asynchronous: a sender continues immediately after it has
submitted its message for transmission
• synchronous: the sender is blocked until its message is
stored in a local buffer at the receiving host or delivered to the
receiver
72
Stream Oriented Communication
• until now, we focused on exchanging independent and complete
units of information
• time has no effect on correctness; a system can be slow or fast
• however, there are communications where time has a critical role
Multimedia
• media
 storage, transmission, interchange, presentation, representation
and perception of different data types
 text, graphics, images, voice, audio, video, animation, ...
 movie: video + audio + …
• multimedia: handling of a variety of representation media
• end user pull
 information overload and starvation
• technology push
 emerging technology to integrate media 73
The Challenge
• new applications
 multimedia will be pervasive in few years (as graphics)
• continuous delivery
 e.g., 30 frames/s (NTSC), 25 frames/s (PAL) for video
 guaranteed Quality of Service 􀂃admission control
• storage and transmission
 e.g., 2 hours uncompressed HDTV (1920×1080) movie: 1.12 TB
(1920×1080x3x25x60x60x2)
 videos are extremely large, even after compressed (actually
encoded)
• search
 can we look at 100…videos to find the proper one?
74
1.6 Synchronization
6.1 Clock Synchronization

6.2 Logical Clock
Faizur Rashid (Dr.)
75
1.6. Synchronization
• apart from communication, how do processes cooperate

and synchronize with one another
• cooperation is partly supported by naming; it allows
processes to at least share resources (entities)
• synchronization deals on how to ensure that processes
do not simultaneously access a shared resource; they
should cooperate in granting each other temporary
exclusive access
• how events can be ordered such as two processes
sending messages to each other
76
1.6. Clock Synchronization
• in centralized systems, time can be unambiguously
decided by a system call
• e.g., process A at time t1gets the time, say tA, and
process b at time t2, where t1< t2, gets the time, say tB
then tA is always less than (possibly equal to but never
greater than) tB
• achieving agreement on time in distributed systems is
difficult
• e.g., consider the make program on a UNIX machine
• a large program is usually split up into several source
files
• make compiles only source files for which the time of
their last update was later than the existing object file 77
1.6. Clock Synchronization
when each machine has its own clock, an event that occurred after
another event may nevertheless be assigned an earlier time
78
1.7. Fault Tolerance
1.7. Fault Tolerance (Challenges)
•Heterogeneity The Internet enables users to access services and run
applications over a heterogeneous collection of computers and networks.
Heterogeneity (that is, variety and difference) applies to all of the following:
•networks;
•computer hardware;
•operating systems;
•Programming languages;
•implementations by different developers.
79
1.7. Challenges
•Although the Internet consists of many different sorts of network their
differences are masked by the fact that all of the computers attached to them
use the Internet protocols to communicate with one another.
•For example, a computer attached to an Ethernet has an implementation of
the Internet protocols over the Ethernet, whereas a computer on a different
sort of network will need an implementation of the Internet protocols for that
network.
•Data types such as integers may be represented in different ways on
different sorts of hardware – for example, there are two alternatives for the
byte ordering of integers. These differences in representation must be dealt
with if messages are to be exchanged between programs running on
different hardware.
•Different programming languages use different representations for
characters and data structures such as arrays and records. These
differences must be addressed if programs written in different languages are
to be able to communicate with one another. 80
1.7. Challenges
Openness: The openness of a computer system is the characteristic that
determines whether the system can be extended and reimplemented in
various ways. The openness of distributed systems is determined primarily
by the degree to which new resource-sharing services can be added and be
made available for use by a variety of client programs.
•Openness cannot be achieved unless the specification and documentation
of the key software interfaces of the components of a system are made
available to software developers.
Security: Many of the information resources that are made available and
maintained in distributed systems have a high intrinsic value to their users.
•Their security is therefore of considerable importance. Security for
information resources has three components: confidentiality (protection
against disclosure to unauthorized individuals), integrity (protection against
alteration or corruption), and availability (protection against interference with
the means to access the resources).
81
1.7. Challenges
•In a distributed system, clients send requests to access data managed by
servers, which involves sending information in messages over a network.
For example:
1.A doctor might request access to hospital patient data or send additions to
that data.
2.In electronic commerce and banking, users send their credit card numbers
across the Internet.
•In both examples, the challenge is to send sensitive information in a
message over a network in a secure manner.
Scalability: Distributed systems operate effectively and efficiently at many
different scales, ranging from a small intranet to the Internet.
•A system is described as scalable if it will remain effective when there is a
significant increase in the number of resources and the number of users.
82
1.7. Challenges
•The design of scalable distributed systems presents the following
challenges:
 Controlling the cost of physical resources:
 Controlling the performance loss:
 Preventing software resources running out:
 Avoiding performance bottlenecks:
Failure handling: Failures in a distributed system are partial – that is,
some components fail while others continue to function. Therefore the
handling of failures is particularly difficult.
 Detecting failures: Some failures can be detected. For example, checksums can be
used to detect corrupted data in a message or a file.
 Masking failures: Some failures that have been detected can be hidden or made
less severe. Two examples of hiding failures:
 1. Messages can be retransmitted when they fail to arrive.
83
 2. File data can be written to a pair of disks so that if one is corrupted, the other
1.7. Challenges
•Tolerating failures: Most of the services in the Internet do exhibit failures –
it would not be practical for them to attempt to detect and hide all of the
failures that might occur in such a large network with so many components.
•Recovery from failures: Recovery involves the design of software so that the state
of permanent data can be recovered or ‘rolled back’ after a server has crashed .
Concurrency :The process that manages a shared resource could take
one client request at a time. But that approach limits throughput. Therefore
services and applications generally allow multiple client requests to be
processed concurrently.
•For example, if two concurrent bids at an auction are ‘Smith: $122’ and
‘Jones: $111’, and the corresponding operations are interleaved without any
control, then they might get stored as ‘Smith: $111’ and ‘Jones: $122’.
•Transparency
84
Assignment #1
Q1. Use the World Wide Web as an example to illustrate the
concept of resource sharing, client and server. What are
the advantages and disadvantages of HTML, URLs
and HTTP as core technologies for information browsing? Are
any of these technologies suitable as a basis for client-server
computing in general?
Q2. Describe Trend, Focus, and Challenges in context to
World Wide Web.
Q3. List the three main software components that may fail
when a client process invokes a method in a server object,
giving an example of a failure in each case. Suggest how
the components can be made to tolerate one another’s
failures.
85
Quiz #1
• Q1. Q1. Tick out () all types of wireless technologies.
i) WiMax ii) WiFi iii) Bluetooth
iv) Hotspot v) Television & Radio broadcasting
vi) Sattelite communication
vii) Infrared Communication viii) GPS ix) RFID
• Q2. Separate algorithm is needed to handle the locking of data and transaction, says.
• Separate transaction
• Distributed transaction
• Flat Transaction
• Top level Transaction
• Q3. Tick all the properties that stands the meaning of ACID in distributed transaction.
• Atomic ii) Serializable iii) Consistent
• iv)Durable v) Isolated vi) Correlate
• vii) Dynamic viii) Automatic
86
• Q4. Which Properties are

Introduction Distributed Syetem and Principles For Distributed System

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction Distributed Syetem and Principles For Distributed System

Uploaded by

Copyright:

Available Formats

1 Introduction to Distributed System

 distributed system : a collection of

• a distributed system organized as middleware; note that

– economics: sharing resources such as printers and high-speed

 How to solve scaling problems ?

• an example of a cluster computing system􀂄

the layered architectural style 20

the object-based architectural style

• processes are loosely coupled;

three tiered architecture: an example of a server acting as a client

1.3.1 Name, Identifier, Address

Faizur Rashid (Dr.)

• hierarchical organization of a location service into domains, each

• an example of storing information of an entity having two addresses

 a client (in Domain D) would like to locate an entity E

looking up a location in a hierarchically organized location service

a general naming graph with a single root node, no

the concept of a symbolic link explained in a naming graph

the next request

this means multiple logical connections (communications) can be

other requests cannot be handled 47

the basic organization of the X Window System

b. Client-Side Software for Distribution Transparency

transparent replication of a server using a client-side solution

• Well-known ports: assigned and controlled by IANA for standard 54

the general organization of a three-tiered server cluster 57

• Interposes communication is at the heart of all

• Physical: Physical characteristics of the media Host (upper) Layers

• the first distributed systems were based on explicit message

Steps of a Remote Procedure Call

6.1 Clock Synchronization

Faizur Rashid (Dr.)

• apart from communication, how do processes cooperate

You might also like