Professional Documents
Culture Documents
Introduction
2
2.1 Architectural Styles
the logical organization of distributed systems into
software components
a component is a modular unit with well-defined required
and provided interfaces that is replaceable within its
environment
a connector is a mechanism that mediates
communication, coordination, or cooperation among
components, e.g., facilities for RPC, message passing, or
streaming data
there are various architectural styles
Layered architectures
Object-based architectures
Data-centered architectures
Event-based architectures
3
Layered architectures
components are organized in a layered fashion where a
component at layer Li is allowed to call components at
the underlying layer Li-1, but not the other way around;
e.g., network layers
5
Data-centered architectures
processes communicate through a common repository;
e.g., a shared distributed file system
Event-based architectures
processes communicate through the propagation of events
publish/subscribe systems
processes publish events and the middleware ensures that
only those processes that subscribed to those events will
receive them
7
2.2 System Architectures
the logical organization of distributed systems into
software components or how are processes organized in a
system
2.2.1 Centralized Architectures
thinking in terms of clients requesting services from
servers
9
Application Layering
no clear distinction between a client and a server; for
instance a server for a distributed database may act as a
client when it forwards requests to different file servers
three levels exist
the user-interface level: implemented by clients and
contains all that is required by a client; usually
through GUIs, but not necessarily
the processing level: contains the applications
the data level: contains the programs that maintain
the actual data dealt with
10
e.g., the general organization of an Internet search engine into
three different layers
Client-Server Architectures
how to physically distribute a client-server application
across several machines
Multitiered Architectures 11
Two-tiered architecture: alternative client-server organizations
(a) put only terminal-dependent part of the user interface on the
client machine and let the applications remotely control the
presentation
(b) put the entire user-interface software on the client side
(c) move part of the application to the client, e.g. checking
correctness in filling forms
12
(d) and (e) are for powerful client machines (more popular)
three tiered architecture: an example of a server acting as a client
an example is the organization of Web sites
13
2.2.2 Decentralized Architectures
vertical distribution: the ones discussed so far where the
different tiers correspond directly with the logical
organization of applications; place logically different
components on different machines
horizontal distribution: physically split up the client or the
server into logically equivalent parts
an example is a peer-to-peer system where processes
are equal and hence each process acts as a client and a
server at the same time (servent)
read about the different approaches of peer-to-peer
architecture - pages 44 - 51 and about Architectures
versus Middleware - pages 54 - 66
14
another example is the horizontal distribution of a Web
service
15
Chapter 3 - Processes
16
Introduction
17
3.1 Threads and their Implementation
threads can be used in both distributed and nondistributed
systems
Threads in Nondistributed Systems
a process has an address space (containing program text
and data) and a single thread of control, as well as other
resources such as open files, child processes, accounting
information, etc.
Process 1 Process 2 Process 3
three processes each with one thread one process with three threads 18
each thread has its own program counter, registers, stack, and
state; but all threads of a process share address space, global
variables and other resources such as open files, etc.
19
Threads take turns in running
Threads allow multiple executions to take place in the same
process environment, called multithreading
Thread Usage – Why do we need threads?
e.g., a wordprocessor has different parts; parts for
interacting with the user
formatting the page as soon as changes are made
timed savings (for auto recovery)
spelling and grammar checking, etc.
1. Simplifying the programming model: since many activities are
going on at once
2. They are easier to create and destroy than processes since
they do not have any resources attached to them
3. Performance improves by overlapping activities if there is too
much I/O; i.e., to avoid blocking when waiting for input or
doing calculations, say in a spreadsheet
4. Real parallelism is possible in a multiprocessor system 20
having finer granularity in terms of multiple threads per process
rather than processes provides better performance and makes it
easier to build distributed applications
in nondistributed systems, threads can be used with shared data
instead of processes to avoid context switching overhead in
interprocess communication (IPC)
Multithreaded Servers
servers can be constructed in three ways
a. single-threaded process
it gets a request, examines it, carries it out to completion
are blocking
24
b. threads
threads are more important for implementing servers
e.g., a file server
the dispatcher thread reads incoming requests for a file
operation from clients and passes it to an idle worker thread
the worker thread performs a blocking disk read; in which
case another thread may continue, say the dispatcher or
another worker thread
Model Characteristics
Single-threaded process No parallelism, blocking system calls
Parallelism, blocking system calls
Threads
(thread only)
Finite-state machine Parallelism, nonblocking system calls
three ways to construct a server
26
3.2 Anatomy of Clients
Two issues: user interfaces and client-side software for
distribution transparency
a. User Interfaces
to create a convenient environment for the interaction of a
27
the basic organization of the X Window System
28
b. Client-Side Software for Distribution Transparency
in addition to the user interface, parts of the processing and
registers, etc.
moreover, client software can also include components to
29
transparent replication of a server using a client-side solution
30
3.3 Servers and design issues
3.3.1 General Design Issues
How to organize servers?
Where do clients contact a server?
Whether and how a server can be interrupted
Whether or not the server is stateless
Concurrent server
it passes a request to a separate process or thread and
31
b. Where do clients contact a server?
using endpoints or ports at the machine where the server is
running where each server listens to a specific endpoint
how do clients know the endpoint of a service?
globally assign endpoints for well-known services; e.g. FTP
33
Client-to-server binding using a daemon
ii. use a superserver (as in UNIX) that listens to all endpoints and
then forks a process to take care of the request; this is instead of
having a lot of servers running simultaneously and most of them
idle
34
c. Whether and how a server can be interrupted
for instance, a user may want to interrupt a file transfer, may
be it was the wrong file
let the client exit the client application; this will break the
connection to the server; the server will tear down the
connection assuming that the client had crashed
or
let the client send out-of-bound data, data to be processed by
the server before any other data from the client; the server
may listen on a separate control endpoint; or send it on the
same connection as urgent data as is in TCP
d. Whether or not the server is stateless
a stateless server does not keep information on the state of its
clients; for instance a Web server
soft state: a server promises to maintain state for a limited
time; e.g., to keep a client informed about updates; after the
time expires, the client has to poll
35
a stateful server maintains information about its clients; for
instance a file server that allows a client to keep a local copy
of a file and can make update operations
36
the general organization of a three-tiered server cluster
37
Distributed Servers
the problem with a server cluster is when the logical switch
(single access point) fails making the cluster unavailable
hence, several access points can be provided where the
addresses are publicly available leading to a distributed server
e.g., the DNS can return several addresses for the same host
name
38
3.4 Code Migration
so far, communication was concerned on passing data
we may pass programs, even while running and in
heterogeneous systems
code migration also involves moving data as well: when a
program migrates while running, its status, pending signals, and
other environment variables such as the stack and the program
counter also have to be moved
39
Reasons for Migrating Code
to improve performance; move processes from heavily-loaded to
lightly-loaded machines (load balancing)
to reduce communication: move a client application that performs
many database operations to a server if the database resides on
the server; then send only results to the client
to exploit parallelism (for nonparallel programs): e.g., copies of a
mobile program (a crawler as is called in search engines) moving
from site to site searching the Web
40
to have flexibility by dynamically configuring distributed systems:
instead of having a multitiered client-server application deciding
in advance which parts of a program are to be run where
42
Strong Mobility
transfer code and execution segments; helps to migrate a
process in execution
can also be supported by remote cloning; having an exact
copy of the original process and running on a different
machine; executed in parallel to the original process; UNIX
does this by forking a child process
migration can be
sender-initiated: the machine where the code resides or is
currently running; e.g., uploading programs to a server; may
need authentication or that the client is a registered one
receiver-initiated: by the target machine; e.g., Java Applets;
easier to implement
43
Summary of models of code migration
44
Migration and Local Resources
how to migrate the resource segment
not always possible to move a resource; e.g., a reference to TCP
port held by a process to communicate with other processes
Types of Process-to-Resource Bindings
Binding by identifier (the strongest): a resource is referred by its
identifier; e.g., a URL to refer to a Web page or an FTP server
referred by its Internet (IP) address
Binding by value (weaker): when only the value of a resource is
needed; in this case another resource can provide the same
value; e.g., standard libraries of programming languages such as
C or Java which are normally locally available, but their location
in the file system may vary from site to site
Binding by type (weakest): a process needs a resource of a
specific type; reference to local devices, such as monitors,
printers, ...
45
in migrating code, the above bindings cannot change, but the
references to resources can
how can a reference be changed? depends whether the resource
can be moved along with the code, i.e., resource-to-machine
binding
Types of Resource-to-Machine Bindings
Unattached Resources: can be easily moved with the migrating
program (such as data files associated with the program)
Fastened Resources: such as local databases and complete
Web sites; moving or copying may be possible, but very costly
Fixed Resources: intimately bound to a specific machine or
environment such as local devices and cannot be moved
we have nine combinations to consider
46
Resource-to machine binding
Unattached Fastened Fixed
By identifier MV (or GR) GR (or MV) GR
Process-to-
resource binding By value CP (or MV, GR) GR (or CP) GR
By type RB (or GR, CP) RB (or GR, CP) RB (or GR)
47
Migration in Heterogeneous Systems
distributed systems are constructed on a heterogeneous
collection of platforms, each with its own OS and machine
architecture
heterogeneity problems are similar to those of portability
easier in some languages
for scripting languages the source code is interpreted
for Java an intermediary code is generated by the compiler for
a virtual machine
in weak mobility
since there is no runtime information, compile the source code
for each potential platform
in strong mobility
difficult to transfer the execution segment since there may be
platform-dependent information such as register values; Read
the book about possible solutions
48
Chapter 4 - Communication
Introduction
interprocess communication is at the heart of all distributed
systems
communication in distributed systems is based on message
passing as offered by the underlying network as opposed to
using shared memory
modern distributed systems consist of thousands of
processes scattered across an unreliable network such as
the Internet
unless the primitive communication facilities of the network
are replaced by more advanced ones, development of large
scale Distributed Systems becomes extremely difficult
50
Objectives of the Chapter
review of how processes communicate in a network (the
rules or the protocols) and their structures
introduce the four widely used communication models for
distributed systems:
Remote Procedure Call (RPC)
Remote Method Invocation (RMI)
Message-Oriented Middleware (MOM)
Streams
51
A. Layered Protocols
two computers, possibly from different manufacturers, must
be able to talk to each other
for such a communication, there has to be a standard
The ISO OSI (Open Systems Interconnection) Reference
Model is one of such standards - 7 layers
TCP/IP protocol suite is the other; has 4 or 5 layers
OSI
Open – to connect open systems or systems that are open
for communication with other open systems using standard
rules that govern the format, contents, and meaning of the
messages sent and received
these rules are called protocols
two types of protocols: connection-oriented and
connectionless
52
layers, interfaces, and protocols in the OSI model
53
Media (lower) Layers
Physical: Physical characteristics of the media
Data Link: Reliable data delivery across the link
Network: Managing connections across the network
or routing
Transport: End-to-end connection and reliability
(handles
lost packets); TCP (connection-oriented),
UDP (connectionless), etc.
Session: Managing sessions between applications
(dialog control and synchronization); rarely
supported
Presentation: Data presentation to applications; concerned
with the syntax and semantics of the
information transmitted
Application: Network services to applications; contains
protocols that are commonly needed by
users; FTP, HTTP, SMTP, ...
Host (upper)
54
Transport Protocols: Client-Server TCP
WWW
Middleware Protocols
a middleware is an application that contains general-purpose
58
Conventional Procedure Call, i.e., on a single machine
e.g. count = read (fd, buf, bytes); a C like statement, where
fd is an integer indicating a file
buf is an array of characters into which data are read
bytes is the number of bytes to be read
Stack
pointer
Stack
pointer
63
original message on the Pentium
(the numbers in boxes indicate the address of each
byte)
the message after receipt on the SPARC; wrong integer (224+226 = 83886080),
but correct string
64
one approach is to invert the bytes of each word after
receipt
the message after being inverted (correct integer but wrong string)
65
2. Passing Reference Parameters
assume the parameter is a pointer to an array
copy the array into the message and send it to the server
the server stub can then call the server with a pointer to this
array
the server then makes any changes to the array and sends it
back to the client stub which copies it to the client
this is in effect call-by-copy/restore
optimization of the method
one of the copy operations can be eliminated if the stub
knows whether the parameter is input or output to the
server
if it is an input to the server (e.g., in a call to write), it need
not be copied back
if it is an output, it need not be sent over in the first place;
only send the size
the above procedure can handle pointers to simple arrays
and structures, but difficult to generalize it to an arbitrary
data structure 66
Parameter Specification and Stub Generation
the caller and the callee need to use the same protocol
(format of messages) and the same steps; with such rules the
client and server stubs can assemble, communicate, and
interpret messages correctly
consider the following example; the procedure foobar has 3
68
Extended RPC Models
to solve some of the shortcomings of the original model
a. Doors
the original RPC model assumes that the caller and the
callee can communicate only by means of passing
messages over a network; what if they are colocated on
the same machine?
a door is a generic name for a procedure in the address
space of a server process that can be called by a process
colocated with the server
support from the local OS is required
69
1. the server process registers a door before it can be called
(door_create) and a name is attached to it
2. a client calls a door by a system call (door_call) including
all parameters
3. results are returned by the system call door_return
71
b. Asynchronous RPC
if there is no need to block the client until it gets a reply
two cases
1. if there is no result to be returned
e.g., adding entries in a database, ...
the server immediately sends an ack promising that it
will carryout the request
the client can now proceed without blocking
74
DCE (Distributed Computing Environment) RPC
a middleware and an example RPC system developed by
OSF (Open Software Foundation), now The Open Group
it is designed to execute as a layer of abstraction between
existing OSs and distributed applications
the Open Group sells the source code and vendors integrate
it into their systems
it uses the client-server programming model and
communication is by means of RPCs
services
distributed file service: a worldwide file system that
provides a transparent way of accessing files
directory service: to keep track of the location of all
resources in the system (machines, printers, data,
servers, ...); a process can ask for a resource without
knowing its location
security service: for protecting resources; access is only
through authorization 75
distributed time service: to maintain clocks on different
machines synchronized (clock synchronization is covered
in Chapter 5)
Steps in writing a Client and a Server in DCE RPC
the system consists of languages, libraries, daemons,
utility programs, ... for writing clients and servers
IDL (Interface Definition Language) is the interface
language - the glue that holds everything together
it contains type definitions, constant declarations
and what the procedures do (only their syntax)
76
Uuidgen generates a prototype IDL file with a globally unique interface
identifier
the IDL file is edited (filling the names of procedures and parameters) and
the IDL compiler is called to generate 3 files
the application writer writes the client and server codes and are then
77
compiled and linked together with the stubs
Binding a Client to a Server in DCE RPC
for a client to call a server, the server must be registered (1
& 2)
the registration allows the client to locate the server and
bind to it
the DCE daemon maintains a table (server, endpoint) and the
protocols the server uses
the directory server maintains the locations of all resources
in the system (machines, servers, data,, ...)
two steps for server location
locate the server’s machine (3)
locate the server process on that machine (which has
what is called an endpoint or port) (4)
78
79
4.2 Remote Object (Method) Invocation (RMI)
resulted from object-based technology that has proven its
value in developing nondistributed applications
it is an expansion of the RPC mechanisms
it enhances distribution transparency as a consequence of
an object that hides its internal from the outside world by
means of a well-defined interface
Distributed Objects
an object encapsulates data, called the state, and the
operations on those data, called methods
methods are made available through interfaces
the state of an object can be manipulated only by invoking
methods
this allows an interface to be placed on one machine while
the object itself resides on another machine; such an
organization is referred to as a distributed object
the state of an object is not distributed, only the interfaces
are; such objects are also referred to as remote objects 80
the implementation of an object’s interface is called a proxy
(analogous to a client stub in RPC systems)
it is loaded into the client’s address space when a client
binds to a distributed object
tasks: a proxy marshals method invocation into messages
and unmarshals reply messages to return the result of the
method invocation to the client
a server stub, called a skeleton, unmarshals messages and
marshals replies
81
common organization of a remote object with client-side proxy
82
Binding a Client to an Object
a process must first bind to an object before invoking its
83
Distr_object* obj_ref; //Declare a systemwide object reference
obj_ref = …; //Initialize the reference to a distributed object
obj_refdo_something(); //Implicitly bind and invoke a method
(a)
Distr_object obj_ref; //Declare a systemwide object reference
Local_object* obj_ptr; //Declare a pointer to local objects
obj_ref = …; //Initialize the reference to a distributed object
obj_ptr = bind(obj_ref); //Explicitly bind and obtain a pointer to the local proxy
obj_ptrdo_something(); //Invoke a method on the local proxy
(b)
a) an example with implicit binding using only global references
b) an example with explicit binding using global and local references
84
Parameter Passing
there are two situations when invoking a method with
object reference as parameter; is the object local or
remote to the client?
remote object: copy and pass the reference of the object
as a value parameter; this means the object is passed by
reference
local object: a copy of the object is passed; this means the
object is passed by value
85
the situation when passing an object by reference or by value
two examples:
DCE Remote Objects
Java RMI
86
4.3 Message Oriented Communication
RPCs and RMIs are not adequate for all distributed system
applications
the provision of access transparency may be good but
they have semantics that is not adequate for all
applications
example problems
they assume that the receiving side is running at the
time of communication
a client is blocked until its request has been processed
87
Persistence and Synchronicity in Communication
assume the communication system is organized as a
computer network shown below
Persistent Transient
Asynchronous
90
persistent asynchronous communication persistent synchronous
communication
91
transient asynchronous receipt-based transient synchronous communication
communication
92
delivery-based transient synchronous response-based transient synchronous
communication at message delivery communication
the sender is blocked until the strongest form; the sender is
message is delivered to the blocked until it receives a reply
receiver for further processing; message from the receiver
e.g., asynchronous RPC
93
Message-Oriented Transient Communication
many applications are built on top of the simple message-
oriented model offered by the transport layer
standardizing the interface of the transport layer by
providing a set of primitives allows programmers to use
messaging protocols
they also allow porting applications
Berkley Sockets
an example is the socket interface as used in Berkley
UNIX
a socket is a communication endpoint to which an
application can write data that are to be sent over the
network, and from which incoming data can be read
94
Primitive Meaning
Create a new communication endpoint; also executed by
Socket
reserve resources to send and receive messages both
Attach a local address to a socket; e.g., IP
Bind
address with a known port number
executed by
Announce willingness to accept connections; for
Listen servers
connection-oriented communication
Accept Block caller until a connection request arrives
Actively attempt to establish a connection; the executed by
Connect
client is blocked until connection is set up clients
Send Send some data over the connection
executed by
Receive Receive some data over the connection
both
Close Release the connection
socket primitives for TCP/IP
95
connection-oriented communication pattern using sockets
96
The Message-Passing Interface (MPI)
sockets were designed to communicate across networks
using general-purpose protocol stacks such as TCP/IP
they were not designed for proprietary protocols developed
for high-speed interconnection networks; of course
portability will suffer
MPI is designed for parallel applications and tailored for
transient communication
MPI assumes communication takes place within a known
group of processes, where each group is assigned an
identifier (groupID)
each process within a group is also assigned an identifier
(processID)
a (groupID, processID) identifies the source or destination of
a message, and is used instead of a transport-level address
97
Primitive Meaning
Append outgoing message to a local send buffer; to support
MPI_bsend
transient asynchronous communication
Send a message and wait until copied to local or remote
MPI_send buffer (to support receipt-based transient synchronous
communication)
Send a message and wait until receipt starts (to support
MPI_ssend
delivery-based transient synchronous communication)
Send a message and wait for reply (to support response-
MPI_sendrecv
based transient synchronous communication)
Pass reference to outgoing message, and continue (a
MPI_isend
variant of MPI_send)
Pass reference to outgoing message, and wait until receipt
MPI_issend
starts (a variant of MPI_ssend)
MPI_recv Receive a message; block if there are none
MPI_irecv Check if there is an incoming message, but do not block
98
Message-Oriented Persistent Communication
there are message-oriented middleware services, called
message-queuing systems or Message-Oriented Middleware
(MOM)
they support persistent asynchronous communication
they have intermediate-term storage capacity for messages,
without requiring the sender or the receiver to be active
during message transmission
unlike Berkley sockets and MPI, message transfer may take
minutes instead of seconds or milliseconds
Message-Queuing Model
applications communicate by inserting messages in
specific queues
it permits loosely-coupled communication
the sender may or may not be running; similarly the
receiver may or may not be running, giving four possible
combinations
99
four combinations for loosely-coupled communications using queues
100
Primitive Meaning
Append a message to a specified queue; by the sender
Put
and is nonblocking
Block until the specified queue is nonempty, and remove
Get
the first message
Check a specified queue for messages, and remove the
Poll
first. Never block
Install a handler to be called when a message is put into
Notify
the specified queue; usually a daemon
basic interface to a queue in a message-queuing system
101
General Architecture of a Message-Queuing System
messages can be put only into queues that are local to the
sender (same machine or on a nearby machine on a LAN)
such a queue is called the source queue
messages can also be read only from local queues
a message put into a local queue must contain the
specification of the destination queue; hence a message-
queuing system must maintain a mapping of queues to network
locations; like in DNS
103
the general organization of a message-queuing system with routers
104
Message Brokers
how can applications understand the messages they receive
each receiver can not be made to understand message formats
of new applications
hence, in a message-queuing system conversations are
handled by message brokers
a message broker converts incoming messages to a format
that can be understood by the destination application based on
a set of rules
106
The Challenge
new applications
multimedia will be pervasive in 10 years (as graphics)
storage and transmission
e.g., 2 hours uncompressed HDTV (1920×1080) movie:
1.1 TB (1920×1080x3x25x60x60x2)
videos are extremely large, even compressed
continuous delivery
e.g., 30 frames/s (NTSC), 25 frames/s (PAL) for video
guaranteed Quality of Service
admission control
search
can we look at 100… videos to find the proper one?
107
Types of Media
two types
discrete media: text, executable code, graphics, images;
temporal relationships between data items are not
fundamental to correctly interpret the data
continuous media: video, audio, animation; temporal
relationships between data items are fundamental to
correctly interpret the data
a data stream is a sequence of data units and can be applied
to discrete as well as continuous media
stream-oriented communication provides facilities for the
exchange of time-dependent information (continuous media)
such as audio and video streams
108
timing in transmission modes
asynchronous transmission mode: data items are transmitted
one after the other, but no timing constraints; e.g. text transfer
synchronous transmission mode: a maximum end-to-end
delay defined for each data unit; it is possible that data can be
transmitted faster than the maximum delay, but not slower
isochronous transmission mode: maximum and minimum
end-to-end delay are defined; also called bounded delay jitter;
applicable for distributed multimedia systems
a continuous data stream can be simple or complex
simple stream: consists of a single sequence of data; e.g.,
mono audio, video only
complex stream: consists of several related simple streams
that must be synchronized; e.g., stereo audio, video
consisting of audio and video (may also contain subtitles,
translation to other languages, ...)
109
movie as a set of simple streams
110
a stream can be considered as a virtual connection between a
source and a sink
the source or the sink could be a process or a device
112
Quality of Service (QoS)
QoS requirements describe what is needed from the
underlying distributed system and network to ensure
acceptable delivery; e.g. viewing experience of a user
for continuous data, the concerns are
timeliness: data must be delivered in time
exceeded
quality of perception; highly subjective
113
QoS Dimensions
timeliness dimensions
latency (maximum delay between consecutive frames)
presentation)
jitter (delay variance)
volume dimensions
throughput in frames/sec or bits/sec or bytes/sec
reliability dimensions
MTBF (Mean Time Between Failure) of disks
114
QoS Requirements
deterministic
precise values or ranges
probabilistic
probability of the required QoS
stochastic distributions
e.g., frame arrival should follow normal distribution with
115
QoS Management
can be static or dynamic
Static QoS Management Functions
specification
116
Dynamic QoS Management Functions
monitoring
notices deviation from QoS level
at a certain level of granularity (e.g., every 100 ms)
policing
detect participants not keeping themselves to the contract
e.g., source sends faster than negotiated (e.g., 25 fps)
maintenance
sustaining the negotiated QoS
e.g., the system requires more resources
renegotiation
client tries to adapt – may be can accept lower QoS
117
QoS requirements can be specified using flow specification
containing bandwidth requirements, transmission rates,
delays, ...
e.g. by Partridge (1992)
it uses the token bucket algorithm which specifies how the
stream will shape its network traffic (in fact the leaky
bucket, as used in networking)
the idea is to shape bursty traffic into fixed-rate traffic by
the input rate may vary, but the output rate remains
constant
118
the principle of a token bucket algorithm
119
Specifying QoS
121
the basic organization of RSVP for resource reservation in a distributed system
122
Stream Synchronization
how to maintain temporal relations between streams, e.g., lip
synchronization
two approaches
1. explicitly by operating on the data units of simple
streams; the responsibility of the application
124