You are on page 1of 39

Distributed Systems

( CN-6123 )

Presented by
Dr. M. Prasad, Ph.D, MISTE,
Professor,
Master of Science in Computer Science and Networking Program,
School of Computing and Informatics,
College of Engineering and Technology,
Dilla University, Dilla, Ethiopia.
CHAPTER – 3

Processes
Chapter_3 Syllabus

Processes :

Threads and their Implementation,

Anatomy of Clients,

Servers and Design Issues,

Code Migration.
Introduction
•Process –> Program in Execution.
•Thread -> Path of execution within a process.
• Process can contain multiple threads – Parallelism
•A thread is also known as Lightweight Process.
•A thread executes its own piece of code, independently from
other threads.
•A thread shares with its peer threads few information like code
segment, data segment and open files.
•Ex.:
o Browser - Multiple tabs can be different threads.
o MS Word - One thread to format the text, another thread to
process inputs, spell checker, etc.,
Introduction
Advantages:
1. Responsiveness
2. Faster context switch
3. Effective utilization of multiprocessor system
4. Resource sharing
5. Communication
6. Enhanced throughput of the system
Threads Implementation

•Two types implementations or Used in

Non – Distributed Systems

Distributed Systems
Threads Implementation – Non Distributed
•Two Types threading Models
Single Threading - A process has an address space

(containing program text and data) and a single thread of


control.
Multi Threading - Threads allow multiple executions to

take place in the same process environment. Ex: Word


Processor.
Threads Implementation – Non Distributed . . .
•Advantages of Multi Threading

- Simplifying the programming model.


- Easier to create and destroy than processes.
- Performance improves.
- Real parallelism is possible in a multiprocessor system.
• Multithreading – Useful in - Large applications - Developed as
a collection of cooperating programs - Each to be executed by a
separate process.
• In UNIX environment - Cooperation between programs -
Implemented by - Inter-Process Communication (IPC).
• The major drawback of all IPC – Communication requires
extensive Context Switching.
Threads Implementation – Non Distributed . . .
•Solutions : Threads can be used with shared data.
•Threads are provided in the form of
A Thread Package.
•Package contains
Operations to create and destroy threads
Operations on synchronization variables like mutexes and
condition variables.
•Two approaches to implement or constructing a thread
package.
User Mode
OS Kernal Mode
Threads Implementation – Non Distributed . . .
User Mode
+ ::
 Cheap to create and destroy threads.
 Just allocate and free memory.
 Context switching can be done using few instructions.
 Store and reload only CPU register values
- ::
Invocation of a blocking system call
OS Kernal Mode
- :: Expensive for thread operations - bcz requires a system call.
Solution is Light – Weight Process (LWP)
Threads Implementation – Non Distributed . . .
• A LWP runs in the context of a single (heavy-weight) process, and
there can be several LWPs per process.
• The system also offers a user-level thread package for some
operations such as creating and destroying threads, for thread
synchronization (mutexes and condition variables).
• The thread package can be shared by multiple LWPs
Threads Implementation – Distributed
• Threads are used in DS attractively because allowing blocking
system calls without blocking the entire process.
• So, DS are easier to maintain multiple logical connections at the

same time in Multithreaded Client & Server.


Multithreaded Clients:
• Hide communication latencies created because of Distribution
Transparency.
 Ex.: Consider a Web browser; fetching different parts of a
page can be implemented as a separate thread, each opening its
own TCP/IP connection to the server or to separate and
replicated servers and each can display the results as it gets its
part of the page.
Threads Implementation – Distributed . . .
Multithreaded Servers:
• The main uses of multithreading in distributed systems at

server side are


• Simplifies server code.
• Easier to develop servers.
• Exploit parallelism.
• Attain high performance.
• Servers can be constructed in three ways.

a. Single-threaded process
b. Threads
c. Finite-state machine
Threads Implementation – Distributed . . .
Multithreaded Servers : Single –threaded Process
 It gets a request, examines it, carries it out to completion before
getting the next request.
 The server is idle while waiting for disk read, i.e., system calls are
blocking.
Multithreaded Servers : Thread
 Important for implementing servers.
 EX.: A file server
 The Dispatcher Thread reads incoming requests for a file operation
from clients and passes it to an idle worker thread.
 The Worker Thread performs a blocking disk read; in which case
another thread may continue, say the dispatcher or another worker
Threads Implementation – Distributed . . .
Multithreaded Servers : Finite – State Machine
 If threads are not available.
 It gets a request, examines it, tries to fulfill the request from cache,
else sends a request to the file system; but instead of blocking it
records the state of the current request and proceeds to the next
request.

Model Characteristics
Single-threaded process No parallelism, blocking system calls
Threads Parallelism, blocking system calls (thread only)
Finite-state machine Parallelism, non-blocking system calls
Anatomy of Clients

•Two issues

Networked user interfaces.

Client-side software for distribution

transparency.
Anatomy of Clients – Net. User Interface
• To create a convenient environment for the interaction of a human

user and a remote server.

• There are two ways of interaction can be supported.

1. For each remote service the client machine will have a separate

counterpart that can contact the service over the network.

Ex.: Mobile phones with simple displays and a set of keys.

2. Provide direct access to remote services by only offering a

convenient user interface like GUI.

Ex.: The X Window System


Anatomy of Clients – Client S/W for DT
 Addition to the user interface, parts of the processing and data level in
a client-server application are executed at the client side.
Ex.: Embedded client software for ATMs, cash registers,
etc.
 Client software can also include components to achieve distribution
transparency.
Ex.: Replication Transparency - A distributed system with
replicated servers; the client proxy can send requests to each replica and a
client side software can transparently collect all responses and passes a
single return value to the client application.
Anatomy of Servers
• A server is a process implementing a specific service on behalf of a

collection of clients.

• Each server is waits for an incoming request from a client and

subsequently ensures that the request is taken care of, after which it

waits for the next incoming request.

• Two important concepts

• Design Issues

• Server Clusters
Anatomy of Servers – Design Issues

General Design Issues

 How to organize servers?

 Where do clients contact a server?

 Whether and how a server can be interrupted

 Whether or not the server is stateless


Anatomy of Servers – Design Issues . . .
How to organize servers?

 Iterative server

 The server itself handles the request and returns the result.

 Concurrent server

 It passes a request to a separate process or thread and waits for the

next incoming request;

Ex.: A multithreaded server; or by forking a new

process as is done in UNIX.


Anatomy of Servers – Design Issues . . .

Where do clients contact a server?


 Using endpoints or ports at the machine where the server is
running where each server listens to a specific endpoint.
 How do clients know the endpoint of a service?
 Globally assign endpoints for well-known services;
Ex.: FTP is on TCP port 21, HTTP is on TCP port 80
 For services that do not require preassigned endpoints, it can be
dynamically assigned by the local OS.
 IANA (Internet Assigned Numbers Authority) Ranges
 IANA divided the port numbers into three ranges
Anatomy of Servers – Design Issues . . .

Where do clients contact a server? . . .

 Well-known ports: Assigned and controlled by IANA for standard


services.
Ex.: DNS uses port 53.
 Registered ports: Not assigned and controlled by IANA; can only be
registered with IANA to prevent duplication.
Ex.: MySQL uses port 3306.
 Dynamic ports or ephemeral ports : Neither controlled nor registered by
IANA.
Anatomy of Servers – Design Issues . . .

Where do clients contact a server? . . .


 How can the client know this endpoint?
Two approaches are to know the server endpoints by client.
a. Special Daemon
b. Superserver
Anatomy of Servers – Design Issues . . .

Whether and how a server can be interrupted?


 For instance, a user may want to interrupt a file transfer, may be it

was the wrong file.

 Let the client exit the client application – It will break the

connection to the server.

 Let the client send out-of-bound data.

 Send it on the same connection as urgent data as is in TCP.


Anatomy of Servers – Design Issues . . .

Whether or not the server is stateless


 Stateless server – It does not keep information on the state of its
clients.
Ex.: A Web server
 Soft state - A server promises to maintain state for a limited time.
Ex.: To keep a client informed about updates; after the
time expires, the client has to poll.
 Stateful server – It maintains information about its clients.
Ex.: A file server that allows a client to keep a local
copy of a file and can make update operations.
Anatomy of Servers – Server Clusters
 Collection of machines connected through a network (normally a
LAN with high bandwidth and low latency) where each machine
runs one or more servers.
 It is logically organized into three tiers.
Code Migration
 So far, communication was concerned on passing data.

 We may pass programs, even while running and in heterogeneous

systems.

 code migration also involves moving data as well: when a program

migrates while running, its status, pending signals, and other

environment variables such as the stack and the program counter

also have to be moved .


Code Migration - Reasons

 To improve performance - Load Balancing.

 To reduce communication .

 To exploit parallelism - For nonparallel programs.

 To have flexibility by dynamically configuring distributed

systems.
Code Migration - Models

 A process consists of three segments:

 Code segment - Set of instructions.

 Resource segment - References to external resources such as files,

printers, …

 Execution segment - To store the current execution state of a

process such as private data, the stack, the program counter.


Code Migration – Models…
The code Migration is to provide
 Weak Mobility
 Transfer only the code segment and may be some initialization data; in
this case a program always starts from its initial stage. Ex.: Java
Applets
 Execution can be by the target process (in its own address space like in
Java Applets) or by a separate process.
 Strong Mobility
 Transfer code and execution segments; helps to migrate a process in
execution.
 Can also be supported by remote cloning; having an exact copy of the
original process and running on a different machine.
Code Migration – Models…

A further distinction can be made between

o Sender-initiated: The machine where the code resides or is

currently running.

Ex.: uploading programs to a server; may need

authentication or that the client is a registered one.

o Receiver-initiated: By the target machine.

Ex.: Java Applets; easier to implement.


Code Migration – Models…
Code Migration – Process – Resource Binding
 Binding by identifier (the strongest): A resource is referred by its
identifier.
Ex.: A URL to refer to a Web page or an FTP server
referred by its IP address.
 Binding by value (weaker): When only the value of a resource is
needed; in this case another resource can provide the same value;
Ex.: Standard libraries of programming languages such
as C or Java which are normally locally available, but their location in
the file system may vary from site to site.
 Binding by type (weakest): A process needs a resource of a specific
type; reference to local devices, such as monitors, printers, ...
Code Migration – Resource – Machine Binding

 Unattached Resources - Can be easily moved with the migrating program (such as

data files associated with the program).

 Fastened Resources - Such as local databases and complete Web sites; moving or

copying may be possible, but very costly.

 Fixed Resources - Intimately bound to a specific machine or environment such as

local devices and cannot be moved.


Code Migration – Resource – Machine Binding . . .
Code Migration – Heterogeneous Systems
 Distributed systems are constructed on a heterogeneous collection of

platforms, each with its own OS and machine architecture.


Heterogeneity problems are similar to those of portability.

It is easier in some languages

 For scripting languages the source code is interpreted.

 For Java an intermediary code is generated by the compiler for a

virtual machine.
 In weak mobility - Since there is no runtime information, compile the
source code for each potential platform.
 In strong mobility - Difficult to transfer the execution segment since there
may be platform-dependent information such as register values.
Code Migration – Heterogeneous Systems . . .
There are, in principle, three ways to handle migration:

 Pushing memory pages to the new machine and resending the ones that are later

modified during the migration process.

 Stopping the current virtual machine; migrate memory, and start the new virtual

machine.

 Letting the new virtual machine pull in new pages as needed, that is, let processes

start on the new virtual machine immediately and copy memory pages on demand.
Dr. M. Prasad, Ph.D, MISTE,
Professor,
Master of Science in Computer Science and Networking Program,
School of Computing and Informatics,
E-mail : prasads.maddula@du.edu.et
Phone No : 0934845087

You might also like