You are on page 1of 46

Processes

CS403/534
Distributed Systems
Erkay Savas
Sabanci University

1
Overview
• Processes
• Introduction to threads
• Threads in distributed systems
• Process (code) migration
• Software agents

2
Processes
• A process can be defined as a program in
execution
• Each process has a separate address space
• More than one process can be active at a given
time
• The OS facilitates sharing of computer resources
by many processes concurrently in a secure way
• To give the illusion that many processes are
executing concurrently, the CPU switches
between processes
• Concept of virtual processors
• Process table
3
Threads: Finer Granularity
• Switching between processes may be too
expensive
– CPU context (register values, program counter, stack
pointer, etc.) must be saved
– Registers of MMU must be modified
– Invalidate TLB
– etc.
• Thread is similar to a process, however
– Multiple threads can share the same address space
– Thread context often consists of only CPU context
and thread management information (e.g. the state of
a thread (runnable, suspended, blocked, etc.))
4
Protection & Threads
• In order to protect the system resources from
competing processes and the processes from
each other, the OS takes extensive measures
which are expensive
• In threads, a measure is not taken if it is not
strictly necessary
– Threads within a process share the same address
space
• Therefore, application developers have to be
extremely careful since there is no automatic
protection mechanism
5
Threads
• Threads switching can be done entirely
independent of the OS
• Creating and destroying threads are much
cheaper than doing so for processes (no OS
help)
– One study reveals that creating a thread more than
ten times faster than creating a process.
• It is much less time consuming to switch
between threads
• Multithreaded applications leads to
performance gain
• Coding for multithreaded applications requires
skills
6
Threads & Non-distributed Systems
• Single processor systems: Execution cannot
proceeds if a single thread of control blocks
– While waiting on user’s input, the single thread cannot
process user’s previous input
• Multiprocessor systems: In order to exploit the
inherent concurrency of a program, each thread
is assigned to a different CPU with a different
subtask.
• Large applications
– a collection of cooperating programs, each to be
executed by a separate process
– Cooperating processes communicates through IPC
mechanism
7
Thread Implementation
Something to think about:
Where do you think we have to implement threads?
1. User-level thread implementations
• Run-time library linked to the applications.
• Kernel has no knowledge of user-level threads.
2. In the kernel
• many kernels provide native support for multi-
threaded processes including Windows NT, Solaris,
etc.
3. Lightweight processes (kernel level threads)
• A hybrid approach (Solaris 2)
8
Threads in Distributed Systems
• Multithreaded clients: Main issue is hiding
network latency
• Example - Multithreaded Web client:
– Web browser scans an incoming HTML page, and finds
that other parts need to be fetched
– Each part is fetched by a separate thread, each doing
a blocking HTTP request
– As more data come in, the browser displays them
– Meanwhile, user can brows through the document
– Each connection is handled by a different thread
– In the case of replicated web servers, accessing
different parts of a web page on different servers
may even result in a real speedup
9
Client-Side Software
• Essence: Often focused on providing distribution
transparency
– Access transparency: client-side stubs for RPCs and
RMIs
– Location/migration/relocation transparency: let client-
side software keep track of actual location of the
server
– Replication transparency: multiple invocations handled
by client stub to replicated servers, a single return
value to client
– Failure transparency: can often be placed only at
client-side (client tries to mask server and
communication failures)
– Concurrency and persistence transparency are often
handled at server side
10
Replication Transparency
• A possible approach to transparent replication of
a remote object using a client-side solution

replica 1

replica 2

Proxy
replicates
invocation
request
replica 3

11
Multithreaded Servers (1)
A multithreaded file server organized in a
dispatcher/worker model.

dispatcher worker worker worker


thread thread thread thread

server process

OS

request coming in
from the network 12
Multithreaded Servers (2)
• Main issues
1. Improve performance:
– Starting a thread to handle an incoming request is much
cheaper than starting a new process
– Having a single-threaded server prohibits simply scaling
the server to a multiprocessor system
– As with clients, hides network latency by reacting next
request while the previous one is being processed
2.Better structure:
– Most servers have high I/O demands. Using simple,
well-understood blocking calls simplifies overall
structure
– Multithreaded programs tend to be smaller and easier
to understand due to simplified flow of control
13
Severs: Basic Models
– A server is a process that waits for incoming service
requests at a specific transport-layer address
(endpoint or port).
– In practice, there is a fixed mapping between a port
and a service for well-known services:
ftp-data: 20 File transfer [default data]
ftp 21 File transfer [control]
telnet 23 Telnet
24 any private mail system
smpt 25 Simple mail transfer
login 49 Login host protocol
sunrpc 111 SUN RPC (portmapper)
courier 530 Xerox RPC

– Services that do not have pre-assigned endpoint can be


located contacting a daemon which is listening to a
well-known port.
14
Servers: Design Issues
a) Client-to-server binding using a daemon as in DCE
b) Client-to-server binding using a superserver as in UNIX

15
Server Types
• Iterative Server:
– The server itself handles all the processing a request requires
• Concurrent server:
– The server listens to the specified endpoint, passes the incoming
requests to another process or thread
– In many UNIX systems, the server forks a new process for each
incoming request
• Superserver:
– Superserver listen to several well-known ports and provide
several independent services. In practice, when a service request
comes in, it starts a process to handle the request (UNIX inetd)
– Prevent many processes from doing nothing but waiting for client
requests in a passive mode

16
Servers and State
• Stateless servers: Never keep information on a
client’s state after having handled a request
– Cannot anticipate the client’s behavior
– Do not keep track of clients
– Do not promise to invalidate a client’s cache
– Example: Web server
• Consequences:
– Clients and servers are completely independent
– Possible loss of performance because, e.g., a server
can’t anticipate client behavior (think of prefetching
file blocks)

17
Stateful Servers
• Stateful servers: Keeps track of the state of
clients:
– Knows which data a client has cached, allows clients to
keep local copies of shared data
– For example, (client, file) entries are kept in the
server
– Drawback: When the server crashes, it must recover its
entire state just before the crash (inconsistency)
– Server keep a record on a client’s behavior.
– Example: cookie
• Observation: The performance of stateful
servers can be extremely high, provided clients
are allowed to keep local copies.
18
Object Servers
• An object server acts as a place where objects
live
– Objects provide the services requested by clients
– Object server provide only the means to invoke local
objects
– Objects may be invoked in different ways such as
creating a thread for each object
– A thread per object automatically protect the object
from concurrent access
– We can also use one thread per invocation request; this
time, we need to protect object against concurrent
access
19
Object Adapters (1)
• Activation policy determines the activation
method of an object
– Activation: bringing an object into server’s address
space + invoking
• We can group objects per policy
• Object Adapter (wrapper) : a piece of software
implementing a specific activation policy
– An incoming request is dispatched to an appropriate
object adapter which resides within a server
– It can extract the object reference from an
invocation request, dispatch the request to the
referenced object (more specifically skeleton of an
object) with a specific activation policy
20
Activation Policy: Example
Server

Organization
of an object
server
supporting
different
activation
policies. Object adapter Object adapter

Request
skeleton demultiplexer

Local OS
21
Object Adapter: Example
• An object adapter needs a generic way of
invoking methods
• Actual invocation is done by skeleton
• Example: An adapter implements the policy of
single thread per object
• To interact with skeletons, the adapter expects
that each skeleton implements the operation:

invoke (unsigned in_size, char in_args[],


unsigned *out_size, char *out_args[])

22
Object Adapter: Header File
/* Definitions needed by caller of adapter and adapter */
#define TRUE
#define MAX_DATA 65536
/* Definition of general message format */
struct message {
long source /* senders identity */
long object_id; /* identifier for the requested object */
long method_id; /* identifier for the requested method */
unsigned size; /* total bytes in list of parameters */
char *data; /* parameters as sequence of bytes */
};
/*General definition of operation to be called at skeleton of object*/
typedef void (*METHOD_CALL)(unsigned, char*, unsigned*, char**);

/* Functions to be called by the server */


long register_object (METHOD_CALL call); /* register an object */
void unrigester_object (long object_id); /* unrigester an object */

void invoke_adapter (message *request);/* call the adapter */

header.h file 23
Object Adapter: thread.h
typedef struct thread THREAD; /* hidden definition of a thread */
thread *CREATE_THREAD (void (*body)(long tid), long tid);
/* Create a thread by giving a pointer to a function that defines */
/* the actual behavior of the thread, along with a thread identifier */
void get_msg (unsigned *size, char **data);

void put_msg(THREAD *receiver, unsigned size, char **data);


/* Calling get_msg blocks the thread until a message has been put */
/* into its associated buffer. Putting a message in a thread's */
/* buffer is a nonblocking operation. */

The thread.h file used by the adapter for using threads.

24
Object Adapter: main.h (1)
#include <header.h>
#include <thread.h>

#define MAX_OBJECTS 100


#define NULL 0
#define ANY -1

METHOD_CALL invoke[MAX_OBJECT];
THREAD *root; /* multiplexor thread */
THREAD *thread[MAX_OBJECTS];

void thread_per_object(long object_id)


{
...
}
...

25
Object Adapter: main.h (2)
...
void thread_per_object(long object_id) {
message *req, *res;
unsigned size;
char *result;

while(TRUE){
get_msg(&size, (char *) &req); /* block for invocation request */

(invoke[object_id])(req->size, req->data, &size, &results);

res = malloc(sizeof(message)+size); /*create response message */


res->object_id = object_id;
res->method_id = req.method_id;
memcpy(res->data, results, size);
put_msg(root, sizeof(res), res);
free(req); free(results);
}
}
void invoke_adapter(message *request) {
put_msg(thread[request->object_id], sizeof(request), request);
}
26
Code Migration
• Approaches to code migration
• Migration and local resources
• Migration in heterogeneous systems

27
Reasons for Migrating Code
• Performance:
– Load balancing: Moving processes from heavily-loaded
to lightly-loaded machines
– Decreasing the communication overhead: Process data
close where the data reside
• Ship client code to server side or vice versa.
– Exploiting parallelism via the use of copies of small
mobile codes
• Flexibility:
– Dynamically downloading client side software, for
example, to implement certain proprietary protocols
used to communicate servers.
28
Downloading Client-side Software
The principle of dynamically configuring a client to
communicate to a server.

29
Models for Code Migration
• Process components
– Code segment: contains the actual code
– Execution segment: the current execution state of a
process (stack, PC, etc.)
– Resource segment: contains references to external
resources (files, devices, other processes, etc)
• Weak mobility:
– Move only the code segment and some initialization data
– Simple if the code is portable (Java applets)
• Strong mobility:
– Move code + execution segments
– Migration: move the entire process
– Cloning: start a clone, set it in the same execution state
(forking a child process on a remote machine) 30
Types of Migration
• Sender-initiated migration
– Uploading programs to a compute server
– Sending search programs across Internet to a Web
database
– Security is always a problem
– Sender must be registered with the receiver in
advance and authenticate itself later.
• Receiver-initiated migration
– Downloading a Java applet
– Can be done anonymously
– Much easier to implement

31
Types of Execution
• Execute at target process
– no need for a separate process
– Java applets are executed in the browser’s address
space.
– Drawback: security problems
• Execute at a separate process
– a separate process is started
– naturally, more costly
– IPC is required for communication between the target
process and the newly created process.

32
Taxonomy of Migration
Mobility
mechanism

weak mobility strong mobility

sender-initiated receiver-initiated sender-initiated receiver-initiated


mobility mobility mobility mobility

execute execute execute execute migrate clone migrate clone


at target in at target in process process process process
process separate process separate
process process

Alternatives for code migration.


33
Migration & Local Resources
• Problem:
– local resources, which migrating process uses, may or
may not be available at the target side
• Resource Types (resource-to-machine binding):
– Fixed: the resource cannot be migrated, such as local
hardware (local device, port number)
– Fastened: the resource can, in principle, be migrated
but only at a high cost (local database, complete Web
sites)
– Unattached: the resource can easily be moved along
with the process (data files associated with a process)

34
Process-to-Resource Binding
• The relation between process and resource is
also important
• By identifier
– The process requires precisely the referenced
resource (e.g. a specific Web site by its URL,
reference to an endpoint)
• By value
– The process requires the value of a resource (e.g.
standard Java or C libraries)
• By type
– The process requires that only a type of a resource be
available (e.g. a color monitor, a printer)

35
Migration and Local Resources
Resource-
to-machine
binding
Unattached Fastened Fixed
process-
to-resource
binding
By identifier MV (or GR) GR (or MV) GR

By value CP (or MV, GR) GR (or CP) GR

By type RB (or GR, CP) RB (or GR, CP) RB (or GR)

Actions to be taken with respect to the references to local


resources when migrating code to another machine
MV : move
GR : supply a global reference
CP : copy the reference
RB : rebind to locally available resource
36
Migration in Heterogeneous Systems
• Weak mobility case:
– no runtime information transferred
– compilation of the source code for each potential
target machine.
– Abstract virtual machine solution
• Strong mobility case:
– Strong mobility in heterogeneous systems is hard to
implement
– The definition of process/thread context (execution
segment) is highly dependent on local hardware,
operation system and runtime systems
– allow migration at specific “transferable” points,
– For example, just before a subroutine call.
37
Migration Stack
• The runtime system maintains its own copy of the
program stack in a machine independent way in
migration stack.
• Every time a subroutine is called, the runtime
system marshals the data that has been pushed
onto the stack since the last call.
– local variables and parameters for the newly called
subroutine.
• The marshaled data is pushed into the migration
stack
– identifier for the called subroutine, return address (as
a label) etc are also pushed onto the migration stack
38
Migration in Heterogeneous Systems (3)
Local
variables B
Procedure B return label
Local (jump) to A
variables B parameter
values for B
return addr.
from B id
for proc B
push procedure parameter Local
call onto program values for B variables A
stack Local stack return label
operations A to the caller
parameter
Local values for A
variables A id
return addr. for proc A
Procedure A
from A

migration
program stack
stack 39
When the Code Migrates
• Runtime system marshals a part of the execution
segment of the program
– leave out the machine specific data.
• Migration stack and the execution segment are
transferred to the destination
• Target machine loads the appropriate code
segment
– i.e. binaries for its machine architecture and OS
• A new runtime stack is constructed from the
migration stack
– Also execution segment
• The compiler has to generate the code to update
the migration stack whenever a subroutine is
entered or exited 40
Reading Assignment

• D’Agents (f.k.a. Agent Tcl) from pages 168 – 173


on the textbook
• Agent system that supports
1. sender-initiated weak mobility
2. sender-initiated strong mobility by process migration
3. sender-initiated strong mobility by process cloning

41
Software Agents
• Definition: An autonomous process (computational entity)
capable of performing a task, in collaboration with users
and other, possibly remote, agents
• An agent can act on its own
• Collaborative agent: Collaborate with others in a
multiagent system
• Mobile agent: can move between machines
• Interface agent: assist users at user-interface level
– Learning capability of an agent
• Information agent: manage information from physically
different sources
– Ordering, filtering, collating, organizing and so on

42
Properties of Agents
Common
Property Description to all
agents?
Autonomous Can act on its own Yes
Responds timely to changes in its
Reactive Yes
environment
Initiates actions that affects its
Proactive Yes
environment
Can exchange information with users
Communicative Yes
and other agents
Continuous Has a relatively long lifespan No
Mobile Can migrate from one site to another No
Adaptive Capable of learning No

Some important properties by which different types of


agents can be distinguished. 43
Agent Technology
• Goal: Identify generally-used components of
agents in distributed systems, and incorporate
them into, for example, middleware.
• Foundation for Intelligent Physical Agents (FIPA)
is developing a general model for software agents
• Agent platform provides basic services
– Management: keeps track of where the agents on this
platform are (mapping agent ID to port), and creating,
deleting, locating agents.
– Directory: Mapping of agent attributes to agent IDs
(yellow pages)
– ACC: Agent Communication Channel, used to
communicate with other agent platforms 44
Agent Technology
The general model of an agent platform (adapted from
[fipa98-mgt]).

Agent program
Agent’s
endpoint
AGENT

Inter-platform
Management Directory communication
ACC
component service

AGENT PLATFORM

Intra-platform
communication 45
Communication
• For the interoperability between platforms,
communication between ACCs on different
platforms follows Internet Inter-ORB Protocol
(IIOP)
• Communication between agents takes place by
means of Agent Communication Language (ACL).
– ACL is application level communication protocol
– ACL messages consist of a header and the actual
content
– Header contains the purpose of the message
– purpose could be to request the receiver to provide
specific services, to respond to a previously sent
request, etc
46

You might also like