1

Rajkumar Buyya
School of Computer Science and Software Engineering
Monash Technology
Melbourne, Australia
Email: rajkumar@ieee.org
URL: http://www.dgs.monash.edu.au/~rajkumar
Concurrent Programming with Threads
2
Objectives
+
Explain the parallel computing right from
architecture, OS, programming paradigm, and
applications
+
Explain the multithreading paradigm, and all
aspects of how to use it in an application
+
Cover all basic MT concepts
+
Explore issues related to MT
+
Contrast Solaris, POSIX, Java threads
+
Look at the APIs in detail
+
Examine some Solaris, POSIX, and Java code
examples
+
Debate on: MPP and Cluster Computing
3
Agenda
ª
Overview of Computing
ª
Operating Systems Issues
ª
Threads Basics
ª
Multithreading with Solaris and POSIX threads
ª
Multithreading in Java
ª
Distributed Computing
ª
Grand Challenges
ª
Solaris, POSIX, and Java example code
4
P P P P P P
Microkernel
Microkernel
Multi-Processor Computing System
Threads Interface
Threads Interface
Hardware
Operating System
Process Processor Thread
P
P
Applications
Computing Elements
Programming paradigms
5

Architectures
Compilers
Applications
P.S.Es
Architectures
Compilers

Applications
P.S.Es
Sequential
Era
Parallel
Era
1940 50 60 70 80 90 2000 2030
Two Eras of Computing
Commercialization
R & D Commodity
6
History of Parallel
Processing
*
PP can be traced to a tablet
dated around 100 BC.
4
Tablet has 3 calculating positions.
4
Infer that multiple positions:
Reliability/ Speed
7
Motivating Factors
d d
d
+
Just as we learned to fly, not by
constructing a machine that flaps its
wings like birds, but by applying
aerodynamics principles demonstrated
by nature...
We modeled PP after those of
biological species.
8
¬
Aggregated speed with
which complex calculations
carried out by individual neurons
response is slow (ms) - demonstrate
feasibility of PP
Motivating Factors
9
Why Parallel Processing?
>
Computation requirements are ever
increasing -- visualization, distributed
databases, simulations, scientific
prediction (earthquake), etc..
>
Sequential architectures reaching
physical limitation (speed of light,
thermodynamics)
10
Technical Computing
Solving technology problems using
computer modeling, simulation and analysis
Life Sciences
Life Sciences
Mechanical Design & Analysis (CAD/CAM)
Mechanical Design & Analysis (CAD/CAM)
Aerospace
Aerospace
Geographic
Information
Systems
Geographic
Information
Systems
11
No. of Processors
C
.
P
.
I
.
1 2 . . . .
Computational Power Improvement
Multiprocessor
Uniprocessor
12
Age
G
r
o
w
t
h
5 10 15 20 25 30 35 40 45 . . . .
Computational Power Improvement
Vertical
Horizontal
13
>
The Tech. of PP is mature and can be
exploited commercially; significant
R & D work on development of tools
& environment.
>
Significant development in
Networking technology is paving a
way for heterogeneous computing.
Why Parallel Processing?
14
>
Hardware improvements like
Pipelining, Superscalar, etc., are non-
scalable and requires sophisticated
Compiler Technology.
>
Vector Processing works well for
certain kind of problems.
Why Parallel Processing?
15
Parallel Program has &
needs ...
>
Multiple “processes” active
simultaneously solving a given
problem, general multiple processors.
>
Communication and synchronization
of its processes (forms the core of
parallel programming efforts).
16
Processing Elements
Architecture
17
>
Simple classification by Flynn:
(No. of instruction and data streams)
SISD - conventional
 SIMD - data parallel, vector computing
 MISD - systolic arrays
 MIMD - very general, multiple approaches.
>
Current focus is on MIMD model, using
general purpose processors.
(No shared memory)
Processing Elements
18
SISD : A Conventional
Computer

Speed is limited by the rate at which computer
can transfer information internally.
Processor
Processor
Data Input
Data Output
I
n
s
t
r
u
c
t
i
o
n
s
Ex:PC, Macintosh, Workstations
19
The MISD
Architecture

More of an intellectual exercise than a practical configuration.
Few built, but commercially not available
Data
Input
Stream
Data
Output
Stream
Processor
A
Processor
B
Processor
C
Instruction
Stream A
Instruction
Stream B
Instruction Stream C
20
SIMD Architecture
Ex: CRAY machine vector processing, Thinking machine cm*
C
i
<= A
i
* B
i
Instruction
Stream
Processor
A
Processor
B
Processor
C
Data Input
stream A
Data Input
stream B
Data Input
stream C
Data Output
stream A
Data Output
stream B
Data Output
stream C
21
Unlike SISD, MISD, MIMD computer works asynchronously.
Shared memory (tightly coupled) MIMD
Distributed memory (loosely coupled) MIMD
MIMD Architecture
Processor
A
Processor
B
Processor
C
Data Input
stream A
Data Input
stream B
Data Input
stream C
Data Output
stream A
Data Output
stream B
Data Output
stream C
Instruction
Stream A
Instruction
Stream B
Instruction
Stream C
22
M
E
M
O
R
Y
B
U
S
Shared Memory MIMD
machine
Comm: Source PE writes data to GM & destination retrieves it
 Easy to build, conventional OSes of SISD can be easily be ported
 Limitation : reliability & expandability. A memory component or any
processor failure affects the whole system.
 Increase of processors leads to memory contention.
Ex. : Silicon graphics supercomputers....
M
E
M
O
R
Y
B
U
S
Global Memory System
Global Memory System
Processor
A
Processor
A
Processor
B
Processor
B
Processor
C
Processor
C
M
E
M
O
R
Y
B
U
S
23
M
E
M
O
R
Y
B
U
S
Distributed Memory MIMD
B
Communication : IPC on High Speed Network.
B
Network can be configured to ... Tree, Mesh, Cube, etc.
B
Unlike Shared MIMD

easily/ readily expandable

Highly reliable (any CPU failure does not affect the whole
system)
Processor
A
Processor
A
Processor
B
Processor
B
Processor
C
Processor
C
M
E
M
O
R
Y
B
U
S
M
E
M
O
R
Y
B
U
S
Memory
System A
Memory
System A
Memory
System B
Memory
System B
Memory
System C
Memory
System C
IPC
channel
IPC
channel
24
Laws of caution.....
B
Speed of computers is proportional to the
square of their cost.
i.e.. cost = Speed

Speedup by a parallel computer increases
as the logarithm of the number of
processors.
S
P
l
o
g2
P
C
S
(speed = cost
2
)
Speedup = log
2
(no. of processors)
25
Caution....
>
Very fast development in PP and related area
have blurred concept boundaries, causing lot of
terminological confusion : concurrent computing/
programming, parallel computing/ processing,
multiprocessing, distributed computing, etc..
26
It’s hard to imagine a field
that changes as rapidly as
computing.
27
Computer Science is an Immature Science.
(lack of standard taxonomy, terminologies)
Caution....
28
>
There is no strict delimiters for contributors
to the area of parallel processing : CA, OS,
HLLs, databases, computer networks, all
have a role to play.
¬
This makes it a Hot Topic of Research
Caution....
29
Parallel Programming
Paradigms
Multithreading
Task level parallelism
30
Serial Vs. Parallel
Q
Please
COUNTER
COUNTER 1
COUNTER 2
31
High Performance Computing
Parallel Machine :
MPP
function1( )
{
//......function stuff
}
function2( )
{
//......function stuff
}
Serial
Machine
function1 ( ):
function2 ( ):
B
Single CPU
Time : add (t
1
, t
2
)
function1( ) || function2 ( )
B
massively parallel system
containing thousands of CPUs
Time : max (t
1
, t
2
)
t
1
t
2
32
Single and
Multithreaded
Processes
Single-threaded Process
Single instruction stream
Multiple instruction stream
Multiplethreaded
Process Threads of
Execution
Common
Address Space
33
OS:
Multi-Processing, Multi-Threaded
Application
Application
Application
Application
CPU
Better Response Times in
Multiple Application
Environments
Higher Throughput for
Parallelizeable Applications
CPU
CPU
CPU
CPU CPU
Threaded Libraries, Multi-threaded I/O
Threaded Libraries, Multi-threaded I/O
34
Multi-threading, continued...
Multi-threaded OS enables parallel, scalable I/O
Application
CPU
CPU CPU
Application
Application
OS Kernel
Multiple, independent I/O
requests can be satisfied
simultaneously because all the
major disk, tape, and network
drivers have been multi-
threaded, allowing any given
driver to run on multiple
CPUs simultaneously.
35
Shared
memory
segments
, pipes,
open files
or
mmap’d
files
Shared
memory
segments
, pipes,
open files
or
mmap’d
files
Basic Process Model
DATA
DATA
STAC
K
TEXT
TEXT
DATA
DATA
STAC
K
TEXT
TEXT
processes
processes
Shared Memory
maintained by kernel
Shared Memory
maintained by kernel
processes
processes
36
What are Threads?
>
Thread is a piece of code that can execute
in concurrence with other threads.
>
It is a schedule entity on a processor
·
Local state
·
Global/ shared state
·
PC
·
Hard/Software Context
Registers
Registers
Hardware
Context
Status Word
Status Word
Program Counter
Program Counter
Running
Thread Object
37
Threaded Process Model
THREAD
STACK
THREAD
STACK
THREAD
DATA
THREAD
DATA
THREAD
TEXT
THREAD
TEXT
SHARED
MEMOR
Y
SHARED
MEMOR
Y
Threads within a process
B
Independent executables
B
All threads are parts of a process hence communication
easier and simpler.
38
Code-
Granularity
Code Item
Large grain
(task level)
Program
Medium grain
(control level)
Function (thread)
Fine grain
(data level)
Loop
Very fine grain
(multiple issue)
With hardware
Code-
Granularity
Code Item
Large grain
(task level)
Program
Medium grain
(control level)
Function (thread)
Fine grain
(data level)
Loop
Very fine grain
(multiple issue)
With hardware
Levels of
Parallelism
Levels of
Parallelism
Task i-l
Task i-l
Task i
Task i
Task i+1
Task i+1
func1 ( )
{
....
....
}
func1 ( )
{
....
....
}
func2 ( )
{
....
....
}
func2 ( )
{
....
....
}
func3 ( )
{
....
....
}
func3 ( )
{
....
....
}
a ( 0 ) =..
b ( 0 ) =..
a ( 0 ) =..
b ( 0 ) =..
a ( 1 )=..
b ( 1 )=..
a ( 1 )=..
b ( 1 )=..
a ( 2 )=..
b ( 2 )=..
a ( 2 )=..
b ( 2 )=..
+
+
x
x
Load
Load
¯
Task
¯
Control
¯
Data
¯
Multiple Issue
¯
Task
¯
Control
¯
Data
¯
Multiple Issue
39
Simple Thread Example
Simple Thread Example
void *func ( )
{
/* define local data */
- - - - - - - - - - -
- - - - - - - - - - - /* function code */
- - - - - - - - - - -
thr_exit(exit_value);
}
main ( )
{
thread_t tid;
int exit_value;
- - - - - - - - - - -
thread_create (0, 0, func (), NULL, &tid);
- - - - - - - - - - -
thread_join (tid, 0, &exit_value);
- - - - - - - - - - -
}
void *func ( )
{
/* define local data */
- - - - - - - - - - -
- - - - - - - - - - - /* function code */
- - - - - - - - - - -
thr_exit(exit_value);
}
main ( )
{
thread_t tid;
int exit_value;
- - - - - - - - - - -
thread_create (0, 0, func (), NULL, &tid);
- - - - - - - - - - -
thread_join (tid, 0, &exit_value);
- - - - - - - - - - -
}
40
Few Popular Thread
Models
Few Popular Thread
Models
/
POSIX, ISO/IEEE standard
/
Mach C threads, CMU
/
Sun OS LWP threads, Sun Microsystems
/
PARAS CORE threads, C-DAC
/
Java-Threads, Sun Microsystems
/
Chorus threads, Paris
/
OS/2 threads, IBM
/
Windows NT/95 threads, Microsoft
/
POSIX, ISO/IEEE standard
/
Mach C threads, CMU
/
Sun OS LWP threads, Sun Microsystems
/
PARAS CORE threads, C-DAC
/
Java-Threads, Sun Microsystems
/
Chorus threads, Paris
/
OS/2 threads, IBM
/
Windows NT/95 threads, Microsoft
41
Multithreading -
Uniprocessors
Multithreading -
Uniprocessors
+
Concurrency Vs Parallelism
+
Concurrency Vs Parallelism
Concurrency Concurrency

Concurrency Concurrency
Number of Simulatneous execution units > no
of CPUs
Number of Simulatneous execution units > no
of CPUs
P1
P1
P2
P2
P3
P3
tim
e
tim
e
CPU
42
Multithreading -
Multithreading -
Multiprocessors
Multiprocessors
Multithreading -
Multithreading -
Multiprocessors
Multiprocessors
Concurrency Vs Parallelism Concurrency Vs Parallelism
Concurrency Vs Parallelism Concurrency Vs Parallelism
P1
P1
P2
P2
P3
P3
tim
e
tim
e
No of execution process = no of CPUs
No of execution process = no of CPUs
CPU
CPU
CPU
43
Computational Model
Computational Model
Parallel Execution due to :
¬
Concurrency of threads on Virtual Processors
¬
Concurrency of threads on Physical Processor
True Parallelism :
threads : processor map = 1:1
Parallel Execution due to :
¬
Concurrency of threads on Virtual Processors
¬
Concurrency of threads on Physical Processor
True Parallelism :
threads : processor map = 1:1
User Level Threads
User Level Threads
Virtual Processors
Virtual Processors
Physical Processors
Physical Processors
User-Level Schedule (User)
Kernel-Level Schedule (Kernel)
44
General Architecture of
Thread Model
General Architecture of
Thread Model
Hides the details of machine
architecture
Maps User Threads to kernel
threads
Process VM is shared, state
change in VM by one thread
visible to other.
Hides the details of machine
architecture
Maps User Threads to kernel
threads
Process VM is shared, state
change in VM by one thread
visible to other.
45
Process Parallelism
Process Parallelism
int add (int a, int b, int & result)
// function stuff
int sub(int a, int b, int & result)
// function stuff
int add (int a, int b, int & result)
// function stuff
int sub(int a, int b, int & result)
// function stuff
pthread t1, t2;
pthread-create(&t1, add, a,b, & r1);
pthread-create(&t2, sub, c,d, & r2);
pthread-par (2, t1, t2);
pthread t1, t2;
pthread-create(&t1, add, a,b, & r1);
pthread-create(&t2, sub, c,d, & r2);
pthread-par (2, t1, t2);
MISD and MIMD Processing
MISD and MIMD Processing
a
b
r1
c
d
r2
a
b
r1
c
d
r2
add
add
sub
sub
Processor
Data
IS
1
IS
2
Processor
46
do


d
n/2
d
n2/+1


d
n
Sort
Sort
Data
IS
Data Parallelism
Data Parallelism
sort( int *array, int count)
//......
//......
sort( int *array, int count)
//......
//......
pthread-t, thread1, thread2;


pthread-create(& thread1, sort, array, N/2);
pthread-create(& thread2, sort, array, N/2);
pthread-par(2, thread1, thread2);
pthread-t, thread1, thread2;


pthread-create(& thread1, sort, array, N/2);
pthread-create(& thread2, sort, array, N/2);
pthread-par(2, thread1, thread2);
SIMD Processing
SIMD Processing
Sort
Sort
Processor
Processor
47
Purpose
Purpose
Threads
Model
Threads
Model
Process
Model
Process
Model
Start execution of a new
thread
Start execution of a new
thread
Creation of a new
thread
Creation of a new
thread
Wait for completion of
thread
Wait for completion of
thread
Exit and destroy the
thread
Exit and destroy the
thread
thr_join()
thr_join()
wait( )
wait( )
exec( )
exec( )
exit( )
exit( )
fork ( )
fork ( )
[ thr_create() builds
the new thread and
starts the execution
[ thr_create() builds
the new thread and
starts the execution
thr_create( )
thr_create( )
thr_exit()
thr_exit()
Process and Threaded
models
Process and Threaded
models
48
Code Comparison
Code Comparison
Segment (Process)
main ( )
{
fork ( );
fork ( );
fork ( );
}
Segment (Process)
main ( )
{
fork ( );
fork ( );
fork ( );
}
Segment(Thread)
main()
{
thread_create(0,0,func(),0,0);
thread_create(0,0,func(),0,0);
thread_create(0,0,func(),0,0);
}
Segment(Thread)
main()
{
thread_create(0,0,func(),0,0);
thread_create(0,0,func(),0,0);
thread_create(0,0,func(),0,0);
}
49
Printing Thread
Printing Thread
Editing
Thread
Editing
Thread
50
Independent Threads
Independent Threads
printing()
{
- - - - - - - - - - - -
}
editing()
{
- - - - - - - - - - - -
}
main()
{
- - - - - - - - - - - -
id1 = thread_create(printing);
id2 = thread_create(editing);
thread_run(id1, id2);
- - - - - - - - - - - -
}
printing()
{
- - - - - - - - - - - -
}
editing()
{
- - - - - - - - - - - -
}
main()
{
- - - - - - - - - - - -
id1 = thread_create(printing);
id2 = thread_create(editing);
thread_run(id1, id2);
- - - - - - - - - - - -
}
51
Cooperative threads - File
Copy
Cooperative threads - File
Copy
reader()
{
- - - - - - - - -
-
lock(buff[i]);
read(src,buff[i]);
unlock(buff[i]);
- - - - - - - - -
-
}
reader()
{
- - - - - - - - -
-
lock(buff[i]);
read(src,buff[i]);
unlock(buff[i]);
- - - - - - - - -
-
}
writer()
{
- - - - - - - - - -
lock(buff[i]);
write(src,buff[i]);
unlock(buff[i]);
- - - - - - - - - -
}
writer()
{
- - - - - - - - - -
lock(buff[i]);
write(src,buff[i]);
unlock(buff[i]);
- - - - - - - - - -
}
buff[0]
buff[0]
buff[1]
buff[1]
Cooperative Parallel
Synchronized Threads
Cooperative Parallel
Synchronized Threads
52
RPC Call
RPC Call
func()
{
/* Body */
}
func()
{
/* Body */
}
RPC(func)
RPC(func)
........
........
Client
Client
Server
Server
N
e
t
w
o
r
k
N
e
t
w
o
r
k
53
Server
Threads
Message Passing
Facility
Server Process
Client
Process
Client Process
User Mode
Kernel Mode
Multithreaded Server
54
Compile
r
Thread
Preprocess
or
Thread
Multithreaded Compiler
Sourc
e
Code
Objec
t
Code
55
Thread Programming
models
1. The boss/worker model
2. The peer model
3. A thread pipeline
56
taskX
taskX
taskY
taskY
taskZ
taskZ
main ( )
main ( )
Workers
Program
Files
Resources
Databases
Disks
Special
Devices
Boss
Input (Stream)
The boss/worker model
57
Example
main() /* the boss */
{
forever {
get a request;
switch( request )
case X: pthread_create(....,taskX);
case X: pthread_create(....,taskX);
....
}
}
taskX() /* worker */
{
perform the task, sync if accessing shared resources
}
taskY() /* worker */
{
perform the task, sync if accessing shared resources
}
....
--Above runtime overhead of creating thread can be solved by thread pool
* the boss thread creates all worker thread at program initialization
and each worker thread suspends itself immediately for a wakeup call
from boss
58
The peer model
taskX
taskX
taskY
taskY
Workers
Program
Files
Resources
Databases
Disks
Special
Devices
taskZ
taskZ
Input
(static)
Input
(static)
59
Example
main()
{
pthread_create(....,thread1...task1);
pthread_create(....,thread2...task2);
....
signal all workers to start
wait for all workers to finish
do any cleanup
}
}
task1() /* worker */
{
wait for start
perform the task, sync if accessing shared resources
}
task2() /* worker */
{
wait for start
perform the task, sync if accessing shared resources
}
60
A thread pipeline
Resources
Files
Databases
Disks
Special Devices
Files
Databases
Disks
Special Devices
Files
Databases
Disks
Special Devices
Stage 1
Stage 1
Stage 2
Stage 2
Stage 3
Stage 3
Program
Filter Threads
Input (Stream)
61
Example
main()
{
pthread_create(....,stage1);
pthread_create(....,stage2);
....
wait for all pipeline threads to finish
do any cleanup
}
stage1() {
get next input for the program
do stage 1 processing of the input
pass result to next thread in pipeline
}
stage2(){
get input from previous thread in pipeline
do stage 2 processing of the input
pass result to next thread in pipeline
}
stageN()
{
get input from previous thread in pipeline
do stage N processing of the input
pass result to program output.
}
62
Multithreaded Matrix Multiply...
X
A
=
B C
C[1,1] = A[1,1]*B[1,1]+A[1,2]*B[2,1]..
….
C[m,n]=sum of product of corresponding elements in row of
A and column of B.
Each resultant element can be computed independently.
63
Multithreaded Matrix Multiply
typedef struct {
int id; int size;
int row, column;
matrix *MA, *MB, *MC;
} matrix_work_order_t;
main()
{
int size = ARRAY_SIZE, row, column;
matrix_t MA, MB,MC;
matrix_work_order *work_orderp;
pthread_t peer[size*zize];
...
/* process matrix, by row, column */
for( row = 0; row < size; row++ )
for( column = 0; column < size; column++)
{
id = column + row * ARRAY_SIZE;
work_orderp = malloc( sizeof(matrix_work_order_t));
/* initialize all members if wirk_orderp */
pthread_create(peer[id], NULL, peer_mult, work_orderp);
} }
/* wait for all peers to exist*/ for( i =0; i < size*size;i++)
pthread_join( peer[i], NULL );
}
64
Multithreaded Server...
void main( int argc, char *argv[] )
{
int server_socket, client_socket, clilen;
struct sockaddr_in serv_addr, cli_addr;
int one, port_id;
#ifdef _POSIX_THREADS
pthread_t service_thr;
#endif
port_id = 4000; /* default port_id */
if( (server_socket = socket( AF_INET, SOCK_STREAM, 0 )) < 0 )
{
printf("Error: Unable to open socket in parmon server.\n");
exit( 1 );
}
memset( (char*) &serv_addr, 0, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = htonl(INADDR_ANY);
serv_addr.sin_port = htons( port_id );
setsockopt(server_socket, SOL_SOCKET, SO_REUSEADDR, (char *)&one, sizeof(one));
65
Multithreaded Server...
if( bind( server_socket, (struct sockaddr *)&serv_addr, sizeof(serv_addr)) < 0 )
{
printf( "Error: Unable to bind socket in parmon server->%d\n",errno );
exit( 1 );
}
listen( server_socket, 5);
while( 1 )
{
clilen = sizeof(cli_addr);
client_socket = accept( server_socket, (struct sockaddr *)&serv_addr, &clilen );
if( client_socket < 0 )
{ printf( "connection to client failed in server.\n" ); continue;
}
#ifdef POSIX_THREADS
pthread_create( &service_thr, NULL, service_dispatch, client_socket);
#else
thr_create( NULL, 0, service_dispatch, client_socket, THR_DETACHED, &service_thr);
#endif
}
}
66
Multithreaded Server
// Service function -- Thread Funtion
void *service_dispatch(int client_socket)
{
…Get USER Request
if( readline( client_socket, command, 100 ) > 0 )
{
…IDENTI|FY USER REQUEST
….Do NECESSARY Processing
…..Send Results to Server
}
…CLOSE Connect and Terminate THREAD
close( client_socket );
#ifdef POSIX_THREADS
pthread_exit( (void *)0);
#endif
}
67
The Value of MT

Program structure

Parallelism

Throughput

Responsiveness

System resource usage

Distributed objects

Single source across platforms (POSIX)

Single binary for any number of CPUs
68
To thread or not to
thread
To thread or not to
thread
¬
Improve efficiency on uniprocessor
systems
¬
Use multiprocessor Hardware
¬
Improve Throughput
+
Simple to implement Asynchronous I/O
¬
Leverage special features of the OS
¬
Improve efficiency on uniprocessor
systems
¬
Use multiprocessor Hardware
¬
Improve Throughput
+
Simple to implement Asynchronous I/O
¬
Leverage special features of the OS
69
To thread or not to
thread
To thread or not to
thread
¬
If all operations are CPU
intensive do not go far on
multithreading
¬
Thread creation is very cheap,
it is not free
+
thread that has only five lines of
code would not be useful
¬
If all operations are CPU
intensive do not go far on
multithreading
¬
Thread creation is very cheap,
it is not free
+
thread that has only five lines of
code would not be useful
70
DOS - The Minimal OS
User
Space
Kernel
Space
DOS
Data
Stack & Stack Pointer Program Counter
User
Code
Global
Data
DOS
Code
Hardware
DOS
71
Multitasking OSs
Process
User
Space
Kernel
Space
Hardware
UNIX
Process Structure
(UNIX, VMS, MVS, NT, OS/2 etc.)
72
Multitasking Systems
Hardware
The Kernel
P1 P2
P3 P4
Processes
(Each process is completely independent)
73
Multithreaded Process
User
Code
Global
Data
The Kernel
Process Structure
(Kernel state and address space are shared)
T1’s SP T3’sPC T1’sPC T2’sPC
T1’s SP
T2’s SP
74
Kernel Structures
Process ID
UID GID EUID EGID CWD.
Priority
Signal Mask
Registers
Kernel Stack
CPU State
File Descriptors
Signal Dispatch Table
Memory Map
Process ID
UID GID EUID EGID CWD.
File Descriptors
Signal Dispatch Table
Memory Map
Traditional UNIX Process Structure Solaris 2 Process Structure
LWP 2 LWP 1
75
Scheduling Design
Options
M:1
HP-UNIX
1:1
DEC, NT, OS/1, AIX. IRIX
M:M
2-level
76
SunOS Two-Level Thread
Model
Proc 1 Proc 2 Proc 3 Proc 4 Proc 5
Traditional
process
User
LWPs
Kernel
threads
Kernel
Hardware Processors
77
Thread Life Cycle
main() main()
{ ... {
pthread_create( func, arg); thr_create(
..func..,arg..);
... ...
} }
void * func()
{
....
}
pthread_exit()
T2
T1
pthread_create(...func...)
POSIX
Solaris
78
Waiting for a Thread to
Exit
main() main()
{ ... {
pthread_join(T2); thr_join( T2,&val_ptr);
... ...
} }
void * func()
{
....
}
pthread_exit()
T2
T1
pthread_join()
POSIX
Solaris
79
Scheduling States: Simplified
View of Thread State
Transitions
RUNNABLE
SLEEPING STOPPED
ACTIVE
Stop
Continue
Preempt
Stop
Stop Sleep
Wakeup
80
Preemption
The process of rudely interrupting a thread and
forcing it to relinquish its LWP (or CPU) to
another.
CPU2 cannot change CPU3’s registers directly. It
can only issue a hardware interrupt to CPU3. It is
up to CPU3’s interrupt handler to look at CPU2’s
request and decide what to do.
Higher priority threads always preempt lower
priority threads.
Preemption ! = Time slicing
All of the libraries are preemptive
81
EXIT Vs. THREAD_EXIT
The normal C function exit() always causes the
process to exit. That means all of the process -- All
the threads.
The thread exit functions:
UI : thr_exit()
POSIX : pthread_exit()
OS/2 : DosExitThread() and _endthread()
NT : ExitThread() and endthread()
all cause only the calling thread to exit, leaving the
process intact and all of the other threads running.
(If no other threads are running, then exit() will be
called.)
82
Cancellation
Cancellation is the means by which a thread can tell another thread
that it should exit.
main() main() main()
{... {... {...
pthread_cancel (T1); DosKillThread(T1); TerminateThread(T1)
} } }
There is no special relation between the killer of a thread and the
victim. (UI threads must “roll their own” using signals)
(pthread exit)
(pthread cancel()
T1
T2
POSIX OS/2
Windows NT
83
Cancellation State and
Type
+
State
+ PTHREAD_CANCEL_DISABLE (Cannot be cancelled)
+ PTHREAD_CANCEL_ENABLE (Can be cancelled, must consider
type)
+
Type
+ PTHREAD_CANCEL_ASYNCHRONOUS (any time what-so-
ever) (not generally used)
+ PTHREAD_CANCEL_DEFERRED
+ (Only at cancellation points)
(Only POSIX has state and type)
(OS/2 is effectively always “enabled asynchronous”)
(NT is effectively always “enabled asynchronous”)
84
Cancellation is Always
Complex!
+
It is very easy to forget a lock that’s being
held or a resource that should be freed.
+
Use this only when you absolutely require
it.
+
Be extremely meticulous in analyzing the
possible thread states.
+
Document, document, document!
85
Returning Status
+
POSIX and UI
+
A detached thread cannot be “joined”. It cannot
return status.
+
An undetached thread must be “joined”, and can
return a status.
+
OS/2
+
Any thread can be waited for
+
No thread can return status
+
No thread needs to be waited for.
+
NT
+
No threads can be waited for
+
Any thread can return status
86
Suspending a Thread
main()
{
...
thr_suspend(T1);
...
thr_continue(T1);
...
}
continue()
T2
T1
suspend()
Solaris:
* POSIX does not support thread suspension
87
Proposed Uses
of
Suspend/Contin
ue
+
Garbage Collectors
+
Debuggers
+
Performance Analysers
+
Other Tools?
These all must go below the API, so they don’t
count.
+
Isolation of VM system “spooling” (?!)
+
NT Services specify that a service should b
suspendable (Questionable requirement?)
Be Careful
88
Do NOT Think about
Scheduling!
+
Think about Resource Availability
+
Think about Synchronization
+
Think about Priorities
Ideally, if you’re using suspend/ continue,
you’re making a mistake!
89
Synchronization
+
Websters: “To represent or arrange events
to indicate coincidence or coexistence.”
+
Lewis : “To arrange events so that they
occur in a specified order.”
* Serialized access to controlled resources.
Synchronization is not just an MP issue. It
is not even strictly an MT issue!
90
«
Threads Synchronization :
I
On shared memory : shared variables -
semaphores
I
On distributed memory :
>
within a task : semaphores
>
Across the tasks : By passing
messages
«
Threads Synchronization :
I
On shared memory : shared variables -
semaphores
I
On distributed memory :
>
within a task : semaphores
>
Across the tasks : By passing
messages
91
Unsynchronized Shared
Data is a Formula for
Disaster
Thread1 Thread2
temp = Your - > BankBalance;
dividend = temp * InterestRate;
newbalance = dividend + temp;
Your->Dividend += dividend; Your-
>BankBalance+= deposit;
Your->BankBalance = newbalance;
92
Atomic Actions
+
An action which must be started and completed
with no possibility of interruption.
+
A machine instruction could need to be atomic.
(not all are!)
+
A line of C code could need to be atomic. (not
all are)
+
An entire database transaction could need to
be atomic.
+
All MP machines provide at least one complex
atomic instruction, from which you can build
anything.
+
A section of code which you have forced to be
atomic is a Critical Section.
93
Critical Section
(Good Programmer!)
Critical Section
(Good Programmer!)
reader()
{
- - - - - - - - -
-
lock(DISK);
...........
...........
...........
unlock(DISK);
- - - - - - - - -
-
}
reader()
{
- - - - - - - - -
-
lock(DISK);
...........
...........
...........
unlock(DISK);
- - - - - - - - -
-
}
writer()
{
- - - - - - - - - -
lock(DISK);
..............
..............
unlock(DISK);
- - - - - - - - - -
}
writer()
{
- - - - - - - - - -
lock(DISK);
..............
..............
unlock(DISK);
- - - - - - - - - -
}
Shared Data
T1
T2
94
Critical Section
(Bad Programmer!)
Critical Section
(Bad Programmer!)
reader()
{
- - - - - - - - -
-
lock(DISK);
...........
...........
...........
unlock(DISK);
- - - - - - - - -
-
}
reader()
{
- - - - - - - - -
-
lock(DISK);
...........
...........
...........
unlock(DISK);
- - - - - - - - -
-
}
writer()
{
- - - - - - - - - -
..............
..............
- - - - - - - - - -
}
writer()
{
- - - - - - - - - -
..............
..............
- - - - - - - - - -
}
Shared Data
T1
T2
95
Lock Shared Data!
+
Globals
+
Shared data structures
+
Static variables
(really just lexically scoped global
variables)
96
Mutexes
item = create_and_fill_item();
mutex_lock( &m );
item->next = list;
list = item;
mutex_unlock(&m);
mutex_lock( &m );
this_item = list;
list = list_next;
mutex_unlock(&m);
.....func(this-item);
+
POSIX and UI : Owner not recorded, block in
priority order.
+
OS/2 and NT. Owner recorded, block in FIFO
order.
Thread 1
Thread2
97
Synchronization Variables in
Shared Memory (Cross
Process)
Process 1 Process 2
S S
Shared Memory
S
S
Synchronization
Variable
Thread
98
Synchronization
Problems
99
Deadlocks
lock( M1 );
lock( M2 );
lock( M2 );
lock( M1 );
Thread 1
Thread 2
Thread1 is waiting for the resource(M2) locked by Thread2 and
Thread2 is waiting for the resource (M1) locked by Thread1
100
Avoiding Deadlocks
+ Establish a hierarchy : Always lock Mutex_1 before Mutex_2, etc..,.
+ Use the trylock primitives if you must violate the hierarchy.
{
while (1)
{ pthread_mutex_lock (&m2);
if( EBUSY |= pthread mutex_trylock (&m1))
break;
else
{ pthread _mutex_unlock (&m1);
wait_around_or_do_something_else();
}
}
do_real work();/* Got `em both! */
}
+ Use lockllint or some similar static analysis program to scan your
code for hierarchy violations.
101
Race Conditions
A race condition is where the results of a
program are different depending upon the
timing of the events within the program.
Some race conditions result in different
answers and are clearly bugs.
Thread 1 Thread 2
mutex_lock (&m) mutex_lock (&m)
v = v - 1; v = v * 2;
mutex_unlock (&m) mutex_unlock (&m)
--> if v = 1, the result can be 0 or 1based on which
thread gets chance to enter CR first
102
Operating System Issues
103
Library Goals
+
Make it fast!
+
Make it MT safe!
+
Retain UNIX semantics!
104
Are Libraries Safe ?
getc() OLD implementation:
extern int get( FILE * p )
{
/* code to read data */
}
getc() NEW implementation:
extern int get( FILE * p )
{
pthread_mutex_lock(&m);
/* code to read data */
pthread_mutex_unlock(&m);
}
105
ERRNO
In UNIX, the distinguished variable errno is used to hold the error code for any
system calls that fail.
Clearly, should two threads both be issuing system calls around the same time, it
would not be possible to figure out which one set the value for errno.
Therefore errno is defined in the header file to be a call to thread-specific data.
This is done only when the flag_REENTRANT (UI)
_POSIX_C_SOURCE=199506L (POSIX) is passed to the compiler, allowing older, non-
MT programs to continue to run.
There is the potential for problems if you use some libraries which are not reentrant.
(This is often a problem when using third party libraries.)
106
Are Libraries Safe?
+
MT-Safe This function is safe
+
MT-Hot This function is safe and fast
+
MT-Unsafe This function is not MT-safe, but
was compiled with _REENTRANT
+
Alternative Call This function is not safe, but
there is a similar function (e.g. getctime_r())
+
MT-Illegal This function wasn’t even compiled
with _REENTRANT and therefore can only be
called from the main thread.
107
Threads Debugging
Interface
+
Debuggers
+
Data inspectors
+
Performance monitors
+
Garbage collectors
+
Coverage analyzers
Not a standard interface!
108
The APIs
109
Different Thread Specifications
Functionality UI Threads POSIX Thteads NT Threads OS/2 Threads

Design Philosophy Base Near-Base Complex Complex
Primitives Primitives Primitives Primitives
Scheduling Classes Local/ Global Local/Global Global Global
Mutexes Simple Simple Complex Complex
Counting Semaphores Simple Simple Buildable Buildable
R/W Locks Simple Buildable Buildable Buildable
Condition VariablesSimple Simple Buildable Buildable
Multiple-Object Buildable Buildable Complex Complex
Synchronization
Thread Suspension Yes Impossible Yes Yes
Cancellation Buildable Yes Yes Yes
Thread-Specific Data Yes Yes Yes Yes
Signal-Handling
Primitives Yes Yes n/a n/a
Compiler Changes
Required No No Yes No
Vendor Libraries MT-safe? Moat Most All? All?
ISV Libraries MT-safe? Some Some Some Some
110
POSIX and Solaris API
Differences
thread cancellation
scheduling policies
sync attributes
thread attributes
continue
suspend
semaphore vars
concurrency setting
reader/ writer vars
daemon threads
join
exit key creation
priorities sigmask create
thread specific data
mutex vars kill
condition vars
POSIX API
Solaris API
111
Error Return Values
+
Many threads functions return an error
value which can be looked up in errno.h.
+
Very few threads functions set errno(check
man pages).
+
The “lack of resources” errors usually
mean that you’ve used up all your virtual
memory, and your program is likely to
crash very soon.
112
Attribute Objects
UI, OS/2, and NT all use flags and direct arguments to
indicate what the special details of the objects being created
should be. POSIX requires the use of “Attribute objects”:
thr_create(NULL, NULL, foo, NULL, THR_DETACHED);
Vs:
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_DETACH
ED);
pthread_create(NULL, &attr, foo, NULL);
113
Attribute Objects
Although a bit of pain in the *** compared to passing all the
arguments directly, attribute objects allow the designers of
the threads library more latitude to add functionality without
changing the old interfaces. (If they decide they really want
to, say, pass the signal mask at creation time, they just add a
function pthread_attr_set_signal_mask() instead of adding a
new argument to pthread_create().)
There are attribute objects for:
Threads
stack size, stack base, scheduling policy, scheduling class,
scheduling scope, scheduling inheritance, detach state.
Mutexes
Cross process, priority inheritance
Condition Variables
Cross process
114
Attribute Objects
Attribute objects must be:
Allocated
Initialized
Values set (presumably)
Used
Destroyed (if they are to be free’d)
pthread_attr_t attr;
pthread_attr_init (&attr);
pthread_attr_setdetachstate(&attr,
PTHREAD_CREATE_DETACHED)’
pthread_create(NULL, &attr, foo, NULL);
pthread_attr_destroy (&attr);
115
Thread Attribute Objects
pthread_attr_t;
Thread attribute object type:
pthread_attr_init (pthread_mutexattr_t *attr)
pthread_attr_destroy (pthread_attr_t *attr)
pthread_attr_getdetachstate (pthread_attr_t *attr, in
*state)
pthread_attr_setdetachstate (pthread_attr_t *attr, int
state)
Can the thread be joined?:
pthread_attr_getscope(pthread_attr_t *attr, in *scope)
pthread_attr_setscope(pthread_attr_t *attr, int scope)
116
Thread Attribute Objects
pthread_attr_getinheritpolicy(pthread_attr_t *attr, int *policy)
pthread_attr_setinheritpolicy(pthread_attr_t *attr, int policy)
Will the policy in the attribute object be used?
pthread_attr_getschedpolicy(pthread_attr_t *attr, int *policy)
pthread_attr_setschedpolicy(pthread_attr_t *attr, int policy)
Will the scheduling be RR, FIFO, or OTHER?
pthread_attr_getschedparam(pthread_attr_t *attr, struct sched
param *param)
pthread_attr_setschedparam(pthread attr_t *attr, struct sched
param *param);
What will the priority be?
117
Thread Attribute Objects
pthread_attr_getinheritsched(pthread_attr_t *attr, int *inheritsched)
pthread_attr_setinheritsched(pthread_attr_t *attr, int inheritsched)
Will the policy in the attribute object be used?
pthread_attr_getstacksize(pthread_attr_t *attr, int *size)
pthread_attr_setstacksize(pthread_attr_t *attr, int size)
How big will the stack be?
pthread_attr_getstackaddr (pthread_attr_t *attr, size_t *base)
pthread_attr_setstackaddr(pthread_attr_t *attr, size_t base)
What will the stack’s base address be?
118
Mutex Attribute Objects
pthread_mutexattr_t;
mutex attribute object type
pthread_mutexattr_init(pthread_mutexattr_t *attr)
pthread_mutexattr_destroy(pthread_mutexattr_t *attr)
pthread_mutexattr_getshared(pthread_mutexattr_t*attr, int
shared)
pthread_mutexattr_setpshared (pthread_mutex attr_t *attr,
int shared)
Will the mutex be shared across processes?
119
Mutex Attribute Objects
pthread_mutexattr_getprioceiling(pthread_mutexattr_t
*attr, int *ceiling)
pthread_mutexattr_setprioceiling(pthread_mutexattr_t
*attr, int *ceiling)
What is the highest priority the thread owning this mutex
can acquire?
pthread_mutexattr_getprotocol (pthread_mutexattr_t
*attr, int *protocol)
pthread_mutexattr_setprotocol (pthread_mutexattr_t
*attr, int protocol)
Shall the thread owning this mutex inherit priorities from
waiting threads?
120
Condition Variable
Attribute Objects
pthread_condattr_t;
CV attribute object type
pthread_condattr_init(pthread_condattr_t * attr)
pthread_condattr_destroy(pthread_condattr_t *attr)
pthread_condattr_getpshared (pthread_condattr_t
*attr, int *shared)
pthread_condattr_setpshared (pthread_condattr_t
*attr, int shared)
Will the mutex be shared across processes?
121
Creation and Destruction
(UI & POSIX)
int thr_create(void *stack_base, size_t stacksize,
void *(*start_routine) (void *), void
* arg, long flags, thread_t thread);
void thr_exit (void *value_ptr);
int thr_join (thread_t thread, void **value_ptr);
int pthread_create (pthread_t *thread, const
pthread_attr_t *attr, void *
(*start_routine) (void *), void *arg);
void pthread_exit (void *value_ptr);
int pthread_join (pthread_t thread, void
**value_ptr);
int pthread_cancel (pthread_t thread);
122
Suspension (UI & POSIX)
int thr_suspend(thread_t target)
int thr_continue(thread_t target)
123
Changing Priority (UI &
POSIX)
int thr_setpriority(thread_t thread, int priority)
int thr_getpriority(thread_t thread, int *priority)
int pthread_getschedparam(pthread_t thread, int
*policy, struct sched param
*param)
int pthread_setschedparam(pthread_t thread, int
policy, struct sched param *param)
124
Readers / Writer Locks
(UI)
int rwlock_init (rwlock_t *rwlock, int
type, void *arg);
int rw_rdlock (rwlock_t *rwlock);
int rw_wrlock (rwlock_t *rwlock);
int rw_tryrdlock(rwlock_t *rwlock);
int rw_trywrlock (rwlock_t *rwlock);
int rw_unlock (rwlock_t *rwlock);
int rw_destroy (rwlock_t *rwlock);
125
(Counting) Semaphores
(UI & POSIX)
int sema_init (sema_t *sema,
unsigned int sema_count,
int type, void *arg)
int sema_wait (sema_t *sema)
int sema_post(sema_t *sema)
int sema_trywait (sema_t *sema)
int sema_destroy (sema_t *sema)
int sem_init (sem_t *sema, int pshared, unsigned int count)
int sem_post (sem_t *sema)
int sem_trywait (sem_t *sema)
int sem_destroy (sem_t *sema)
(POSIX semaphores are not part of pthread. Use the
libposix4.so and posix4.h)
126
Condition Variables
(UI & POSIX)
int cond_init(contd_t *cond, int type, void *arg)
int cond_wait(cond_t *cond, mutex_t *mutex);
int cond_signal(cond_t *cond)
int cond_broadcast(cond_t *cond)
int cond_timedwait(cond_t *cond, mutex_t *mutex, timestruc_t *abstime)
int cond_destroy (cond_t *cond)
int pthread_cond_init(pthread_cond_t *cond,pthread_condattr_t *attr)
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)
int pthread_cond_signal (pthread_cond_t *cond)
int pthread_cond_broadcast(pthread_cond_t *cond, pthread_mutex_t
*mutex, struct timespec *abstime)
int pthread_cond_destroy(pthread_cond_t *cond)
127
Signals (UI & POSIX)
int thr_sigsetmask(int how, const sigset_t *set, sigset_t
*oset);
int thr_kill(thread_t target thread, int sig)
int sigwait(sigset_t *set)
int pthread_sigmask(int how, const sigset_t *set, sigset_t
*oset);
int pthread_kill(thread_t target_thread, int sig)
int sigwait(sigset_t *set, int *sig)
128
Cancellation (POSIX)
int pthread_cancel (pthread_thread_t thread)
int pthread cleanup_pop (int execute)
int pthread_cleanup_push (void (*funtion) (void *),
void *arg)
int pthread_setcancelstate (int state, int *old_state)
int pthread_testcancel (void)
129
Other APIs
thr_self(void)
thr_yield()
int pthread_atfork (void (*prepare) (void),
void (*parent) (void),
void (*child) (void)
pthread_equal (pthread_thread_t tl, pthread_thread_t t2)
pthread_once (pthread_once_t *once_control, void
(*init_routine) (void))
pthread_self (void)
pthread_yield()
(Thread IDs in Solaris recycle every 2^32 threads, or about
once a month if you do create/exit as fast as possible.)
130
Compiling
131
Solaris Libraries
+
Solaris has three libraries: libthread.so,
libpthread.so, libposix4.so
+
Corresponding new include files: synch.h,
thread.h, pthread.h, posix4.h
+
Bundled with all O/S releases
+
Running an MT program requires no extra
effort
+
Compiling an MT program requires only a
compiler (any compiler!)
+
Writing an MT program requires only a
compiler (but a few MT tools will come in
very handy)
132
Compiling UI under
Solaris
+
Compiling is no different than for non-MT programs
+
libthread is just another system library in /usr/lib
+
Example: %cc -o sema sema.c -lthread
-D_REENTRANT %cc -o sema sema.c -mt
+
All multithreaded programs should be compiled
using the _REENTRANT flag
+
Applies for every module in a new application
+
If omitted, the old definitions for errno, stdio would
be used, which you don’t want
+
All MT-safe libraries should be compiled using the
_REENTRANT flag, even though they may be used
single in a threaded program.
133
Compiling POSIX under
Solaris
+ Compiling is no different than for non-MT programs
+ libpthread is just another system library in /usr/lib
+ Example : %cc-o sema sema.c -lpthread -lposix4
-D_POSIX_C_SOURCE=19956L
+ All multithreaded programs should be compiled using the
_POSIX_C_SOURCE=199506L flag
+ Applies for every module in a new application
+ If omitted, the old definitions for errno, stdio would be
used, which you don’t want
+ All MT-safe libraries should be compiled using the
_POSIX_C_SOURCE=199506L flag, even though they may
be used single in a threaded program
134
Compiling mixed
UI/POSIX under Solaris
+
If you just want to use the UI thread
functions (e.g., thr_setconcurrency())
%cc-o sema sema.c -1thread -1pthread
-1posix4 D_REENTRANT -
_POSIX_PTHREAD_SEMANTICS
If you also want to use the UI semantics
for fork(), alarms, timers, sigwait(),
etc.,.
135
Summary
¬
Threads provide a more natural programming paradigm
¬
Improve efficiency on uniprocessor systems
¬
Allows to take full advantage of multiprocessor Hardware
¬
Improve Throughput: simple to implement asynchronous
I/O
¬
Leverage special features of the OS
¬
Many applications are already multithreaded
¬
MT is not a silver bullet for all programming problems.
¬
Threre is already standard for multithreading--POSIX
¬
Multithreading support already available in the form of
language syntax--Java
¬
Threads allows to model the real world object (ex: in Java)
¬
Threads provide a more natural programming paradigm
¬
Improve efficiency on uniprocessor systems
¬
Allows to take full advantage of multiprocessor Hardware
¬
Improve Throughput: simple to implement asynchronous
I/O
¬
Leverage special features of the OS
¬
Many applications are already multithreaded
¬
MT is not a silver bullet for all programming problems.
¬
Threre is already standard for multithreading--POSIX
¬
Multithreading support already available in the form of
language syntax--Java
¬
Threads allows to model the real world object (ex: in Java)
136
Java
Multithreading in Java
137
Java - An Introduction
+
Java - The new programming language from
Sun Microsystems
+
Java -Allows anyone to publish a web page
with Java code in it
+
Java - CPU Independent language
+
Created for consumer electronics
+
Java - James , Arthur Van , and others
+
Java -The name that survived a patent
search
+
Oak -The predecessor of Java
+
Java is “C++ -- ++ “
138
Object Oriented
Languages -A
comparison
Feature C++ Objective
C
Ada Java
Encapsulation
Yes Yes Yes Yes
Inheritance
Yes Yes No Yes
Multiple Inherit.
Yes Yes No No
Polymorphism
Yes Yes Yes Yes
Binding (Early or Late)
Both Both Early Late
Concurrency
Poor Poor Difficult Yes
Garbage Collection
No Yes No Yes
Genericity
Yes No Yes No
Class Libraries
Yes Yes Limited Yes
139
Sun defines Java as:
+
Simple and Powerful Simple and Powerful
+
Safe Safe
+
Object Oriented Object Oriented
+
Robust Robust
+
Architecture Neutral and Portable Architecture Neutral and Portable
+
Interpreted and High Performance Interpreted and High Performance
+
Threaded Threaded
+
Dynamic Dynamic
140

Java Integrates
Power of Compiled Languages
and
Flexibility of Interpreted Languages
141
Classes and Objects
+
Classes and Objects
+
Method Overloading
+
Method Overriding
+
Abstract Classes
+
Visibility modifiers
default
public
protected
private protected , private
142
Threads
+
Java has built in thread support for
Multithreading
+
Synchronization
+
Thread Scheduling
+
Inter-Thread Communication:
currentThread start setPriority
yield run getPriority
sleep stop suspend
resume
+
Java Garbage Collector is a low-priority thread
143
Ways of Multithreading in
Java
+ Create a class that extends the Thread class
+ Create a class that implements the Runnable interface
+
1st Method: Extending the Thread class
class MyThread extends Thread
{
public void run()
{
// thread body of execution
}
}
+ Creating thread:
MyThread thr1 = new MyThread();
+ Start Execution:
thr1.start();
144
2nd method: Threads by implementing
Runnable interface
class ClassName implements Runnable
{
.....
public void run()
{
// thread body of execution
}
}
+
Creating Object:
ClassName myObject = new ClassName();
+
Creating Thread Object:
Thread thr1 = new Thread( myObject );
+
Start Execution:
thr1.start();
145
Thread Class Members...
public class java.lang.Thread extends java.lang.Object
implements java.lang.Runnable
{
// Fields
public final static int MAX_PRIORITY;
public final static int MIN_PRIORITY;
public final static int NORM_PRIORITY;
// Constructors
public Thread();
public Thread(Runnable target);
public Thread(Runnable target, String name);
public Thread(String name);
public Thread(ThreadGroup group, Runnable target);
public Thread(ThreadGroup group, Runnable target, String name);
public Thread(ThreadGroup group, String name);
// Methods
public static int activeCount();
public void checkAccess();
public int countStackFrames();
public static Thread currentThread();
public void destroy();
public static void dumpStack();
public static int enumerate(Thread tarray[]);
public final String getName();
146
...Thread Class Members.
public final int getPriority(); // 1 to 10 priority-pre-emption at mid.
public final ThreadGroup getThreadGroup();
public void interrupt();
public static boolean interrupted();
public final boolean isAlive();
public final boolean isDaemon();
public boolean isInterrupted();
public final void join();
public final void join(long millis);
public final void join(long millis, int nanos);
public final void resume();
public void run();
public final void setDaemon(boolean on);
public final void setName(String name);
public final void setPriority(int newPriority);
public static void sleep(long millis);
public static void sleep(long millis, int nanos);
public void start();
public final void stop();
public final void stop(Throwable obj);
public final void suspend();
public String toString();
public static void yield();
}
147
Manipulation of Current Thread
// CurrentThreadDemo.java
class CurrentThreadDemo {
public static void main(String arg[]) {
Thread ct = Thread.currentThread();
ct.setName( "My Thread" );
System.out.println("Current Thread : "+ct);
try {
for(int i=5; i>0; i--) {
System.out.println(" " + i);
Thread.sleep(1000);
}
}
catch(InterruptedException e) {
System.out.println("Interrupted."); }
}
}
Run:
Current Thread : Thread[My Thread,5,main]
5
4
3
2
1
148
Creating new Thread...
// ThreadDemo.java
class ThreadDemo implements Runnable
{
ThreadDemo()
{
Thread ct = Thread.currentThread();
System.out.println("Current Thread : "+ct);
Thread t = new Thread(this,"Demo Thread");
t.start();
try
{
Thread.sleep(3000);
}
catch(InterruptedException e)
{
System.out.println("Interrupted.");
}
System.out.println("Exiting main thread.");
}
149
...Creating new Thread.
public void run() {
try {
for(int i=5; i>0; i--) {
System.out.println(" " + i);
Thread.sleep(1000);
} }
catch(InterruptedException e) {
System.out.println("Child interrupted.");
}
System.out.println("Exiting child thread.");
}
public static void main(String args[]) {
new ThreadDemo();
}
}
Run:
Current Thread : Thread[main,5,main]
5
4
3
Exiting main thread.
2
1
Exiting child thread.
150
Thread Priority...
// HiLoPri.java
class Clicker implements Runnable {
int click = 0;
private Thread t;
private boolean running = true;
public Clicker(int p)
{
t = new Thread(this);
t.setPriority(p);
}
public void run()
{
while(running)
click++;
}
public void start()
{
t.start();
}
public void stop()
{
running = false;
}
}
151
...Thread Priority
class HiLoPri
{
public static void main(String args[])
{
Thread.currentThread().setPriority(Thread.MAX_PRIORITY);
Clicker Hi = new Clicker(Thread.NORM_PRIORITY+2);
Clicker Lo = new Clicker(Thread.NORM_PRIORITY-2);
Lo.start();
Hi.start();
try {
Thread.sleep(10000);
}
catch (Exception e)
{ }
Lo.stop();
Hi.stop();
System.out.println(Lo.click + " vs. " + Hi.click);
}
}
Run1: (on Solaris)
0 vs. 956228
Run2: (Window 95)
304300 vs. 4066666
152
The Java monitor model
Method 1
Method 2
Block 1
Key
Threads
Monitor (synchronised) solves race-condition problem
153
Threads Synchronisation...
// Synch.java: race-condition without synchronisation
class Callme {
// Check synchronized and unsynchronized methods
/* synchronized */ void call(String msg)
{
System.out.print("["+msg);
try {
Thread.sleep(1000);
}
catch(Exception e)
{ }
System.out.println("]");
}
}
class Caller implements Runnable
{
String msg;
Callme Target;
public Caller(Callme t, String s)
{
Target = t;
msg = s;
new Thread(this).start();
}
154
...Threads Synchronisation.
public void run() {
Target.call(msg);
}
}
class Synch {
public static void main(String args[]) {
Callme Target = new Callme();
new Caller(Target, "Hello");
new Caller(Target, "Synchronized");
new Caller(Target, "World");
}
}
Run 1: With unsynchronized call method (race condition)
[Hello[Synchronized[World]
]
]
Run 2: With synchronized call method
[Hello]
[Synchronized]
[World]
Run3: With Synchronized object
synchronized(Target)
{ Target.call(msg); }
The output is the same as Run2
155
Queue (no inter-threaded
communication)...
// pc.java: produce and consumer
class Queue
{
int n;
synchronized int get()
{
System.out.println("Got : "+n);
return n;
}
synchronized void put(int n)
{
this.n = n;
System.out.println("Put : "+n);
}
}
class Producer implements Runnable
{
Queue Q;
Producer(Queue q)
{
Q = q;
new Thread( this, "Producer").start();
}
156
Queue (no inter-threaded
communication)...
public void run()
{
int i = 0;
while(true)
Q.put(i++);
}
}
class Consumer implements Runnable
{
Queue Q;
Consumer(Queue q)
{
Q = q;
new Thread( this, "Consumer").start();
}
public void run()
{
while(true)
Q.get();
}
}
157
...Queue (no inter-threaded
communication).
class PC
{
public static void main(String[] args)
{
Queue Q = new Queue();
new Producer(Q);
new Consumer(Q);
}
}
Run:
Put: 1
Got: 1
Got: 1
Got: 1
Put: 2
Put: 3
Got: 3
^C
158
Queue (interthread communication)...
// PCnew.java: produce-consumenr with interthread communication
class Queue
{
int n;
boolean ValueSet = false;
synchronized int get()
{
try
{
if(!ValueSet)
wait();
}
catch(InterruptedException e)
{
}
System.out.println("Got : "+n);
ValueSet = false;
notify();
return n;
}
159
Queue (interthread communication)...
synchronized void put(int n)
{
try {
if(ValueSet)
wait();
}
catch(InterruptedException e)
{ }
this.n = n;
System.out.println("Put : "+n);
ValueSet = true;
notify();
}
}
class Producer implements Runnable
{
Queue Q;
Producer(Queue q)
{
Q = q;
new Thread( this, "Producer").start();
}
160
Queue (interthread communication)...
public void run()
{
int i = 0;
while(true)
Q.put(i++);
}
}
class Consumer implements Runnable
{
Queue Q;
Consumer(Queue q)
{
Q = q;
new Thread( this, "Consumer").start();
}
public void run()
{
while(true)
Q.get();
}
}
161
...Queue (no interthread
communication).
class PCnew
{
public static void main(String[] args)
{
Queue Q = new Queue();
new Producer(Q);
new Consumer(Q);
}
}
Run:
Put : 0
Got : 0
Put : 1
Got : 1
Put : 2
Got : 2
Put : 3
Got : 3
Put : 4
Got : 4
^C
162
Deadlock...
// DeadLock.java
class A
{
synchronized void foo(B b)
{
String name = Thread.currentThread().getName();
System.out.println(name + " entered A.foo");
try
{
Thread.sleep(1000);
}
catch(Exception e)
{
}
System.out.println(name + " trying to call B.last()");
b.last();
}
synchronized void last()
{
System.out.println("Inside A.last");
}
}
163
Deadlock...
class B
{
synchronized void bar(A a)
{
String name = Thread.currentThread().getName();
System.out.println(name + " entered B.bar");
try
{
Thread.sleep(1000);
}
catch(Exception e)
{
}
System.out.println(name + " trying to call A.last()");
a.last();
}
synchronized void last()
{
System.out.println("Inside B.last");
}
}
164
...Deadlock.
class DeadLock implements Runnable {
A a = new A();
B b = new B();
DeadLock() {
Thread.currentThread().setName("Main Thread");
new Thread(this).start();
a.foo(b);
System.out.println("Back in the main thread.");
}
public void run() {
Thread.currentThread().setName("Racing Thread");
b.bar(a);
System.out.println("Back in the other thread");
}
public static void main(String args[]) {
new DeadLock();
}
}
Run:
Main Thread entered A.foo
Racing Thread entered B.bar
Main Thread trying to call B.last()
Racing Thread trying to call A.last()
^C
165
Grand Challenges
(Is PP Practical?)
Grand Challenges
(Is PP Practical?)
z
Need OS and Compiler support to use multiprocessor
machines.
z
Ideal would be for the user to be unaware if the problem
is running on sequential or parallel hardware - a long way
to go.
z
With Highspeed Networks and improved microprocessor
performance, multiple stand-alone machines can also be
used as a parallel machine - a Popular Trend. (appealing
vehicle for parallel computing)
z
Language standards have to evolve. (Portability).
z
Re-orientation of thinking
+
Sequential Parallel
z
Need OS and Compiler support to use multiprocessor
machines.
z
Ideal would be for the user to be unaware if the problem
is running on sequential or parallel hardware - a long way
to go.
z
With Highspeed Networks and improved microprocessor
performance, multiple stand-alone machines can also be
used as a parallel machine - a Popular Trend. (appealing
vehicle for parallel computing)
z
Language standards have to evolve. (Portability).
z
Re-orientation of thinking
+
Sequential Parallel
166
Grand Challenges
(Is PP Practical?)
Grand Challenges
(Is PP Practical?)
z
Language standards have to
evolve. (Portability).
z
Re-orientation of thinking
+
Sequential
Parallel
z
Language standards have to
evolve. (Portability).
z
Re-orientation of thinking
+
Sequential
Parallel
167
Breaking High Performance Computing Barriers
Breaking High Performance Computing Barriers
2100
2100 2100 2100 2100
2100 2100 2100 2100
Single
Processor
Shared
Memory
Local
Parallel
Cluster
Global
Parallel
Cluster
G
F
L
O
P
S
168
Thank You ...
Thank You ...

Objectives
b

Explain the parallel computing right from architecture, OS, programming paradigm, and applications Explain the multithreading paradigm, and all aspects of how to use it in an application b Cover all basic MT concepts b Explore issues related to MT b Contrast Solaris, POSIX, Java threads b Look at the APIs in detail b Examine some Solaris, POSIX, and Java code examples Debate on: MPP and Cluster Computing

b

b

2

Agenda 

      

Overview of Computing Operating Systems Issues Threads Basics Multithreading with Solaris and POSIX threads Multithreading in Java Distributed Computing Grand Challenges Solaris, POSIX, and Java example code
3

Computing Elements Applications Programming paradigms Threads Interface Threads Interface Microkernel Microkernel Multi-Processor Computing System P P P P P Operating System P .. Hardware P Processor P Thread Process 4 .

Es 1940 50 60 70 80 90 2000 2030 Commercialization R&D Commodity 5 .Es Architectures Compilers Applications P.S.S.Two Eras of Computing Sequential Era Parallel Era Architectures Compilers Applications P.

x x Tablet has 3 calculating positions. Infer that multiple positions: Reliability/ Speed 6 .History of Parallel Processing S PP can be traced to a tablet dated around 100 BC.

. 7 d  ..Motivating Factors d v Just as we d learned to fly. not by constructing a machine that flaps its wings like birds. but by applying aerodynamics principles demonstrated by nature. We modeled PP after those of biological species.

Motivating Factors ªAggregated speed with which complex calculations carried out by individual neurons response is slow (ms) .demonstrate feasibility of PP 8 .

distributed databases. etc.visualization. thermodynamics) 9 Ä . simulations. scientific prediction (earthquake)..Why Parallel Processing? Ä Computation requirements are ever increasing -. Sequential architectures reaching physical limitation (speed of light.

Technical Computing Solving technology problems using computer modeling. simulation and analysis Geographic Geographic Information Information Systems Systems Life Sciences Life Sciences Aerospace Aerospace Mechanical Design & Analysis (CAD/CAM) Mechanical Design & Analysis (CAD/CAM) 10 .

.P.Computational Power Improvement Multiprocessor C. . .I. of Processors 11 . No. Uniprocessor 1 2.

.Computational Power Improvement Vertical Horizontal Growth 5 10 15 20 25 30 35 40 45 . Age 12 . . .

of PP is mature and can be exploited commercially. 13 Ä . Significant development in Networking technology is paving a way for heterogeneous computing. significant R & D work on development of tools & environment.Why Parallel Processing? Ä The Tech.

etc. Vector Processing works well for certain kind of problems. are nonscalable and requires sophisticated Compiler Technology. Superscalar. Ä 14 .Why Parallel Processing? Ä Hardware improvements like Pipelining..

. Communication and synchronization of its processes (forms the core of parallel programming efforts). general multiple processors.Parallel Program has & needs . 15 ® ® .. Multiple “processes” active simultaneously solving a given problem.

Processing Elements Architecture 16 .

of instruction and data streams)     ® SISD SIMD MISD MIMD . (No shared memory) 17 .systolic arrays .conventional .Processing Elements ® Simple classification by Flynn: (No.very general. using general purpose processors. Current focus is on MIMD model. vector computing . multiple approaches.data parallel.

Workstations 18 .SISD : A Conventional Computer Instructions Data Input Processor Processor Data Output  Speed is limited by the rate at which computer can transfer information internally. Macintosh. Ex:PC.

Instruction Stream A Instruction Stream B

The MISD Architecture

Instruction Stream C
Processor

A Data Input Stream
Processor

Data Output Stream
Processor

B C

More of an intellectual exercise than a practical configuration. Few built, but commercially not available

19

SIMD Architecture
Instruction Stream

Data Input stream A Data Input stream B Data Input stream C

Processor

A
Processor

Data Output stream A Data Output stream B
Processor

B C

Data Output stream C

Ci<= Ai * Bi Ex: CRAY machine vector processing, Thinking machine cm*

20

MIMD Architecture
Instruction Instruction Instruction Stream A Stream B Stream C

Data Input stream A Data Input stream B Data Input stream C

Processor

A
Processor

Data Output stream A Data Output stream B
Processor

B C

Data Output stream C

Unlike SISD, MISD, MIMD computer works asynchronously. Shared memory (tightly coupled) MIMD Distributed memory (loosely coupled) MIMD

21

conventional OSes of SISD can be easily be ported  Limitation : reliability & expandability.Shared Memory MIMD machine Processor Processor A A Processor Processor B B Processor Processor C C M E M B O U R S Y M E M B O U R S Y M E M B O U R S Y Global Memory System Global Memory System Comm: Source PE writes data to GM & destination retrieves it  Easy to build.  Increase of processors leads to memory contention.. : Silicon graphics supercomputers.. A memory component or any processor failure affects the whole system. Ex. 22 ..

Distributed Memory MIMD IPC channel Processor Processor A A Processor Processor B B Processor Processor C C IPC channel M E M B O U R S Y M E M B O U R S Y M E M B O U R S Y Memory Memory System A System A q q q Memory Memory System B System B Memory Memory System C System C Communication : IPC on High Speed Network. etc. Mesh. Cube.. Network can be configured to . Unlike Shared MIMD  easily/ readily expandable  Highly reliable (any CPU failure does not affect the whole system) 23 . Tree..

...e. cost = Speed C (speed = cost2) S  Speedup by a parallel computer increases as the logarithm of the number of processors.Laws of caution... i. of processors) P 24 . q Speed of computers is proportional to the square of their cost. S P og 2 l Speedup = log2(no.

Caution. ¢ Very fast development in PP and related area have blurred concept boundaries.. distributed computing. parallel computing/ processing. causing lot of terminological confusion : concurrent computing/ programming.. multiprocessing. etc. 25 ...

It’s hard to imagine a field that changes as rapidly as computing. 26 .

..Caution. terminologies) 27 . Computer Science is an Immature Science.. (lack of standard taxonomy.

¢ There is no strict delimiters for contributors to the area of parallel processing : CA... databases. computer networks. all have a role to play. OS. This makes it a Hot Topic of Research § 28 . HLLs.Caution..

Parallel Programming Paradigms Multithreading Task level parallelism 29 .

Parallel COUNTER COUNTER 1 COUNTER 2 Q Please 30 .Serial Vs.

function stuff } function2( ) { //... t2) 31 .High Performance Computing t1 t2 Serial Machine function1 ( ): function2 ( ): q Single CPU Time : add (t1.... t2) function1( ) { //.....function stuff } Parallel Machine : MPP function1( ) || function2 ( ) q massively parallel system containing thousands of CPUs Time : max (t1..

Single and Multithreaded Processes Single-threaded Process Multiplethreaded Threads of Process Execution Multiple instruction stream Single instruction stream Common Address Space 32 .

Multi-threaded I/O Application Application Application CPU CPU Better Response Times in Multiple Application Environments CPU CPU CPU Application CPU Higher Throughput for Parallelizeable Applications 33 .OS: Multi-Processing. Multi-Threaded Threaded Libraries.

Multi-threading. 34 . allowing any given driver to run on multiple CPUs simultaneously. independent I/O requests can be satisfied simultaneously because all the major disk.. and network drivers have been multithreaded. continued. tape. scalable I/O Application Application Application OS Kernel CPU CPU CPU Multiple.. Multi-threaded OS enables parallel.

pipes..pipes. open files open files or or mmap’d mmap’d files files STAC K DATA DATA TEXT TEXT DATA DATA TEXT TEXT processes processes Shared Memory Shared Memory maintained by kernel maintained by kernel processes processes 35 .Basic Process Model STAC K Shared Shared memory memory segments segments .

What are Threads? Ä Ä Thread is a piece of code that can execute in concurrence with other threads. It is a schedule entity on a processor Hardware Context Registers Registers Status Word Status Word Program Counter Program Counter iLocal state iGlobal/ shared state iPC iHard/Software Context Thread Object 36 Running .

THREAD THREAD DATA DATA THREAD THREAD TEXT TEXT 37 .Threaded Process Model THREAD THREAD STACK STACK SHARED SHARED MEMOR MEMOR Y Y Threads within a process q Independent executables q All threads are parts of a process hence communication easier and simpler.

.. }} aa( (00) )=. ..... )=... ... =... )=...... )=. bb( (00) )=... )=..... ..... bb( (22)=...... }} func2 ( () ) func2 {{ ... . aa( (22)=. .. bb( (11)=...Levels of Levels of Parallelism Parallelism Task i-l Task i-l Task ii Task Task i+1 Task i+1 CodeCodeGranularity Granularity Code Item Code Item Large grain Large grain (task level) (task level) Program Program Medium grain Medium grain (control level) (control level) Function (thread) Function (thread) Fine grain Fine grain (data level) (data level) Loop Loop Very fine grain Very fine grain (multiple issue) (multiple issue) With hardware 38 r r r r r r r r Task Task Control Control Data Data Multiple Issue Multiple Issue func1 ( () ) func1 {{ ... . aa( (11)=. . . }} func3 ( () ) func3 {{ ...... . =... + + x x Load Load ...

......... func ()... &exit_value)....... thread_t tid. &tid)............. int exit_value.. int exit_value...... } } main ( ) main ( ) { { thread_t tid......... &exit_value)...... NULL..... 0.thread_create (0....... ... thread_create (0...............thr_exit(exit_value).thread_join (tid......Simple Thread Example Simple Thread Example void *func ( ) void *func ( ) { { /* define local data */ /* define local data */ ...... NULL.. func ().... &tid). 0. 0........ ... 0. thread_join (tid....... thr_exit(exit_value).. ....../* function code */ ..} } 39 .../* function code */ .....

Microsoft 40 . CMU Mach C threads. ISO/IEEE standard POSIX. C-DAC PARAS CORE threads.Few Popular Thread Few Popular Thread Models Models                 POSIX. C-DAC Java-Threads. Sun Microsystems PARAS CORE threads. Sun Microsystems Chorus threads. Microsoft Windows NT/95 threads. Paris OS/2 threads. ISO/IEEE standard Mach C threads. Paris Chorus threads. IBM OS/2 threads. Sun Microsystems Sun OS LWP threads. Sun Microsystems Java-Threads. IBM Windows NT/95 threads. CMU Sun OS LWP threads.

Multithreading Multithreading Uniprocessors Uniprocessors b Concurrency b Concurrency Vs Parallelism Vs Parallelism  Concurrency  Concurrency P1 P1 P2 P2 P3 P3 CPU tim tim e e Number of Simulatneous execution units > no Number of Simulatneous execution units > no of CPUs of CPUs 41 .

Multithreading Multithreading Multiprocessors Multiprocessors Concurrency Vs Parallelism Concurrency Vs Parallelism CPU P1 P1 P2 P2 P3 P3 CPU CPU tim tim e e No of execution process = no of CPUs No of execution process = no of CPUs 42 .

Computational Model Computational Model User Level Threads User Level Threads Virtual Processors Virtual Processors Physical Processors Physical Processors User-Level Schedule (User) Kernel-Level Schedule (Kernel) Parallel Execution due to :: Parallel Execution due to 5 5 5 5 Concurrency of threads on Virtual Processors Concurrency of threads on Virtual Processors Concurrency of threads on Physical Processor Concurrency of threads on Physical Processor True Parallelism :: True Parallelism threads :: processor map = 1:1 threads processor map = 1:1 43 .

visible to other. state Process VM is shared. state change in VM by one thread change in VM by one thread visible to other.General Architecture of General Architecture of Thread Model Thread Model Hides the details of machine Hides the details of machine architecture architecture Maps User Threads to kernel Maps User Threads to kernel threads threads Process VM is shared. 44 .

& r1). a. c. t1. sub. int & result) // function stuff // function stuff int sub(int a. int & result) int add (int a. & r2). sub. t2. & r1). t2). & r2). pthread t1.d.b. t1. int & result) Processor // function stuff // function stuff Data a a b b r1 r1 c c d d r2 r2 45 IS1 pthread t1. a. add.Process Parallelism Process Parallelism int add (int a.d. int b. int b. pthread-create(&t1. add add Processor IS2 sub sub MISD and MIMD Processing MISD and MIMD Processing . t2. int b. c. pthread-create(&t2. pthread-create(&t1. pthread-par (2. pthread-create(&t2. int b. add. t2). int & result) int sub(int a.b. pthread-par (2.

. int count) sort( int *array. thread1.... thread1. sort... array. thread2.. sort. pthread-t... array. Data Processor pthread-t. ““ ““ pthread-create(& thread1.Data Parallelism Data Parallelism sort( int *array. N/2). sort. thread2)... thread1. sort. //. pthread-par(2.... array. N/2). pthread-create(& thread2. Sort Sort IS Processor do “ “ dn/2 dn2/+1 “ “ dn 46 Sort Sort SIMD Processing SIMD Processing . pthread-create(& thread2.. pthread-par(2. pthread-create(& thread1. thread1. array... N/2). thread2. N/2).. //. thread2). //.. int count) //..

Process and Threaded Process and Threaded models models Purpose Purpose Creation of a new Creation of a new thread thread Start execution of a new Start execution of a new thread thread Wait for completion of Wait for completion of thread thread Exit and destroy the Exit and destroy the thread thread Process Process Model Model fork ( ) fork ( ) exec( ) exec( ) Threads Threads Model Model thr_create( ) thr_create( ) [ thr_create() builds [ thr_create() builds the new thread and the new thread and starts the execution starts the execution thr_join() thr_join() thr_exit() thr_exit() wait( ) wait( ) exit( ) exit( ) 47 .

thread_create(0.0).0.0. ). thread_create(0.0. ).0. ). 48 . thread_create(0.0.0.0.0.Code Comparison Code Comparison Segment (Process) Segment(Thread) Segment (Process) Segment(Thread) main ( ) main ( ) { { fork ( fork ( fork ( fork ( fork ( fork ( } } main() main() { { thread_create(0. thread_create(0.func().0). thread_create(0.func().func().0).0).0. } } ). ).func().0.0).func(). ).0).0.func().0.

Printing Thread Printing Thread

Editing Editing Thread Thread

49

printing() printing() {{ -- -- -- -- -- -- -- -- -- -- -- -}} editing() editing() {{ -- -- -- -- -- -- -- -- -- -- -- -}} main() main() {{ -- -- -- -- -- -- -- -- -- -- -- -id1 == thread_create(printing); id1 thread_create(printing); id2 == thread_create(editing); id2 thread_create(editing); thread_run(id1, id2); thread_run(id1, id2); -- -- -- -- -- -- -- -- -- -- -- -}}

Independent Threads Independent Threads

50

Cooperative threads - File Cooperative threads - File Copy Copy
reader() reader() {{ -- -- -- -- -- -- -- -- --lock(buff[i]); lock(buff[i]); read(src,buff[i]); read(src,buff[i]); unlock(buff[i]); unlock(buff[i]); -- -- -- -- -- -- -- -- --}} writer() writer() {{ -- -- -- -- -- -- -- -- -- -lock(buff[i]); lock(buff[i]); write(src,buff[i]); write(src,buff[i]); unlock(buff[i]); unlock(buff[i]); -- -- -- -- -- -- -- -- -- -}}

buff[0] buff[0] buff[1] buff[1]

Cooperative Parallel Cooperative Parallel Synchronized Threads Synchronized Threads

51

.....Client Client RPC Call RPC Call rk oork tw eetw N N Server Server RPC(func) RPC(func) func() func() { { /* Body */ /* Body */ } } ...... .. 52 ....

Multithreaded Server Client Process Client Process Server Process Server Threads User Mode Kernel Mode Message Passing Facility 53 .

Multithreaded Compiler Sourc e Code Preprocess or Thread Compile r Thread Objec t Code 54 .

A thread pipeline 55 . The peer model 3.Thread Programming models 1. The boss/worker model 2.

The boss/worker model Program Workers taskX taskX Resources Files Databases Boss Input (Stream) main ( ( ) main ) taskY taskY Disks taskZ taskZ Special Devices 56 .

... switch( request ) case X: pthread_create(. case X: pthread_create(..taskX)... ..taskX)..Example main() /* the boss */ { forever { get a request. } } taskX() /* worker */ { perform the task... --Above runtime overhead of creating thread can be solved by thread pool * the boss thread creates all worker thread at program initialization and each worker thread suspends itself immediately for a wakeup call from boss 57 . sync if accessing shared resources } taskY() /* worker */ { perform the task... sync if accessing shared resources } ...

The peer model Program (static) (static) taskY taskY Input Input Workers taskX taskX Resources Files Databases Disks taskZ taskZ Special Devices 58 .

sync if accessing shared resources } task2() /* worker */ { wait for start perform the task..Example main() { pthread_create(.....thread2. sync if accessing shared resources } 59 ..task2).. ..task1)...... pthread_create(.thread1. signal all workers to start wait for all workers to finish do any cleanup } } task1() /* worker */ { wait for start perform the task...

A thread pipeline Program Input (Stream) Filter Threads Stage 11 Stage Stage 22 Stage Stage 33 Stage Resources Files Databases Disks Files Databases Disks Files Databases Disks Special Devices Special Devices Special Devices 60 .

Example
main() { pthread_create(....,stage1); pthread_create(....,stage2); .... wait for all pipeline threads to finish do any cleanup } stage1() { get next input for the program do stage 1 processing of the input pass result to next thread in pipeline } stage2(){ get input from previous thread in pipeline do stage 2 processing of the input pass result to next thread in pipeline } stageN() { get input from previous thread in pipeline do stage N processing of the input pass result to program output. }

61

Multithreaded Matrix Multiply...

X A B

= C

C[1,1] = A[1,1]*B[1,1]+A[1,2]*B[2,1].. …. C[m,n]=sum of product of corresponding elements in row of A and column of B. Each resultant element can be computed independently. 62

Multithreaded Matrix Multiply
typedef struct { int id; int size; int row, column; matrix *MA, *MB, *MC; } matrix_work_order_t; main() { int size = ARRAY_SIZE, row, column; matrix_t MA, MB,MC; matrix_work_order *work_orderp; pthread_t peer[size*zize]; ... /* process matrix, by row, column */ for( row = 0; row < size; row++ ) for( column = 0; column < size; column++) { id = column + row * ARRAY_SIZE; work_orderp = malloc( sizeof(matrix_work_order_t)); /* initialize all members if wirk_orderp */ pthread_create(peer[id], NULL, peer_mult, work_orderp); } } /* wait for all peers to exist*/ for( i =0; i < size*size;i++) pthread_join( peer[i], NULL ); }

63

exit( 1 ).s_addr = htonl(INADDR_ANY). /* default port_id */ if( (server_socket = socket( AF_INET. cli_addr. port_id. SOCK_STREAM. void main( int argc. struct sockaddr_in serv_addr.sin_addr. SO_REUSEADDR..Multithreaded Server.\n"). client_socket. sizeof(one)). setsockopt(server_socket. SOL_SOCKET. serv_addr. 0 )) < 0 ) { printf("Error: Unable to open socket in parmon server. 0. clilen. sizeof(serv_addr)). } memset( (char*) &serv_addr. serv_addr. 64 .. int one. serv_addr.sin_family = AF_INET. char *argv[] ) { int server_socket.sin_port = htons( port_id ). (char *)&one. #endif port_id = 4000. #ifdef _POSIX_THREADS pthread_t service_thr.

\n" ). service_dispatch. continue. (struct sockaddr *)&serv_addr. if( client_socket < 0 ) { printf( "connection to client failed in server. exit( 1 ). THR_DETACHED.. sizeof(serv_addr)) < 0 ) { printf( "Error: Unable to bind socket in parmon server->%d\n". #else thr_create( NULL. &clilen ). 0. } #ifdef POSIX_THREADS pthread_create( &service_thr. } listen( server_socket. if( bind( server_socket. NULL. client_socket. (struct sockaddr *)&serv_addr.. while( 1 ) { clilen = sizeof(cli_addr). client_socket). service_dispatch. &service_thr). 5).Multithreaded Server. #endif } } 65 . client_socket = accept( server_socket.errno ).

.Thread Funtion void *service_dispatch(int client_socket) { …Get USER Request if( readline( client_socket.Multithreaded Server // Service function -. #endif } 66 . 100 ) > 0 ) { …IDENTI|FY USER REQUEST ….Send Results to Server } …CLOSE Connect and Terminate THREAD close( client_socket ). command.Do NECESSARY Processing …. #ifdef POSIX_THREADS pthread_exit( (void *)0).

The Value of MT • • • • • • • • Program structure Parallelism Throughput Responsiveness System resource usage Distributed objects Single source across platforms (POSIX) Single binary for any number of CPUs 67 .

To thread or not to To thread or not to thread thread Improve efficiency on uniprocessor Improve efficiency on uniprocessor systems systems Use multiprocessor Hardware Use multiprocessor Hardware Improve Throughput Improve Throughput b b 8 8 8 8 8 8 Simple to implement Asynchronous I/O Simple to implement Asynchronous I/O 8 8 Leverage special features of the OS Leverage special features of the OS 68 .

it is not free it is not free b b 8 8 thread that has only five lines of thread that has only five lines of code would not be useful code would not be useful 69 . Thread creation is very cheap.To thread or not to To thread or not to thread thread 8 8 If all operations are CPU If all operations are CPU intensive do not go far on intensive do not go far on multithreading multithreading Thread creation is very cheap.

DOS .The Minimal OS Stack & Stack Pointer Program Counter User Code User Space Kernel Space DOS Data Global Data DOS DOS Code Hardware 70 .

OS/2 etc. MVS.) 71 .Multitasking OSs Process User Space Process Structure Kernel Space UNIX Hardware (UNIX. NT. VMS.

Multitasking Systems Processes P1 P2 P3 P4 The Kernel Hardware (Each process is completely independent) 72 .

Multithreaded Process T1’s SP T3’sPC T1’sPC T2’sPC T1’s SP T2’s SP User Code Global Data Process Structure The Kernel (Kernel state and address space are shared) 73 .

Signal Dispatch Table Memory Map Signal Dispatch Table Memory Map File Descriptors Priority Signal Mask Registers Kernel Stack File Descriptors CPU State LWP 2 LWP 1 74 .Kernel Structures Traditional UNIX Process Structure Process ID UID GID EUID EGID CWD. Solaris 2 Process Structure Process ID UID GID EUID EGID CWD.

OS/1. NT. AIX. IRIX M:M 2-level 75 .Scheduling Design Options M:1 HP-UNIX 1:1 DEC.

SunOS Two-Level Thread Model Traditional process Proc 1 User LWPs Kernel threads Kernel Hardware Processors 76 Proc 2 Proc 3 Proc 4 Proc 5 .

). } main() { thr_create( .. arg). .... ... pthread_create( func........Thread Life Cycle T1 pthread_create(. } void * func() { ... } POSIX Solaris 77 .arg.func..func.) pthread_exit() T2 main() { ..

pthread_join(T2)... . } void * func() { . } Solaris 78 .. } POSIX main() { thr_join( T2.&val_ptr).. ...Waiting for a Thread to Exit T1 pthread_join() pthread_exit() T2 main() { ....

Scheduling States: Simplified View of Thread State Transitions Stop Wakeup RUNNABLE Continue Preempt Stop STOPPED SLEEPING Stop ACTIVE Sleep 79 .

It can only issue a hardware interrupt to CPU3. CPU2 cannot change CPU3’s registers directly.Preemption The process of rudely interrupting a thread and forcing it to relinquish its LWP (or CPU) to another. Preemption ! = Time slicing All of the libraries are preemptive 80 . Higher priority threads always preempt lower priority threads. It is up to CPU3’s interrupt handler to look at CPU2’s request and decide what to do.

That means all of the process -.EXIT Vs.) 81 . then exit() will be called. (If no other threads are running. leaving the process intact and all of the other threads running. The thread exit functions: UI : thr_exit() POSIX : pthread_exit() OS/2 : DosExitThread() and _endthread() NT : ExitThread() and endthread() all cause only the calling thread to exit. THREAD_EXIT The normal C function exit() always causes the process to exit.All the threads.

.. DosKillThread(T1). {. (UI threads must “roll their own” using signals) T2 Windows NT 82 .Cancellation Cancellation is the means by which a thread can tell another thread that it should exit..... TerminateThread(T1) } } } There is no special relation between the killer of a thread and the victim. pthread_cancel (T1). (pthread exit) T1 (pthread cancel() main() main() POSIX main() OS/2 {. {.

must consider type) PTHREAD_CANCEL_ASYNCHRONOUS ever) (not generally used) PTHREAD_CANCEL_DEFERRED (Only at cancellation points) (any time what-so- b Type b b b (Only POSIX has state and type) (OS/2 is effectively always “enabled asynchronous”) (NT is effectively always “enabled asynchronous”) 83 .Cancellation State and Type b State b b PTHREAD_CANCEL_DISABLE (Cannot be cancelled) PTHREAD_CANCEL_ENABLE (Can be cancelled.

document.Cancellation is Always Complex! b b b b It is very easy to forget a lock that’s being held or a resource that should be freed. Be extremely meticulous in analyzing the possible thread states. Document. Use this only when you absolutely require it. document! 84 .

No threads can be waited for Any thread can return status b OS/2 b b b b NT b b 85 . An undetached thread must be “joined”. Any thread can be waited for No thread can return status No thread needs to be waited for. It cannot return status.Returning Status b POSIX and UI b b A detached thread cannot be “joined”. and can return a status.

. thr_suspend(T1)..Suspending a Thread T1 suspend() continue() T2 Solaris: main() { . thr_continue(T1)... .. .. } * POSIX does not support thread suspension 86 .

b Isolation of VM system “spooling” (?!) b NT Services specify that a service should b suspendable (Questionable requirement?) b Proposed Uses of Suspend/Contin ue Be Careful 87 . so they don’t count.Garbage Collectors b Debuggers b Performance Analysers b Other Tools? These all must go below the API.

Do NOT Think about Scheduling! Think about Resource Availability b Think about Synchronization b Think about Priorities b Ideally. you’re making a mistake! 88 . if you’re using suspend/ continue.

It is not even strictly an MT issue! 89 . Synchronization is not just an MP issue.” b * Serialized access to controlled resources.Synchronization b Websters: “To represent or arrange events to indicate coincidence or coexistence.” Lewis : “To arrange events so that they occur in a specified order.

   On shared memory :: On shared  semaphores memory Threads Synchronization :: Threads Synchronization shared variables -shared variables semaphores  On distributed memory ::  On distributed memory ® within a task :: semaphores ® within a task semaphores ® Across the tasks :: By passing Across the ®messages tasks By passing messages 90 .

Your->Dividend += dividend.> BankBalance. dividend = temp * InterestRate. Your- 91 .Unsynchronized Shared Data is a Formula for Disaster Thread1 Thread2 temp = Your . >BankBalance+= deposit. Your->BankBalance = newbalance. newbalance = dividend + temp.

(not all are) b An entire database transaction could need to be atomic. A section of code which you have forced to be atomic is a Critical Section. 92 .Atomic Actions b b b An action which must be started and completed with no possibility of interruption. All MP machines provide at least one complex atomic instruction. from which you can build anything. (not all are!) b A line of C code could need to be atomic. b A machine instruction could need to be atomic.

........ ....-..... .. . .--lock(DISK)..-....-...-.-.......-.... .--}} T2 writer() writer() {{ -..... -...........-.....-........-. unlock(DISK).-..-..-.-. unlock(DISK).-...-}} Shared Data 93 ..-lock(DISK).-...-.-.. unlock(DISK)..-....-.-..-.-......-.-. .... -.Critical Section Critical Section (Good Programmer!) (Good Programmer!) T1 reader() reader() {{ -........-.-..-..... lock(DISK)... .....-... ..-..-. lock(DISK).. unlock(DISK)... . ...........

...-. . ... .... unlock(DISK).-..-}} Shared Data 94 .. -.-...-.. lock(DISK)..-....-..-....-.-.. -.-.....-......... unlock(DISK).........-.-.............-...-....-................--lock(DISK).-.-..-.--}} T2 writer() writer() {{ -.Critical Section Critical Section (Bad Programmer!) (Bad Programmer!) T1 reader() reader() {{ -....-....-.....-...-.-..... . ..-. ......-. ......-.-..-.. ..-...-.......

Lock Shared Data! b b b Globals Shared data structures Static variables (really just lexically scoped global variables) 95 .

OS/2 and NT. list = item. Owner recorded. . 96 ..Mutexes Thread 1 item = create_and_fill_item(). mutex_unlock(&m).. Thread2 mutex_lock( &m ).func(this-item). item->next = list. list = list_next. this_item = list.. mutex_lock( &m ).. block in priority order. b b POSIX and UI : Owner not recorded. block in FIFO order. mutex_unlock(&m).

Synchronization Variables in Shared Memory (Cross Process) Synchronization Variable Process 1 S Shared Memory S S Process 2 S Thread 97 .

Synchronization Problems 98 .

Thread 2 lock( M2 ). lock( M1 ).Deadlocks Thread 1 lock( M1 ). lock( M2 ). Thread1 is waiting for the resource(M2) locked by Thread2 and Thread2 is waiting for the resource (M1) locked by Thread1 99 .

if( EBUSY |= pthread mutex_trylock (&m1)) break. } } do_real work(). else { pthread _mutex_unlock (&m1). { while (1) { pthread_mutex_lock (&m2). Use the trylock primitives if you must violate the hierarchy.Avoiding Deadlocks b b Establish a hierarchy : Always lock Mutex_1 before Mutex_2... wait_around_or_do_something_else(). 100 . etc../* Got `em both! */ } b Use lockllint or some similar static analysis program to scan your code for hierarchy violations.

in different Thread 1 Thread 2 mutex_lock (&m) mutex_lock (&m) v = v .Race Conditions A race condition is where the results of a program are different depending upon the timing of the events within the program.1. v = v * 2. mutex_unlock (&m) mutex_unlock (&m) --> if v = 1. the result can be 0 or 1based on which thread gets chance to enter CR first 101 . Some race conditions result answers and are clearly bugs.

Operating System Issues 102 .

Library Goals b b b Make it fast! Make it MT safe! Retain UNIX semantics! 103 .

Are Libraries Safe ? getc() OLD implementation: extern int get( FILE * p ) { /* code to read data */ } getc() NEW implementation: extern int get( FILE * p ) { pthread_mutex_lock(&m). } 104 . /* code to read data */ pthread_mutex_unlock(&m).

should two threads both be issuing system calls around the same time. nonMT programs to continue to run. (This is often a problem when using third party libraries.ERRNO In UNIX.) 105 . There is the potential for problems if you use some libraries which are not reentrant. This is done only when the flag_REENTRANT (UI) _POSIX_C_SOURCE=199506L (POSIX) is passed to the compiler. Therefore errno is defined in the header file to be a call to thread-specific data. the distinguished variable errno is used to hold the error code for any system calls that fail. allowing older. it would not be possible to figure out which one set the value for errno. Clearly.

g. but there is a similar function (e. getctime_r()) MT-Illegal This function wasn’t even compiled with _REENTRANT and therefore can only be called from the main thread. but was compiled with _REENTRANT Alternative Call This function is not safe.Are Libraries Safe? b b b b b MT-Safe This function is safe MT-Hot This function is safe and fast MT-Unsafe This function is not MT-safe. 106 .

Threads Debugging Interface b b b b b Debuggers Data inspectors Performance monitors Garbage collectors Coverage analyzers Not a standard interface! 107 .

The APIs 108 .

Different Thread Specifications Functionality UI Threads POSIX Thteads NT Threads OS/2 Threads Design Philosophy Base Near-Base Complex Complex Primitives Primitives Primitives Primitives Scheduling Classes Local/ Global Local/Global Global Global Mutexes Simple Simple Complex Complex Counting Semaphores Simple Simple Buildable Buildable R/W Locks Simple Buildable Buildable Buildable Condition VariablesSimple Simple Buildable Buildable Multiple-Object Buildable Buildable Complex Complex Synchronization Thread Suspension Yes Impossible Yes Yes Cancellation Buildable Yes Yes Yes Thread-Specific Data Yes Yes Yes Yes Signal-Handling Primitives Yes Yes n/a n/a Compiler Changes Required No No Yes No Vendor Libraries MT-safe? Moat Most All? All? ISV Libraries MT-safe? Some Some Some Some 109 .

POSIX and Solaris API Differences POSIX API Solaris API continue thread cancellation scheduling policies sync attributes thread attributes join suspend exit key creation semaphore vars priorities sigmask create thread specific data concurrency setting mutex vars kill reader/ writer vars condition vars daemon threads 110 .

Error Return Values b b b Many threads functions return an error value which can be looked up in errno.h. The “lack of resources” errors usually mean that you’ve used up all your virtual memory. and your program is likely to crash very soon. Very few threads functions set errno(check man pages). 111 .

pthread_create(NULL. and NT all use flags and direct arguments to indicate what the special details of the objects being created should be.Attribute Objects UI. &attr. pthread_attr_init(&attr). NULL. Vs: pthread_attr_t attr. OS/2. POSIX requires the use of “Attribute objects”: thr_create(NULL.PTHREAD_CREATE_DETACH ED). NULL). 112 . pthread_attr_setdetachstate(&attr. foo. THR_DETACHED). NULL. foo.

scheduling policy. stack base. scheduling inheritance. (If they decide they really want to. attribute objects allow the designers of the threads library more latitude to add functionality without changing the old interfaces. priority inheritance Condition Variables Cross process 113 .Attribute Objects Although a bit of pain in the *** compared to passing all the arguments directly. they just add a function pthread_attr_set_signal_mask() instead of adding a new argument to pthread_create(). pass the signal mask at creation time.) There are attribute objects for: Threads stack size. detach state. scheduling class. Mutexes Cross process. scheduling scope. say.

pthread_attr_setdetachstate(&attr. foo.Attribute Objects Attribute objects must be: Allocated Initialized Values set (presumably) Used Destroyed (if they are to be free’d) pthread_attr_t attr. PTHREAD_CREATE_DETACHED)’ pthread_create(NULL. pthread_attr_destroy (&attr). &attr. NULL). 114 . pthread_attr_init (&attr).

in *scope) pthread_attr_setscope(pthread_attr_t *attr. int scope) 115 . Thread attribute object type: pthread_attr_init (pthread_mutexattr_t *attr) pthread_attr_destroy (pthread_attr_t *attr) pthread_attr_getdetachstate *state) pthread_attr_setdetachstate state) (pthread_attr_t *attr. in (pthread_attr_t *attr.Thread Attribute Objects pthread_attr_t. int Can the thread be joined?: pthread_attr_getscope(pthread_attr_t *attr.

struct sched param *param) pthread_attr_setschedparam(pthread attr_t *attr. int *policy) pthread_attr_setschedpolicy(pthread_attr_t *attr. FIFO. int policy) Will the scheduling be RR. What will the priority be? 116 . or OTHER? pthread_attr_getschedparam(pthread_attr_t *attr.Thread Attribute Objects pthread_attr_getinheritpolicy(pthread_attr_t *attr. struct sched param *param). int policy) Will the policy in the attribute object be used? pthread_attr_getschedpolicy(pthread_attr_t *attr. int *policy) pthread_attr_setinheritpolicy(pthread_attr_t *attr.

size_t base) What will the stack’s base address be? 117 . int *inheritsched) pthread_attr_setinheritsched(pthread_attr_t *attr. int *size) pthread_attr_setstacksize(pthread_attr_t *attr.Thread Attribute Objects pthread_attr_getinheritsched(pthread_attr_t *attr. int inheritsched) Will the policy in the attribute object be used? pthread_attr_getstacksize(pthread_attr_t *attr. int size) How big will the stack be? pthread_attr_getstackaddr (pthread_attr_t *attr. size_t *base) pthread_attr_setstackaddr(pthread_attr_t *attr.

mutex attribute object type pthread_mutexattr_init(pthread_mutexattr_t *attr) pthread_mutexattr_destroy(pthread_mutexattr_t *attr) pthread_mutexattr_getshared(pthread_mutexattr_t*attr. int shared) Will the mutex be shared across processes? 118 .Mutex Attribute Objects pthread_mutexattr_t. int shared) pthread_mutexattr_setpshared (pthread_mutex attr_t *attr.

int *protocol) pthread_mutexattr_setprotocol (pthread_mutexattr_t *attr.Mutex Attribute Objects pthread_mutexattr_getprioceiling(pthread_mutexattr_t *attr. int *ceiling) What is the highest priority the thread owning this mutex can acquire? pthread_mutexattr_getprotocol (pthread_mutexattr_t *attr. int *ceiling) pthread_mutexattr_setprioceiling(pthread_mutexattr_t *attr. int protocol) Shall the thread owning this mutex inherit priorities from waiting threads? 119 .

int *shared) pthread_condattr_setpshared (pthread_condattr_t *attr. int shared) Will the mutex be shared across processes? 120 .Condition Variable Attribute Objects pthread_condattr_t. CV attribute object type pthread_condattr_init(pthread_condattr_t * attr) pthread_condattr_destroy(pthread_condattr_t *attr) pthread_condattr_getpshared (pthread_condattr_t *attr.

const pthread_attr_t *attr.Creation and Destruction (UI & POSIX) int thr_create(void *stack_base. int pthread_join (pthread_t thread. int thr_join (thread_t thread. void *(*start_routine) (void *). void *arg). long flags. void pthread_exit (void *value_ptr). void **value_ptr). size_t stacksize. thread_t thread). 121 . int pthread_create (pthread_t *thread. void * arg. void thr_exit (void *value_ptr). void **value_ptr). void * (*start_routine) (void *). int pthread_cancel (pthread_t thread).

Suspension (UI & POSIX) int thr_suspend (thread_t target) int thr_continue(thread_t target) 122 .

int *policy. int *priority) int pthread_getschedparam(pthread_t thread. struct sched param *param) 123 . int priority) int thr_getpriority(thread_t thread.Changing Priority (UI & POSIX) int thr_setpriority(thread_t thread. int policy. struct sched param *param) int pthread_setschedparam(pthread_t thread.

int rw_rdlock (rwlock_t *rwlock). int rw_unlock (rwlock_t *rwlock). int rw_tryrdlock(rwlock_t *rwlock). 124 . int type. int rw_wrlock (rwlock_t *rwlock). int rw_trywrlock (rwlock_t *rwlock). int rw_destroy (rwlock_t *rwlock).Readers / Writer Locks (UI) int rwlock_init (rwlock_t *rwlock. void *arg).

Use the libposix4. int pshared. unsigned int sema_count. int type. void *arg) int sema_wait (sema_t *sema) int sema_post (sema_t *sema) int sema_trywait (sema_t *sema) int sema_destroy (sema_t *sema) int int int int sem_init (sem_t *sema. unsigned int count) sem_post (sem_t *sema) sem_trywait (sem_t *sema) sem_destroy (sem_t *sema) (POSIX semaphores are not part of pthread.so and posix4.h) 125 .(Counting) Semaphores (UI & POSIX) int sema_init (sema_t *sema.

int type. timestruc_t *abstime) cond_destroy (cond_t *cond) int pthread_cond_init(pthread_cond_t *cond. pthread_mutex_t *mutex) int pthread_cond_signal (pthread_cond_t *cond) int pthread_cond_broadcast(pthread_cond_t *cond.Condition Variables (UI & POSIX) int int int int int int cond_init(contd_t *cond. pthread_mutex_t *mutex. void *arg) cond_wait(cond_t *cond. mutex_t *mutex. cond_signal(cond_t *cond) cond_broadcast(cond_t *cond) cond_timedwait(cond_t *cond.pthread_condattr_t *attr) int pthread_cond_wait(pthread_cond_t *cond. mutex_t *mutex). struct timespec *abstime) int pthread_cond_destroy(pthread_cond_t *cond) 126 .

sigset_t *oset). int sig) int sigwait(sigset_t *set) int pthread_sigmask(int how.Signals (UI & POSIX) int thr_sigsetmask(int how. int thr_kill(thread_t target thread. int sig) int sigwait(sigset_t *set. const sigset_t *set. int pthread_kill(thread_t target_thread. int *sig) 127 . sigset_t *oset). const sigset_t *set.

int *old_state) (void) 128 int pthread_setcancelstate int pthread_testcancel . void *arg) (int state.Cancellation (POSIX) int pthread_cancel int pthread cleanup_pop int pthread_cleanup_push (pthread_thread_t thread) (int execute) (void (*funtion) (void *).

void (*parent) (void). void (*init_routine) (void)) pthread_self (void) pthread_yield() (Thread IDs in Solaris recycle every 2^32 threads.) 129 . or about once a month if you do create/exit as fast as possible.Other APIs thr_self(void) thr_yield() int pthread_atfork (void (*prepare) (void). pthread_thread_t t2) pthread_once (pthread_once_t *once_control. void (*child) (void) pthread_equal (pthread_thread_t tl.

Compiling 130 .

libpthread.h.h.h. posix4.so.so. pthread. libposix4.h Bundled with all O/S releases b Running an MT program requires no extra effort b Compiling an MT program requires only a compiler (any compiler!) b Writing an MT program requires only a compiler (but a few MT tools will come in very handy) 131 .Solaris Libraries b b b Solaris has three libraries: libthread.so Corresponding new include files: synch. thread.

the old definitions for errno. which you don’t want b All MT-safe libraries should be compiled using the _REENTRANT flag.Compiling UI under Solaris b Compiling is no different than for non-MT programs b b libthread is just another system library in /usr/lib Example: %cc -o sema sema. 132 .c -mt b All multithreaded programs should be compiled using the _REENTRANT flag b b Applies for every module in a new application If omitted. stdio would be used.c -lthread -D_REENTRANT %cc -o sema sema. even though they may be used single in a threaded program.

Compiling POSIX under Solaris b Compiling is no different than for non-MT programs b b libpthread is just another system library in /usr/lib Example : %cc-o sema sema. stdio would be used. which you don’t want b All MT-safe libraries should be compiled using the _POSIX_C_SOURCE=199506L flag. the old definitions for errno. even though they may be used single in a threaded program 133 .c -lpthread -lposix4 -D_POSIX_C_SOURCE=19956L b All multithreaded programs should be compiled using the _POSIX_C_SOURCE=199506L flag b b Applies for every module in a new application If omitted.

alarms.. timers. thr_setconcurrency()) %cc-o sema sema.Compiling mixed UI/POSIX under Solaris b If you just want to use the UI thread functions (e..g. 134 .. etc. sigwait().c -1thread -1pthread -1posix4 D_REENTRANT _POSIX_PTHREAD_SEMANTICS If you also want to use the UI semantics for fork().

Threre is already standard for multithreading--POSIX Threre is already standard for multithreading--POSIX Multithreading support already available in the form of Multithreading support already available in the form of language syntax--Java language syntax--Java Threads allows to model the real world object (ex: in Java) Threads allows to model the real world object (ex: in Java) 135 . MT is not a silver bullet for all programming problems.Summary 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 Threads provide a more natural programming paradigm Threads provide a more natural programming paradigm Improve efficiency on uniprocessor systems Improve efficiency on uniprocessor systems Allows to take full advantage of multiprocessor Hardware Allows to take full advantage of multiprocessor Hardware Improve Throughput: simple to implement asynchronous Improve Throughput: simple to implement asynchronous I/O I/O Leverage special features of the OS Leverage special features of the OS Many applications are already multithreaded Many applications are already multithreaded MT is not a silver bullet for all programming problems.

Java Multithreading in Java 136 .

CPU Independent language Created for consumer electronics Java . and others Java -The name that survived a patent search Oak -The predecessor of Java Java is “C++ -.An Introduction b b b b b b b b Java . Arthur Van .James .++ “ 137 .Java .The new programming language from Sun Microsystems Java -Allows anyone to publish a web page with Java code in it Java .

Object Oriented Languages -A comparison Feature Encapsulation Inheritance Multiple Inherit. Polymorphism Binding (Early or Late) Concurrency Garbage Collection Genericity Class Libraries C++ Yes Yes Yes Yes Both Poor No Yes Yes Objective C Yes Yes Yes Yes Both Poor Yes No Yes Ada Yes No No Yes Early Difficult No Yes Limited Java Yes Yes No Yes Late Yes Yes No Yes 138 .

Sun defines Java as: b b b b b b b b Simple and Powerful Safe Object Oriented Robust Architecture Neutral and Portable Interpreted and High Performance Threaded Dynamic 139 .

Java Integrates Power of Compiled Languages and Flexibility of Interpreted Languages 140 .

Classes and Objects b b b b b Classes and Objects Method Overloading Method Overriding Abstract Classes Visibility modifiers default public protected private protected . private 141 .

Threads b b b b b Java has built in thread support for Multithreading Synchronization Thread Scheduling Inter-Thread Communication: currentThread start setPriority yield run getPriority sleep stop suspend resume Java Garbage Collector is a low-priority thread 142 .

extends Thread run() body of execution b new MyThread().start().Ways of Multithreading in Java b b Create a class that extends the Thread class Create a class that implements the Runnable interface b 1st Method: Extending the Thread class class MyThread { public void { // thread } } Creating thread: MyThread thr1 = Start Execution: thr1. b 143 .

144 . public void run() { // thread body of execution } } b Creating Object: ClassName myObject = new ClassName()...start(). b Creating Thread Object: Thread thr1 = new Thread( myObject )..2nd method: Threads by implementing Runnable interface class ClassName implements Runnable { .. b Start Execution: thr1.

// Methods public static int activeCount(). public class java. public final String getName().Object implements java. public static Thread currentThread().. name).lang. public void checkAccess(). // Constructors public Thread().Thread Class Members. Runnable target). public Thread(Runnable target).lang.lang. String name). 145 . String name).Thread extends java. public static void dumpStack(). public void destroy(). public final static int NORM_PRIORITY. public int countStackFrames(). public final static int MIN_PRIORITY. public Thread(String name). public Thread(Runnable target. public Thread(ThreadGroup group. Runnable target. String public Thread(ThreadGroup group.Runnable { // Fields public final static int MAX_PRIORITY.. public Thread(ThreadGroup group. public static int enumerate(Thread tarray[]).

final void setName(String name). final void setPriority(int newPriority). void interrupt().. boolean isInterrupted().Thread Class Members. 146 . void start(). int nanos). final void join(long millis. static void sleep(long millis. final void join(long millis). void run(). String toString(). final boolean isAlive(). final void join(). final boolean isDaemon(). static void sleep(long millis). int nanos). final void stop(Throwable obj). final ThreadGroup getThreadGroup().. static void yield(). public public public public public public public public public public public public public public public public public public public public public public public } final int getPriority().. final void setDaemon(boolean on). final void stop(). final void suspend(). // 1 to 10 priority-pre-emption at mid. static boolean interrupted(). final void resume().

out.main] 5 4 3 2 1 147 .java class CurrentThreadDemo { public static void main(String arg[]) { Thread ct = Thread. i>0. i--) { System.sleep(1000). try { for(int i=5. ct.Manipulation of Current Thread // CurrentThreadDemo.currentThread().out.5. } } } Run: Current Thread : Thread[My Thread. Thread. } } catch(InterruptedException e) { System."). System.println("Interrupted.setName( "My Thread" ).println("Current Thread : "+ct).println(" " + i).out.

").Creating new Thread.start(). try { Thread.out."Demo Thread").out.println("Exiting main thread."). } 148 . Thread t = new Thread(this.println("Interrupted. // ThreadDemo. } catch(InterruptedException e) { System..currentThread()..out.sleep(3000). System.java class ThreadDemo implements Runnable { ThreadDemo() { Thread ct = Thread. } System.println("Current Thread : "+ct). t.

println("Child interrupted. } System. 149 . Thread. i--) { System. 2 1 Exiting child thread..println("Exiting child thread.")..5.out.main] 5 4 3 Exiting main thread.. } } Run: Current Thread : Thread[main.out. } } catch(InterruptedException e) { System.out.").sleep(1000). public void run() { try { for(int i=5. } public static void main(String args[]) { new ThreadDemo().println(" " + i). i>0.Creating new Thread.

} public void stop() { running = false.. } public void run() { while(running) click++. // HiLoPri.Thread Priority.start(). } public void start() { t.java class Clicker implements Runnable { int click = 0. private Thread t. public Clicker(int p) { t = new Thread(this). private boolean running = true. } } 150 .setPriority(p).. t.

start(). 956228 Run2: (Window 95) 304300 vs..click).sleep(10000).NORM_PRIORITY+2). try { Thread. } catch (Exception e) { } Lo. Hi.. Hi.Thread Priority class HiLoPri { public static void main(String args[]) { Thread.println(Lo.stop(). Clicker Lo = new Clicker(Thread. Lo. Clicker Hi = new Clicker(Thread.click + " vs. System.MAX_PRIORITY).stop().setPriority(Thread.currentThread().out. " + Hi.NORM_PRIORITY-2).start().. 4066666 151 . } } Run1: (on Solaris) 0 vs.

The Java monitor model Method 1 Method 2 Key Block 1 Threads Monitor (synchronised) solves race-condition problem 152 .

println("]"). String s) { Target = t. } } class Caller implements Runnable { String msg. public Caller(Callme t.out.. Callme Target.java: race-condition without synchronisation class Callme { // Check synchronized and unsynchronized methods /* synchronized */ void call(String msg) { System. msg = s.sleep(1000).start(). new Thread(this). try { Thread.Threads Synchronisation. } 153 .out.. // Synch. } catch(Exception e) { } System.print("["+msg).

"Synchronized").. } } Run 1: With unsynchronized call method (race condition) [Hello[Synchronized[World] ] ] Run 2: With synchronized call method [Hello] [Synchronized] [World] Run3: With Synchronized object synchronized(Target) { Target. new Caller(Target. new Caller(Target.Threads Synchronisation.. "Hello"). new Caller(Target..call(msg). "World"). } The output is the same as Run2 154 . } } class Synch { public static void main(String args[]) { Callme Target = new Callme().call(msg). public void run() { Target.

System. "Producer").out.. Producer(Queue q) { Q = q. new Thread( this. return n. synchronized int get() { System.println("Put : "+n). } } class Producer implements Runnable { Queue Q.start(). } synchronized void put(int n) { this.Queue (no inter-threaded communication).n = n. // pc.. } 155 .out.java: produce and consumer class Queue { int n.println("Got : "+n).

public void run() { int i = 0. } public void run() { while(true) Q. class Consumer implements Runnable { Queue Q.get().put(i++)... } } Queue (no inter-threaded communication). Consumer(Queue q) { Q = q. } } 156 . while(true) Q. "Consumer"). new Thread( this.start().

. } } Run: Put: 1 Got: 1 Got: 1 Got: 1 Put: 2 Put: 3 Got: 3 ^C 157 . new Producer(Q)..Queue (no inter-threaded communication). new Consumer(Q).. class PC { public static void main(String[] args) { Queue Q = new Queue().

notify(). ValueSet = false. } catch(InterruptedException e) { } System. // PCnew..println("Got : "+n).Queue (interthread communication).java: produce-consumenr with interthread communication class Queue { int n. boolean ValueSet = false.out. } 158 . return n.. synchronized int get() { try { if(!ValueSet) wait().

. notify()..n = n.Queue (interthread communication). synchronized void put(int n) { try { if(ValueSet) wait(). } 159 . ValueSet = true. System. } catch(InterruptedException e) { } this. new Thread( this.out. "Producer").start(). } } class Producer implements Runnable { Queue Q. Producer(Queue q) { Q = q.println("Put : "+n).

put(i++)... public void run() { int i = 0. } } 160 . while(true) Q.get().Queue (interthread communication).start(). } } class Consumer implements Runnable { Queue Q. "Consumer"). new Thread( this. } public void run() { while(true) Q. Consumer(Queue q) { Q = q.

...Queue (no interthread communication).
class PCnew { public static void main(String[] args) { Queue Q = new Queue(); new Producer(Q); new Consumer(Q); } } Run: Put : 0 Got : 0 Put : 1 Got : 1 Put : 2 Got : 2 Put : 3 Got : 3 Put : 4 Got : 4 ^C

161

Deadlock...
// DeadLock.java class A { synchronized void foo(B b) { String name = Thread.currentThread().getName(); System.out.println(name + " entered A.foo"); try { Thread.sleep(1000); } catch(Exception e) { } System.out.println(name + " trying to call B.last()"); b.last(); } synchronized void last() { System.out.println("Inside A.last"); } }

162

Deadlock...
class B { synchronized void bar(A a) { String name = Thread.currentThread().getName(); System.out.println(name + " entered B.bar"); try { Thread.sleep(1000); } catch(Exception e) { } System.out.println(name + " trying to call A.last()"); a.last(); } synchronized void last() { System.out.println("Inside B.last"); } }

163

System.setName("Racing Thread").last() Racing Thread trying to call A.start(). B b = new B().. class DeadLock implements Runnable { A a = new A().foo Racing Thread entered B.println("Back in the main thread.currentThread().Deadlock. a. } public static void main(String args[]) { new DeadLock(). new Thread(this). System. b. } public void run() { Thread.out.last() ^C 164 .bar Main Thread trying to call B..out. } } Run: Main Thread entered A.bar(a).").setName("Main Thread").. DeadLock() { Thread.println("Back in the other thread").foo(b).currentThread().

(appealing vehicle for parallel computing) vehicle for parallel computing) Language standards have to evolve. (Portability). multiple stand-alone machines can also be used as a parallel machine -. (appealing used as a parallel machine a Popular Trend.a Popular Trend.a long way is running on sequential or parallel hardware a long way to go. Language standards have to evolve. to go. (Portability). multiple stand-alone machines can also be performance.Grand Challenges Grand Challenges (Is PP Practical?) (Is PP Practical?)      Need OS and Compiler support to use multiprocessor Need OS and Compiler support to use multiprocessor machines. With Highspeed Networks and improved microprocessor With Highspeed Networks and improved microprocessor performance. machines. Re-orientation of thinking Re-orientation of thinking b Sequential Parallel b Sequential Parallel 165 . Ideal would be for the user to be unaware if the problem Ideal would be for the user to be unaware if the problem is running on sequential or parallel hardware -.

(Portability).Grand Challenges Grand Challenges (Is PP Practical?) (Is PP Practical?)     Language standards have to Language standards have to evolve. (Portability). Re-orientation of thinking Re-orientation of thinking b Sequential b Sequential Parallel Parallel 166 . evolve.

Breaking High Performance Computing Barriers Breaking High Performance Computing Barriers 2100 2100 2100 2100 G F L O P S 2100 2100 2100 2100 2100 Single Processor Shared Memory Local Parallel Cluster Global Parallel Cluster 167 .

.. Thank You ..Thank You . 168 ..

Sign up to vote on this title
UsefulNot useful