Professional Documents
Culture Documents
1 2
Processes and Threads The Case for Threads
Process abstraction combines two concepts Consider the following code fragment
¾ Concurrency
for(k = 0; k < n; k++)
Each process is a sequential execution stream of instructions
¾ Protection a[k] = b[k] * c[k] + d[k] * e[k];
Each process defines an address space
Address space identifies all addresses that can be touched by the Is there a missed opportunity here? On a Uni-processor?
program
On a Multi-processor?
Threads
¾ Key idea: separate the concepts of concurrency from protection
¾ A thread is a sequential execution stream of instructions
¾ A process defines the address space that may be shared by
multiple threads
¾ Threads can execute on different cores on a multicore CPU
(parallelism for performance) and can communicate with other
threads by updating memory
3 4
The Case for Threads Programmer’s View
5 6
Introducing Threads
Examples:
¾ OS-supported: Windows’ threads, Sun’s LWP, POSIX threads
¾ Language-supported: Modula-3, Java
These are possibly going the way of the Dodo
7 8
How Can it Help? How Can it Help?
How can this code take advantage of 2 threads? Consider a Web server
for(k = 0; k < n; k++) Create a number of threads, and for each thread do
a[k] = b[k] * c[k] + d[k] * e[k]; get network message from client
get URL data from disk
Rewrite this code fragment as: send data over network
do_mult(l, m) {
for(k = l; k < m; k++) What did we gain?
a[k] = b[k] * c[k] + d[k] * e[k];
}
main() {
CreateThread(do_mult, 0, n/2);
CreateThread(do_mult, n/2, n);
9 10
Overlapping Requests (Concurrency)
Request 1 Request 2
Thread 1 Thread 2 Threads have their own…?
get network message
(URL) from client 1. CPU
get network message
get URL data from disk
(URL) from client 2. Address space
get URL data from disk 3. PCB
(disk access latency)
4. Stack
(disk access latency)
5. Registers
send data over network
dd
send k
data over network
11 12
Threads vs. Processes Implementing Threads
Process’s
Threads Processes TCB for address space
Processes define an address Thread1
space; threads share the mapped segments
A thread has no data segment A process has code/data/heap & address space PC
or heap th segments
other t SP DLL’ss
DLL
A thread cannot live on its own, There must be at least one Process Control Block (PCB) State
it must live within a process Registers Heap
thread in a process contains process-specific
There can be more than one …
Threads within a process share information
thread in a process, the first ¾ Owner, PID, heap pointer,
thread calls main & has the code/data/heap, share I/O, but
each has its own stack & priority, active thread, and
process’s stack pointers to thread information TCB for Stack – thread2
registers
If a thread dies, its stack is Thread2
reclaimed If a process dies, its resources
Thread Control Block (TCB)
Inter-thread communication via are reclaimed & all threads die PC Stack – thread1
contains thread-specific SP
memory. Inter-process communication via information State
Each thread can run on a OS and data copying. Initialized data
different physical processor
¾ Stack pointer, PC, thread state Registers
Each process can run on a (running, …), register values, a …
Inexpensive creation and different physical processor pointer to PCB, … Code
context switch
Expensive creation and context
switch
13 14
Threads’ Life Cycle
Threads (just like processes) go through a sequence of start, Threads have the same scheduling states as
ready, running, waiting, and done states processes
1. True
T
Start Done 2. False
Ready Running
Waiting
15 16
Userlevel vs. Kernellevel threads Languages vs. Systems
Process0 Process1
User-level threads (M to 1 model) user Kernel-level threads have won for systems
¾ + Fast to create and switch kernel ¾ Linux, Solaris 10, Windows
¾ + Natural fit for language-level threads ¾ pthreads tends to be kernel-level threads
¾ - All user-level threads in process block on OS calls User-level threads still used for languages (Java)
E.g., read from file can block all threads ¾ User tells JVM how many underlying system threads
¾ -User-level scheduler can fight with kernel-level scheduler Default: 1 system thread
¾ Java runtime intercepts blocking calls, makes them non-
Kernel-level threads (1 to 1 model) blocking
¾ + Kernel-level threads do not block process for syscall ¾ JNI code that makes blocking syscalls can block JVM
¾ + Only one scheduler (and kernel has global view) ¾ JVMs are phasing this out because kernel threads are
¾ - Can be difficult to make efficient (create & switch) efficient enough and intercepting system calls is complicated
Kernel-level thread vs. process
¾ Each process requires its own page table & hardware state
(significant on the x86)
17 18
Latency and Throughput Relationship between Latency and Throughput
Latency: time to complete an operation Latency and bandwidth only loosely coupled
Throughput: work completed per unit time ¾ Henry Ford: assembly lines increase bandwidth without
reducing latency
Multiplying vector example: reduced latency
My factory takes 1 day to make a Model-T ford.
Web server example: increased throughput
¾ But I can start building a new car every 10 minutes
Consider plumbing ¾ At 24 hrs/day, I can make 24 * 6 = 144 cars per day
¾ Low latency: turn on faucet and water comes out ¾ A special order for 1 green car, still takes 1 day
¾ High bandwidth: lots of water (e.g., to fill a pool) ¾ Throughput is increased, but latency is not.
What is “High speed Internet?” Latency reduction is difficult
¾ Low latency: needed to interactive gaming
Often, one can buy bandwidth
¾ High bandwidth: needed for downloading large files
¾ E.g., more memory chips, more disks, more computers
¾ Marketing departments like to conflate latency and
¾ Big server farms (e.g., google) are high bandwidth
bandwidth…
19 20
Thread or Process Pool
When a user level thread does I/O it blocks the
Creating a thread or process entire process.
for each unit of work (e.g., Well conditioned
q
user request)) is dangerous
g Not well conditioned
¾ High overhead to create &
delete thread/process 1. True
¾ Can exhaust CPU & 2. False
Throughput
memory resource
Thread/process pool controls
resource use
¾ Allows service to be well
conditioned.
Load
21 22