You are on page 1of 144

CENG 334 – Operating Systems

03- Threads & Synchronization

Assoc. Prof. Yusuf Sahillioğlu


Computer Eng. Dept, , Turkey
Threads
2 / 144

 Threads.
 Concurrent work on the same process for efficiency.
 Several activities going on as part of the same process.
 Share registers, memory, and other resources.
 Careful w/ data synchronization (race conditions & deadlocks):
Process (a single thread) vs. Multithreads
3 / 144
Naming
4 / 144

 Why do we call it a thread anyway?

 Execution flow of a program is not smooth, looks like a thread.


 Execution jumps around (e.g., loops go back and forth) but integrity is intact.
Concurrency vs. Parallelism
5 / 144

 Concurrency: 2 processes or threads run concurrently (are concurrent)


if their flows overlap in time
 Otherwise, they are sequential.
 Examples (running on single core):
 Concurrent: A & B, A & C
 Sequential: B & C

 Parallelism: requires multiple resources to execute multiple processes


or threads at a given time instant. A&B parallel:
Concurrent Programming
6 / 144

 Many programs want to do many things “at once”


 Web browser:
 Download web pages, read cache files, accept user input, ...
 Web server:
 Handle incoming connections from multiple clients at once
 Scientific programs:
 Process different parts of a data set on different CPUs
 In each case, would like to share memory across these activities
 Web browser: Share buffer for HTML page and inlined images
 Web server: Share memory cache of recently-accessed pages
 Scientific programs: Share memory of data set being processed
 Can't we simply do this with multiple processes?
Why processes are not always ideal?
7 / 144

 Processes are not very efficient


 Each process has its own PCB and OS resources, e.g., memory
 Typically high overhead for each process: e.g., 1.7KB per
task_struct on Linux for PCB! And also the address space in physical
memory
 Creating a new process is often very expensive
 Processes don't (directly) share memory
 Each process has its own address space
 Parallel and concurrent programs often want to directly manipulate
the same memory
 e.g., When processing elements of a large array in parallel
 Note: Many OS's provide some form of inter-process shared memo
 e.g., UNIX shmget() and shmat() system calls
 Still, this requires more programmer work and does not address the
Can we do better?
8 / 144

 What can we share across all of these tasks?


 Same code – generally running the same or similar programs
 Same data
 Same privileges
 Same OS resources (files, sockets, etc.)
 What is private to each task?
 Execution state: CPU registers, stack, and program counter
 Key idea of this lecture:
 Separate the concept of a process from a thread of control
 The process is the address space and OS resources
 Each thread has its own CPU execution state
Threads vs. Processes
9 / 144

 Processes form a tree hierarchy.


 Threads form a pool of peers.
 Each thread can kill any other.
 Each thread can wait for any other thread to terminate.
 Main thread: first thread to run in a process (always exist).
Process hierarchy Thread pool
P0 T2
T4
T1
P1 shared code, data
and kernel context
sh sh sh

T5 T3
foo
Threads vs. Processes
10 /
144

 Each process has one or more threads “within” it


 Each thread has its own stack, CPU registers, etc.
 All threads within a process share the same address space and OS
resources
 Threads share memory, so they can communicate directly!
 The thread is now the unit of CPU scheduling
Address space
 A process is just a “container” for its threads
 Each thread is bound to its containing process
Thread 0

Thread 1

Thread 2
(Old) Process Address Space
11 /
144

0xFFFFFFFF
(Reserved for OS)

Stack
Stack pointer

Address space
Heap

Uninitialized vars
(BSS segment)
Initialized vars
(data segment)
Code Program counter
0x00000000 (text segment)
(New) Process Address Space w/ Threads
12 /
144
Implementing Threads
13 /
144

 Break the PCB into two pieces:


 Thread-specific stuff: Processor state
 Process-specific stuff: Address space and OS resources (open files,
etc.)
Thread Control Block (TCB)
14 /
144

 TCB contains info on a single thread


 Just processor state and pointer to corresponding PCB
 PCB contains information on the containing process
 Address space and OS resources ... but NO processor state!
Thread Control Block (TCB)
15 /
144

 TCB's are smaller and cheaper than processes


 Linux TCB (thread_struct) has 24 fields
 Linux PCB (task_struct) has 106 fields

 Context switching threads is cheaper than context switching processes.


Context Switching
16 /
144

 TCB is now the unit of a context switch


 Ready queue, wait queues, etc. now contain pointers to TCB's
 Context switch causes CPU state to be copied to/from the TCB

 Context switch between two threads in the same process:


 No need to change address space
 Context switch between two threads in different processes:
 Must change address space, sometimes invalidating cache
 This will become relevant when we talk about virtual memory
Thread State
17 /
144

 State shared by all threads in process:


 Memory content (global variables, heap, code, etc).
 I/O (files, network connections, etc).
 A change in the global variable will be seen by all other threads (unlike processes).

 State private to each thread:


 Kept in TCB (Thread Control Block).
 CPU registers, program counter.
 Stack (what functions it is calling, parameters, local variables, return addresses).
 Pointer to enclosing process (PCB).
Thread Behavior
18 /
144

 Some useful applications with threads:

 One thread listens to connections; others handle page requests.


 One thread handles GUI; others computations.
 One thread paints the left part, other the right part.
 https://youtu.be/-P28LKWTzrI (not just left and right)
 ..
Thread Behavior
19 /
144

 Single threaded
 main()
computePI(); //never finish
printf(“hi”); //never reach here
A process has a single thread of control: if it blocks on something nothing else can be done.

 Multi-threaded
 main()
createThread( computePI() ); //never finish
createThread( printf(“hi”) ); //reaches here
 main()
createThread( scanf() ); //not finish ‘till user enters
createThread( autoSaveDoc() ); //reaches here while waiting on I/O
Thread Behavior
20 /
144

 Execution flow:
Threads on a Single CPU
21 /
144

 Still possible.
 Multitasking idea
 Share one CPU among many processes (context switch).
 Multithreading idea
 Share the same process among many threads (thread switch).
 Whenever this process has the opportunity to run in the CPU, OS can select one
of its many threads to run it for a while, and so on.
 One pid, several thread ids.

Schedulable entities increased.


Threads on a Single CPU
22 /
144

 If threads are all CPU-bound, e.g., no I/O or pure math, then we do


not gain much by multithreading.

 Luckily this is usually not the case, e.g., 1 thread does the I/O, ..
 Select your threads carefully, one is I/O-bound, other is CPU-bound, ..

 With multicores, we still gain big even if threads are all CPU-bound.
Single-threaded Process on Multiple CPUs
23 /
144

 Note that even if you have 8 CPUs, a single-threaded process can use
only one of them at a time
 Obviously other processes may utilize the unused CPUs but if there are fewer
processes than the CPU count then the system is underutilized
 If you implement the multithreaded version of the same program, then
all 8 CPUs can serve the same process and make it finish very early
Multithreading Concept
24 /
144

 Multithreading concept: pseudo-parallel runs. (pseudo: interleaving switches on CPU).

thread1 thread2 thread3 thread4


funct1() { .. }
funct2() { .. }
main() {
..
createThread( funct1() );
..
createThread( funct2() );
..
createThread( funct1() );
..
}
Single- vs. Multi-threaded Processes
25 /
144

 Shared and private stuff:


Benefits of Threads
26 /
144

 Responsiveness
 One thread blocks, another runs.
 One thread may always wait for the user.
 Resources sharing
 Very easy sharing (use global variables; unlike msg queues, pipes, shmget).
 Be careful about data synchronization tough.
 Economy
 Thread creation is fast.
 Context switching among threads may be faster.
 ‘cos you do not have to duplicate code and global variables (unlike processes).
 Scalability
 Multiprocessors can be utilized better.
 Process that has created 8 threads can use all 8 cores (single-threaded proc. utilize 1
core).
Multithreading Example: WWW
27 /
144

 Client (Chrome) requests a page from server (amazon.com).

 Server gives the page name to the thread and resumes listening.
 Thread checks the disk cache in memo; if page not there, do disk I/O;
sends the page to the client (network I/O).
Threading Support
28 /
144

 User-level threads: are threads that the OS is not aware of. They exist
entirely within a process, and are scheduled to run within that process'
time slices.

 Kernel-level threads: The OS is aware of kernel-level threads. Kernel


threads are scheduled by the OS's scheduling algorithm, and require a
"lightweight" context switch to switch between (that is, registers, PC,
and SP must be changed, but the memory context remains the same
among kernel threads in the same process).
Threading Support
29 /
144

 User-level threads are much faster to switch between, as there is no


context switch; further, a problem-domain-dependent algorithm can be
used to schedule among them. CPU-bound tasks with interdependent
computations, or a task that will switch among threads often, might
best be handled by user-level threads.
Threading Support
30 /
144

 Kernel-level threads are scheduled by the OS, and each thread can be
granted its own time slices by the scheduling algorithm. The kernel
scheduler can thus make intelligent decisions among threads, and
avoid scheduling processes which consist of entirely idle threads (or
I/O bound threads). A task that has multiple threads that are I/O
bound, or that has many threads (and thus will benefit from the
additional time slices that kernel threads will receive) might best be
handled by kernel threads.

 Kernel-level threads require a system call for the switch to occur; user-
level threads do not.
Threading Support
31 /
144

 Thread libraries that provide us API for creating and managing


threads.
 pthreads, java threads, win32 threads.

 Pthreads (POSIX threads) interface.


 Common in Unix operating sytems: Solaris, Mac OS, Linux.
 Not implemented in the standard C library; search the library named pthread
while compiling: gcc –o thread1 –lpthread thread1.c
 Implementation-dependent; can be user- or kernel-level.

 Functions in pthread library are actually doing linux system calls, e.g.,
pthread_create()  clone()

 See sample codes to warm up on pthreads:


 http://user.ceng.metu.edu.tr/~ys/ceng334-os/threadster0.c also
Pthreads
32 /
144

thread1 thread2
int main(..)
{
..
..
pthread_create(&tid,…,runner,..);

pthread_join(tid); wait

printf (sum);
}

runner (..)
{
..
sum = ..
pthread_exit();
}
Single- to Multi-thread Conversion
33 /
144

 In a simple world
 Identify functions as parallel activities.
 Run them as separate threads.
 In real world
 Single-threaded programs use global variables, library functions (malloc).
 Be careful with them.
 Global variables are good for easy-communication but need special care.
Single- to Multi-thread Conversion
34 /
144

 Careful with global variable:


Single- to Multi-thread Conversion
35 /
144

 Careful with global variable:


Single- to Multi-thread Conversion
36 /
144

 Global, local, and thread-specific variables.


 thread-specific: global inside the thread, but not for the whole process,
i.e., other threads cannot access it, but all the functions of the thread
can (no problem ‘cos fnctns within a thread executed sequentially).

 No language support for this variable type; C cannot do this.

 Thread API has special functions to create such variables


(pthread_getspecific).
Single- to Multi-thread Conversion
37 /
144

 Use thread-safe (reentrant, reenterable) library routines.

 Multiple malloc()s are executed sequentially in a single-threaded code.

 Say one thread is suspended on malloc(); another process calls


malloc() and re-enters it while the 1st one has not finished.
 Library functions should be designed to be reentrant = designed to
have a second call to itself from the same process before it’s finished.
 To do so, do not use global variables.
Thread Issues
38 /
144

 All threads in a process share memory:


Address space
write
foo

Thread 0
read

Thread 1

Thread 2

 What happens when two threads access the same variable?


 Which value does Thread 2 see when it reads “foo”?
 What does it depend on?
Thread Issues
39 /
144

/* shared */
 asd unsigned int cnt = 0; /* thread routine */
volatile
void *count(void *arg) {
//see Note section below for volatile
int i;
for (i=0; i<NITERS; i++)
#define NITERS 100000000
cnt++;
int main() {
return NULL;
pthread_t tid1, tid2;
}
Pthread_create(&tid1, NULL,
count, NULL);
linux> ./badcnt
Pthread_create(&tid2, NULL,
BOOM! cnt=198841183
count, NULL);
linux> ./badcnt
Pthread_join(tid1, NULL);
BOOM! cnt=198261801
Pthread_join(tid2, NULL);
linux> ./badcnt
if (cnt != (unsigned)NITERS*2)
BOOM! cnt=198269672
printf("BOOM! cnt=%d\n",
cnt); cnt should be
else equal to 200,000,000.
printf("OK cnt=%d\n", What went wrong?
cnt);}
Thread Issues
40 /
144

 Assembly code for counter loop:


Thread Issues
41 /
144

 Assembly code for counter loop.


 Unpredictable switches of threads by scheduler will create
inconsistencies on the shared data, e.g., global variable cnt.
 Handling this is one of the most important topics of this course:
Synchronization.
Synchronization
42 /
144

 Synchronize threads/coordinate their activities so that when you


access the shared data (e.g., global variables) you are not having a
trouble.

 Multiple processes sharing a file or shared memory segment also


require synchronization (= critical section handling).
Synchronization
43 /
144

 The part of the process that is accessing and changing shared data is
called its critical section.

Thread 1 Code Thread 2 Code Thread 3 Code

Change X
Change X
Change Y
Change Y
Change Y Change X

Assuming X and Y are shared data.


Synchronization
44 /
144

 Solution: No 2 processes/threads are in their critical section at the


same time, aka Mutual Exclusion (mutex).

 Must assume processes/threads interleave executions arbitrarily


(preemptive scheduling) and at different rates.
 Scheduling is not under application’s control.
 We control coordination using data synchronization.
 We restrict interleaving of executions to ensure consistency.
 Low-level mechanism to do this: locks,
 High-level mechanisms: mutexes, semaphores, monitors, condition variables.
Synchronization
45 /
144

 General way to achieve synchronization:


Synchronization
46 /
144

 An example: race condition.

Critical
section:
Critical
section:

critical section respected  not respected 


Synchronization
47 /
144

 Another example: race condition.


 Assume we had 5 items in the buffer.
 Then
 Assume producer just produced a new item, put it into buffer, and about to do
count++
 Assume consumer just retrieved an item from the buffer, and about to do
count--

Producer Consumer
or

Producer Consumer
Synchronization
48 /
144

 Another example: race condition.

 Critical region: is where we manipulate count.

 count++ could be implemented as (similarly, count--)


 register1 = count; //read value
register1 += 1; //increase value
count = register1; //write back
Synchronization
49 /
144

 Then:
register1 Count PRODUCER (count++)
6
5 5
4
6
register1 = count
register1 = register1 + 1
register2 count = register1
4
5
CONSUMER (count--)
register2 = count
register2 = register2 – 1
CPU count = register2

Main Memory
Synchronization
50 /
144

 Another example: race condition.

 2 threads executing their critical section codes 


Balance = 1000TL
balance = get_balance(account);
balance -= amount; Local = 900TL
Execution sequence
as seen by CPU balance = get_balance(account);
balance -= amount; Local = 900TL
put_balance(account, balance); Balance = 900TL

put_balance(account, balance); Balance = 900TL!

 Although 2 customers withdrew 100TL, balance is 900TL, not 800TL 


Synchronization
51 /
144

 Solution: mutual exclusion.


 Only one thread at a time can execute code in their critical section.
 All other threads are forced to wait on entry.
 When one thread leaves the critical section, another can enter.

Critical Section

Thread 1
(modify account balance)
Synchronization
52 /
144

 Solution: mutual exclusion.


 Only one thread at a time can execute code in their critical section.
 All other threads are forced to wait on entry.
 When one thread leaves the critical section, another can enter.

Critical Section

Thread 2 Thread 1
(modify account balance)
2 thread must wait
nd

for critical section to clear


Synchronization
53 /
144

 Solution: mutual exclusion.


 Only one thread at a time can execute code in their critical section.
 All other threads are forced to wait on entry.
 When one thread leaves the critical section, another can enter.

Critical Section

Thread 2
(modify account balance)

2nd thread free to enter 1st thread leaves critical section


Synchronization
54 /
144

 Solution: mutual exclusion.


 pthread library provides us mutex variables to control the critical section
access.
 pthread_mutex_lock(&myMutex)
.. //critical section stuff
pthread_mutex_unlock(&myMutex)
 See this in action here: http://user.ceng.metu.edu.tr/~ys/ceng334-os/threadster1.c
Synchronization
55 /
144

 Critical section requirements.


 Mutual exclusion: at most 1 thread is currently executing in the critical section.

 Progress: if thread T1 is outside the critical section, then T1 cannot prevent T2


from entering the critical section.

 No starvation: if T1 is waiting for the critical section, it’ll eventually enter.


 Assuming threads eventually leave critical sections.

 Performance: the overhead of entering/exiting critical section is small w.r.t.


the work being done within it.
Synchronization
56 /
144

 Solution: Peterson’s solution to mutual exclusion.


 Programming at the application (sw solution; no hw or kernel support).
 Peterson.enter //similar to pthread_mutex_lock(&myMutex)
.. //critical section stuff
Peterson.exit //similar to pthread_mutex_unlock(&myMutex)
 Works for 2 threads/processes (not more).
 Is this solution OK?
 Set global variable lock = 1.
 A thread that wants to enter critical section checks lock == 1.
 If true, enter. Do lock--.
 if false, another thread decremented it so not enter.
Synchronization
57 /
144

 Solution: Peterson’s solution to mutual exclusion.


 Programming at the application (sw solution; no hw or kernel support).
 Peterson.enter //similar to pthread_mutex_lock(&myMutex)
.. //critical section stuff
Peterson.exit //similar to pthread_mutex_unlock(&myMutex)
 Works for 2 threads/processes (not more).
 Is this solution OK?
 Set global variable lock = 1.
 A thread that wants to enter critical section checks lock == 1.
 If true, enter. Do lock--.
 if false, another thread decremented it so not enter.

 This solution sucks ‘cos lock itself is a shared global variable.


 Just using a single variable without any other protection is not enough.
 Back to Peterson’s algo..
Synchronization
58 /
144

 Solution: Peterson’s solution to mutual exclusion.


 Programming at the application (sw solution; no hw or kernel support).
 Peterson.enter //similar to pthread_mutex_lock(&myMutex)
.. //critical section stuff
Peterson.exit //similar to pthread_mutex_unlock(&myMutex)
 Works for 2 threads/processes (not more).
 Assume that the LOAD and STORE machine instructions are atomic; that is,
cannot be interrupted.
 The two processes share two variables:
 int turn;
 boolean flag[2];
 The variable turn indicates whose turn it is to enter the critical section.
 turn = i means process Pi can execute (i=0,1).
 The flag array is used to indicate if a process is ready to enter the critical
section. flag[i] = true implies that process Pi is ready (wants to enter).
Synchronization
59 /
144

 Solution: Peterson’s solution to mutual exclusion.


 The variable turn indicates whose turn it is to enter the critical section.
 turn = i means process Pi can execute (i=0,1).
 The flag array is used to indicate if a process is ready to enter the critical
section. flag[i] = true implies that process Pi is ready (wants to enter).
 Algorithm for Pi; the other process is Pj:

I want to enter but,

be nice to other process.

Busy wait:
Synchronization
60 /
144

 Solution: Peterson’s solution to mutual exclusion.


PROCESS i (0) PROCESS j (1)
do { do {
flag[i] = TRUE; flag[j] = TRUE;
turn = j; turn = i;
while (flag[j] && turn == j); while (flag[i] && turn == i);
critical section.. critical section..
flag[i] = FALSE; flag[j] = FALSE;
remainder section.. remainder section..
} while (1) } while (1)

flag[]
Shared Variables: turn i=0, j=1 are local.
Synchronization
61 /
144

 Solution: hardware support for mutual exclusion.


 Kernel code can disable clock interrupts (context/thread switches).

disable interrupts (no switch)

enable interrupts (schedulable)


Synchronization
62 /
144

 Solution: hardware support for mutual exclusion.


 Works for single CPU.
 Multi-CPU fails ‘cos you’re disablin the interrupt only for your
processor.
 That does not mean other processors do not get interrupts.
 Each processor has its own interrupt mechanism.
 Hence another process/thread running in another processor can touch
the shared data.
 Too inefficient to disable interrupts on all available processors.
Synchronization
63 /
144

 Solution: hardware support for mutual exclusion.


 Another support mechanism: Complex machine instructions from hw that are
atomic (not interruptible).
 Locks (not just simple integers).

do {
acquire lock
critical section
release lock
remainder section
} while (TRUE);
 How to implement acquire/release lock?
 Use special machine instructions: TestAndSet, Swap.
Synchronization
64 /
144

 Solution: hardware support for mutual exclusion.


 Complex machine instruction for hw synch: TestAndSet

 TestAndSet is a machine/assembly instruction.


 You must write the acquire-lock portion (entry section code) of your code in
assembly. But here is a C code for easy understanding:

--Definition of TestAndSet Instruction--


boolean TestAndSet (boolean *target)

{
boolean rv = *target;
*target = TRUE;
return rv:
} //atomic (not interruptible)!!!!!!!!!!!!
Synchronization
65 /
144

 Solution: hardware support for mutual exclusion.


 Complex machine instruction for hw synch: TestAndSet

We use a shared Boolean variable lock, initialized to false.


do {
while ( TestAndSet (&lock ) ) entry section
; //do nothing; busy wait

// critical section

lock = FALSE; //release lock exit section


// remainder section

} while (TRUE);
Synchronization
66 /
144

 Solution: hardware support for mutual exclusion.


 Complex machine instruction for hw synch: TestAndSet

 Can be suspended/interrupted b/w TestAndSet & CMP, but not during TestAndSet.
Synchronization
67 /
144

 Advertisement: Writing assembly in C is a piece of cake.


Synchronization
68 /
144

 Solution: hardware support for mutual exclusion.


 Complex machine instruction for hw synch: Swap

 Swap is a machine/assembly instruction.


 You must write the acquire-lock portion (entry section code) of your code in
assembly. But here is a C code for easy understanding:

--Definition of Swap Instruction--


boolean Swap (boolean* a, boolean* b)

{
boolean temp = *a;
*a = *b;
*b = temp;
} //atomic (not interruptible)!!!!!!!!!!!!
Synchronization
69 /
144

 Solution: hardware support for mutual exclusion.


 Complex machine instruction for hw synch: Swap
We use a shared Boolean variable lock, initialized to false.
Each process also has a local Boolean variable key.
do {
key = TRUE;
while (key == TRUE) entry sect
Swap (&lock, &key );

// critical section

lock = FALSE; exit sect

// remainder section

} while (TRUE);
Synchronization
70 /
144

 Solution: hardware support for mutual exclusion.


 A comment on TestAndSwap & Swap.
 Although they both guarantee mutual exclusion, they may make one
process (X) wait a lot:
 A process X may be waiting, but we can have the other process Y going into
the critical region repeatedly.
 One toy/bad solution: keep the remainder section code so long that
scheduler kicks Y out of the CPU before it reaches back to the entry section.
Synchronization
71 /
144

 Solution: Semaphores (= shared integer variable).


 Idea: avoid busy waiting: waste of CPU cycles by waiting in a loop
‘till the lock is available, aka spinlock.
 Example1: while (flag[i] && turn == i);//from Peterson’s algo.
 Example2: while (TestAndSet (&lock )); //from TestAndSet algo.
 How to avoid?
 If a process P calls wait() on a semaphore with a value of zero, P is
added to the semaphore’s queue and then blocked.
 The state of P is switched to the waiting state, and control is transferred
to the CPU scheduler, which selects another process to execute
(instead of busy waiting on P).
 When another process increments the semaphore by calling signal()
and there are tasks on the queue, one is taken off of it and resumed.
 wait() = P() = down(). //modify semaphore s via these functions.
 signal() = V() = up(). //modify semaphore s via these functions.
Synchronization
72 /
144

 Solution: Semaphores.
 wait() = P() = down(). //modify semaphore s via these functions.
 signal() = V() = up(). //modify semaphore s via these functions.
 These functions can be implemented in kernel as system calls.
 Kernel makes sure that wait(s) & signal(s) are atomic.

 Less complicated entry & exit sections.


Synchronization
73 /
144

 Solution: Semaphores.
 Operations (kernel codes).
Busy-waiting  vs. Efficient 

 More formally, s->value--; s->list.add(this); etc.


Synchronization
74 /
144

 Solution: Semaphores.
 Operations.
 wait(s):
 if s positive
s-- and return
else
s-- and block/wait (‘till somebody wakes you
up; then return)
Synchronization
75 /
144

 Solution: Semaphores.
 Operations.
 signal(s):
 s++
if there’s 1+ process waiting (new s<=0)
wake one of them up
return
Synchronization
76 /
144

 Solution: Semaphores.
 Types.
 Binary semaphore
 Integer value can range only between 0 and 1; can be simpler to implement;
aka mutex locks.
 Provides mutual exclusion; can be used for the critical section problem.

 Counting semaphore
 Integer value can range over an unrestricted domain.
 Can be used for other synchronization problems; for example for resource
allocation.
 Example: you have 10 instances of a resource. Init semaphore s to 10 in this case.
Synchronization
77 /
144

 Solution: Semaphores.
 Usage.
 An integer variable s that can be shared by N processes/threads.
 s can be modified only by atomic system calls: wait() & signal().
 s has a queue of waiting processes/threads that might be
sleeping on it.
typedef struct {
int value;
struct process *list;
} semaphore;

 Atomic: when process X is executing wait(), Y can execute wait()


if X finished executing wait() or X is blocked in wait().
 When X is executing signal(), Y can execute signal() if X finished.
Synchronization
78 /
144

 Solution: Semaphores.
 Usage.
 Binary semaphores (mutexes) can be used to solve critical
section problems.

 A semaphore variable (lets say mutex) can be shared by N


processes, and initialized to 1.

 Each process is structured as follows: do {


wait (mutex);
// Critical Section
signal (mutex);
// remainder section
} while (TRUE);
Synchronization
79 /
144

 Solution: Semaphores.
 Usage.
Process 0 Process 1
do { do {
wait (mutex); wait (mutex);
// Critical Section // Critical Section
signal (mutex); signal (mutex);
// remainder section // remainder section
} while (TRUE); } while (TRUE);

wait() {…} signal() {…}


Kernel
Semaphore mutex; //initialized to 1
Synchronization
80 /
144

 Solution: Semaphores.
 Usage.
 Kernel puts processes/threads waiting on s in a FIFO queue. Why
FIFO?
Synchronization
81 /
144

 Solution: Semaphores.
 Usage other than critical section.
 Ensure S1 definitely executes before S2 (just a synch problem).

P0 P1
… …
S1; S2;
…. ….
Synchronization
82 /
144

 Solution: Semaphores.
 Usage other than critical section.
 Ensure S1 definitely executes before S2 (just a synch problem).

P0 P1
… …
S1; S2; Solution via semaphores:
…. …. Semaphore x = 0; //inited to 0
P0 P1
… …
S1; wait (x);
signal (x); S2;
…. ….
Synchronization
83 /
144

 Solution: Semaphores.
 Usage other than critical section.
 Resource allocation (just another synch problem).
 We have N processes that want a resource that has 5 instances.
 Solution:
Synchronization
84 /
144

 Solution: Semaphores.
 Usage other than critical section.
 Resource allocation (just another synch problem).
 We’ve N processes that want a resource R that has 5 instances.
 Solution:
 Semaphore rs = 5;
 Every process that wants to use R will do wait(rs);
 If some instance is available, that means rs will be nonnegative  no blocking.
 If all 5 instances are used, that means rs will be negative  block ‘till rs nonneg.
 Every process that finishes with R will do signal(rs);
 A blocked processes will change state from waiting to ready.
Synchronization
85 /
144

 Solution: Semaphores.
 Usage other than critical section.
 Enforce consumer to sleep while there’s no item in the buffer
(another synch problem).
Producer Consumer
do { do {
// produce item wait (Full_Cells); //instead of
.. busy-waiting, go to sleep mode
put item into buffer and give CPU back to producer
.. for faster production (efficiency!!).
signal (Full_Cells); ..
remove item from buffer
} while (TRUE); ..
} while (TRUE);
Semaphore Full_Cells = 0; //initialized to 0 Kernel
Synchronization
86 /
144

 Solution: Semaphores.
Synchronization
87 /
144

 Solution: Semaphores.
 Consumer can never cross the producer curve.
 Difference b/w produced and consumed items can be <= BUFSIZE
Synchronization
88 /
144

 Problems with semapahores: Deadlock and Starvation.


 Deadlock.
 Two or more processes are waiting indefinitely for an event that can be
caused by only one of the waiting processes.
Synchronization
89 /
144

 Problems with semapahores: Deadlock and Starvation.


 Deadlock.
 Two or more processes are waiting indefinitely for an event that can be
caused by only one of the waiting processes.
Synchronization
90 /
144

 Problems with semapahores: Deadlock and Starvation.


 Deadlock.
 Two or more processes are waiting indefinitely for an event that can be
caused by only one of the waiting processes.
Synchronization
91 /
144

 Problems with semapahores: Deadlock and Starvation.


 Deadlock.
 This code may sometimes (not all the time) cause a deadlock:
 P0 P1
wait(S); wait(Q);
wait(Q); wait(S);
. .
. .
signal(S); signal(Q);
signal(Q); signal(S);

 When does the deadlock occur?


 How to solve?
Synchronization
92 /
144

 Problems with semapahores: Deadlock and Starvation.


 Starvation.
 Indefinite blocking: a process may never be removed from the semaphore
queue in which it is susupended; it’ll always be sleeping; no service.
 When does it occur?
 How to solve?

 Another problem:
 Low-priority process may cause high-priority process to wait.
Synchronization
93 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Bounded-buffer problem.
 Readers-Writers problem.
 Dining philosophers problem.
 Rendezvous problem.
 Barrier problem.
Synchronization
94 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Bounded-buffer problem (aka Producer-Consumer problem).
 Producer should not produce any item if the buffer is full: Semaphore full = 0; //inited
 Consumer should not consume any item if the buffer is empty: Semaphore empty = N;
 Producer and consumer should access the buffer in a mutually exc manner: mutex = 1;

buffer
prod cons
full = 4
empty = 6

 Types of 3 semaphores above?


Synchronization
95 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Bounded-buffer problem.
 Producer should not produce any item if the buffer is full: Semaphore full = 0; //inited
 Consumer should not consume any item if the buffer is empty: Semaphore empty = N;
 Producer and consumer should access the buffer in a mutually exc manner: mutex = 1;

 Think about the code of this?


Synchronization
96 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Bounded-buffer problem.
 Producer should not produce any item if the buffer is full: Semaphore full = 0; //inited
 Consumer should not consume any item if the buffer is empty: Semaphore empty = N;
 Producer and consumer should access the buffer in a mutually exc manner: mutex = 1;
Synchronization
97 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Readers-Writers problem.
 A data set is shared among a number of concurrent processes.
 Readers: only read the data set; they do not perform any updates.
 Writers: can both read and write.

 Problem: allow multiple readers to read at the same time. Only one single writer can
access the shared data at the same time (no reader/writer when writer is active).
Synchronization
98 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Readers-Writers problem.
 A data set is shared among a number of concurrent processes.
 Readers: only read the data set; they do not perform any updates.
 Writers: can both read and write.

 Problem: allow multiple readers to read at the same time. Only one single writer can
access the shared data at the same time (no reader/writer when writer is active).

 Integer readcount initialized to 0.


 Number of readers reading the data at the moment.
 Semaphore mutex initialized to 1.
 Protects the readcount variable (multiple readers may try to modify it).
 Semaphore wrt initialized to 1.
 Protects the shared data (either writer or reader(s) should access data at a time).
Synchronization
99 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Readers-Writers problem.
 A data set is shared among a number of concurrent processes.
 Readers: only read the data set; they do not perform any updates.
 Writers: can both read and write.

 Problem: allow multiple readers to read at the same time. Only one single writer can
access the shared data at the same time (no reader/writer when writer is active).

 Think about the code of this?


 Reader and writer processes running in (pseudo) parallel.
 Hint: first and last reader should do something special.
Synchronization
100 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Readers-Writers problem.
 A data set is shared among a number of concurrent processes.
 Readers: only read the data set; they do not perform any updates.
 Writers: can both read and write.

//acquire lock to shared data.

//release lock of shared data.


Synchronization
101 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Readers-Writers problem.

 Case1: First reader acquired the lock, reading, what happens if writer arrives?
 Case2: First reader acquired the lock, reading, what happens if reader2 arrvs?
 Case3: Writer acquired the lock, writing, what happens if reader1 arrives?
 Case4: Writer acquired the lock, writing, what happens if reader2 arrives?
Synchronization
102 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbor cannot have it.
Synchronization
103 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbor cannot have it.
Synchronization
104 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbor cannot have it.
Synchronization
105 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbor cannot have it.
Synchronization
106 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbors cannot have it.

 Not gay , just going for a fork.


Synchronization
107 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbors cannot have it.

 Philosopher in 2 states: eating (needs forks) and thinking (not need forks).

 We want parallelism, e.g., 4 or 5 (not 1 or 3) can be eating while 2 is eating.


 We don’t want deadlock: waiting for each other indefinitely.
 We don’t want starvation: no philosopher waits forever (starves to death).
Synchronization
108 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbors cannot have it.

 A solution that provides concurrency but not deadlock prevention:

Semaphore forks[5]; //inited to 1 (assume 5 philosophers on table).


do {
wait( forks[i] );
wait( forks[ (i + 1) % 5] );

// eat

signal( forks[i] );
signal( forks[ (i + 1) % 5] );

// think
} while(TRUE);
Synchronization
109 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbors cannot have it.

 A solution that provides concurrency but not deadlock prevention:


 How is deadlock possible?
Synchronization
110 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Dining philosophers problem.
 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbors cannot have it.

 A solution that provides concurrency but not deadlock prevention:


 How is deadlock possible?

 Deadlock in a circular fashion: 4 gets the left fork, context switch (cs), 3 gets the left fork,
cs, .. , 0 gets the left fork, cs, 4 now wants the right fork which is held by 0 forever.
Unlucky sequence of cs’s not likely but possible.
 A perfect solution w/o deadlock danger is possible with again semaphores.

 Solution #1: put the left back if you cannot grab right.
 Solution #2: grab both forks at once (atomic).
Synchronization
111 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Rendezvous problem.
 2 threads rendezvous at a point of execution, and neither is allowed to proceed until both
arrived.

 Guarantee that a1 happens before b2 and b1 happens before a2.


 Solution?
Synchronization
112 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Rendezvous problem.
 2 threads rendezvous at a point of execution, and neither is allowed to proceed until both
arrived.

 Guarantee that a1 happens before b2 and b1 happens before a2.


 Solution: initially aArrived = bArrived = 0; //arrived at the rendezvous (line 2).
Synchronization
113 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Rendezvous problem.
 2 threads rendezvous at a point of execution, and neither is allowed to proceed until both
arrived.

 Guarantee that a1 happens before b2 and b1 happens before a2.


 Solution: initially aArrived = bArrived = 0; //arrived at the rendezvous (line 2).

 Less efficient: might have to switch b/w A and B 1 time more than necessary.
 A arrives first.
Synchronization
114 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Rendezvous problem.
 2 threads rendezvous at a point of execution, and neither is allowed to proceed until both
arrived.

 Guarantee that a1 happens before b2 and b1 happens before a2.


 Solution: initially aArrived = bArrived = 0; //arrived at the rendezvous (line 2).

 Any problem?
Synchronization
115 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Rendezvous problem.
 2 threads rendezvous at a point of execution, and neither is allowed to proceed until both
arrived.

 Guarantee that a1 happens before b2 and b1 happens before a2.


 Solution: initially aArrived = bArrived = 0; //arrived at the rendezvous (line 2).

 Any problem? Yes, deadlock!


Synchronization
116 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Barrier problem.
 Generalization of Rendezvous problem to more than 2 threads.

 That is, no thread executes critical point until after all threads have executed rendezvous.
 That is, when the first n-1 threads arrive, they should block until the n th thread arrives.
 Solution?
Synchronization
117 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Barrier problem.
 Generalization of Rendezvous problem to more than 2 threads.

 That is, no thread executes critical point until after all threads have executed rendezvous.
 That is, when the first n-1 threads arrive, they should block until the n th thread arrives.
 Solution: semaphore mutex = 1, barrier = 0; int n = 5, count = 0;
Synchronization
118 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Barrier problem.
 Generalization of Rendezvous problem to more than 2 threads.

 That is, when the first n-1 threads arrive, they should block until the nth thread arrives.
 Solution: semaphore mutex = 1, barrier = 0; int n = 5, count = 0;

 First n-1 threads wait when they get to the barrier. nth thread unlocks the barrier.
Synchronization
119 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Barrier problem.
 Generalization of Rendezvous problem to more than 2 threads.

 That is, when the first n-1 threads arrive, they should block until the nth thread arrives.
 Solution: semaphore mutex = 1, barrier = 0; int n = 5, count = 0;

 Problem?
Synchronization
120 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Barrier problem.
 Generalization of Rendezvous problem to more than 2 threads.

 That is, when the first n-1 threads arrive, they should block until the nth thread arrives.
 Solution: semaphore mutex = 1, barrier = 0; int n = 5, count = 0;

 Problem: deadlock! nth thread signals 1 of the waiting threads. No one signals again.
Synchronization
121 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Barrier problem.
 Generalization of Rendezvous problem to more than 2 threads.

 That is, when the first n-1 threads arrive, they should block until the nth thread arrives.
 Solution: semaphore mutex = 1, barrier = 0; int n = 5, count = 0;

 Correct solution! No deadlock.


Synchronization
122 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Barrier problem.
 Generalization of Rendezvous problem to more than 2 threads.

 That is, when the first n-1 threads arrive, they should block until the nth thread arrives.
 Solution: semaphore mutex = 1, barrier = 0; int n = 5, count = 0;

 Problem: deadlock! 1st thread blocks. Since mutex is locked no one can do count++.
Synchronization
123 /
144

 Classic Synchronization Problems to be solved with Semaphores.


 Barrier problem.
 Generalization of Rendezvous problem to more than 2 threads.

 That is, when the first n-1 threads arrive, they should block until the nth thread arrives.
 Solution: semaphore mutex = 1, barrier = 0; int n = 5, count = 0;

 Common deadlock source: blocking on a semaphore while holding mutex.


Synchronization
124 /
144

 Problems with semaphores.


 Careless programmer may do
 signal(mutex); .. wait(mutex); //2+ threads in critical region (unprotected).
 wait(mutex); .. wait(mutex); //deadlock (indefinite waiting).
 Forgetting corresponding wait(mutex) or signal(mutex); //unprotect & deadlck

 Need something else, something better, something easier to use:


 Monitors.
Synchronization
125 /
144

 Solution: Monitors.
 Idea: get help not from the OS but from the programming language.
 High-level abstraction for process/thread synchronization.
 C does not provide monitors (use semaphores) but Java does.
 Compiler ensures that the critical regions of your code are protected.
 You just identify the critical section of the code, put them into a monitor, and
compiler puts the protection code.
 Monitor implementation using semaphores.
 Compiler writer/language developer has to worry about this stuff, not the
casual application programmer.
Synchronization
126 /
144

 Solution: Monitors.
 Monitor is a construct in the language, like class construct:
monitor monitor-name {
// shared variable declarations

procedure P1 (..) { .. }
..
procedure Pn (..) { .. }

Initialization code (..) { .. }


..
}

 monitor construct guarantees that only one process may be active


within the monitor at a time.
Synchronization
127 /
144

 Solution: Monitors.
 monitor construct guarantees that only one process may be active
within the monitor at a time.
 This means that, if a process is running inside the monitor (= running
a procedure, say P1()), then no other process can be active inside the
monitor (= can run P1() or any other procedure of the monitor) at the
same time.

 Compiler is putting some locks/semaphores to the beginning/ending


of these critical regions (procedures, shared variables, etc.).
 So it is not the programmer’s job anymore to insert these locks/semaphores.
Synchronization
128 /
144

 Solution: Monitors.
 Schematic view of a monitor.

 All other processes that want to be active in the monitor (execute a


monitor procedure) must wait in the queue ‘till current active P
leaves.
Synchronization
129 /
144

 Solution: Monitors.
 Schematic view of a monitor.

 This monitor solution solves the critical section (mutual exc.) problem.
 But not the other synchronization problems such as produc-consume,
dining philosophs.
Synchronization
130 /
144

 Solution: Monitors.
 Condition variables to solve all the synchronization problems.
 In previous model, no way to enforce a process/thread to wait ‘till a
condition happens.
 Now we can 
 Using conition variables.
 condition x, y;

 Two operations on a condition variable:


 x.wait (): a process that invokes the operation is suspended.
 Execute wait() operation on the condition variable x.
 x.signal(): resumes one of processes (if any) that invoked x.wait().
 Usually the first one that is blocked is waken up (FIFO).
Synchronization
131 /
144

 Solution: Monitors.
 condition x, y;

 wait(Semaphore s); //you may or may not block depending on s.value


 x.wait () //you (= process) definitely block.

 No integer is attached to x (unlike s.value).


Synchronization
132 /
144

 Solution: Monitors.
 Schematic view of a monitor w/ condition variables.

 If currently active process wants to wait (e.g., empty buffer), it calls


x.wait() and added to the queue of x, and it is no longer active.
Synchronization
133 /
144

 Solution: Monitors.
 Schematic view of a monitor w/ condition variables.

 New active process in the monitor (fetched from the entry queue),
does x.signal() from a different/same procedure. Prev. process
resumes from where it got blocked.
Synchronization
134 /
144

 Solution: Monitors.
 Schematic view of a monitor w/ condition variables.

 Now we may have 2 processes active: caller of x.signal & waken-up.


 Solution: put x.signal() as the last statement in the procedure.
Synchronization
135 /
144

 Solution: Monitors.
 Schematic view of a monitor w/ condition variables.

 Now we may have 2 processes active: caller of x.signal & waken-up.


 Solution: call x.wait() right after x.signal() to block the caller.
Synchronization
136 /
144

 Solution: Monitors.
 An example: We have 5 instances of a resource and N processes.
 Only 5 processes can use the resource simultaneously.
 Process code Monitor code
Synchronization
137 /
144

Solution: Monitors.
 An example: Dining philosophers.
monitor DP { void test (int i) {
enum { THINKING, //not holding/wanting resources if ( (state[(i + 4) % 5] != EATING) &&
HUNGRY, //not holding but wanting (state[(i + 1) % 5] != EATING) &&
EATING} //has the resources (state[i] == HUNGRY)) {
state[5]; condition cond[5]; //each philosopher may state[i] = EATING ;
need to wait (no fork to eat), so need 5 condition variables cond[i].signal();
}
//no need for entry/exit code to pickup() ‘cos its in monitor }
void pickup (int i) {
state[i] = HUNGRY; //initially all thinking
test(i); initialization_code() {
if (state[i] != EATING) for (int i = 0; i < 5; i++)
cond[i].wait(); state[i] = THINKING;
} }
void putdown (int i) {
state[i] = THINKING; } /* end of monitor */
// test left and right neighbors
test((i + 4) % 5)
test((i + 1) % 5);
}
Synchronization
138 /
144

 Solution: Monitors.
 One philosopher/process doing this in an endless loop:
..
DP DiningPhilosophers;
..
while (1) {
//THINK..
Philosopher i:
DiningPhilosophters.pickup (i);

//EAT (use resources)

DiningPhilosophers.putdown (i);

//THINK..
}
Synchronization
139 /
144

 Solution: Monitors.
 First things first; what are the ID’s to access neighbors?

#define LEFT ? THINKING?


#define RIGHT ? HUNGRY?
EATING?
state[LEFT] = ? state[i] = ? state[RIGHT] = ?
Process Process Process
.. ?? ..
i ??
Synchronization
140 /
144

 Solution: Monitors.
 General idea.

#define LEFT (i+4)%5 THINKING?


#define RIGHT (i+1)%5 HUNGRY?
EATING?
state[LEFT] = ? state[i] = ? state[RIGHT] = ?
Process Process Process
… (i+4) % 5 …
i (i+1) % 5

Test((i+4) %5) Test(i)


Test((i+1) %5)
Synchronization
141 /
144

 Solution: Monitors.
 An example: Allocate a resource to one of the several processes.
 Priority-based: The process that will use the resource for the shortest
amount of time (known) will get the resource first if there are other
processes that want the resource.

Processes or Threads
.. that want to use the resource

Resource
Synchronization
142 /
144

 Solution: Monitors.
 An example: Allocate a resource to one of the several processes.
 Assume we have condition variable implementation that can enqueue
sleeping/waiting processes w.r.t. a priority specified as a parameter to
wait() call.
 condition x;
 x.wait (priority);

Queue of sleeping processes waiting on condition x:


x
10 20 45 70

priority could be the time-duration to use the resource.


Synchronization
143 /
144

 Solution: Monitors.
 An example: Allocate a resource to one of the several processes.
monitor ResourceAllocator
{
boolean busy; //true if resource is currently in use/allocated
condition x; //sleep the process that cannot acquire the resource

void acquire(int time) {


if (busy)
x.wait(time);
busy = TRUE;
}
void release() {
busy = FALSE;
x.signal(); //wakeup the P at the head of the waiting qu
}
initialization_code() {
busy = FALSE; } }
Synchronization
144 /
144

 Solution: Monitors.
 An example: Allocate a resource to one of the several processes.

Process/Thread 1 Process/Thread 2 Process/Thread N

ResourceAllocator RA; ResourceAllocator RA; ResourceAllocator RA;

RA.acquire(10); RA.acquire(30); RA.acquire(25);

..use resource.. ..use resource.. .. ..use resource..

RA.release(); RA.release(); RA.release();

Each process should use resource between acquire() and release() calls.

You might also like