03 Threads Synch

CENG 334 – Operating Systems
03- Threads & Synchronization
Assoc. Prof. Yusuf Sahillioğlu

Computer Eng. Dept, , Turkey
Threads
2 / 144
 Threads.
 Concurrent work on the same process for efficiency.
 Several activities going on as part of the same process.
 Share registers, memory, and other resources.
 Careful w/ data synchronization (race conditions & deadlocks):
Process (a single thread) vs. Multithreads
3 / 144
Naming
4 / 144
 Why do we call it a thread anyway?
 Execution flow of a program is not smooth, looks like a thread.

 Execution jumps around (e.g., loops go back and forth) but integrity is intact.
Concurrency vs. Parallelism
5 / 144
 Concurrency: 2 processes or threads run concurrently (are concurrent)

if their flows overlap in time
 Otherwise, they are sequential.
 Examples (running on single core):
 Concurrent: A & B, A & C
 Sequential: B & C
 Parallelism: requires multiple resources to execute multiple processes

or threads at a given time instant. A&B parallel:
Concurrent Programming
6 / 144
 Many programs want to do many things “at once”

 Web browser:
 Download web pages, read cache files, accept user input, ...
 Web server:
 Handle incoming connections from multiple clients at once
 Scientific programs:
 Process different parts of a data set on different CPUs
 In each case, would like to share memory across these activities
 Web browser: Share buffer for HTML page and inlined images
 Web server: Share memory cache of recently-accessed pages
 Scientific programs: Share memory of data set being processed
 Can't we simply do this with multiple processes?
Why processes are not always ideal?
7 / 144
 Processes are not very efficient

 Each process has its own PCB and OS resources, e.g., memory
 Typically high overhead for each process: e.g., 1.7KB per
task_struct on Linux for PCB! And also the address space in physical
memory
 Creating a new process is often very expensive
 Processes don't (directly) share memory
 Each process has its own address space
 Parallel and concurrent programs often want to directly manipulate
the same memory
 e.g., When processing elements of a large array in parallel
 Note: Many OS's provide some form of inter-process shared memo
 e.g., UNIX shmget() and shmat() system calls
 Still, this requires more programmer work and does not address the
Can we do better?
8 / 144
 What can we share across all of these tasks?

 Same code – generally running the same or similar programs
 Same data
 Same privileges
 Same OS resources (files, sockets, etc.)
 What is private to each task?
 Execution state: CPU registers, stack, and program counter
 Key idea of this lecture:
 Separate the concept of a process from a thread of control
 The process is the address space and OS resources
 Each thread has its own CPU execution state
Threads vs. Processes
9 / 144
 Processes form a tree hierarchy.

 Threads form a pool of peers.
 Each thread can kill any other.
 Each thread can wait for any other thread to terminate.
 Main thread: first thread to run in a process (always exist).
Process hierarchy Thread pool
P0 T2
T4
T1
P1 shared code, data
and kernel context
sh sh sh
T5 T3
foo
Threads vs. Processes
10 /
144
 Each process has one or more threads “within” it

 Each thread has its own stack, CPU registers, etc.
 All threads within a process share the same address space and OS
resources
 Threads share memory, so they can communicate directly!
 The thread is now the unit of CPU scheduling
Address space
 A process is just a “container” for its threads
 Each thread is bound to its containing process
Thread 0
Thread 1
Thread 2
(Old) Process Address Space
11 /
144
0xFFFFFFFF
(Reserved for OS)
Stack
Stack pointer
Address space
Heap
Uninitialized vars
(BSS segment)
Initialized vars
(data segment)
Code Program counter
0x00000000 (text segment)
(New) Process Address Space w/ Threads
12 /
144
Implementing Threads
13 /
144
 Break the PCB into two pieces:

 Thread-specific stuff: Processor state
 Process-specific stuff: Address space and OS resources (open files,
etc.)
Thread Control Block (TCB)
14 /
144
 TCB contains info on a single thread

 Just processor state and pointer to corresponding PCB
 PCB contains information on the containing process
 Address space and OS resources ... but NO processor state!
Thread Control Block (TCB)
15 /
144
 TCB's are smaller and cheaper than processes

 Linux TCB (thread_struct) has 24 fields
 Linux PCB (task_struct) has 106 fields
 Context switching threads is cheaper than context switching processes.

Context Switching
16 /
144
 TCB is now the unit of a context switch

 Ready queue, wait queues, etc. now contain pointers to TCB's
 Context switch causes CPU state to be copied to/from the TCB
 Context switch between two threads in the same process:

 No need to change address space
 Context switch between two threads in different processes:
 Must change address space, sometimes invalidating cache
 This will become relevant when we talk about virtual memory
Thread State
17 /
144
 State shared by all threads in process:

 Memory content (global variables, heap, code, etc).
 I/O (files, network connections, etc).
 A change in the global variable will be seen by all other threads (unlike processes).
 State private to each thread:

 Kept in TCB (Thread Control Block).
 CPU registers, program counter.
 Stack (what functions it is calling, parameters, local variables, return addresses).
 Pointer to enclosing process (PCB).
Thread Behavior
18 /
144
 Some useful applications with threads:
 One thread listens to connections; others handle page requests.

 One thread handles GUI; others computations.
 One thread paints the left part, other the right part.
 https://youtu.be/-P28LKWTzrI (not just left and right)
 ..
Thread Behavior
19 /
144
 Single threaded
 main()
computePI(); //never finish
printf(“hi”); //never reach here
A process has a single thread of control: if it blocks on something nothing else can be done.
 Multi-threaded
 main()
createThread( computePI() ); //never finish
createThread( printf(“hi”) ); //reaches here
 main()
createThread( scanf() ); //not finish ‘till user enters
createThread( autoSaveDoc() ); //reaches here while waiting on I/O
Thread Behavior
20 /
144
 Execution flow:
Threads on a Single CPU
21 /
144
 Still possible.
 Multitasking idea
 Share one CPU among many processes (context switch).
 Multithreading idea
 Share the same process among many threads (thread switch).
 Whenever this process has the opportunity to run in the CPU, OS can select one
of its many threads to run it for a while, and so on.
 One pid, several thread ids.
Schedulable entities increased.

Threads on a Single CPU
22 /
144
 If threads are all CPU-bound, e.g., no I/O or pure math, then we do

not gain much by multithreading.
 Luckily this is usually not the case, e.g., 1 thread does the I/O, ..
 Select your threads carefully, one is I/O-bound, other is CPU-bound, ..
 With multicores, we still gain big even if threads are all CPU-bound.
Single-threaded Process on Multiple CPUs
23 /
144
 Note that even if you have 8 CPUs, a single-threaded process can use
only one of them at a time
 Obviously other processes may utilize the unused CPUs but if there are fewer
processes than the CPU count then the system is underutilized
 If you implement the multithreaded version of the same program, then
all 8 CPUs can serve the same process and make it finish very early
Multithreading Concept
24 /
144
 Multithreading concept: pseudo-parallel runs. (pseudo: interleaving switches on CPU).
thread1 thread2 thread3 thread4

funct1() { .. }
funct2() { .. }
main() {
..
createThread( funct1() );
..
..
..
}
Single- vs. Multi-threaded Processes
25 /
144
 Shared and private stuff:

Benefits of Threads
26 /
144
 Responsiveness
 One thread blocks, another runs.
 One thread may always wait for the user.
 Resources sharing
 Very easy sharing (use global variables; unlike msg queues, pipes, shmget).
 Be careful about data synchronization tough.
 Economy
 Thread creation is fast.
 Context switching among threads may be faster.
 ‘cos you do not have to duplicate code and global variables (unlike processes).
 Scalability
 Multiprocessors can be utilized better.
 Process that has created 8 threads can use all 8 cores (single-threaded proc. utilize 1
core).
Multithreading Example: WWW
27 /
144
 Client (Chrome) requests a page from server (amazon.com).
 Server gives the page name to the thread and resumes listening.
 Thread checks the disk cache in memo; if page not there, do disk I/O;
sends the page to the client (network I/O).
Threading Support
28 /
144
 User-level threads: are threads that the OS is not aware of. They exist
entirely within a process, and are scheduled to run within that process'
time slices.
 Kernel-level threads: The OS is aware of kernel-level threads. Kernel

threads are scheduled by the OS's scheduling algorithm, and require a
"lightweight" context switch to switch between (that is, registers, PC,
and SP must be changed, but the memory context remains the same
among kernel threads in the same process).
Threading Support
29 /
144
 User-level threads are much faster to switch between, as there is no

context switch; further, a problem-domain-dependent algorithm can be
used to schedule among them. CPU-bound tasks with interdependent
computations, or a task that will switch among threads often, might
best be handled by user-level threads.
Threading Support
30 /
144
 Kernel-level threads are scheduled by the OS, and each thread can be
granted its own time slices by the scheduling algorithm. The kernel
scheduler can thus make intelligent decisions among threads, and
avoid scheduling processes which consist of entirely idle threads (or
I/O bound threads). A task that has multiple threads that are I/O
bound, or that has many threads (and thus will benefit from the
additional time slices that kernel threads will receive) might best be
handled by kernel threads.
 Kernel-level threads require a system call for the switch to occur; user-
level threads do not.
Threading Support
31 /
144
 Thread libraries that provide us API for creating and managing

threads.
 pthreads, java threads, win32 threads.
 Pthreads (POSIX threads) interface.

 Common in Unix operating sytems: Solaris, Mac OS, Linux.
 Not implemented in the standard C library; search the library named pthread
while compiling: gcc –o thread1 –lpthread thread1.c
 Implementation-dependent; can be user- or kernel-level.
 Functions in pthread library are actually doing linux system calls, e.g.,
pthread_create()  clone()
 See sample codes to warm up on pthreads:

 http://user.ceng.metu.edu.tr/~ys/ceng334-os/threadster0.c also
Pthreads
32 /
144
thread1 thread2
int main(..)
{
..
..
pthread_create(&tid,…,runner,..);
pthread_join(tid); wait
printf (sum);
}
runner (..)
{
..
sum = ..
pthread_exit();
}
Single- to Multi-thread Conversion
33 /
144
 In a simple world
 Identify functions as parallel activities.
 Run them as separate threads.
 In real world
 Single-threaded programs use global variables, library functions (malloc).
 Be careful with them.
 Global variables are good for easy-communication but need special care.
34 /
144
 Careful with global variable:

35 /
144
 Careful with global variable:

36 /
144
 Global, local, and thread-specific variables.

 thread-specific: global inside the thread, but not for the whole process,
i.e., other threads cannot access it, but all the functions of the thread
can (no problem ‘cos fnctns within a thread executed sequentially).
 No language support for this variable type; C cannot do this.
 Thread API has special functions to create such variables

(pthread_getspecific).
37 /
144
 Use thread-safe (reentrant, reenterable) library routines.
 Multiple malloc()s are executed sequentially in a single-threaded code.
 Say one thread is suspended on malloc(); another process calls

malloc() and re-enters it while the 1st one has not finished.
 Library functions should be designed to be reentrant = designed to
have a second call to itself from the same process before it’s finished.
 To do so, do not use global variables.
Thread Issues
38 /
144
 All threads in a process share memory:

Address space
write
foo
Thread 0
read
Thread 1
Thread 2
 What happens when two threads access the same variable?

 Which value does Thread 2 see when it reads “foo”?
 What does it depend on?
Thread Issues
39 /
144
/* shared */
 asd unsigned int cnt = 0; /* thread routine */
volatile
void *count(void *arg) {
//see Note section below for volatile
int i;
for (i=0; i<NITERS; i++)
#define NITERS 100000000
cnt++;
int main() {
return NULL;
pthread_t tid1, tid2;
}
Pthread_create(&tid1, NULL,
count, NULL);
linux> ./badcnt
Pthread_create(&tid2, NULL,
BOOM! cnt=198841183
count, NULL);
linux> ./badcnt
Pthread_join(tid1, NULL);
BOOM! cnt=198261801
Pthread_join(tid2, NULL);
linux> ./badcnt
if (cnt != (unsigned)NITERS*2)
BOOM! cnt=198269672
printf("BOOM! cnt=%d\n",
cnt); cnt should be
else equal to 200,000,000.
printf("OK cnt=%d\n", What went wrong?
cnt);}
Thread Issues
40 /
144
 Assembly code for counter loop:

Thread Issues
41 /
144
 Assembly code for counter loop.

 Unpredictable switches of threads by scheduler will create
inconsistencies on the shared data, e.g., global variable cnt.
 Handling this is one of the most important topics of this course:
Synchronization.
Synchronization
42 /
144
 Synchronize threads/coordinate their activities so that when you

access the shared data (e.g., global variables) you are not having a
trouble.
 Multiple processes sharing a file or shared memory segment also

require synchronization (= critical section handling).
Synchronization
43 /
144
 The part of the process that is accessing and changing shared data is
called its critical section.
Thread 1 Code Thread 2 Code Thread 3 Code
Change X
Change X
Change Y
Change Y
Change Y Change X
Assuming X and Y are shared data.

Synchronization
44 /
144
 Solution: No 2 processes/threads are in their critical section at the

same time, aka Mutual Exclusion (mutex).
 Must assume processes/threads interleave executions arbitrarily

(preemptive scheduling) and at different rates.
 Scheduling is not under application’s control.
 We control coordination using data synchronization.
 We restrict interleaving of executions to ensure consistency.
 Low-level mechanism to do this: locks,
 High-level mechanisms: mutexes, semaphores, monitors, condition variables.
Synchronization
45 /
144
 General way to achieve synchronization:

Synchronization
46 /
144
 An example: race condition.
Critical
section:
Critical
section:
critical section respected  not respected 

Synchronization
47 /
144
 Another example: race condition.

 Assume we had 5 items in the buffer.
 Then
 Assume producer just produced a new item, put it into buffer, and about to do
count++
 Assume consumer just retrieved an item from the buffer, and about to do
count--
Producer Consumer
or
Producer Consumer
Synchronization
48 /
144
 Critical region: is where we manipulate count.
 count++ could be implemented as (similarly, count--)

 register1 = count; //read value
register1 += 1; //increase value
count = register1; //write back
Synchronization
49 /
144
 Then:
register1 Count PRODUCER (count++)
6
5 5
4
6
register1 = count
register1 = register1 + 1
register2 count = register1
4
5
CONSUMER (count--)
register2 = count
register2 = register2 – 1
CPU count = register2
Main Memory
Synchronization
50 /
144
 2 threads executing their critical section codes 

Balance = 1000TL
balance = get_balance(account);
balance -= amount; Local = 900TL
Execution sequence
as seen by CPU balance = get_balance(account);
balance -= amount; Local = 900TL
put_balance(account, balance); Balance = 900TL
put_balance(account, balance); Balance = 900TL!
 Although 2 customers withdrew 100TL, balance is 900TL, not 800TL 

Synchronization
51 /
144
 Solution: mutual exclusion.

 Only one thread at a time can execute code in their critical section.
 All other threads are forced to wait on entry.
 When one thread leaves the critical section, another can enter.
Critical Section
Thread 1
(modify account balance)
Synchronization
52 /
144

Critical Section
Thread 2 Thread 1
2 thread must wait
nd
for critical section to clear

Synchronization
53 /
144

Critical Section
Thread 2
2nd thread free to enter 1st thread leaves critical section

Synchronization
54 /
144

 pthread library provides us mutex variables to control the critical section
access.
 pthread_mutex_lock(&myMutex)
.. //critical section stuff
pthread_mutex_unlock(&myMutex)
 See this in action here: http://user.ceng.metu.edu.tr/~ys/ceng334-os/threadster1.c
Synchronization
55 /
144
 Critical section requirements.

 Mutual exclusion: at most 1 thread is currently executing in the critical section.
 Progress: if thread T1 is outside the critical section, then T1 cannot prevent T2

from entering the critical section.
 No starvation: if T1 is waiting for the critical section, it’ll eventually enter.

 Assuming threads eventually leave critical sections.
 Performance: the overhead of entering/exiting critical section is small w.r.t.

the work being done within it.
Synchronization
56 /
144
 Solution: Peterson’s solution to mutual exclusion.

 Programming at the application (sw solution; no hw or kernel support).
 Peterson.enter //similar to pthread_mutex_lock(&myMutex)
Peterson.exit //similar to pthread_mutex_unlock(&myMutex)
 Works for 2 threads/processes (not more).
 Is this solution OK?
 Set global variable lock = 1.
 A thread that wants to enter critical section checks lock == 1.
 If true, enter. Do lock--.
 if false, another thread decremented it so not enter.
Synchronization
57 /
144

 Is this solution OK?
 Set global variable lock = 1.
 A thread that wants to enter critical section checks lock == 1.
 If true, enter. Do lock--.
 if false, another thread decremented it so not enter.
 This solution sucks ‘cos lock itself is a shared global variable.

 Just using a single variable without any other protection is not enough.
 Back to Peterson’s algo..
Synchronization
58 /
144

 Assume that the LOAD and STORE machine instructions are atomic; that is,
cannot be interrupted.
 The two processes share two variables:
 int turn;
 boolean flag[2];
 The variable turn indicates whose turn it is to enter the critical section.
 turn = i means process Pi can execute (i=0,1).
 The flag array is used to indicate if a process is ready to enter the critical
section. flag[i] = true implies that process Pi is ready (wants to enter).
Synchronization
59 /
144

 The variable turn indicates whose turn it is to enter the critical section.
 turn = i means process Pi can execute (i=0,1).
 The flag array is used to indicate if a process is ready to enter the critical
section. flag[i] = true implies that process Pi is ready (wants to enter).
 Algorithm for Pi; the other process is Pj:
I want to enter but,
be nice to other process.
Busy wait:
Synchronization
60 /
144

PROCESS i (0) PROCESS j (1)
do { do {
flag[i] = TRUE; flag[j] = TRUE;
turn = j; turn = i;
while (flag[j] && turn == j); while (flag[i] && turn == i);
critical section.. critical section..
flag[i] = FALSE; flag[j] = FALSE;
remainder section.. remainder section..
} while (1) } while (1)
flag[]
Shared Variables: turn i=0, j=1 are local.
Synchronization
61 /
144
 Solution: hardware support for mutual exclusion.

 Kernel code can disable clock interrupts (context/thread switches).
disable interrupts (no switch)
enable interrupts (schedulable)

Synchronization
62 /
144

 Works for single CPU.
 Multi-CPU fails ‘cos you’re disablin the interrupt only for your
processor.
 That does not mean other processors do not get interrupts.
 Each processor has its own interrupt mechanism.
 Hence another process/thread running in another processor can touch
the shared data.
 Too inefficient to disable interrupts on all available processors.
Synchronization
63 /
144

 Another support mechanism: Complex machine instructions from hw that are
atomic (not interruptible).
 Locks (not just simple integers).
do {
acquire lock
critical section
release lock
remainder section
} while (TRUE);
 How to implement acquire/release lock?
 Use special machine instructions: TestAndSet, Swap.
Synchronization
64 /
144

 Complex machine instruction for hw synch: TestAndSet
 TestAndSet is a machine/assembly instruction.

 You must write the acquire-lock portion (entry section code) of your code in
assembly. But here is a C code for easy understanding:
--Definition of TestAndSet Instruction--

boolean TestAndSet (boolean *target)
{
boolean rv = *target;
*target = TRUE;
return rv:
} //atomic (not interruptible)!!!!!!!!!!!!
Synchronization
65 /
144

We use a shared Boolean variable lock, initialized to false.

do {
while ( TestAndSet (&lock ) ) entry section
; //do nothing; busy wait
// critical section
lock = FALSE; //release lock exit section

// remainder section
} while (TRUE);
Synchronization
66 /
144

 Can be suspended/interrupted b/w TestAndSet & CMP, but not during TestAndSet.
Synchronization
67 /
144
 Advertisement: Writing assembly in C is a piece of cake.

Synchronization
68 /
144

 Complex machine instruction for hw synch: Swap
 Swap is a machine/assembly instruction.

 You must write the acquire-lock portion (entry section code) of your code in
assembly. But here is a C code for easy understanding:
--Definition of Swap Instruction--

boolean Swap (boolean* a, boolean* b)
{
boolean temp = *a;
*a = *b;
*b = temp;
} //atomic (not interruptible)!!!!!!!!!!!!
Synchronization
69 /
144

 Complex machine instruction for hw synch: Swap
We use a shared Boolean variable lock, initialized to false.
Each process also has a local Boolean variable key.
do {
key = TRUE;
while (key == TRUE) entry sect
Swap (&lock, &key );
// critical section
lock = FALSE; exit sect
} while (TRUE);
Synchronization
70 /
144

 A comment on TestAndSwap & Swap.
 Although they both guarantee mutual exclusion, they may make one
process (X) wait a lot:
 A process X may be waiting, but we can have the other process Y going into
the critical region repeatedly.
 One toy/bad solution: keep the remainder section code so long that
scheduler kicks Y out of the CPU before it reaches back to the entry section.
Synchronization
71 /
144
 Solution: Semaphores (= shared integer variable).

 Idea: avoid busy waiting: waste of CPU cycles by waiting in a loop
‘till the lock is available, aka spinlock.
 Example1: while (flag[i] && turn == i);//from Peterson’s algo.
 Example2: while (TestAndSet (&lock )); //from TestAndSet algo.
 How to avoid?
 If a process P calls wait() on a semaphore with a value of zero, P is
added to the semaphore’s queue and then blocked.
 The state of P is switched to the waiting state, and control is transferred
to the CPU scheduler, which selects another process to execute
(instead of busy waiting on P).
 When another process increments the semaphore by calling signal()
and there are tasks on the queue, one is taken off of it and resumed.
 wait() = P() = down(). //modify semaphore s via these functions.
 signal() = V() = up(). //modify semaphore s via these functions.
Synchronization
72 /
144
 Solution: Semaphores.
 wait() = P() = down(). //modify semaphore s via these functions.
 signal() = V() = up(). //modify semaphore s via these functions.
 These functions can be implemented in kernel as system calls.
 Kernel makes sure that wait(s) & signal(s) are atomic.
 Less complicated entry & exit sections.

Synchronization
73 /
144
 Operations (kernel codes).
Busy-waiting  vs. Efficient 
 More formally, s->value--; s->list.add(this); etc.

Synchronization
74 /
144
 Operations.
 wait(s):
 if s positive
s-- and return
else
s-- and block/wait (‘till somebody wakes you
up; then return)
Synchronization
75 /
144
 Operations.
 signal(s):
 s++
if there’s 1+ process waiting (new s<=0)
wake one of them up
return
Synchronization
76 /
144
 Types.
 Binary semaphore
 Integer value can range only between 0 and 1; can be simpler to implement;
aka mutex locks.
 Provides mutual exclusion; can be used for the critical section problem.
 Counting semaphore
 Integer value can range over an unrestricted domain.
 Can be used for other synchronization problems; for example for resource
allocation.
 Example: you have 10 instances of a resource. Init semaphore s to 10 in this case.
Synchronization
77 /
144
 Usage.
 An integer variable s that can be shared by N processes/threads.
 s can be modified only by atomic system calls: wait() & signal().
 s has a queue of waiting processes/threads that might be
sleeping on it.
typedef struct {
int value;
struct process *list;
} semaphore;
 Atomic: when process X is executing wait(), Y can execute wait()

if X finished executing wait() or X is blocked in wait().
 When X is executing signal(), Y can execute signal() if X finished.
Synchronization
78 /
144
 Usage.
 Binary semaphores (mutexes) can be used to solve critical
section problems.
 A semaphore variable (lets say mutex) can be shared by N

processes, and initialized to 1.
 Each process is structured as follows: do {

wait (mutex);
// Critical Section
signal (mutex);
} while (TRUE);
Synchronization
79 /
144
 Usage.
Process 0 Process 1
do { do {
wait (mutex); wait (mutex);
// Critical Section // Critical Section
signal (mutex); signal (mutex);
// remainder section // remainder section
} while (TRUE); } while (TRUE);
wait() {…} signal() {…}

Kernel
Semaphore mutex; //initialized to 1
Synchronization
80 /
144
 Usage.
 Kernel puts processes/threads waiting on s in a FIFO queue. Why
FIFO?
Synchronization
81 /
144
 Usage other than critical section.
 Ensure S1 definitely executes before S2 (just a synch problem).
P0 P1
… …
S1; S2;
…. ….
Synchronization
82 /
144
 Ensure S1 definitely executes before S2 (just a synch problem).
P0 P1
… …
S1; S2; Solution via semaphores:
…. …. Semaphore x = 0; //inited to 0
P0 P1
… …
S1; wait (x);
signal (x); S2;
…. ….
Synchronization
83 /
144
 Resource allocation (just another synch problem).
 We have N processes that want a resource that has 5 instances.
 Solution:
Synchronization
84 /
144
 Resource allocation (just another synch problem).
 We’ve N processes that want a resource R that has 5 instances.
 Solution:
 Semaphore rs = 5;
 Every process that wants to use R will do wait(rs);
 If some instance is available, that means rs will be nonnegative  no blocking.
 If all 5 instances are used, that means rs will be negative  block ‘till rs nonneg.
 Every process that finishes with R will do signal(rs);
 A blocked processes will change state from waiting to ready.
Synchronization
85 /
144
 Enforce consumer to sleep while there’s no item in the buffer
(another synch problem).
Producer Consumer
do { do {
// produce item wait (Full_Cells); //instead of
.. busy-waiting, go to sleep mode
put item into buffer and give CPU back to producer
.. for faster production (efficiency!!).
signal (Full_Cells); ..
remove item from buffer
} while (TRUE); ..
} while (TRUE);
Semaphore Full_Cells = 0; //initialized to 0 Kernel
Synchronization
86 /
144
Synchronization
87 /
144
 Consumer can never cross the producer curve.
 Difference b/w produced and consumed items can be <= BUFSIZE
Synchronization
88 /
144
 Problems with semapahores: Deadlock and Starvation.

 Deadlock.
 Two or more processes are waiting indefinitely for an event that can be
caused by only one of the waiting processes.
Synchronization
89 /
144

 Deadlock.
Synchronization
90 /
144

 Deadlock.
Synchronization
91 /
144

 Deadlock.
 This code may sometimes (not all the time) cause a deadlock:
 P0 P1
wait(S); wait(Q);
wait(Q); wait(S);
. .
. .
signal(S); signal(Q);
signal(Q); signal(S);
 When does the deadlock occur?

 How to solve?
Synchronization
92 /
144

 Starvation.
 Indefinite blocking: a process may never be removed from the semaphore
queue in which it is susupended; it’ll always be sleeping; no service.
 When does it occur?
 How to solve?
 Another problem:
 Low-priority process may cause high-priority process to wait.
Synchronization
93 /
144
 Classic Synchronization Problems to be solved with Semaphores.

 Bounded-buffer problem.
 Readers-Writers problem.
 Dining philosophers problem.
 Rendezvous problem.
 Barrier problem.
Synchronization
94 /
144

 Bounded-buffer problem (aka Producer-Consumer problem).
 Producer should not produce any item if the buffer is full: Semaphore full = 0; //inited
 Consumer should not consume any item if the buffer is empty: Semaphore empty = N;
 Producer and consumer should access the buffer in a mutually exc manner: mutex = 1;
buffer
prod cons
full = 4
empty = 6
 Types of 3 semaphores above?

Synchronization
95 /
144

 Think about the code of this?

Synchronization
96 /
144

Synchronization
97 /
144

 A data set is shared among a number of concurrent processes.
 Readers: only read the data set; they do not perform any updates.
 Writers: can both read and write.
 Problem: allow multiple readers to read at the same time. Only one single writer can
access the shared data at the same time (no reader/writer when writer is active).
Synchronization
98 /
144

 Integer readcount initialized to 0.

 Number of readers reading the data at the moment.
 Semaphore mutex initialized to 1.
 Protects the readcount variable (multiple readers may try to modify it).
 Semaphore wrt initialized to 1.
 Protects the shared data (either writer or reader(s) should access data at a time).
Synchronization
99 /
144

 Think about the code of this?

 Reader and writer processes running in (pseudo) parallel.
 Hint: first and last reader should do something special.
Synchronization
100 /
144

//acquire lock to shared data.
//release lock of shared data.

Synchronization
101 /
144

 Case1: First reader acquired the lock, reading, what happens if writer arrives?
 Case2: First reader acquired the lock, reading, what happens if reader2 arrvs?
 Case3: Writer acquired the lock, writing, what happens if reader1 arrives?
 Case4: Writer acquired the lock, writing, what happens if reader2 arrives?
Synchronization
102 /
144

 A philosopher (process) needs 2 forks (resources) to eat.
 While a philosopher is holding a fork, it’s neighbor cannot have it.
Synchronization
103 /
144

Synchronization
104 /
144

Synchronization
105 /
144

Synchronization
106 /
144

 While a philosopher is holding a fork, it’s neighbors cannot have it.
 Not gay , just going for a fork.

Synchronization
107 /
144

 Philosopher in 2 states: eating (needs forks) and thinking (not need forks).
 We want parallelism, e.g., 4 or 5 (not 1 or 3) can be eating while 2 is eating.

 We don’t want deadlock: waiting for each other indefinitely.
 We don’t want starvation: no philosopher waits forever (starves to death).
Synchronization
108 /
144

 A solution that provides concurrency but not deadlock prevention:
Semaphore forks[5]; //inited to 1 (assume 5 philosophers on table).

do {
wait( forks[i] );
wait( forks[ (i + 1) % 5] );
// eat
signal( forks[i] );
signal( forks[ (i + 1) % 5] );
// think
} while(TRUE);
Synchronization
109 /
144


 How is deadlock possible?
Synchronization
110 /
144


 How is deadlock possible?
 Deadlock in a circular fashion: 4 gets the left fork, context switch (cs), 3 gets the left fork,
cs, .. , 0 gets the left fork, cs, 4 now wants the right fork which is held by 0 forever.
Unlucky sequence of cs’s not likely but possible.
 A perfect solution w/o deadlock danger is possible with again semaphores.
 Solution #1: put the left back if you cannot grab right.
 Solution #2: grab both forks at once (atomic).
Synchronization
111 /
144

 2 threads rendezvous at a point of execution, and neither is allowed to proceed until both
arrived.
 Guarantee that a1 happens before b2 and b1 happens before a2.

 Solution?
Synchronization
112 /
144

arrived.

 Solution: initially aArrived = bArrived = 0; //arrived at the rendezvous (line 2).
Synchronization
113 /
144

arrived.

 Less efficient: might have to switch b/w A and B 1 time more than necessary.
 A arrives first.
Synchronization
114 /
144

arrived.

 Any problem?
Synchronization
115 /
144

arrived.

 Any problem? Yes, deadlock!

Synchronization
116 /
144

 Generalization of Rendezvous problem to more than 2 threads.
 That is, no thread executes critical point until after all threads have executed rendezvous.
 That is, when the first n-1 threads arrive, they should block until the n th thread arrives.
 Solution?
Synchronization
117 /
144

 That is, no thread executes critical point until after all threads have executed rendezvous.
 That is, when the first n-1 threads arrive, they should block until the n th thread arrives.
 Solution: semaphore mutex = 1, barrier = 0; int n = 5, count = 0;
Synchronization
118 /
144

 That is, when the first n-1 threads arrive, they should block until the nth thread arrives.
 First n-1 threads wait when they get to the barrier. nth thread unlocks the barrier.
Synchronization
119 /
144

 Problem?
Synchronization
120 /
144

 Problem: deadlock! nth thread signals 1 of the waiting threads. No one signals again.
Synchronization
121 /
144

 Correct solution! No deadlock.

Synchronization
122 /
144

 Problem: deadlock! 1st thread blocks. Since mutex is locked no one can do count++.
Synchronization
123 /
144

 Common deadlock source: blocking on a semaphore while holding mutex.

Synchronization
124 /
144
 Problems with semaphores.

 Careless programmer may do
 signal(mutex); .. wait(mutex); //2+ threads in critical region (unprotected).
 wait(mutex); .. wait(mutex); //deadlock (indefinite waiting).
 Forgetting corresponding wait(mutex) or signal(mutex); //unprotect & deadlck
 Need something else, something better, something easier to use:

 Monitors.
Synchronization
125 /
144
 Solution: Monitors.
 Idea: get help not from the OS but from the programming language.
 High-level abstraction for process/thread synchronization.
 C does not provide monitors (use semaphores) but Java does.
 Compiler ensures that the critical regions of your code are protected.
 You just identify the critical section of the code, put them into a monitor, and
compiler puts the protection code.
 Monitor implementation using semaphores.
 Compiler writer/language developer has to worry about this stuff, not the
casual application programmer.
Synchronization
126 /
144
 Monitor is a construct in the language, like class construct:
monitor monitor-name {
// shared variable declarations
procedure P1 (..) { .. }
..
procedure Pn (..) { .. }
Initialization code (..) { .. }

..
}
 monitor construct guarantees that only one process may be active

within the monitor at a time.
Synchronization
127 /
144
 monitor construct guarantees that only one process may be active
within the monitor at a time.
 This means that, if a process is running inside the monitor (= running
a procedure, say P1()), then no other process can be active inside the
monitor (= can run P1() or any other procedure of the monitor) at the
same time.
 Compiler is putting some locks/semaphores to the beginning/ending

of these critical regions (procedures, shared variables, etc.).
 So it is not the programmer’s job anymore to insert these locks/semaphores.
Synchronization
128 /
144
 Schematic view of a monitor.
 All other processes that want to be active in the monitor (execute a

monitor procedure) must wait in the queue ‘till current active P
leaves.
Synchronization
129 /
144
 Schematic view of a monitor.
 This monitor solution solves the critical section (mutual exc.) problem.
 But not the other synchronization problems such as produc-consume,
dining philosophs.
Synchronization
130 /
144
 Condition variables to solve all the synchronization problems.
 In previous model, no way to enforce a process/thread to wait ‘till a
condition happens.
 Now we can 
 Using conition variables.
 condition x, y;
 Two operations on a condition variable:

 x.wait (): a process that invokes the operation is suspended.
 Execute wait() operation on the condition variable x.
 x.signal(): resumes one of processes (if any) that invoked x.wait().
 Usually the first one that is blocked is waken up (FIFO).
Synchronization
131 /
144
 condition x, y;
 wait(Semaphore s); //you may or may not block depending on s.value

 x.wait () //you (= process) definitely block.
 No integer is attached to x (unlike s.value).

Synchronization
132 /
144
 Schematic view of a monitor w/ condition variables.
 If currently active process wants to wait (e.g., empty buffer), it calls

x.wait() and added to the queue of x, and it is no longer active.
Synchronization
133 /
144
 New active process in the monitor (fetched from the entry queue),
does x.signal() from a different/same procedure. Prev. process
resumes from where it got blocked.
Synchronization
134 /
144
 Now we may have 2 processes active: caller of x.signal & waken-up.

 Solution: put x.signal() as the last statement in the procedure.
Synchronization
135 /
144
 Now we may have 2 processes active: caller of x.signal & waken-up.

 Solution: call x.wait() right after x.signal() to block the caller.
Synchronization
136 /
144
 An example: We have 5 instances of a resource and N processes.
 Only 5 processes can use the resource simultaneously.
 Process code Monitor code
Synchronization
137 /
144
Solution: Monitors.
 An example: Dining philosophers.
monitor DP { void test (int i) {
enum { THINKING, //not holding/wanting resources if ( (state[(i + 4) % 5] != EATING) &&
HUNGRY, //not holding but wanting (state[(i + 1) % 5] != EATING) &&
EATING} //has the resources (state[i] == HUNGRY)) {
state[5]; condition cond[5]; //each philosopher may state[i] = EATING ;
need to wait (no fork to eat), so need 5 condition variables cond[i].signal();
}
//no need for entry/exit code to pickup() ‘cos its in monitor }
void pickup (int i) {
state[i] = HUNGRY; //initially all thinking
test(i); initialization_code() {
if (state[i] != EATING) for (int i = 0; i < 5; i++)
cond[i].wait(); state[i] = THINKING;
} }
void putdown (int i) {
state[i] = THINKING; } /* end of monitor */
// test left and right neighbors
test((i + 4) % 5)
test((i + 1) % 5);
}
Synchronization
138 /
144
 One philosopher/process doing this in an endless loop:
..
DP DiningPhilosophers;
..
while (1) {
//THINK..
Philosopher i:
DiningPhilosophters.pickup (i);
//EAT (use resources)
DiningPhilosophers.putdown (i);
//THINK..
}
Synchronization
139 /
144
 First things first; what are the ID’s to access neighbors?
#define LEFT ? THINKING?

#define RIGHT ? HUNGRY?
EATING?
state[LEFT] = ? state[i] = ? state[RIGHT] = ?
Process Process Process
.. ?? ..
i ??
Synchronization
140 /
144
 General idea.
#define LEFT (i+4)%5 THINKING?

#define RIGHT (i+1)%5 HUNGRY?
EATING?
state[LEFT] = ? state[i] = ? state[RIGHT] = ?
Process Process Process
… (i+4) % 5 …
i (i+1) % 5
Test((i+4) %5) Test(i)

Test((i+1) %5)
Synchronization
141 /
144
 An example: Allocate a resource to one of the several processes.
 Priority-based: The process that will use the resource for the shortest
amount of time (known) will get the resource first if there are other
processes that want the resource.
Processes or Threads
.. that want to use the resource
Resource
Synchronization
142 /
144
 Assume we have condition variable implementation that can enqueue
sleeping/waiting processes w.r.t. a priority specified as a parameter to
wait() call.
 condition x;
 x.wait (priority);
Queue of sleeping processes waiting on condition x:

x
10 20 45 70
priority could be the time-duration to use the resource.

Synchronization
143 /
144
monitor ResourceAllocator
{
boolean busy; //true if resource is currently in use/allocated
condition x; //sleep the process that cannot acquire the resource
void acquire(int time) {

if (busy)
x.wait(time);
busy = TRUE;
}
void release() {
busy = FALSE;
x.signal(); //wakeup the P at the head of the waiting qu
}
initialization_code() {
busy = FALSE; } }
Synchronization
144 /
144
Process/Thread 1 Process/Thread 2 Process/Thread N
ResourceAllocator RA; ResourceAllocator RA; ResourceAllocator RA;
RA.acquire(10); RA.acquire(30); RA.acquire(25);
..use resource.. ..use resource.. .. ..use resource..
RA.release(); RA.release(); RA.release();
Each process should use resource between acquire() and release() calls.

03 Threads Synch

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

03 Threads Synch

Uploaded by

Copyright:

Available Formats

CENG 334 – Operating Systems

03- Threads & Synchronization

Assoc. Prof. Yusuf Sahillioğlu

 Why do we call it a thread anyway?

 Execution flow of a program is not smooth, looks like a thread.

 Concurrency: 2 processes or threads run concurrently (are concurrent)

 Parallelism: requires multiple resources to execute multiple processes

 Many programs want to do many things “at once”

 Processes are not very efficient

 What can we share across all of these tasks?

 Processes form a tree hierarchy.

 Each process has one or more threads “within” it

 Break the PCB into two pieces:

 TCB contains info on a single thread

 TCB's are smaller and cheaper than processes

 Context switching threads is cheaper than context switching processes.

 TCB is now the unit of a context switch

 Context switch between two threads in the same process:

 State shared by all threads in process:

 State private to each thread:

 Some useful applications with threads:

 One thread listens to connections; others handle page requests.

Schedulable entities increased.

 If threads are all CPU-bound, e.g., no I/O or pure math, then we do

 Multithreading concept: pseudo-parallel runs. (pseudo: interleaving switches on CPU).

thread1 thread2 thread3 thread4

 Shared and private stuff:

 Client (Chrome) requests a page from server (amazon.com).

 Kernel-level threads: The OS is aware of kernel-level threads. Kernel

 User-level threads are much faster to switch between, as there is no

 Thread libraries that provide us API for creating and managing

 Pthreads (POSIX threads) interface.

 See sample codes to warm up on pthreads:

 Careful with global variable:

 Careful with global variable:

 Global, local, and thread-specific variables.

 No language support for this variable type; C cannot do this.

 Thread API has special functions to create such variables

 Use thread-safe (reentrant, reenterable) library routines.

 Multiple malloc()s are executed sequentially in a single-threaded code.

 Say one thread is suspended on malloc(); another process calls

 All threads in a process share memory:

 What happens when two threads access the same variable?

 Assembly code for counter loop:

 Assembly code for counter loop.

 Synchronize threads/coordinate their activities so that when you

 Multiple processes sharing a file or shared memory segment also

Thread 1 Code Thread 2 Code Thread 3 Code

Assuming X and Y are shared data.

 Solution: No 2 processes/threads are in their critical section at the

 Must assume processes/threads interleave executions arbitrarily

 General way to achieve synchronization:

 An example: race condition.

critical section respected  not respected 

 Another example: race condition.

 Another example: race condition.

 Critical region: is where we manipulate count.

 count++ could be implemented as (similarly, count--)

 Another example: race condition.

 2 threads executing their critical section codes 

put_balance(account, balance); Balance = 900TL!

 Although 2 customers withdrew 100TL, balance is 900TL, not 800TL 

 Solution: mutual exclusion.

 Solution: mutual exclusion.