You are on page 1of 38

Concurrent Processing

Concurrent Processes
 Operating systems that support concurrent processing allow a
number
b off cooperating
i sequential
i l processes to, allll run
asynchronously, and share some data in common

 These cooperating processes are said to be running concurrently

 The relative speed of concurrent processes is unknown

 Cooperating
p g processes
p can share:
 A logical address space (both code and data) – Threads
 Data only through files or messages only – Processes

1
Identifying Concurrent Executions
 Approgram
g mayy have some statements that can be
executed independently, i.e. in parallel or concurrently

 Such a program can be divided into independent


chunks of serial statements (executions)

 Each chunk can then execute in a different thread

 To identify these chunks, we must know the


precedence constraints among the various statements
in the program.

Precedence Graph
 A Precedence Graph is a directed acyclic
graph whose nodes correspond to
individual program statements.

 An edge from node Si to node Sj means


that statement Sj can be executed only
after statement Si has completed
execution.

2
Example Precedence Graph
 Example: P
Precedence
d G
Graph
h
Consider the following program:

a = x + y;  S1 S1 S2

b = z + 1;  S2
c = a – b;  S3
S3
w = c + 1;  S4

S4

The Read and Write Sets


 The read set for statement Si,
R(Si) = {a1,a2, … , am}
is the set of all variables whose values are referenced
in statement Si during its execution.

 The write set for statement Si,


W(Si) = {b1, b2, …, bn}
is the set of all variables whose values are changed
by the execution of statement Si.

3
Concurrency Conditions
 For two statements S1 and S2 to be executed
concurrently (where S1 comes before S2 in the
program) and still give the same result, they
must satisfy the Bernstein’s Concurrency
Conditions:

1. R  S 1  W  S 2     S2 does not change any variable that S1 reads.


2 W  S 1   R  S 2     S1 ddoes nott change
2. h any variable
i bl that
th t S2 reads.
d
3. W  S 1  W  S 2     S1 & S2 do not change the same variable.

Concurrency Conditions
Example: The read and write sets for the above program statements are:

R(S1) = {x,y}, W(S1) = {a} S1: a = x + y;


S2: b = z + 1;
R(S2) = {z}, W(S2) = {b} S3: c = a – b;
R(S3) = {a,b}, W(S3) = {c} S4: w = c + 1;

Then: R  S 1  W  S 2   x , y   b   
R  S 2  W  S 3   z   c   
W  S 1   R  S 2   a  z   
W  S 2   R  S 3   b   a , b   b 
W  S 1  W  S 2   a  b   
W  S 2  W  S 3   b   c   

This means that statements S1 and S2 can be executed concurrently but S2


and S3 cannot. And that can be seen clearly from the precedence graph.
٨

4
Specifying Concurrency
 Approgramming
g g language
g g that supports
pp the
specification of concurrency by the user’s program,
must have the two abstracted functions defined as,
fork() and join().
1. fork(label):
Produces two concurrent executions in the program. One
starts at the statement labeled by label, and the other
continues at the statement directly after the fork.
2. join(count):
Recombines a number of concurrent computations equal
to the value of count, into one. Each of the concurrent
computations must request to be joined. If (count - 1) is
not zero that computation is terminated.

Specifying Concurrency
Example:
To implement concurrency on the previous
program:

١٠

5
Specifying Concurrency
count = 2; //We need two concurrent computations.
fork(L1); //Creates the two computations:
a = x + y; T 1: a = x + y;

go to L2; go to L2;
L1: b = z + 1; T 2: b = z + 1;
L2: join(count); join(count); join(count);
c = a – b;
Here, one computation terminates, and the
w = c + 1; other continues with these two lines.

١١

Specifying Concurrency
A more structured construct to specify
concurrency is:

cobegin S1; S2; S3; … ; Sn coend;

Statement Sn+1 can execute only after


all statements Si, i = 1..n, have
finished.
١٢

6
The Relation between Threads and
the Precedence Graph
A thread is sequential
q in nature because,, at anyy one
point in time, at most one instruction of the thread is
executed.

A program, on the other hand, may need to have


concurrent executions.

Therefore, each concurrent execution can be


represented as a thread.

For simplicity: view each node in the precedence


graph as a separate thread.

١٣

The Relation between Threads and


the Precedence Graph
Example:
Consider the Precedence graph for the previous
program,

Then: S1, S2, S3, and S4, each could have a thread.

This arrangement increases the overhead because


many threads are created unnecessarily,
unnecessarily so reducing
their numbers can be done by combining those that
are sequential into one chunk and assigning them to
one thread.

١٤

7
The Relation between Threads and
the Precedence Graph
Example:
Same as above: Create threads,

T1 : S1; T2 : S2; T3 : {S3; S4}

 Practically, when a program is executed, one process with a single


thread is created.

 If a fork() statement is executed by the process, a new process is


created that have a copy of the same code, and global variables as the
parent but does not share the same address space.

 If a clone() statement is executed by the process, a new thread is


created that shares the same code, and global variables as the parent.

 The CPU registers and PC are set appropriately for each one.

١٥

The Critical Section Problem

8
Background
 Concurrent access to shared data may result in data
i
inconsistency
i t

 Maintaining data consistency requires mechanisms to ensure the


orderly execution of cooperating processes

 Suppose that we wanted to provide a solution to the consumer-


producer problem that fills all the buffers. We can do so by
having an integer count that keeps track of the number of full
buffers:

 Initially, count is set to 0.


 It is incremented by the producer after it produces a new buffer.
 And it is decremented by the consumer after it consumes a buffer.

17

The Producer-Consumer Problem


The Producer: The Consumer:

While (true) { while (true) {


// Produce an item and put it in while (count == 0);
// NextProduced // Do nothing -- nothing to consume

while (count == BUFFER_SIZE); // Remove an item from the buffer


// Do nothing -- no free buffers NextConsumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
// Add an item to the buffer count--;
buffer[in] = NextProduced;
in = (in + 1) % BUFFER_SIZE; // Consume the item in NextConsumed
count++;
} }

18

9
Race Condition
 count++ could be implemented as:
register1 = count
register1 = register1 + 1
count = register1
 count-- could be implemented as:
register2 = count
register2 = register2 - 1
count = register2
 Consider this execution interleaving, with “count
count = 5
5” initially:
S0: producer: register1 = count {register1 = 5}
S1: producer: register1 = register1 + 1 {register1 = 6}
S2: consumer: register2 = count {register2 = 5}
S3: consumer: register2 = register2 – 1 {register2 = 4}
S4: producer: count = register1 {count = 6}
S5: consumer: count = register2 {count = 4}

19

Race Conditions in OS kernels


 Different p
parts of the kernel (kernel-mode
(
processes) in an operating system manipulate
common resources – data structures used for:
 Maintaining open-files list
 Maintaining memory allocation tables
 Maintaining process lists
 Interrupt handling … etc.
 Situations like the above race condition occur
frequently in operating systems when more than
one process tries to update the same data structure
at the same time.
 Process/thread synchronization and coordination is
needed to prevent race conditions
20

10
Handling Race Conditions in OS Kernels
1. Non-preemptive kernels:
 Do not allow a process running in kernel mode to be preempted
 A kernel-mode process will run until it either:
 Exits kernel mode
 Blocks
 Yields control of the CPU
 It is free from race conditions on kernel data structures, because only
one process is active in the kernel at a time
 Example non-preemptive OS kernels are: Windows 2000/XP, traditional
Unix, and Linux before release 2.6
2
2. Preemptive kernels:
 Allow a process to be preempted while it is running in kernel mode
 We will look at software and hardware solutions next
 More difficult to design for SMP architectures
 More suitable for real-time programming
 More responsive
 Example preemptive OS kernels are: Modern commercial versions of
Unix, Solaris, IRIX, and Linux release 2.6 and higher
21

The Critical Section


Definition:
In concurrent processes, the segment of
code that accesses a common variable is
called the critical section (CS).

 To avoid the critical section problem, shared


data must be protected

 Mutual exclusion (ME) is the technique used


to protect shared data

22

11
Solution to Critical-Section Problem
1. Mutual Exclusion - If process Pi is executing in its critical
section,
i then
h no other
h processes can be b executingi ini their
h i
critical sections
2. Progress - If no process is executing in its critical section
and there exist some processes that wish to enter their
critical section, then the selection of the processes that will
enter the critical section next cannot be postponed
indefinitely
3
3. Bounded Waiting - A bound must exist on the number of
times that other processes are allowed to enter their critical
sections after a process has made a request to enter its
critical section and before that request is granted
 Assume that each process executes at a nonzero speed
 No assumption concerning relative speed of the N processes

23

Solution to Critical-Section Problem

 There are several methods to


implement mutual exclusion in both
software and hardware.

 These methods try to avoid some


problems that may occur, defined next:

24

12
Problems that may happen
1
1. Mutual Blocking:
It happens when each process is trying
to enter its CS simultaneously, by
blocking the other processes. The result
is that each process is blocked waiting
for the others to finish
finish. This produces a
Deadlock.

25

Problems that may happen


2
2. Alternating Access:
It happens when each process specifies
which process is allowed next to its CS,
so that a specific order is defined. This
causes what is known as Lockstep
Synchronization.

26

13
Problems that may happen
3
3. Indefinite Wait:
It happens when one process has
requested to enter its CS, but there is
no guarantee of the maximum amount
of time that it should wait to do so if
other processes will not allow it
it.
This is a sort of discrimination that
causes Starvation.

27

Problems that may happen


4
4. Busy Waiting:
It happens if processes waiting to
enter their CS are allowed to hold the
CPU until they are allowed in (i.e. not
blocked). This causes inefficiency.

28

14
Software
a Solutions to the
Mutual Exclusion Problem

The Concurrent Processes


Suppose that we have the following program:

{
common variable declarations;
cobegin; // start parallel processes
P0; // process number zero
P1; // process number one
coend; // wait for all processes to end
}

30

15
The Structure of Each Process
Suppose also that each one of the concurrent
processes P0 and P1 have the following structure:

while (true)
{
enter CS
Critical Section
exit CS
Remainder Section
}

31

Assumptions
 In implementing enter CS and exit CS
operations the following assumptions are
made:
1. They must be implemented purely in software, no
special hardware is assumed.
2. Machine language instructions are indivisible.
3
3. Relative speed of concurrent processes is unknown
unknown.
4. Processes outside their CS cannot prevent others
from entering CS.
5. Processes must not be indefinitely postponed from
entering CS.

32

16
Algorithm 1
Algorithm 1:
This is a two-process algorithm.
 The reference to process Pi means the current process.
 The reference to process Pj means the other process.

Let the processes share a common integer variable,


turn:

Then, if turn == i, Pi is allowed in its CS.

33

Algorithm 1

while (true)
{
while (turn != i) wait;
CS
turn = j;
RS
}

34

17
Algorithm 1
 This algorithm provides ME but has while (true)
problems
bl #2 andd #4.
#4 {
1 while (turn != i) wait;
2 CS
 Example: 3 turn = j;
RS
If turn = 1 and P0 wants to enter CS, it 4
}
cannot even if P1 is in its RS.

Pid Current Step turn (before) turn (after) Next Step Process Status
P0 1 0 0 2 Allowed in CS
P0 2 0 0 3 In CS
P0 3 0 1 4 Exit CS
P0 4 1 1 1 in RS
P0 1 1 1 1 Wait for P1

35

Algorithm 2
Algorithm
g 2:
Replace the variable turn with the following array.

bool flag[2];
flag[0] = false;
flag[1] = false;

Then, if flag[i] is true,


Pi is in its CS.

36

18
Algorithm 2
while (true)
{
while (flag[j]) wait;
flag[i] = true;
CS
flag[i] = false;
RS
}

37

Algorithm 2
 This algorithm does not guarantee ME, and it has while (true)
Problem #4.
#4 {
1 while (flag[j]) wait;
 Example: 2 flag[i] = true;
3 CS
P0: enters while statement and finds flag[1] = false. 4 flag[i] = false;
P1: enters while statement and finds flag[0] = false. 5 RS
}
P0: sets flag[0] = true and enters CS.
P1: sets flag[1] = true and enters CS.

Current Flag[0] Flag[1]


Pid Next Step Process Status
Step before after before after
P0 1 F F F F 2 Continue
P1 1 F F F F 2 Continue
P0 2 F T F F 3 Allowed in CS
P1 2 T T F T 3 Allowed in CS

Both P0 and P1 are in their CS at the same time.

38

19
Algorithm 3
Algorithm
g 3:
The problem with algorithm 2 is that each
process is allowed to make a decision and
proceed without giving the other process a
chance to cooperate.

This algorithm attempts to solve this problem by


g the same as algorithm
doing g 2,, but:

when flag[i] is true,


it indicates that Pi wants to enter its CS.

39

Algorithm 3
while (true)
{
flag[i] = true;
while (flag[j]) wait;
CS
flag[i] = false;
RS
}

40

20
Algorithm 3
 This algorithm provides ME. It has
problems #1 and #4.

 Example:
P0: sets its flag to true.
P1: sets its flag to true.

So both processes have their flags true and


they will loop forever. This means deadlock.

41

Algorithm 4
Algorithm 4:
To avoid the problem in algorithm 3,
this algorithm tries to give a chance
of breaking a deadlock if it occurred.
This is done by alternating the flag of
a process who wants to enter its CS.

42

21
Algorithm 4
while (true)
{
flag[i] = true;
while (flag[j])
{
flag[i] = false;
delay;
flag[i] = true;
}
CS
flag[i] = false;
RS
}

43

Algorithm 4
This algorithm provides ME, no deadlock, but has
problems
bl #3 andd #4.
#4

Example:
P0: sets its flag to true.
P1: sets its flag to true.
P0: enters the while loop.
P1: enters the while loop.
P0: sets its flag to false, then true.
P1: sets its flag to false, then true.

So every process is preventing the other from entering


its CS.

44

22
Peterson’s Algorithm
Peterson s Algorithm:
Peterson’s
This algorithm offers the correct solution
by combining algorithm 3 and a
modification of algorithm 1 together.

 The processes share two variables


together:

bool flag[2]={0,0};
int turn=0;

45

Peterson’s Algorithm
while
hil (t
(true))
{
flag[i] = true;
turn = j;
while (flag[j] && turn == j) wait;
CS
flag[i] = false;
RS
}

46

23
Peterson’s Algorithm
while (true)
 This algorithm provides ME. It has {
problem
bl #4
#4. 1 flag[i] = true;
2 turn = j;
3 while (flag[j] && turn == j) wait;
 Example: 4 CS
5 flag[i] = false;
6 RS
}

Current turn Flag[0] Flag[1] Next


Pid Process Status
Step before after before after before after Step

P0 1 0 0 F T F F 2 Continue
P1 1 0 0 T T F T 2 Continue
P0 2 0 1 T T T T 3 Continue
P1 2 1 0 T T T T 3 Continue
P0 3 0 0 T T T T 4 Allowed in CS
P1 3 0 0 T T T T 3 Waiting
P0 4 0 0 T T T T 5 In CS
P0 5 0 0 T F T T 6 Exit CS
P1 3 0 0 F F T T 4 Allowed in CS
47

Problems of Software Solutions


 All the above algorithms are software
solutions to the CS problem, that require
if a process is waiting for its turn it waits
while still holding the CPU.

 This problem can not be avoided using


onlyy software.

 All the above algorithms work only for


two processes.

48

24
Software Algorithms for N Processes
Algorithms for N processes:
When N processes share the same common
variables, each one has its own critical section,
and algorithms to ensure mutual exclusion have
been developed by:
1. Dijkstra: his algorithm had problems #3 and #4.
2. Knuth: Solved the problem of indefinite wait but still
allowed lengthy delays and still have problem #4.
#4
3. Lamport, Eisenberg, and McGuire: Presented
algorithms that guaranteed that a process will enter its
CS within N-1 tries.

49

Lamport Algorithm
Lamport’s
p Bakeryy Algorithm:
g
Every process shares the following variables:
bool choosing[n]; // initialized to false.
int number[n]; // initialized to 0.
Notation:
 (a,b) < (c,d) is true:
 if (a < c) or
 if (a = c and b < d).

 next_to_max(a0, …, an-1) is a number, K, such that k ≥ ai ,


for all i = 0, …, n-1.

50

25
Lamport Algorithm
 Each pprocess that want to enter its CS must choose a
number greater than all other processes waiting or in
their CS.

 Then it waits until all processes with numbers less


than its number to finish from their CS.

 After that it can proceed into its CS.

This algorithm works for N processes, and in a


distributed environment, but still have the problem of
busy waiting (#4).

51

Lamport Algorithm
For process Pi:
while (true)
{
choosing[i] = true; // flag that Process Pi is yet to choose a number.
number[i] = next_to_max(number[0],number[1],…,number[n-1]);
// choose a new number greater than others.
choosing[i] = false; // Signals that Pi has finished choosing a number.

for (int j=0; j ≤ n-1; j++) // check for all processes including yourself.
{
while (choosing[j]) wait ; // wait if Pj is chossing.
while ((number[j] != 0) && ((number[j],j)
((number[j] j) < (number[i]
(number[i],i)))
i))) wait;
// wait if Pj has a number and it is less than my number.
}
CS
number[i] = 0;
RS
}

52

26
Hardware
a d a Solutions to the
Mutual Exclusion Problem

Synchronization Hardware
 Manyy systems
y provide hardware support
p pp for critical
section code
 Uniprocessors – could disable interrupts
 Currently running code would execute without preemption
 Generally too inefficient on multiprocessor systems
 Operating systems using this are not broadly scalable

 Modern machines provide special atomic hardware


i t ti
instructions
 Atomic = non-interruptable
 Either test memory word and set value
 Or swap contents of two memory words
 Such instructions can be used to implement mutual
exclusion.
54

27
Test_and_Set Instruction
1
1. The Test-and-Set instruction:
This instruction can be described by the
following function:
bool test_and_set (bool& target)
{
bool x = target;
target
target = true;
return x;
}

55

Solution Using Test_and_Set


 Mutual exclusion can be implemented by
declaring a global Boolean shared
variable lock, initialized to false.
 Each process will be as follows:
while (true)
{
while (test_and_set(lock)) wait;
CS
lock = false;
RS
}

56

28
Swap Instruction
2
2. The swap instruction:
This instruction can be described by the
following function:
void swap (bool& a, bool& b)
{
bool temp = a
a;
a = b;
b = temp;
}

57

Solution Using Swap


 Mutual exclusion can be implemented
p byy
declaring a global shared Boolean variable lock,
initialized to false.
 Each process has a local Boolean variable key.
 Each process will be as follows:
while (true) {
key = true;
while (key) swap(lock,
swap(lock key);
CS
lock = false;
RS
}

58

29
Notes
 The last two algorithms,
algorithms using special
hardware instructions, provide ME but
have problems #3 and #4.
 Problem #3 occurs because there is no
guarantee who gets its CS next.

 Complicated to use by application


programmers
59

Mutex Locks
3
3. Mutex Locks:
The simplest software tool for mutual exclusion

Definition:
A mutex lock has a Boolean variable, available,
whose value indicates if the lock is available or
not,, and the two atomic functions: acquire()
q () and
release()
Usually implemented via hardware atomic
instructions such as test-and-set or swap.

30
Mutex Lock Operations
 The two mutex lock operations can be described by
the following two functions:
 acquire():
{
while (!available); // busy wait == (spinlock)
available = false;
}

 release():
{
available = true;
}

Use of Mutex Locks


 To protect a critical section:
 Use a shared mutex lock, L.
 Each process will be:
while (true) {
acquire(L);
Critical Section
release(L);
Remainder Section
}

 But this solution requires busy waiting, therefore this


kind of locks is called a spinlock

31
Semaphore
4. Semaphore:
This is a synchronization tool which is less
complicated and useful for implementing mutual
exclusion on more complex problems.

 Definition: A semaphore is a protected integer


variable, S, whose value can be:
 Set only once by an initialization operation; then
 Accessed and altered using only the two standard atomic
operations on S: wait() and signal() – originally called P()
and V().

63

Semaphore operations
 The two semaphore
p operations
p can be described byy
the following two functions:
 wait(S) or P(S):
{
while (S <= 0) wait;
S--;
}

 signal(S) or V(S):
{
S++;
}

64

32
Semaphore Types
Definition:
 Counting Semaphore: its integer value
can range over an unrestricted domain.

Definition:
 Binary semaphore: its integer value
can range only between 0 and 1.

65

Use of Counting Semaphores


 Countingg semaphores
p are usuallyy used when a
resource is to be allocated from a pool of identical
resources, such as one disk drive from a pool of ten
disk drives.

 The semaphore is initialized to the number of


resources in the pool.

 Each wait() operation means removing one resource


from the pool, until the last one is removed. After
that, the wait() operation will cause the process
executing it to wait for some other process to release
one of the resources.
66

33
Use of Binary Semaphores
 Binary semaphores can be simpler to implement. Also known as
mutex
t llocks.
k
 Can implement a counting semaphore S as a binary semaphore
 Provides mutual exclusion:
 Use a shared semaphore S, initialized to 1.
 Each process will be:
while (true) {
wait(S);
Critical Section
signal(S);
Remainder Section
}

67

Semaphore Implementation
 As described in the functions above, the wait()
operation causes a busy wait situation

 An implementation of semaphores must guarantee


that wait() and signal() are executed indivisibly

 Also,, for use in a multi-processor


p system,
y , it must
guarantee that no two processes can execute wait()
and signal() on the same semaphore at the same
time

68

34
Semaphore Implementation
 Thus, implementation becomes the critical section problem
where
h the
th wait
it and
d signal
i l codes
d are placed
l d in
i the
th critical
iti l section.
ti
 Could now have busy waiting in the critical section implementation
 But implementation code is short
 Little busy waiting if critical section is occupied (rarely so)

 A semaphore that produces this result is also called a spinlock

 Spinlocks
p are useful when the lock is expected
p to be held for a
short time compared to context switch time

 Note, however, that many applications may spend lots of time in


critical sections and therefore this is not a good implementation.

69

Semaphore Implementation
with no Busy waiting
 The semaphore operations implementation
will use the process operations block() and
wakeup() as follows:

 block():
Suspends the process invoking this operation.

 wakeup(P):
Resumes the execution of a blocked process P,
and places it in the ready queue.

70

35
Semaphore Implementation
with no Busy waiting
 Semaphore operations should be atomic – executed without
i t
interruption
ti

 Each semaphore has an integer value and an associated list of


processes L as follows:

class semaphore {
int value; // The value of the semaphore
process *L; // The waiting queue list
public:
bli
semaphore(); // The initialization function – constructor
wait(); // Provides the wait function
signal(); // Provides the signal function
};

71

Semaphore Implementation
with no Busy waiting
 Implementation of wait:
S.wait() {
value--;
if (value < 0) {
add this process to the waiting queue at *L
block();
}
}

 Implementation of signal:
S.signal() {
value++;
l ++
if (value <= 0) {
remove a process, P, from the waiting queue at *L
wakeup(P);
}
}

72

36
Semaphore to Solve the CS Problem
Usingg Semaphores
p for ME ((N-Processes):)
All N processes share a common semaphore, say
mutex, initialized to 1. Then each process Pi has
the following structure:
while (true)
{
mutex.wait();
();
CS
mutex.signal();
RS
}
73

Notes:
 This p
provides Mutual Exclusion,, and eliminates all
the previous problems.
 There is no busy waiting.
 Operations wait() and signal() must be atomic
(uninterruptible), can use one of two methods:
1. Uni-processor system:
Prohibit interrupts during the execution of these
operations.
1. Multi-processor system:
The two operations can use one of the correct software
algorithms for their very small critical section.

74

37
Deadlock and Starvation
 Deadlock – two or more processes are waiting indefinitely for an
eventt th
thatt can b
be caused
dbby only
l one off th
the waiting
iti processes
 Let S and Q be two semaphores initialized to 1:
P0 P1
wait (S); wait (Q);
wait (Q); wait (S);
. .
. .
. .
signal (S); signal (Q);
signal (Q); signal (S);
 Starvation – indefinite blocking. If the semaphore queue is
implemented as a LIFO queue, a process may never be removed
from the queue in which it is suspended.
75

38

You might also like