You are on page 1of 35

Basic Concepts

Kostis Sagonas
kostis@it.uu.se

IntroPP (UU) Lecture 2 1 / 30


Overview

Concurrent programming using shared memory


Concurrent programming using message passing

IntroPP (UU) Lecture 2 2 / 30


Concurrent Programming

Remember: a concurrent program consist of independent tasks, which may


execute during overlapping time periods.

On multi-processor machines, each processor may be running one of


these tasks simultaneously (true parallelism).

On uni-processor machines, only one task is running at any instant.


However, because of multitasking (e.g., time-sharing), several tasks
may appear to run simultaneously.

IntroPP (UU) Lecture 2 3 / 30


Concurrency: Interaction

Truly independent tasks are easy to program—but not very useful.


We need tasks that communicate (e.g., initial data, intermediate results,
events such as user input) and share resources (e.g., system devices).

For this, we need some form of interaction between tasks.

IntroPP (UU) Lecture 2 4 / 30


Shared Memory

Shared memory is memory that may be accessed simultaneously by


multiple processes (or by multiple threads within a process).

Shared memory provides an efficient means of sharing data and


communicating between different processes/threads.

Most multi-processor architectures today are shared memory architectures:


each CPU core has access to the same (shared) main memory.

IntroPP (UU) Lecture 2 5 / 30


Shared Memory: Example

Let us write ... ||| ... to denote concurrent tasks.

What will be the output of the following program?

int x = 0;
int y = x+1; int z = x+1;
printf("%d", y); printf("%d", z);

IntroPP (UU) Lecture 2 6 / 30


Shared Memory: Example

Let us write ... ||| ... to denote concurrent tasks.

What will be the output of the following program?

int x = 0;
int y = x+1; int z = x+1;
printf("%d", y); printf("%d", z);

The program will output “1” followed by another “1”. (It can’t output
anything else because printf is thread-safe. More on that later.)

Note that the two tasks may execute on different processors, or even on
different computers. Shared memory provides an abstraction from the
actual hardware.

IntroPP (UU) Lecture 2 6 / 30


Shared Memory: A Simple Abstraction?

The shared memory abstraction is deceptively simple:


Tasks just perform regular memory operations (loads and stores).
Communication is implicit, i.e., there are no explicit annotations in
the code to indicate where tasks are communicating.

However, writing correct code that uses shared memory can be very
tricky. We’ll now discuss some of the challenges.

IntroPP (UU) Lecture 2 7 / 30


Concurrency and Non-Determinism

A deterministic algorithm, given any particular input, always performs the


same computations and produces the same output.

Sequential algorithms are deterministic, unless they depend on external


state (such as user input, hardware signals, calling random(), etc.).

Concurrent programs are often timing-sensitive: their output depends on,


e.g., scheduling decisions. Thus, they are often non-deterministic.

IntroPP (UU) Lecture 2 8 / 30


Concurrency and Non-Determinism: Example

What will be the output of the following program?

int x = 0;
x = 1; printf("%d", x);

IntroPP (UU) Lecture 2 9 / 30


Concurrency and Non-Determinism: Example

What will be the output of the following program?

int x = 0;
x = 1; printf("%d", x);

It could be either “0” or “1”, depending on which task is executed first.

(Actually, the program contains a data race. Depending on your hardware and
programming language, it may not output “0” or “1” after all: it could print a
different value or even crash. More on that soon.)

IntroPP (UU) Lecture 2 9 / 30


Problem: Race Conditions

A race condition occurs when the result of a concurrent program depends


on the timing of its execution (i.e., different tasks race to perform some
operations or to access a shared resource).

Race conditions easily lead to program bugs when the programmer did not
anticipate all possible executions.

IntroPP (UU) Lecture 2 10 / 30


Race Conditions: Example

Consider the following code to withdraw money from an account:


transfer (amount, account from, account to) {
if (account from.balance < amount) return NO;
account to.balance += amount;
account from.balance -= amount;
return YES;
}
What might go wrong when there are two concurrent transfers from the
same account?

IntroPP (UU) Lecture 2 11 / 30


The Check-Then-Act Error Pattern

The code on the previous slide is an instance of the check-then-act error


pattern:

if (check(x)) { act(x); }

There is likely a race condition when another task concurrently modifies x


after the check, but before the action.

IntroPP (UU) Lecture 2 12 / 30


Race Conditions: Another Example

Consider the following code to implement a counter:


int counter = 0;
count() {
counter = counter + 1;
}
What might go wrong when there are two concurrent calls of count()?

IntroPP (UU) Lecture 2 13 / 30


Race Conditions: Another Example

Consider the following code to implement a counter:


int counter = 0;
count() {
counter += 1;
}
What might go wrong when there are two concurrent calls of count()?

IntroPP (UU) Lecture 2 13 / 30


Race Conditions: Another Example

Consider the following code to implement a counter:


int counter = 0;
count() {
counter++;
}
What might go wrong when there are two concurrent calls of count()?

IntroPP (UU) Lecture 2 13 / 30


The Read-Modify-Write Error Pattern

The code on the previous slide is an instance of the read-modify-write


error pattern:

1. Read a variable.
2. Compute a new value (that depends on the value read).
3. Update the variable.

There is likely a race condition when another task concurrently modifies


the variable after the read, but before the update.

IntroPP (UU) Lecture 2 14 / 30


Synchronization and Mutual Exclusion
To prevent race conditions, we need to achieve some synchronization
between concurrent tasks.
A critical section is a piece of code that accesses a shared resource (e.g., a
data structure in shared memory) that must not be accessed concurrently.
The basic goal of process synchronization is to ensure mutual exclusion:
no two tasks execute parts of their critical sections at the same time.

IntroPP (UU) Lecture 2 15 / 30


Dekker’s Algorithm
bool flag_0 = false; bool flag_1 = false; int turn = 0 // or 1

P0 : P1 :
flag_0 = true; flag_1 = true;
while (flag_1) { while (flag_0) {
if (turn != 0) { if (turn != 1) {
flag_0 = false; flag_1 = false;
while (turn != 0) { while (turn != 1) {
// busy wait // busy wait
} }
flag_0 = true; flag_1 = true;
} }
} }
// critical section // critical section
... ...
turn = 1; turn = 0;
flag_0 = false; flag_1 = false;

IntroPP (UU) Lecture 2 16 / 30


Dekker’s Algorithm: Remarks

Dekker’s algorithm (ca. 1962) was the first algorithm to solve the mutual
exclusion problem, using only shared memory for communication.

However, Dekker’s algorithm


is limited to two processes,
makes use of busy waiting (rather than suspending processes),
assumes that the concurrent execution of P0 and P1 is equivalent to
some interleaving of their instructions (which is often not the case on
modern hardware).

There are more advanced synchronization primitives: locks, monitors,


message passing, etc.

IntroPP (UU) Lecture 2 17 / 30


Problem: Deadlocks

While too little synchronization leads to race conditions, too much (or
improper) synchronization likewise causes problems.

When one task is executing its critical section, other tasks that want to
begin executing their critical sections must wait.

A deadlock occurs when two (or more) tasks are waiting for each other.

“When two trains approach each other at a crossing, both shall


come to a full stop and neither shall start up again until the
other has gone.”
(alleged Kansas state law, 20th century)

IntroPP (UU) Lecture 2 18 / 30


Deadlocks: Example

Consider the following algorithm for copying a file:


1 Open the source file for exclusive access. (Assume that this blocks until
no other process has the file open. Once this call returns, other processes
that attempt to open the file block until the file has been closed again by
the current process.)
2 Open the destination file for exclusive access.
3 Copy data from source to destination.
4 Close the destination file.
5 Close the source file.

What can possibly go wrong?

IntroPP (UU) Lecture 2 19 / 30


Deadlocks: Example

Consider the following algorithm for copying a file:


1 Open the source file for exclusive access. (Assume that this blocks until
no other process has the file open. Once this call returns, other processes
that attempt to open the file block until the file has been closed again by
the current process.)
2 Open the destination file for exclusive access.
3 Copy data from source to destination.
4 Close the destination file.
5 Close the source file.

What can possibly go wrong?


Consider concurrent calls copy("A", "B") and copy("B", "A").

IntroPP (UU) Lecture 2 19 / 30


Problem: Livelocks

A livelock is similar to a deadlock (two or more tasks are waiting for each
other), but the tasks involved keep changing their state, without making
proper progress.

Real-life example: two people meet in a narrow corridor. Each tries to be


polite by moving aside to let the other person pass.

In practice, lifelocks occur less often than deadlocks, but are somewhat
harder to detect.

IntroPP (UU) Lecture 2 20 / 30


Problem: Resource Starvation

A task suffers from resource starvation when it is waiting for a resource


that keeps getting given to other tasks.

For instance, a (bad) scheduling algorithm might never schedule a task as


long as there is another task with higher priority.

Fairness means that as long as the system is making progress, a task that
is waiting for a resource will be granted the resource eventually. (However,
there is not necessarily a fixed upper bound on the waiting time.)

IntroPP (UU) Lecture 2 21 / 30


Problem: Data Races

A data race occurs when two (or more) tasks attempt to access the same
shared memory location,
at least one of the accesses is a write, and
the accesses may happen simultaneously.

For instance (as before):

int x = 0;
x = 1; printf("%d", x);

While race conditions may be benign, data races must be avoided! In


many programming languages, they have very weak semantics (e.g., your
program might crash).

IntroPP (UU) Lecture 2 22 / 30


Shared Memory: Limitations

Main issue: many processors need fast access to memory.

CPU CPU CPU


I/O
System Bus or Crossbar Switch

Memory

The CPU-to-memory connection is a bottleneck. Shared memory


does not scale well to many (> 10) processors.
Per-processor caches, commonly employed to reduce memory access
times, must be kept coherent (i.e., in sync).

IntroPP (UU) Lecture 2 23 / 30


Distributed Memory

Distributed memory refers to a multiple-processor system in which each


processor has its own private memory.
Tasks can only operate on local data. If remote data is required, tasks
must communicate with one or more remote processors.

http://en.wikipedia.org/wiki/Distributed_memory

IntroPP (UU) Lecture 2 24 / 30


Distributed Memory: Remarks

Advantages of distributed memory (vs. shared memory):


Scales to many processors
No data races (communication between processors is explicit)

Disadvantages of distributed memory (vs. shared memory):


No uniform address space
High access latency for remote data—programmers must think about
how to distribute data

IntroPP (UU) Lecture 2 25 / 30


Distributed Shared Memory

Physically distributed memory can be accessed via the same shared


address space from different processors.

+ : Uniform address space, implicit communication


– : High access latency for remote data

IntroPP (UU) Lecture 2 26 / 30


(Non-)Uniform Memory Access

We can also classify memory architectures according to how different


processors access memory.

Uniform memory access (UMA): all processors access memory in the


same way, with access times that are independent of the processor
and memory location. In a symmetric multi-processor (SMP) system,
additionally a single OS instance treats all processors equally.
−→ shared memory

Non-uniform memory access (NUMA): memory access times depend


on the location relative to the processor.
−→ distributed memory

IntroPP (UU) Lecture 2 27 / 30


Message Passing

Shared memory is tricky to program (critical sections, mutual exclusion


problem, . . . ). Processes might communicate in other ways.

Message passing makes communication between processes explicit. It relies


on two primitives:

send: sends a copy of some private data to another process


receive: copies data sent by another task to private address space

IntroPP (UU) Lecture 2 28 / 30


Synchronous vs. Asynchronous Message Passing

Message passing may be synchronous or asynchronous.

Synchronous: the sender of a message is blocked until the receiver


calls receive.

Asynchronous: the sender of a message can proceed immediately.


The message is buffered until the receiver calls receive.

IntroPP (UU) Lecture 2 29 / 30


Direct Communication vs. Channels

Processes may send messages directly to other (named) processes.

Alternatively, processes may send and receive messages via named


communication channels.

IntroPP (UU) Lecture 2 30 / 30

You might also like