You are on page 1of 21

Technische Universitt Mnchen

Chip Multicore Processors


Tutorial 4

S. Wallentowitz

Institute for Integrated Systems Theresienstr. 90 Building N1 www.lis.ei.tum.de

Technische Universitt Mnchen

Task 3.2: Locking

#define N 100 int buf[N]; int c=0;

mutex lock;

On the right you find a piece of source code to implement a stack. In a stack, the last written element is read as first (LIFO).

Use the functions mutex_lock(mutex*) and mutex_unlock(mutex*) to make the code thread-safe.

void write(int i) { int tmp=c; while(tmp==N) { tmp=c; } buf[c]=i; c=c+1; } int read() { int r, tmp=c; while(tmp==0) { tmp=c; } r=buf[c]; c=c-1; return r; }
Institute for Integrated Systems

Chip Multicore Processors Tutorial 3 2 S. Wallentowitz

Technische Universitt Mnchen

void write(int i) { int tmp tmp=c; while(tmp==N) { tmp=c; } buf[c]=i; c=c+1; }

Chip Multicore Processors Tutorial 3 3 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

int read() { int r, tmp tmp = c; while(tmp==0) { tmp=c; } r=buf[c]; c=c-1; return r; }

Chip Multicore Processors Tutorial 3 4 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

void write(int i) { int tmp

int read() { int r, tmp tmp = c;

tmp=c;
while(tmp==N) { tmp=c;

while(tmp==0) {
tmp=c; } r=buf[c]; c=c-1; return r; }

}
buf[c]=i; c=c+1; }

Chip Multicore Processors Tutorial 3 5 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

void write(int i) { int tmp

int read() { int r, tmp tmp = c;

tmp=c;
while(tmp==N) { tmp=c;

while(tmp==0) {
tmp=c; } r=buf[c]; c=c-1; return r; }

}
buf[c]=i; c=c+1; }

Chip Multicore Processors Tutorial 3 6 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

void write(int i) { int tmp

int read() { int r, tmp tmp = c;

tmp=c;
while(tmp==N) { tmp=c;

while(tmp==N) {
tmp=c; } r=buf[c]; c=c-1; return r; }

}
buf[c]=i; c=c+1; }

Chip Multicore Processors Tutorial 3 7 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

void write(int i) { int tmp

int read() { int r, tmp tmp = c;

tmp=c;
while(tmp==N) { tmp=c;

while(tmp==N) {
tmp=c; } r=buf[c]; c=c-1; return r; }

}
buf[c]=i; c=c+1; }

Chip Multicore Processors Tutorial 3 8 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Condition Variables
Functions
wait() signal() or broadcast()
wait()

Thread 0

Thread 1

Thread 2

wait() signal()

Wait for condition


Append thread to queue of waiting threads Suspend thread
wait() signal()

wait()
broadcast()

Signal event
Pop thread from queue Re-activate thread

Broadcast
Re-activate all threads in queue

Chip Multicore Processors Tutorial 3 9 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Condition Variables: Example POSIX Thread


Wait
Wait for a certain condition Role of mutex Needs to hold mutex Blocking call, will still hold mutex after call While waiting for the condition: Mutex is released
pthread_cond_wait(cond_t*,mutex_t*) pthread_cond_signal(cond_t*) pthread_cond_broadcast(cond_t*)

Signal
Signal to one waiting thread that the condition holds

Broadcast
Signal to all waiting threads that the condition holds Useful for example when implementing barriers

Chip Multicore Processors Tutorial 3 10 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

#define N 100 int buf[N]; int c=0; mutex lock; void write(int i) { int tmp tmp=c; int read() { int r, tmp tmp=c;

r=buf[c]; buf[c]=i; c=c-1; c=c+1; return r;

Chip Multicore Processors Tutorial 3 11 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Another Example for Conditions: Barrier Implementation


struct { int n_treads; int waiting; condition complete; mutex lock; } barr; void barrier() { mutex_lock(&barr.lock); barr.waiting++; if (barrier.waiting<barrier.n_treads) { cond_wait(&barr.complete,&barr.lock); } else { barr.waiting = 0; cond_broadcast(&barr.complete); } }

Chip Multicore Processors Tutorial 3 12 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Task 4.1: Counter Implementation

You want to share a counter among several threads.

Implement the counter with locks, compare-and-swap and loadlinked/store-conditional. How is each of the implementations characterized?

Chip Multicore Processors Tutorial 3 13 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Starting point: Counter


volatile int count; void countup() { count++; }

volatile int count; void countup() { int tmp; tmp = count; tmp = tmp+1; count = tmp; }

countup: ... R3 <- @(count); R4 <- LD(R3); R4 <- R4 + 1; R4 -> ST(R3) ...

Chip Multicore Processors Tutorial 3 14 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Lock Implementation
volatile int count; void countup() { int tmp; tmp = count; tmp = tmp+1; count = tmp; }

Chip Multicore Processors Tutorial 3 15 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Compare-and-Swap
Similar to Test-and-Set Test for data value and conditionally replace
Read value If value matches compared value: swap If value does not match: leave old value Return previous value
compare swap address

int cas(int* addr, int compare, int swap) { int tmp=*addr; if (tmp==compare) { *addr=swap; } return tmp; }
value
==

Implement non-blocking data structures


Threads always progress Ad-hoc: For specific algorithm Canned: Provided by libraries

Modify
(Bus) Interconnect

Read
data Memory

Write

Chip Multicore Processors Tutorial 3 16 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Compare-and-Swap Implementation
int CAS(int* addr,int compare,int swap);

volatile int count; void countup() { int tmp; tmp = count; tmp = tmp+1; count = tmp; }

Chip Multicore Processors Tutorial 3 17 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Load-Linked/Store-Conditional
Alternative approach: Hardware tracks changes Load-Linked (LL):
Load value and set linked bit Store value when data is unchanged Returns success of store Can also incorrectly fail (implementation dependent): Exceptions, other LL, cache line replacement

LL
value addr

SC
success addr value
Linked bit(s)

Store-Conditional (SC):

set

==1

reset write to addr


(Bus) Interconnect

Often associated to cache line


Requires cache coherency (see module 3) PowerPC, ARM, MIPS, .. Strictly RISC compared to CAS
Institute for Integrated Systems
data Memory

Implemented in most architectures

Chip Multicore Processors Tutorial 3 18 S. Wallentowitz

Technische Universitt Mnchen

LL/SC Implementation
int LL(int* addr); bool SC(int* addr,int value);

volatile int count; void countup() { int tmp; tmp = count;

tmp = tmp+1;
count = tmp; }

Chip Multicore Processors Tutorial 3 19 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Task 4.2: FiFo Implementation

In this task you will implement a FiFo memory in software.


a) Implement a (capacity bound) FiFo as circular buffer by using an array. b) Allow the parallel access to the FiFo by multiple threads using locks. c) Is it possible to use two locks for parallel access by reading and writing threads? d) Use a (POSIX) condition variable for synchronization. e) Can you implement this data structure in a non-blocking fashion? If not, how can you change the data structure to allow for it? f) Is it possible to implement an (unbounded) FiFo by using a linked list and without locking?

Chip Multicore Processors Tutorial 3 20 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

FiFo Implementation
int data[N]; unsigned int start; unsigned int end; void enqueue(int d) { while(((end+1)%N)==start)) {} data[end] = d; end = (end+1)%N; } int dequeue() { int d; while(end==start) {} d=data[start]; start=(start+1)%N; return d; }

Chip Multicore Processors Tutorial 3 21 S. Wallentowitz

Institute for Integrated Systems

You might also like