You are on page 1of 14

The Language of Concurrency

Threads, Races, Message Passing, Sequential Consistency By Bartosz Milewski

The Plan
Processes vs. Threads Multithreading vs. Parallelization Shared Memory vs. Message Passing Data Races and Atomicity Violations Relaxed Memory Models Sequential Consistency and DRF Guarantee Risks of Concurrency Debugging Concurrent Programs

Page 2

Sequential vs. Concurrent


Sequential execution
One action follows another The effects of previous action visible to next action

Concurrent execution

while (1) { tmp = stk->top; while (1) node->next = tmp; { while (1) tmp = stk->top; { tmp = stk->top; node->next = tmp; node->next = tmp; if(stk->top == tmp) { if(stk->top == tmp) stk->top = node; { stk->top = node; if(stk->top == tmp) { break; } break; stk->top = node; } } break; } } }

No specific order of execution for programs or parts of programs Conceptually (or actually) multiple actions executing at the same time Effects of actions may be visible to other concurrent actions out of order
Page 3

Memory Sharing

Processes
Each has a separate address space Communication only through special channels
Messages, sockets, (memory-mapped) files

Threads
Share the same address space within process Can read and write to the same memory address Usually local (stack) variables are considered thread-private
Beware of closures

Concurrent memory access leads to collisions

Page 4

Threads vs. Tasks

Multithreading
Explicit thread creation and work assignment Improves latency Doesnt scale well

Parallelization
Partitioning of work for parallel execution: tasks The system, runtime, or library, assigns tasks to threads Improves performance, if done correctly Scales better with the number of cores

Page 5

Communication Between Threads

Shared memory
Threads accessing concurrently shared variables/objects If not synchronized, leads to data races, atomicity violations Synchronization through locks/atomic variables

Message passing
No sharing of memory (hard to enforce between threads) Message Queues, Channels, Mailboxes, Actors Scales up to inter-process and distributed communications (marshaling)
But usually slower that sharing

Within a process, possibility of passing references through messages


Unintended sharing, races

Page 6

Races

Conflict
Two or more threads accessing the same memory location, at least one of them writing (others may be reading or writing)

Data Race
A conflict with no intervening synchronization In most cases synchronization by locking (Java synchronized)
Lock both reads and writes with the same lock

Atomic variables (Java volatile) used in lock-free programming


To be left to gurus

// Thread 1 ready = true;

// Thread 2 if (ready) ...;

Page 7

Atomicity Violations

Updating more than one location


Momentarily breaking an invariant (inconsistent state)

Lock the related locations with one lock


For the whole duration of access (both read and write)

Reading and modifying the same location // Class SinglyLinkedList with member head synchronized void AddLink() { Node node = new Node(); node.setNext(head); head.setNext(node); }
Page 8

Relaxed Memory Models

Processors dont have a consistent view of memory


Each processor has a local cache Relaxed guarantees about write propagation between caches Special instructions (LOCK, memory fences) to control consistency

Reflected in modern languages


Atomic variables (Java volatile) Atomic operations (intrinsics) Fences Higher synchronization primitives (locks, etc.) built on top

Page 9

Sequential Consistency
while (1) { tmp = stk->top; node->next = tmp; while (1) { tmp = stk->top; node->next = tmp; if(stk->top == tmp) { stk->top = node; if(stk->top == tmp) { stk->top = node; break; } } break; } }

The way we instinctively reason about threads


Interleaving of threads: potentially different for each execution What value does a read see?
The last value written by any thread Last in a particular interleaving Enough to consider all possible interleaving to prove correctness

Page 10

DRF Guarantee
Modern multicores/languages break sequential consistency
The usual reasoning about threads fails
Initially: x = 0, y = 0 Core1 Core2 x = 1 y = 1 r1 = y r2 = x Possible outcome: r1 == 0, r2 == 0 (on x86!)

The DRF guarantee A data race free program is sequentially consistent True in C++, as long as no weak atomics

Page 11

Risks of Concurrency

Exponential number of interleavings


Hard to reason about Hard to test exhaustively

Concurrency bugs (races, atomicity violations, deadlocks)


Hard to find Hard to reproduce Hard to find the cause

Page 12

Testing and Debugging

Static analysis:
Requires complete source code Lots of false positives, Can only reason about known primitives

Dynamic:
Slows down execution Poor coverage

Corensics Jinx
Slowdown manageable (analysis at random intervals) Coverage of interesting subset of executions

Page 13

Do You Have Any Questions?

Resources
blog.corensic.com

More about Jinx


corensic.com

Page 14