You are on page 1of 42

Easy Data Parallelism

Richard Warburton
Raoul-Gabriel Urma
Overview
● Why is parallelism Important?

● What is data parallelism?

● Parallelising your Streams

● Performance and Internals


Why is Parallelism important?
source: http://www.gotw.ca/images/CPU.png
Multicore
What is Data Parallelism?
Concurrency is not Parallelism!
● Concurrency
○ At least two threads are making progress
○ May not run at the same time
○ Eg: chrome and eclipse both running

● Parallelism
○ At least two threads are executing simultaneously
○ A specific case of concurrency
○ Eg: servlet container dealing with two users at
once on a multicore machine
Parallelism
● Task
○ Distribute execution processes over processes
○ Threads and Executors in Java
○ Eg: each thread services a user in JEE App

● Data
○ Distribute data over different processes
○ Support built on top of Streams
○ Eg: process a payroll and give each core 100
employee’s salary
What are good data parallel
problems?
● Big Batch Jobs

○ Transaction Processing

○ Analytics/Reporting

● Web crawlers / parsers

● Maths

○ Monte Carlo Simulations

○ Linear Algebra
What’s a good data parallel problem from your

workplace?
Parallelising your Streams
Data Parallelism
● Useful
○ a lot of data
○ want to process in a similar way

● API aims to be explicit, but unobtrusive


○ .parallelStream()
○ .parallel()

● Can flip between sequential and parallel


Data Parallelism

// Replace stream() with parallelStream()


Set<String> origins = musicians
.parallelStream()
.filter(artist -> artist.getName().startsWith("The"))
.map(artist -> artist.getNationality())
.collect(toSet());
Not all serial code works in parallel.
DON’T interfere with data sources

// add double each value into a list.

List<Integer> numbers = getNumbers();

numbers.parallelStream()
.forEach(i -> numbers.add(i * 2));
Referring to data sources fixed

// add double each value into a list.

List<Integer> numbers = getNumbers();

numbers = numbers.parallelStream()
.flatMap(i -> Stream.of(i, i * 2))
.collect(toList());
DON’T misuse reduce

int totalCost(List<Purchase> items) {


return items.parallelStream()
.reduce(DELIVERY_FEE,
(tally, item) -> tally + item.cost());
}
Associativity

“you can flip order around and things still work”

(4 + 2) + 1 = 4 + (2 + 1) = 7
(4 * 2) * 1 = 4 * (2 * 1) = 8
Identity

“the do nothing value”

0 + 5 = 5
1 * 5 = 5
How to fix reduce

int totalCost(List<Purchase> items) {


return DELIVERY_FEE
+ items.parallelStream()
.reduce(0,
(tally, item) -> tally + item.cost());
}
How to fix reduce (2)

int totalCost(List<Purchase> items) {


return DELIVERY_FEE
+ items.parallelStream()
.mapToInt(Purchase::getCost)
.sum();
}
DON’T hold locks

List<Integer> values = getValues();


CountDownLatch latch = new CountDownLatch(values.size());

values.parallelStream()
.forEach(i -> {
try {
doSomething(i);
// Potential Deadlock
latch.countdown();
} catch (Exception e ) {
e.printStackTrace();
}});
No mutable state!
public static long sideEffectParallelSum(long n) {
Accumulator accumulator = new Accumulator();
LongStream.rangeClosed(1,n).parallel()
.forEach(accumulator::add);
return accumulator.total;
}

public static class Accumulator {


private long total = 0;
public void add(long value) {
total += value;
}
}
Parallel Code Summary
● Very easy to make your code parallel,

but …

● Sometimes you can get away with things


sequentially that you can’t in parallel
○ sources
○ reduce
○ locks
○ unprotected mutable data
Performance and Internals
Under the hood

● Work distributed using Fork/Join framework

● Distributed by data

● New abstraction: Spliterator


Parallel Integer Sums

int sum =
values.parallelStream()
.mapToInt(i -> i)
.sum();
Spliterator
public interface Spliterator<T> {
/** Carve off a portion of the data
into a separate Spliterator */
Spliterator<T> trySplit();

/** Iterate the data described by this Spliterator */


void forEachRemaining(Consumer<? super T> action);

/** The size of the data described


by this Spliterator, if known */
long getExactSizeIfKnown();
}
Always a tradeoff ...
● Parallelism eats more CPU time
○ Thread communication
○ Distributing & Decomposing work
○ Potentially increased memory pressure
○ Competing for the CPU with other processes

● It can reduce wall time


○ Time from beginning to end of the processes’
execution
○ Ideally only need to wait for 1/N of the execution
time
Decomposition Performance
● Data Size

● Source Data Structure

● Packing

● Number of Cores

● Cost per Element


Data Structures
● Good
○ ArrayList / Intstream.range / Stream.of
○ Random Access + Easy to balance
● Meh
○ Hashset / Treeset
○ Usually good balance
● Bad
○ LinkedList / BufferedReader.lines() /
Streams.iterate()
○ Unknown length
○ bad random access performance
Stateful Operations
● Stateless
○ no need to keep state when evaluated
○ eg: map, reduce
○ superior parallel decomposition
○ bounded amounts of data

● Stateful
○ accumulate state during evaluation
○ eg: sorted
○ unbounded caching of data
Benchmarking and Testing
● Don’t assume parallel = faster, measure it
● Use jmh:
http://openjdk.java.net/projects/code-tools/jmh/

● Best Practices
○ Warmup
○ Repeatability
○ Evade the JIT
Summary
Lesson Summary

● Easy to obtain Data Parallelism

● Pick your situation well

● A lot of performance influencers

● Benchmark your parallel code


The End
Exercise
In: com.java_8_training.problems.data_parallelism

1. Looks at OptimisationExample
2. Try to improve the performance of this code
3. Measure performance using the benchmark harness
4. Don’t make the code uglier!
Exercise
In: com.java_8_training.problems.data_parallelism

1. Parallelise the sum of squares method


Question1Test

2. Fix the bug in the "multiplyThrough" method


Question2Test

3. Remove the locks and keep the code safe


Question3Test
Amdahl’s Law
● Defines upper bound for parallel speedup

● Time(n) = Time(1) * (s + 1/n * (1 - s))


○ n = number of cores
○ s = proportion of code that is strictly serial

● Speedup(n) = 1 / (s + 1/n * (1 - s))

● Example
○ 1024 cores, 50% serial
○ 1 / (0.5 + 1/1024 * (1 - 0.5)) ~= 2x speedup

You might also like