You are on page 1of 43

# 15-150 Fall 2014

Lecture 7
Stephen Brookes

Announcements
Homework 3 due...

so far
sorting for integer lists

specifications and correctness

work?
span?

Work
work is number of evaluation steps,
assuming a sequential processor

## For a recursive function we can derive a

recurrence for its applicative work

## Count basic arithmetic steps, etc

For dependent sub-expressions, add the work
dependent: one must be evaluated before the other
e1 + e 2

if e0 then e1 elsee2

Span
span is the number of evaluation steps,
assuming unlimited parallelism

## For a recursive function we can derive a

recurrence for its applicative span

## Count basic arithmetic steps, etc

For dependent sub-expressions, add the span

For independent sub-expressions, max the span
independent: neither depends on the other
so can be evaluated in parallel

mergesort
msort : int list -> int list

fun msort [ ] = [ ]

| msort [x] = [x]

| msort L = let

val (A, B) = split L

in

merge(msort A, msort B)

end

work to split
fun split [ ] = ([ ], [ ])
| split [x] = ([x], [ ])
| split (x::y::L) =
let val (A, B) = split L in (x::A, y::B) end
Let Wsplit(n) = work of split(L) when length(L)=n
Wsplit(n) = c0
for n=0, 1

Wsplit(n) = c1 + Wsplit(n-2)
for n>1
for some constants c0, c1
Closed form?

Wsplit(n) is O(n)

work to merge
fun merge (A, [ ]) = A

| merge ([ ], B) = B
| merge (x::A, y::B) = case compare(x, y) of

LESS => x :: merge(A, y::B)
| EQUAL => x::y::merge(A, B)
| GREATER => y :: merge(x::A, B)

## Let Wmerge(n) = work of merge(A,B)

when length(A)+length(B)=n
Wmerge(n) is O(n)

work to sort
fun msort [ ] = [ ] | msort [x] = [x]
| msort L = let val (A, B) = split L in
merge (msort A, msort B) end
When length(L)=n>1, A and B have length n div 2

Wmsort(n) = c0

for n=0,1

## Wmsort(n) = Wsplit(n) + 2Wmsort(n div 2) + Wmerge(n) + c1

= O(n) + 2Wmsort(n div 2)
for n>1
Wmsort(n) is O(n log n)

assessment
msort(L) does O(n log n) work,
where n is the length of L

## List operations are inherently sequential

e ::e evaluates e first, then e

split and merge are not easily parallelizable

We could use parallel evaluation in msort(L)
1

## msort L = let val (A, B) = split L in

merge (msort A, msort B) end
independent
work
For dependent sub-expressions, add the work

## Wmsort(n) = Wsplit(n) + 2Wmsort(n div 2) + Wmerge(n)

span
For dependent sub-expressions, add the span

For independent sub-expressions, max the span
Smsort(n) = Ssplit(n) + Smsort(n div 2) + Smerge(n)

really?
Actually in SML tuples are sequential

(e , e ) evaluates e first, then e
1

## To exploit parallel evaluation

we should really use sequences,

a data structure with parallel ops

(coming soon)

## work and span

Wmsort(n) = O(n) + 2Wmsort(n div 2)

work

## Wmsort(n) is O(n log n)

Smsort(n) = O(n) + Smsort(n div 2)
Smsort(n) is O(n)

span

## O(n) O(n log n)

in principle, we can speed up msort

by a factor of log n

using parallel processing
To do even better, must use a different data structure

facts
span, span, work, eggs, bacon and span
work is the number of evaluation steps,
assuming sequential processing

span is the number of evaluation steps,
assuming unlimited parallelism

## . span is always work

. For sequential code, span = work

next
Trees offer better support

for parallel evaluation

Sorting a tree

Specifications and proofs

Asymptotic analysis
Insertion
Parallel Mergesort

trees
datatype tree = Empty | Node of tree * int * tree

## A user-defined type named tree

With constructors Empty and Node
Empty : tree

Node : tree * int * tree -> tree

tree values

An inductive definition

## Every tree value is either Empty

or Node(t1, x, t2), where t1 and t2 are tree values
and x is an integer.
Contrast with integer lists:
Every integer list value is either nil
or x::L, where L is an integer list value
and x is an integer.

tree patterns
Empty
Node(p1, p, p2)
pattern

what it matches

Empty

empty tree

Node(_, _, _)

non-empty tree

Node(Empty, _, Empty)

Node(_, 42, _)

## tree with 42 at root

tree patterns
Empty matches t iff t is Empty
Node(p1, p, p2) matches t iff

t is Node(t1, v, t2) such that

p1 matches t1, p matches v, p2 matches t2
and combines all the bindings

Node(A, x, B) matches 3
4

and binds x to 3,

A to Node(Empty,4,Empty)

B to Node(Empty,2,Empty)

## structural induction for trees

To prove: For all tree values t, P(t) holds
by
structural
induction
on
t
!

Inductive case:

## Assume Induction Hypothesis: P(t1) and P(t2).

Prove that, for all integer values x,
P(Node(t1, x, t2)) holds.

## Thats enough! Why?

Contrast with

structural induction for lists

size
fun size Empty = 0

| size (Node(t1, _, t2)) = size t1 + size t2 + 1
Uses tree patterns
Recursion is structural

size(Node(t1, v, t2))

calls size(t1) and size(t2)

## Easy to prove by structural induction

that for all tree values t,

size(t) = a non-negative integer
size(t) = number of nodes

size matters
For all trees t, size(t)0.

If t = Node(t , x, t ),
1

## Can use induction on size to prove

properties or analyze efficiency.

sorted trees
An inductive definition

Empty is sorted

Node(t1, x, t2) is sorted iff
say
t1 x t2

every integer in t1 is x,
every integer in t2 is x,
t1 and t2 are sorted
42
.

42
.

42
81
. .

14
81
. .

3. 14
57
99
. . .

3. 42
57
99
. . .

sorting a tree
If the tree is Empty, do nothing

Otherwise

## (recursively) sort the two children, then

merge the sorted children, then
insert the root value

## Well design helpers to insert and merge

merge will also need a helper

to split a tree in two

tree insertion
Ins : int * tree -> tree
REQUIRES t is a sorted tree
ENSURES Ins(x,t) is a sorted tree
consisting of x and all of t
fun Ins (x, Empty) = Node(Empty, x, Empty)

| Ins (x, Node(t1, y, t2)) =

case compare(x, y) of

GREATER => Node(t1, y, Ins(x, t2))

|
_
=> Node(Ins(x, t1), y, t2)
(contrast with list insertion)

value
equations
!
Ins(x, Empty) = Node(Empty, x, Empty)

!

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy

## These equations hold,

for all integer values x, y

and all tree values t1, t2
By definition of Ins

value
equations
!
Ins(x, Empty) = Node(Empty, x, Empty)

!

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.

6
2

5
. .

.
.

## These equations hold,

for all integer values x, y

and all tree values t1, t2
By definition of Ins

value
equations
!

## These equations hold,

for all integer values x, y

and all tree values t1, t2

!

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.
.

6
2

5
. .

.
.

Ins(4,

2
. . .

6)

.
.

By definition of Ins

value
equations
!

## These equations hold,

for all integer values x, y

and all tree values t1, t2

## Ins(x, Empty) = Node(Empty, x, Empty)

!

By definition of Ins

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.
.

6
2

5
. .

.
.

Ins(4,

2
. . .

6)

1
.

2 Ins(4 ,5 ) .
. . . .

value
equations
!

## These equations hold,

for all integer values x, y

and all tree values t1, t2

!

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.
.

6
2

5
. .

.
.

Ins(4,

2
. . .

6)

.
.

By definition of Ins

value
equations
!

## These equations hold,

for all integer values x, y

and all tree values t1, t2

## Ins(x, Empty) = Node(Empty, x, Empty)

!

By definition of Ins

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.
.

6
2

5
. .

.
.

Ins(4,

2
. . .

=
6)

1
.

6
2

5
. 4 .
. .

value
equations
!

## These equations hold,

for all integer values x, y

and all tree values t1, t2

## Ins(x, Empty) = Node(Empty, x, Empty)

!

By definition of Ins

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.

6
2

5
. .

1
.

6
2

5
. 4 .
. .

value
equations
!

## These equations hold,

for all integer values x, y

and all tree values t1, t2

## Ins(x, Empty) = Node(Empty, x, Empty)

!

By definition of Ins

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.

6
2

5
. .

1
.

6
2

5
. 4 .
. .

value
equations
!

## These equations hold,

for all integer values x, y

and all tree values t1, t2

## Ins(x, Empty) = Node(Empty, x, Empty)

!

By definition of Ins

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.

6
2

5
. .

1
.

6
2

5
. 4 .
. .

value
equations
!

## These equations hold,

for all integer values x, y

and all tree values t1, t2

## Ins(x, Empty) = Node(Empty, x, Empty)

!

By definition of Ins

## Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y

= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,

1
.

6
2

5
. .

T sorted

1
.

6
2

. 4 .
. .
Ins(4, T) sorted

tree merge
Merge : tree * tree -> tree

## REQUIRES t1 and t2 are sorted trees

ENSURES Merge(t1, t2) = a sorted tree

consisting of the items of t1 and t2
Merge (Node(l1,x,r1), t2) = ???!

## We could split t2 into two subtrees (l2, r2),

then do Node(Merge(l1, l2), x, Merge(r1, r2))
But we need to stay sorted and not lose any data
So split should use the value of x

and build (l2, r2) so that l2 x r2 and

tree split
SplitAt : int * tree -> tree * tree

## REQUIRES t is a sorted tree

ENSURES SplitAt(x, t) =

a pair of sorted trees (u1, u2) such that

t consists of the items in u1, u2

and u1 x u2
Not completely specific,
but thats OKAY!

Plan
Define SplitAt(t) using structural recursion

## SplitAt(x, Node(t , y, t )) should

compare x and y

call SplitAt(x, -) on a subtree

build the result
1

SplitAt
SplitAt : int * tree -> tree * tree
REQUIRES t is a sorted tree
ENSURES SplitAt(x, t) = a pair of sorted trees (u1, u2)

such that u1 x u2 and t consists of the items in u1, u2
fun SplitAt(x, Empty) = (Empty, Empty)
|

## SplitAt(x, Node(t1, y, t2)) =

case compare(y, x) of
GREATER =>
let val (l1, r1) = SplitAt(x, t1) in (l1, Node(r1, y, t2)) end
| _ =>
let val (l2, r2) = SplitAt(x, t2) in (Node(t1, y, l2), r2) end

Merge
Merge : tree * tree -> tree

## REQUIRES t1 and t2 are sorted trees

ENSURES Merge(t1, t2) = a sorted tree

consisting of the items of t1 and t2
fun Merge (Empty, t2) = t2
| Merge (Node(l1,x,r1), t2) = !
let!
val (l2, r2) = SplitAt(x, t2)!
in
Node(Merge(l1, l2), x, Merge(r1, r2))
end

(as we promised!)

Msort
Msort : tree -> tree
REQUIRES true

ENSURES Msort(t) = a sorted tree

consisting of the items of t

|

## Msort (Node(t1, x, t2)) =

Ins (x, Merge(Msort t1, Msort t2))

(as we promised)

Correct?
Q: How to prove that Msort is correct?
A: Use structural induction.

## First prove that the helper functions

Merge, SplitAt, Ins are correct.
Again use structural induction.

See
lecture
notes

## The helper specs were carefully chosen to

make the proof of Msort straightforward.

(An easy structural induction, using the proven facts about helpers.)

Notes
Ins(x,t) is inductive on (structure of) t

SplitAt(x, t) is inductive on t

Merge(t , t ) is inductive on t

Msort(t) is inductive on t
1

## look back at the code

and see why this is true