15150 Fall 2014
Lecture 7
Stephen Brookes
Announcements
Read the notes!
Homework 3 due...
so far
sorting for integer lists
specifications and correctness
but what about efficiency?
work?
span?
Work
work is number of evaluation steps,
assuming a sequential processor
For a recursive function we can derive a
recurrence for its applicative work
Count basic arithmetic steps, etc
For dependent subexpressions, add the work
dependent: one must be evaluated before the other
e1 + e 2
if e0 then e1 elsee2
Span
span is the number of evaluation steps,
assuming unlimited parallelism
For a recursive function we can derive a
recurrence for its applicative span
Count basic arithmetic steps, etc
For dependent subexpressions, add the span
For independent subexpressions, max the span
independent: neither depends on the other
so can be evaluated in parallel
mergesort
msort : int list > int list
fun msort [ ] = [ ]
 msort [x] = [x]
 msort L = let
val (A, B) = split L
in
merge(msort A, msort B)
end
work to split
fun split [ ] = ([ ], [ ])
 split [x] = ([x], [ ])
 split (x::y::L) =
let val (A, B) = split L in (x::A, y::B) end
Let Wsplit(n) = work of split(L) when length(L)=n
Wsplit(n) = c0
for n=0, 1
Wsplit(n) = c1 + Wsplit(n2)
for n>1
for some constants c0, c1
Closed form?
Wsplit(n) is O(n)
work to merge
fun merge (A, [ ]) = A
 merge ([ ], B) = B
 merge (x::A, y::B) = case compare(x, y) of
LESS => x :: merge(A, y::B)
 EQUAL => x::y::merge(A, B)
 GREATER => y :: merge(x::A, B)
Let Wmerge(n) = work of merge(A,B)
when length(A)+length(B)=n
Wmerge(n) is O(n)
work to sort
fun msort [ ] = [ ]  msort [x] = [x]
 msort L = let val (A, B) = split L in
merge (msort A, msort B) end
When length(L)=n>1, A and B have length n div 2
Let Wmsort(n) = work of msort(L) when length(L)=n
Wmsort(n) = c0
for n=0,1
Wmsort(n) = Wsplit(n) + 2Wmsort(n div 2) + Wmerge(n) + c1
= O(n) + 2Wmsort(n div 2)
for n>1
Wmsort(n) is O(n log n)
assessment
msort(L) does O(n log n) work,
where n is the length of L
List operations are inherently sequential
e ::e evaluates e first, then e
split and merge are not easily parallelizable
We could use parallel evaluation in msort(L)
1
for the recursive calls to msort A and msort B
How would this affect runtime?
msort L = let val (A, B) = split L in
merge (msort A, msort B) end
independent
work
For dependent subexpressions, add the work
Wmsort(n) = Wsplit(n) + 2Wmsort(n div 2) + Wmerge(n)
span
For dependent subexpressions, add the span
For independent subexpressions, max the span
Smsort(n) = Ssplit(n) + Smsort(n div 2) + Smerge(n)
really?
Actually in SML tuples are sequential
(e , e ) evaluates e first, then e
1
To exploit parallel evaluation
we should really use sequences,
a data structure with parallel ops
e1, e2 allows parallel evaluation
(coming soon)
work and span
Wmsort(n) = O(n) + 2Wmsort(n div 2)
work
Wmsort(n) is O(n log n)
Smsort(n) = O(n) + Smsort(n div 2)
Smsort(n) is O(n)
span
O(n) O(n log n)
in principle, we can speed up msort
by a factor of log n
using parallel processing
To do even better, must use a different data structure
facts
span, span, work, eggs, bacon and span
work is the number of evaluation steps,
assuming sequential processing
span is the number of evaluation steps,
assuming unlimited parallelism
. span is always work
. For sequential code, span = work
next
Trees offer better support
for parallel evaluation
Sorting a tree
Specifications and proofs
Asymptotic analysis
Insertion
Parallel Mergesort
trees
datatype tree = Empty  Node of tree * int * tree
A userdefined type named tree
With constructors Empty and Node
Empty : tree
Node : tree * int * tree > tree
tree values
An inductive definition
Every tree value is either Empty
or Node(t1, x, t2), where t1 and t2 are tree values
and x is an integer.
Contrast with integer lists:
Every integer list value is either nil
or x::L, where L is an integer list value
and x is an integer.
tree patterns
Empty
Node(p1, p, p2)
pattern
what it matches
Empty
empty tree
Node(_, _, _)
nonempty tree
Node(Empty, _, Empty)
tree with one node
Node(_, 42, _)
tree with 42 at root
tree patterns
Empty matches t iff t is Empty
Node(p1, p, p2) matches t iff
t is Node(t1, v, t2) such that
p1 matches t1, p matches v, p2 matches t2
and combines all the bindings
Node(A, x, B) matches 3
4
and binds x to 3,
A to Node(Empty,4,Empty)
B to Node(Empty,2,Empty)
structural induction for trees
To prove: For all tree values t, P(t) holds
by
structural
induction
on
t
!
Base case: Prove P(Empty).
Inductive case:
Assume Induction Hypothesis: P(t1) and P(t2).
Prove that, for all integer values x,
P(Node(t1, x, t2)) holds.
Thats enough! Why?
Contrast with
structural induction for lists
size
fun size Empty = 0
 size (Node(t1, _, t2)) = size t1 + size t2 + 1
Uses tree patterns
Recursion is structural
size(Node(t1, v, t2))
calls size(t1) and size(t2)
Easy to prove by structural induction
that for all tree values t,
size(t) = a nonnegative integer
size(t) = number of nodes
size matters
For all trees t, size(t)0.
If t = Node(t , x, t ),
1
size(t1)<size(t) and size(t2)<size(t).
Many recursive functions on trees make
recursive calls on trees with smaller size.
Can use induction on size to prove
properties or analyze efficiency.
sorted trees
An inductive definition
Empty is sorted
Node(t1, x, t2) is sorted iff
say
t1 x t2
every integer in t1 is x,
every integer in t2 is x,
t1 and t2 are sorted
42
.
42
.
42
81
. .
14
81
. .
3. 14
57
99
. . .
3. 42
57
99
. . .
sorting a tree
If the tree is Empty, do nothing
Otherwise
(the tree has a root integer and two child subtrees)
(recursively) sort the two children, then
merge the sorted children, then
insert the root value
Well design helpers to insert and merge
merge will also need a helper
to split a tree in two
tree insertion
Ins : int * tree > tree
REQUIRES t is a sorted tree
ENSURES Ins(x,t) is a sorted tree
consisting of x and all of t
fun Ins (x, Empty) = Node(Empty, x, Empty)
 Ins (x, Node(t1, y, t2)) =
case compare(x, y) of
GREATER => Node(t1, y, Ins(x, t2))

_
=> Node(Ins(x, t1), y, t2)
(contrast with list insertion)
value
equations
!
Ins(x, Empty) = Node(Empty, x, Empty)
!
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
These equations hold,
for all integer values x, y
and all tree values t1, t2
By definition of Ins
value
equations
!
Ins(x, Empty) = Node(Empty, x, Empty)
!
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
6
2
5
. .
.
.
These equations hold,
for all integer values x, y
and all tree values t1, t2
By definition of Ins
value
equations
!
These equations hold,
for all integer values x, y
and all tree values t1, t2
Ins(x, Empty) = Node(Empty, x, Empty)
!
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
.
6
2
5
. .
.
.
Ins(4,
2
. . .
6)
.
.
By definition of Ins
value
equations
!
These equations hold,
for all integer values x, y
and all tree values t1, t2
Ins(x, Empty) = Node(Empty, x, Empty)
!
By definition of Ins
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
.
6
2
5
. .
.
.
Ins(4,
2
. . .
6)
1
.
2 Ins(4 ,5 ) .
. . . .
value
equations
!
These equations hold,
for all integer values x, y
and all tree values t1, t2
Ins(x, Empty) = Node(Empty, x, Empty)
!
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
.
6
2
5
. .
.
.
Ins(4,
2
. . .
6)
.
.
By definition of Ins
value
equations
!
These equations hold,
for all integer values x, y
and all tree values t1, t2
Ins(x, Empty) = Node(Empty, x, Empty)
!
By definition of Ins
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
.
6
2
5
. .
.
.
Ins(4,
2
. . .
=
6)
1
.
6
2
5
. 4 .
. .
value
equations
!
These equations hold,
for all integer values x, y
and all tree values t1, t2
Ins(x, Empty) = Node(Empty, x, Empty)
!
By definition of Ins
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
6
2
5
. .
1
.
6
2
5
. 4 .
. .
value
equations
!
These equations hold,
for all integer values x, y
and all tree values t1, t2
Ins(x, Empty) = Node(Empty, x, Empty)
!
By definition of Ins
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
6
2
5
. .
1
.
6
2
5
. 4 .
. .
value
equations
!
These equations hold,
for all integer values x, y
and all tree values t1, t2
Ins(x, Empty) = Node(Empty, x, Empty)
!
By definition of Ins
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
6
2
5
. .
1
.
6
2
5
. 4 .
. .
value
equations
!
These equations hold,
for all integer values x, y
and all tree values t1, t2
Ins(x, Empty) = Node(Empty, x, Empty)
!
By definition of Ins
Ins (x, Node(t1, y, t2)) = Node(t1, y, Ins(x, t2)) if x>y
= Node(Ins(x, t1), y, t2) if xy
Ins(4 ,
1
.
6
2
5
. .
T sorted
1
.
6
2
. 4 .
. .
Ins(4, T) sorted
tree merge
Merge : tree * tree > tree
REQUIRES t1 and t2 are sorted trees
ENSURES Merge(t1, t2) = a sorted tree
consisting of the items of t1 and t2
Merge (Node(l1,x,r1), t2) = ???!
We could split t2 into two subtrees (l2, r2),
then do Node(Merge(l1, l2), x, Merge(r1, r2))
But we need to stay sorted and not lose any data
So split should use the value of x
and build (l2, r2) so that l2 x r2 and
tree split
SplitAt : int * tree > tree * tree
REQUIRES t is a sorted tree
ENSURES SplitAt(x, t) =
a pair of sorted trees (u1, u2) such that
t consists of the items in u1, u2
and u1 x u2
Not completely specific,
but thats OKAY!
Plan
Define SplitAt(t) using structural recursion
SplitAt(x, Node(t , y, t )) should
compare x and y
call SplitAt(x, ) on a subtree
build the result
1
SplitAt
SplitAt : int * tree > tree * tree
REQUIRES t is a sorted tree
ENSURES SplitAt(x, t) = a pair of sorted trees (u1, u2)
such that u1 x u2 and t consists of the items in u1, u2
fun SplitAt(x, Empty) = (Empty, Empty)

SplitAt(x, Node(t1, y, t2)) =
case compare(y, x) of
GREATER =>
let val (l1, r1) = SplitAt(x, t1) in (l1, Node(r1, y, t2)) end
 _ =>
let val (l2, r2) = SplitAt(x, t2) in (Node(t1, y, l2), r2) end
Merge
Merge : tree * tree > tree
REQUIRES t1 and t2 are sorted trees
ENSURES Merge(t1, t2) = a sorted tree
consisting of the items of t1 and t2
fun Merge (Empty, t2) = t2
 Merge (Node(l1,x,r1), t2) = !
let!
val (l2, r2) = SplitAt(x, t2)!
in
Node(Merge(l1, l2), x, Merge(r1, r2))
end
(as we promised!)
Msort
Msort : tree > tree
REQUIRES true
ENSURES Msort(t) = a sorted tree
consisting of the items of t
fun Msort Empty = Empty

Msort (Node(t1, x, t2)) =
Ins (x, Merge(Msort t1, Msort t2))
(as we promised)
Correct?
Q: How to prove that Msort is correct?
A: Use structural induction.
First prove that the helper functions
Merge, SplitAt, Ins are correct.
Again use structural induction.
See
lecture
notes
The helper specs were carefully chosen to
make the proof of Msort straightforward.
(An easy structural induction, using the proven facts about helpers.)
Notes
Ins(x,t) is inductive on (structure of) t
SplitAt(x, t) is inductive on t
Merge(t , t ) is inductive on t
Msort(t) is inductive on t
1
look back at the code
and see why this is true