Professional Documents
Culture Documents
Lec7 DivideAndConquerII
Lec7 DivideAndConquerII
‣ master theore
‣ integer multiplicatio
‣ matrix multiplication
m
‣ master theore
‣ integer multiplicatio
‣ matrix multiplication
SECTIONS 4.4–4.6
m
Divide-and-conquer recurrences
n
T (n) = a T + f (n)
b
3
Divide-and-conquer recurrences
n
T (n) = a T + f (n)
b
Terms.
・a ≥ 1 is the number of subproblems.
・b ≥ 2 is the factor by which the subproblem size decreases.
・f (n) ≥ 0 is the work to divide and combine subproblems.
3
Divide-and-conquer recurrences
n
T (n) = a T + f (n)
b
Terms.
・a ≥ 1 is the number of subproblems.
・b ≥ 2 is the factor by which the subproblem size decreases.
・f (n) ≥ 0 is the work to divide and combine subproblems.
T (n)
Recursion tree. [ assuming n is a power of b ]
・a = branching factor.
・a i = number of subproblems at level i.
・1 + logb n levels. T (n / b) T (n / b) ... T (n / b)
・n / b i = size of subproblem at level i.
... ... ...
3
Divide-and-conquer recurrences: recursion tree
T (n)
T (n / b) T (n / b) ... T (n / b)
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2)
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1)
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b)
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2)
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1)
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2)
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1)
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) a2 (n / b2) c
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1)
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) a2 (n / b2) c
a i (n / b i ) c
⋮
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1)
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) a2 (n / b2) c
a i (n / b i ) c
⋮
⋮
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1)
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) a2 (n / b2) c
a i (n / b i ) c
⋮
⋮
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1) nlogb a
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
1 + logb n
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) a2 (n / b2) c
a i (n / b i ) c
⋮
⋮
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1) nlogb a
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
1 + logb n
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) a2 (n / b2) c
a i (n / b i ) c
⋮
⋮
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1) nlogb a
r = a / bc
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
1 + logb n
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) a2 (n / b2) c
a i (n / b i ) c
⋮
⋮
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1) nlogb a
alogb n = nlogb a
r = a / bc
4
fi
Divide-and-conquer recurrences: recursion tree
T (n)
nc
T (n / b) T (n / b) ... T (n / b) a (n / b) c
1 + logb n
T (n / b2) T (n / b2) ... T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) T (n / b2) T (n / b2) ...T (n / b2) a2 (n / b2) c
a i (n / b i ) c
⋮
⋮
T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) T (1) ... T (1) T (1) T (1) nlogb a
alogb n = nlogb a
logb n
r = a / bc T (n) = nc ri
<latexit sha1_base64="dMY5/xq94vZUiORXKa1OXPmRnaE=">AAACU3icbVDLSgMxFM2Mr/quunQTLIK6KDMiqIgguHFZwarQaYdMeluDmWRI7ohlmH/wa9zqVwj+iwvT2oWtHkg4nHNvbu5JMiksBsGn58/Mzs0vVBaXlldW19arG5u3VueGQ5Nrqc19wixIoaCJAiXcZwZYmki4Sx4vh/7dExgrtLrBQQbtlPWV6AnO0Elx9eBmT+3T6IyeDy/V4TSyeRoX4jwoO0UkdT9OqCqp6Yi4WgvqwQj0LwnHpEbGaMQb3mbU1TxPQSGXzNpWGGTYLphBwSWUS1FuIWP8kfWh5ahiKdh2MVqqpLtO6dKeNu4opCP1d0fBUmsHaeIqU4YPdtobiv95rRx7J+1CqCxHUPxnUC+XFDUdJkS7wgBHOXCEcSPcXyl/YIZxdDlOTBm9nQGf2KR4zpXgugtTqsRnNKx0KYbTmf0lzcP6aT28PqpdnIzjrJBtskP2SEiOyQW5Ig3SJJy8kFfyRt69D+/L9/3Zn1LfG/dskQn4q99GQrOQ</latexit>
i=0 4
fi
Divide-and-conquer recurrences: recursion tree analysis
cost dominate
(nc ) r<1 c > logb a by cost of root
logb n
cost evenl
T (n) = nc ri = (nc log n) r=1 c = logb a distributed in tree
i=0
cost dominate
<latexit sha1_base64="NEShmpjKNsOBbDerTDYmROKsym4=">AAAC/nicbVFNj9MwEHXCxy7lq7scuYyoQN1LlSAkdlWKVuLCcZFadqU6jRx32nrrOJHtoFZRDvwaTogrf4QD/wanm5VotyPZenrz3sx4nORSGBsEfz3/3v0HDw8OH7UeP3n67Hn76PiryQrNccQzmemrhBmUQuHICivxKtfI0kTiZbL8VOcvv6E2IlNDu84xStlciZngzDoqbv8ZdtUJ0D4M6ktNOFBTpHEpBkE1KanM5nECqgI9EbeqFk1wLlTJXVtTtWifDhdoWdeZT+ANUIsrW4pZVes1fIAQKB2HmEZbUqhrg9rjGOx33A7Dqj2ejxC2KKppM1Tc7gS9YBNwF4QN6JAmLuIj75hOM16kqCyXzJhxGOQ2Kpm2gkt0rywM5owv2RzHDiqWoonKzf4reO2YKcwy7Y6ysGH/d5QsNWadJk6ZMrswu7ma3JcbF3Z2GpVC5YVFxW8azQoJNoP6M2EqNHIr1w4wroWbFfiCacat+/KtLpvaOfKtl5SrQgmeTXGHlXZlNau3GO7u7C4Yve2d9cIv7zrnp806D8lL8op0SUjek3PymVyQEeFe32Petbf0v/s//J/+rxup7zWeF2Qr/N//AAuD6zQ=</latexit>
(nlogb a ) r>1 c < logb a by cost of leaves
Geometric series
・If 0 < r < 1 then 1 + r + r2 + r3 + … + rk ≤ 1 / (1 − r)
・If r = 1 then 1 + r + r2 + r3 + … + rk = k + 1
・If r > 1, then 1 + r + r2 + r3 + … + rk = (rk+1 − 1) / (r − 1)
5
,
y
,
.
fi
.
with T(0) = 0 and T(1) = Θ(1), where n / b means either ⎣n / b⎦ or ⎡n / b⎤. Then
Pf sketch
・Prove when b is an integer and n is an exact power of b
・Extend domain of recurrences to reals (or rationals)
・Deal with oors and ceilings. at most 2 extra levels in recursion tree
fl
.
fi
.
with T(0) = 0 and T(1) = Θ(1), where n / b means either ⎣n / b⎦ or ⎡n / b⎤. Then
Extensions.
・Can replace Θ with O everywhere
・Can replace Θ with Ω everywhere
・Can replace initial conditions with T(n) = Θ(1) for all n ≤ n0 and
require recurrence to hold only for all n > n0.
7
.
.
.
fi
,
with T(0) = 0 and T(1) = Θ(1), where n / b means either ⎣n / b⎦ or ⎡n / b⎤. Then
8
.
.
.
.
fi
,
with T(0) = 0 and T(1) = Θ(1), where n / b means either ⎣n / b⎦ or ⎡n / b⎤. Then
.
.
.
fl
.
fi
,
with T(0) = 0 and T(1) = Θ(1), where n / b means either ⎣n / b⎦ or ⎡n / b⎤. Then
10
.
.
.
.
fi
,
11
Master theorem need not apply
11
Master theorem need not apply
11
Master theorem need not apply
11
Divide-and-conquer II: quiz 1
(1) n=1
T (n) =
<latexit sha1_base64="cfD19qNLAoyZEyBBTBBK3PkUcFM=">AAACvnicbVFdb9MwFHUyPkb5WDceeTF0oI5JJRlIG5pAQ7zwOKSWTaqjynFuWmuOHdk3U6uoP4Mfx2/hBacLEm25kq2jc+891/c4LZV0GEW/gnDn3v0HD3cfdR4/efpsr7t/8MOZygoYCaOMvU65AyU1jFCiguvSAi9SBVfpzdcmf3UL1kmjh7goISn4VMtcCo6emnR/Dvv6iLJz+qm5OiyFqdS18Ipu2WHnbDgD5P34iL6hDGGONZU5PdS+PD6kS8rYOBqcQpH42vd02GdKgFRUvzuhzDaw0T72yq2Q3hL6/Feow0Bn7eRJtxcNolXQbRC3oEfauJzsBwcsM6IqQKNQ3LlxHJWY1NyiFAr8KpWDkosbPoWxh5oX4JJ65d+SvvZMRnNj/dFIV+y/HTUvnFsUqa8sOM7cZq4h/5cbV5ifJbXUZYWgxd2gvFIUDW0+g2bSgkC18IALK/1bqZhxywX6L1ubstIuQaxtUs8rLYXJYINVOEfLGxfjTc+2wehk8HEQf//Quzhr7dwlL8gr0icxOSUX5Bu5JCMiyO/gZfA2OA6/hNOwCM1daRi0Pc/JWoTzP3z/0Zs=</latexit>
3T ( n/2 ) + (n) n>1
12
Divide-and-conquer II: quiz 1
(1) n=1
T (n) =
<latexit sha1_base64="cfD19qNLAoyZEyBBTBBK3PkUcFM=">AAACvnicbVFdb9MwFHUyPkb5WDceeTF0oI5JJRlIG5pAQ7zwOKSWTaqjynFuWmuOHdk3U6uoP4Mfx2/hBacLEm25kq2jc+891/c4LZV0GEW/gnDn3v0HD3cfdR4/efpsr7t/8MOZygoYCaOMvU65AyU1jFCiguvSAi9SBVfpzdcmf3UL1kmjh7goISn4VMtcCo6emnR/Dvv6iLJz+qm5OiyFqdS18Ipu2WHnbDgD5P34iL6hDGGONZU5PdS+PD6kS8rYOBqcQpH42vd02GdKgFRUvzuhzDaw0T72yq2Q3hL6/Feow0Bn7eRJtxcNolXQbRC3oEfauJzsBwcsM6IqQKNQ3LlxHJWY1NyiFAr8KpWDkosbPoWxh5oX4JJ65d+SvvZMRnNj/dFIV+y/HTUvnFsUqa8sOM7cZq4h/5cbV5ifJbXUZYWgxd2gvFIUDW0+g2bSgkC18IALK/1bqZhxywX6L1ubstIuQaxtUs8rLYXJYINVOEfLGxfjTc+2wehk8HEQf//Quzhr7dwlL8gr0icxOSUX5Bu5JCMiyO/gZfA2OA6/hNOwCM1daRi0Pc/JWoTzP3z/0Zs=</latexit>
3T ( n/2 ) + (n) n>1
12
Divide-and-conquer II: quiz 2
0 n 1
T (n) =
11
<latexit sha1_base64="7VmkhoT1OeXEp4XqE2EINKSyhHk=">AAAC03icbVHLbtNAFB2bVwmPpmXJ5ooIlAoRPEBEpYhSiQ3LIiVtpYwVjSfX6ajjsTUzRomsbBBbPoRP4m8Yu5ZoEu7q6Jz7PDcplLQuiv4E4Z279+4/2HvYefT4ydP97sHhuc1LI3AicpWby4RbVFLjxEmn8LIwyLNE4UVy/aXWL76jsTLXY7cqMM74QstUCu48Nev+Hvf1EbDRJzbqsAQXUlfCt7PrDhtF8AqYw6WDCmQKa9DAFAIFxqYUs9injPtMpSrPDei3Q2CmwUfwGnxbeAPv4Z9Mo9s6Sw0XFaXralj33Rl00ozpMNTzdqFZtxcNoiZgF9AW9EgbZ7OD4JDNc1FmqJ1Q3NopjQoXV9w4KRT6C0uLBRfXfIFTDzXP0MZV4+kaXnpmDqlfPc21g4a9XVHxzNpVlvjMjLsru63V5P+0aenS47iSuigdanEzKC0VuBzqB8FcGhROrTzgwki/K4gr7t1y/o0bU5reBYqNS6plqaXI57jFKrd0htcu0m3PdsH5uwGNBvTbh97pcevnHnlOXpA+oeQjOSVfyRmZEBHsB8PgJPgcTsIq/BH+vEkNg7bmGdmI8Ndf5PTZpg==</latexit>
T ( n/5 ) + T (n 3 n/10 ) + 5 n n>1
13
Divide-and-conquer II: quiz 2
0 n 1
T (n) =
11
<latexit sha1_base64="7VmkhoT1OeXEp4XqE2EINKSyhHk=">AAAC03icbVHLbtNAFB2bVwmPpmXJ5ooIlAoRPEBEpYhSiQ3LIiVtpYwVjSfX6ajjsTUzRomsbBBbPoRP4m8Yu5ZoEu7q6Jz7PDcplLQuiv4E4Z279+4/2HvYefT4ydP97sHhuc1LI3AicpWby4RbVFLjxEmn8LIwyLNE4UVy/aXWL76jsTLXY7cqMM74QstUCu48Nev+Hvf1EbDRJzbqsAQXUlfCt7PrDhtF8AqYw6WDCmQKa9DAFAIFxqYUs9injPtMpSrPDei3Q2CmwUfwGnxbeAPv4Z9Mo9s6Sw0XFaXralj33Rl00ozpMNTzdqFZtxcNoiZgF9AW9EgbZ7OD4JDNc1FmqJ1Q3NopjQoXV9w4KRT6C0uLBRfXfIFTDzXP0MZV4+kaXnpmDqlfPc21g4a9XVHxzNpVlvjMjLsru63V5P+0aenS47iSuigdanEzKC0VuBzqB8FcGhROrTzgwki/K4gr7t1y/o0bU5reBYqNS6plqaXI57jFKrd0htcu0m3PdsH5uwGNBvTbh97pcevnHnlOXpA+oeQjOSVfyRmZEBHsB8PgJPgcTsIq/BH+vEkNg7bmGdmI8Ndf5PTZpg==</latexit>
T ( n/5 ) + T (n 3 n/10 ) + 5 n n>1
13
Master theorem
Master theorem. Suppose that T (n) is a function on the nonnegative integers that
satis es the recurrence
n
T (n) = a T + f (n)
b
where n / b means either ⎣ n / b⎦ or ⎡ n / b⎤. Let k = logb a. Then
Case 1. If f (n) = O(nk – ε) for some constant ε > 0, then T (n) = Θ(nk)
Case 2. If f (n) = Θ(nk log p n) for some constant p ≥ 0, then T (n) = Θ(nk log p+1 n)
Case 3. If f (n) = Ω(nk + ε) for some constant ε > 0 and if a f (n / b) ≤ c f (n) for some
constant c < 1 and all suf ciently large n, then T (n) = Θ ( f (n) ).
regularity condition hold
if f(n) = Θ(nk + ε)
14
fi
s
fi
,
‣ master theore
‣ integer multiplicatio
‣ matrix multiplication
SECTION 5.5
m
16
Integer addition and subtraction
16
Integer addition and subtraction
16
Integer addition and subtraction
1 1 1 1 1 1 0 1
1 1 0 1 0 1 0 1
+ 0 1 1 1 1 1 0 1
1 0 1 0 1 0 0 1 0
16
Integer addition and subtraction
1 1 1 1 1 1 0 1
1 1 0 1 0 1 0 1
+ 0 1 1 1 1 1 0 1
1 0 1 0 1 0 0 1 0
16
Integer multiplication
17
Integer multiplication
1 1 0 1 0 1 0 1
× 0 1 1 1 1 1 0 1
1 1 0 1 0 1 0 1
0 0 0 0 0 0 0 0
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
0 0 0 0 0 0 0 0
0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1
17
Integer multiplication
1 1 0 1 0 1 0 1
× 0 1 1 1 1 1 0 1
1 1 0 1 0 1 0 1
0 0 0 0 0 0 0 0
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
0 0 0 0 0 0 0 0
0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1
17
Integer multiplication
1 1 0 1 0 1 0 1
× 0 1 1 1 1 1 0 1
1 1 0 1 0 1 0 1
0 0 0 0 0 0 0 0
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
1 1 0 1 0 1 0 1
0 0 0 0 0 0 0 0
0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1
18
Divide-and-conquer multiplication
m=⎡n/2⎤
a = ⎣ x / 2m ⎦ b = x mod 2m
use bit shiftin
c = ⎣ y / 2m ⎦ d = y mod 2m to compute 4 terms
Ex. x = 1 0 0 0 1 1 0 1 y =11100001
a b c d
18
g
Divide-and-conquer multiplication
m=⎡n/2⎤
a = ⎣ x / 2m ⎦ b = x mod 2m
use bit shiftin
c = ⎣ y / 2m ⎦ d = y mod 2m to compute 4 terms
Ex. x = 1 0 0 0 1 1 0 1 y =11100001
a b c d
18
g
Divide-and-conquer multiplication
m=⎡n/2⎤
a = ⎣ x / 2m ⎦ b = x mod 2m
use bit shiftin
c = ⎣ y / 2m ⎦ d = y mod 2m to compute 4 terms
Ex. x = 1 0 0 0 1 1 0 1 y =11100001
a b c d
18
g
Divide-and-conquer multiplication
m=⎡n/2⎤
a = ⎣ x / 2m ⎦ b = x mod 2m
use bit shiftin
c = ⎣ y / 2m ⎦ d = y mod 2m to compute 4 terms
Ex. x = 1 0 0 0 1 1 0 1 y =11100001
a b c d
18
g
Divide-and-conquer multiplication
m=⎡n/2⎤
a = ⎣ x / 2m ⎦ b = x mod 2m
use bit shiftin
c = ⎣ y / 2m ⎦ d = y mod 2m to compute 4 terms
Ex. x = 1 0 0 0 1 1 0 1 y =11100001
a b c d
18
g
Divide-and-conquer multiplication
m=⎡n/2⎤
a = ⎣ x / 2m ⎦ b = x mod 2m
use bit shiftin
c = ⎣ y / 2m ⎦ d = y mod 2m to compute 4 terms
Ex. x = 1 0 0 0 1 1 0 1 y =11100001
a b c d
18
g
Divide-and-conquer multiplication
Ex. x = 1 0 0 0 1 1 0 1 y =11100001
a b c d
18
g
Divide-and-conquer multiplication
MULTIPLY(x, y, n)
_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
IF (n = 1
RETURN x y
ELSE
m← ⎡ n / 2 ⎤
Θ(n)
a ← ⎣ x / 2m ⎦; b ← x mod 2m.
c ← ⎣ y / 2m ⎦; d ← y mod 2m.
e ← MULTIPLY(a, c, m).
f ← MULTIPLY(b, d, m).
4 T(⎡n / 2⎤)
g ← MULTIPLY(b, c, m).
h ← MULTIPLY(a, d, m).
RETURN 22m e + 2m (g + h) + f. Θ(n)
_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
19
)
𐄂
.
How many bit operations to multiply two n-bit integers using the
divide-and-conquer multiplication algorithm?
(1) n=1
T (n) =
<latexit sha1_base64="C59SKrult62qTBWLhiGWTGCUjp8=">AAACvnicbVFda9swFJW9ry77aNo+7kVbupGukNml0Jay0bGXPXaQrIXIBFm+TkQl2UjXI8HkZ+zH7bfsZXLqwZLsgsTh3HvP1T1KSyUdRtGvIHzw8NHjJztPO8+ev3i5293b/+6KygoYiUIV9jblDpQ0MEKJCm5LC1ynCm7Suy9N/uYHWCcLM8RFCYnmUyNzKTh6atL9OeybI8ou6cfm6rAUptLUwiu6ZYddsuEMkPfjI/qOMoQ51lTm9ND48viQLilj42hwBjrxtad02GdKgFTUfDihzDaw0T72yq2Q2RL69Feow8Bk7eRJtxcNolXQbRC3oEfauJ7sBfssK0SlwaBQ3LlxHJWY1NyiFAr8KpWDkos7PoWxh4ZrcEm98m9J33omo3lh/TFIV+y/HTXXzi106is1x5nbzDXk/3LjCvPzpJamrBCMuB+UV4piQZvPoJm0IFAtPODCSv9WKmbccoH+y9amrLRLEGub1PPKSFFksMEqnKPljYvxpmfbYHQyuBjE3057V+etnTvkFXlD+iQmZ+SKfCXXZEQE+R28Dt4Hx+HncBrqsLgvDYO254CsRTj/A38N0Zw=</latexit>
4T ( n/2 ) + (n) n>1
A. T(n) = Θ(n1/2).
20
Divide-and-conquer II: quiz 3
How many bit operations to multiply two n-bit integers using the
divide-and-conquer multiplication algorithm?
(1) n=1
T (n) =
<latexit sha1_base64="C59SKrult62qTBWLhiGWTGCUjp8=">AAACvnicbVFda9swFJW9ry77aNo+7kVbupGukNml0Jay0bGXPXaQrIXIBFm+TkQl2UjXI8HkZ+zH7bfsZXLqwZLsgsTh3HvP1T1KSyUdRtGvIHzw8NHjJztPO8+ev3i5293b/+6KygoYiUIV9jblDpQ0MEKJCm5LC1ynCm7Suy9N/uYHWCcLM8RFCYnmUyNzKTh6atL9OeybI8ou6cfm6rAUptLUwiu6ZYddsuEMkPfjI/qOMoQ51lTm9ND48viQLilj42hwBjrxtad02GdKgFTUfDihzDaw0T72yq2Q2RL69Feow8Bk7eRJtxcNolXQbRC3oEfauJ7sBfssK0SlwaBQ3LlxHJWY1NyiFAr8KpWDkos7PoWxh4ZrcEm98m9J33omo3lh/TFIV+y/HTXXzi106is1x5nbzDXk/3LjCvPzpJamrBCMuB+UV4piQZvPoJm0IFAtPODCSv9WKmbccoH+y9amrLRLEGub1PPKSFFksMEqnKPljYvxpmfbYHQyuBjE3057V+etnTvkFXlD+iQmZ+SKfCXXZEQE+R28Dt4Hx+HncBrqsLgvDYO254CsRTj/A38N0Zw=</latexit>
4T ( n/2 ) + (n) n>1
A. T(n) = Θ(n1/2).
20
Karatsuba trick
21
Karatsuba trick
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
x y = (2m a + b) (2m c + d) = 22m ac + 2m (bc + ad ) + bd
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
middle term y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
x y = (2m a + b) (2m c + d) = 22m ac + 2m (bc + ad ) + bd
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
middle term y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
x y = (2m a + b) (2m c + d) = 22m ac + 2m (bc + ad ) + bd
= 22m ac + 2m (ac + bd – (a – b)(c – d)) + bd
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
middle term y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
x y = (2m a + b) (2m c + d) = 22m ac + 2m (bc + ad ) + bd
= 22m ac + 2m (ac + bd – (a – b)(c – d)) + bd
1
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
middle term y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
x y = (2m a + b) (2m c + d) = 22m ac + 2m (bc + ad ) + bd
= 22m ac + 2m (ac + bd – (a – b)(c – d)) + bd
1 2
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
middle term y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
x y = (2m a + b) (2m c + d) = 22m ac + 2m (bc + ad ) + bd
= 22m ac + 2m (ac + bd – (a – b)(c – d)) + bd
1 2 3
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
middle term y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
x y = (2m a + b) (2m c + d) = 22m ac + 2m (bc + ad ) + bd
= 22m ac + 2m (ac + bd – (a – b)(c – d)) + bd
1 1 2 3
21
Karatsuba trick
x =10001101
m=⎡n/2⎤
a b
a = ⎣ x / 2m ⎦ b = x mod 2m
m m
middle term y =11100001
c = ⎣ y / 2 ⎦ d = y mod 2
c d
x y = (2m a + b) (2m c + d) = 22m ac + 2m (bc + ad ) + bd
= 22m ac + 2m (ac + bd – (a – b)(c – d)) + bd
1 1 3 2 3
21
Karatsuba trick
21
Karatsuba multiplication
KARATSUBA-MULTIPLY(x, y, n)
_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
IF (n = 1
RETURN x y
ELSE
m← ⎡n/2⎤
Θ(n)
a ← ⎣ x / 2m ⎦; b ← x mod 2m.
c ← ⎣ y / 2m ⎦; d ← y mod 2m.
e ← KARATSUBA-MULTIPLY(a, c, m).
f ← KARATSUBA-MULTIPLY(b, d, m). 3 T(⎡n / 2⎤)
g ← KARATSUBA-MULTIPLY(⎢a – b ⎢, ⎢c – d ⎢, m)
Flip sign of g if needed.
RETURN 22m e + 2m (e + f – g) + f Θ(n)
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
22
)
𐄂
.
Karatsuba analysis
23
Karatsuba analysis
(1) n=1
T (n) =
<latexit sha1_base64="cfD19qNLAoyZEyBBTBBK3PkUcFM=">AAACvnicbVFdb9MwFHUyPkb5WDceeTF0oI5JJRlIG5pAQ7zwOKSWTaqjynFuWmuOHdk3U6uoP4Mfx2/hBacLEm25kq2jc+891/c4LZV0GEW/gnDn3v0HD3cfdR4/efpsr7t/8MOZygoYCaOMvU65AyU1jFCiguvSAi9SBVfpzdcmf3UL1kmjh7goISn4VMtcCo6emnR/Dvv6iLJz+qm5OiyFqdS18Ipu2WHnbDgD5P34iL6hDGGONZU5PdS+PD6kS8rYOBqcQpH42vd02GdKgFRUvzuhzDaw0T72yq2Q3hL6/Feow0Bn7eRJtxcNolXQbRC3oEfauJzsBwcsM6IqQKNQ3LlxHJWY1NyiFAr8KpWDkosbPoWxh5oX4JJ65d+SvvZMRnNj/dFIV+y/HTUvnFsUqa8sOM7cZq4h/5cbV5ifJbXUZYWgxd2gvFIUDW0+g2bSgkC18IALK/1bqZhxywX6L1ubstIuQaxtUs8rLYXJYINVOEfLGxfjTc+2wehk8HEQf//Quzhr7dwlL8gr0icxOSUX5Bu5JCMiyO/gZfA2OA6/hNOwCM1daRi0Pc/JWoTzP3z/0Zs=</latexit>
3T ( n/2 ) + (n) n>1
23
Karatsuba analysis
(1) n=1
T (n) =
<latexit sha1_base64="cfD19qNLAoyZEyBBTBBK3PkUcFM=">AAACvnicbVFdb9MwFHUyPkb5WDceeTF0oI5JJRlIG5pAQ7zwOKSWTaqjynFuWmuOHdk3U6uoP4Mfx2/hBacLEm25kq2jc+891/c4LZV0GEW/gnDn3v0HD3cfdR4/efpsr7t/8MOZygoYCaOMvU65AyU1jFCiguvSAi9SBVfpzdcmf3UL1kmjh7goISn4VMtcCo6emnR/Dvv6iLJz+qm5OiyFqdS18Ipu2WHnbDgD5P34iL6hDGGONZU5PdS+PD6kS8rYOBqcQpH42vd02GdKgFRUvzuhzDaw0T72yq2Q3hL6/Feow0Bn7eRJtxcNolXQbRC3oEfauJzsBwcsM6IqQKNQ3LlxHJWY1NyiFAr8KpWDkosbPoWxh5oX4JJ65d+SvvZMRnNj/dFIV+y/HTUvnFsUqa8sOM7cZq4h/5cbV5ifJbXUZYWgxd2gvFIUDW0+g2bSgkC18IALK/1bqZhxywX6L1ubstIuQaxtUs8rLYXJYINVOEfLGxfjTc+2wehk8HEQf//Quzhr7dwlL8gr0icxOSUX5Bu5JCMiyO/gZfA2OA6/hNOwCM1daRi0Pc/JWoTzP3z/0Zs=</latexit>
3T ( n/2 ) + (n) n>1
23
Karatsuba analysis
(1) n=1
T (n) =
<latexit sha1_base64="cfD19qNLAoyZEyBBTBBK3PkUcFM=">AAACvnicbVFdb9MwFHUyPkb5WDceeTF0oI5JJRlIG5pAQ7zwOKSWTaqjynFuWmuOHdk3U6uoP4Mfx2/hBacLEm25kq2jc+891/c4LZV0GEW/gnDn3v0HD3cfdR4/efpsr7t/8MOZygoYCaOMvU65AyU1jFCiguvSAi9SBVfpzdcmf3UL1kmjh7goISn4VMtcCo6emnR/Dvv6iLJz+qm5OiyFqdS18Ipu2WHnbDgD5P34iL6hDGGONZU5PdS+PD6kS8rYOBqcQpH42vd02GdKgFRUvzuhzDaw0T72yq2Q3hL6/Feow0Bn7eRJtxcNolXQbRC3oEfauJzsBwcsM6IqQKNQ3LlxHJWY1NyiFAr8KpWDkosbPoWxh5oX4JJ65d+SvvZMRnNj/dFIV+y/HTUvnFsUqa8sOM7cZq4h/5cbV5ifJbXUZYWgxd2gvFIUDW0+g2bSgkC18IALK/1bqZhxywX6L1ubstIuQaxtUs8rLYXJYINVOEfLGxfjTc+2wehk8HEQf//Quzhr7dwlL8gr0icxOSUX5Bu5JCMiyO/gZfA2OA6/hNOwCM1daRi0Pc/JWoTzP3z/0Zs=</latexit>
3T ( n/2 ) + (n) n>1
Practice
・Use base 32 or 64 (instead of base 2)
・Faster than grade-school algorithm for about 320–640 bits.
23
.
integer arithmetic problems with the same bit complexity M(n) as integer multiplication
24
Integer arithmetic reductions
integer arithmetic problems with the same bit complexity M(n) as integer multiplication
24
Integer arithmetic reductions
integer arithmetic problems with the same bit complexity M(n) as integer multiplication
24
Integer arithmetic reductions
integer arithmetic problems with the same bit complexity M(n) as integer multiplication
24
Integer arithmetic reductions
integer arithmetic problems with the same bit complexity M(n) as integer multiplication
24
History of asymptotic complexity of integer multiplication
25
History of asymptotic complexity of integer multiplication
25
History of asymptotic complexity of integer multiplication
25
History of asymptotic complexity of integer multiplication
25
History of asymptotic complexity of integer multiplication
25
History of asymptotic complexity of integer multiplication
25
History of asymptotic complexity of integer multiplication
O (n)
25
History of asymptotic complexity of integer multiplication
O (n)
‣ master theore
‣ integer multiplicatio
‣ matrix multiplication
SECTION 4.2
m
Dot product
n
a·b = a i bi
i=1
27
Dot product
27
Dot product
27
Matrix multiplication
€
".59 .32 .41% ".70 .20 .10% " .80 .30 .50%
$ ' $ ' $ '
$ .31 .36 .25 ' = $ .30 .60 .10 ' × $ .10 .40 .10 '
$#.45 .31 .42'& $#.50 .10 .40'& $# .10 .30 .40'&
28
Matrix multiplication
€
".59 .32 .41% ".70 .20 .10% " .80 .30 .50%
$ ' $ ' $ '
$ .31 .36 .25 ' = $ .30 .60 .10 ' × $ .10 .40 .10 '
$#.45 .31 .42'& $#.50 .10 .40'& $# .10 .30 .40'&
28
Matrix multiplication
€
".59 .32 .41% ".70 .20 .10% " .80 .30 .50%
$ ' $ ' $ '
$ .31 .36 .25 ' = $ .30 .60 .10 ' × $ .10 .40 .10 '
$#.45 .31 .42'& $#.50 .10 .40'& $# .10 .30 .40'&
28
Block matrix multiplication
B21
€
€
29
Block matrix multiplication: warmup
n-by-n matrices
C = A B
30
Block matrix multiplication: warmup
n-by-n matrices
C = A B
½n-by-½n matrices
€
30
Block matrix multiplication: warmup
8 matrix multiplication
n-by-n matrices (of ½n-by-½n matrices)
C = A B
C11 = ( A11 × B11 ) + ( A12 × B21 )
"C11 C12 % " A11 A12 % " B11 B12 % C12 = ( A11 × B12 ) + ( A12 × B22 )
$ ' = $ '× $ ' C21 = ( A21 × B11 ) + ( A22 × B21 )
#C21 C22 & # A21 A22 & # B21 B22 &
C22 = ( A21 × B12 ) + ( A22 × B22 )
½n-by-½n matrices
€ €
30
s
C = A B
C11 = ( A11 × B11 ) + ( A12 × B21 )
"C11 C12 % " A11 A12 % " B11 B12 % C12 = ( A11 × B12 ) + ( A12 × B22 )
$ ' = $ '× $ ' C21 = ( A21 × B11 ) + ( A22 × B21 )
#C21 C22 & # A21 A22 & # B21 B22 &
C22 = ( A21 × B12 ) + ( A22 × B22 )
½n-by-½n matrices
€ € 4 matrix addition
(of ½n-by-½n matrices)
30
s
C = A B
C11 = ( A11 × B11 ) + ( A12 × B21 )
"C11 C12 % " A11 A12 % " B11 B12 % C12 = ( A11 × B12 ) + ( A12 × B22 )
$ ' = $ '× $ ' C21 = ( A21 × B11 ) + ( A22 × B21 )
#C21 C22 & # A21 A22 & # B21 B22 &
C22 = ( A21 × B12 ) + ( A22 × B22 )
½n-by-½n matrices
€ € 4 matrix addition
(of ½n-by-½n matrices)
Strassen’s trick
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
scalars
31
Strassen’s trick
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
scalars
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€
P5 ← (A11 + A22) (B11 + B22
P6 ← (A12 – A22) (B21 + B22
P7 ← (A11 – A21) (B11 + B12)
7 scalar multiplications
31
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
)
Strassen’s trick
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
scalars
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€ C11 = P5 + P4 – P2 + P
C12 = P1 + P2 P5 ← (A11 + A22) (B11 + B22
7 scalar multiplications
31
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
)
Strassen’s trick
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
scalars
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€ C11 = P5 + P4 – P2 + P
C12 = P1 + P2 P5 ← (A11 + A22) (B11 + B22
31
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
)
Strassen’s trick
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
scalars
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€ C11 = P5 + P4 – P2 + P
C12 = P1 + P2 P5 ← (A11 + A22) (B11 + B22
31
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
𐄂
)
𐄂
Strassen’s trick
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
scalars
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€ C11 = P5 + P4 – P2 + P
C12 = P1 + P2 P5 ← (A11 + A22) (B11 + B22
𐄂
)
𐄂
Strassen’s trick
n-by-n ½n-by-½n matrix
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
½n-by-½n matrices
32
Strassen’s trick
n-by-n ½n-by-½n matrix
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
½n-by-½n matrices
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€
P5 ← (A11 + A22) (B11 + B22
P6 ← (A12 – A22) (B21 + B22
P7 ← (A11 – A21) (B11 + B12)
7 matrix multiplication
(of ½n-by-½n matrices)
32
𐄂
𐄂
s
𐄂
𐄂
𐄂
𐄂
𐄂
)
Strassen’s trick
n-by-n ½n-by-½n matrix
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
½n-by-½n matrices
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€ C11 = P5 + P4 – P2 + P
C12 = P1 + P2 P5 ← (A11 + A22) (B11 + B22
7 matrix multiplication
(of ½n-by-½n matrices)
32
𐄂
𐄂
s
𐄂
𐄂
𐄂
𐄂
𐄂
)
Strassen’s trick
n-by-n ½n-by-½n matrix
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
½n-by-½n matrices
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€ C11 = P5 + P4 – P2 + P
C12 = P1 + P2 P5 ← (A11 + A22) (B11 + B22
32
𐄂
𐄂
s
𐄂
𐄂
𐄂
𐄂
𐄂
)
Strassen’s trick
n-by-n ½n-by-½n matrix
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
½n-by-½n matrices
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€ C11 = P5 + P4 – P2 + P
C12 = P1 + P2 P5 ← (A11 + A22) (B11 + B22
32
𐄂
𐄂
𐄂
s
𐄂
𐄂
𐄂
𐄂
𐄂
)
𐄂
Strassen’s trick
n-by-n ½n-by-½n matrix
Key idea. Can multiply two 2-by-2 matrices via 7 scalar multiplications
(plus 11 additions and 7 subtractions).
½n-by-½n matrices
"C11 C12 % " A11 A12 % " B11 B12 % P1 ← A11 (B12 – B22
$ ' = $ ' × $ '
#C21 C22 & # A21 A22 & # B21 B22 & P2 ← (A11 + A12) B22
P3 ← (A21 + A22) B11
P4 ← A22 (B21 – B11
€ C11 = P5 + P4 – P2 + P
C12 = P1 + P2 P5 ← (A11 + A22) (B11 + B22
𐄂
𐄂
𐄂
𐄂
𐄂
)
𐄂
)
𐄂
Strassen’s algorithm
assume n is a power of 2
STRASSEN(n, A, B)
______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
IF (n = 1) RETURN A B.
Partition A and B into ½n-by-½n blocks
P1 ← STRASSEN(n / 2, A11, (B12 – B22))
P2 ← STRASSEN(n / 2, (A11 + A12), B22)
P3 ← STRASSEN(n / 2, (A21 + A22), B11)
P4 ← STRASSEN(n / 2, A22, (B21 – B11)) 7 T(n / 2) + Θ(n2)
𐄂
.
Pf.
・When n is a power of 2, apply Case 1 of the master theorem:
35
Analysis of Strassen’s algorithm
Pf.
・When n is a power of 2, apply Case 1 of the master theorem:
T (n) = 7T (n /2 ) + Θ(n 2 ) ⇒ T (n) = Θ(n log2 7 ) = O(n 2.81 )
!#"# $ !#"#$
recursive calls add, subtract
35
Analysis of Strassen’s algorithm
Pf.
・When n is a power of 2, apply Case 1 of the master theorem:
T (n) = 7T (n /2 ) + Θ(n 2 ) ⇒ T (n) = Θ(n log2 7 ) = O(n 2.81 )
!#"# $ !#"#$
recursive calls add, subtract
1 2 3 0 10 11 12 0 84 90 96 0
4 5 6 0 13 14 15 0 201 216 231 0
=
7 8 9 0 16 17 18 0 318 342 366 0
0 0 0 0 0 0 0 0 0 0 0 0
35
Strassen’s algorithm: practice
Implementation issues
・Sparsity
・Caching
・n not a power of 2
・Numerical stability
・Non-square matrices
・Storage for intermediate submatrices
・Crossover to classical algorithm when n is “small.”
・Parallelism for multi-core and many-core architectures
Common misperception. “Strassen’s algorithm is only a theoretical curiosity.”
・Apple reports 8x speedup when n ≈ 2,048
・Range of instances where it’s useful is a subject of controversy.
Strassen’s Algorithm Reloaded
Jianyu Huang⇤ , Tyler M. Smith⇤† , Greg M. Henry‡ , Robert A. van de Geijn⇤†
⇤ Department of Computer Science and † Institute for Computational Engineering and Sciences,
The University of Texas at Austin, Austin, TX 78712
Email: jianyu,tms,rvdg@cs.utexas.edu
‡ Intel Corporation, Hillsboro, OR 97124
Email: greg.henry@intel.com
Abstract—We dispel with “street wisdom” regarding the that can be computed and makes it so an implementation is not
36
practical implementation of Strassen’s algorithm for matrix- plug-compatible with the standard calling sequence supported
.
Suppose that you could multiply two 3-by-3 matrices with 21 scalar
multiplications. How fast could you multiply two n-by-n matrices?
A. Θ(n log3 21
B. Θ(n log2 21
C. Θ(n log9 21
Θ (n log3 21 ) = O(n 2.77 )
D. Θ(n 2)
37
)
Suppose that you could multiply two 3-by-3 matrices with 21 scalar
multiplications. How fast could you multiply two n-by-n matrices?
A. Θ(n log3 21
B. Θ(n log2 21
C. Θ(n log9 21
Θ (n log3 21 ) = O(n 2.77 )
D. Θ(n 2)
37
)
A. Yes.
B. No.
C. Unknown.
38
Divide-and-conquer II: quiz 5
A. Yes.
B. No.
C. Unknown.
38
Fast matrix multiplication: theory
39
]
Fast matrix multiplication: theory
39
]
Fast matrix multiplication: theory
39
]
]
Fast matrix multiplication: theory
Begun, the decimal wars have. [Pan 1978, Bini et al., Schönhage, …]
39
]
]
Fast matrix multiplication: theory
Begun, the decimal wars have. [Pan 1978, Bini et al., Schönhage, …]
・Two 70-by-70 matrices with 143,640 scalar multiplications O(n2.7962)
39
]
]
.
Fast matrix multiplication: theory
Begun, the decimal wars have. [Pan 1978, Bini et al., Schönhage, …]
・Two 70-by-70 matrices with 143,640 scalar multiplications O(n2.7962)
・Two 48-by-48 matrices with 47,217 scalar multiplications. O(n2.7801)
39
]
]
.
Fast matrix multiplication: theory
Begun, the decimal wars have. [Pan 1978, Bini et al., Schönhage, …]
・Two 70-by-70 matrices with 143,640 scalar multiplications O(n2.7962)
・Two 48-by-48 matrices with 47,217 scalar multiplications. O(n2.7801)
・A year later. O(n2.7799)
39
]
]
.
Fast matrix multiplication: theory
Begun, the decimal wars have. [Pan 1978, Bini et al., Schönhage, …]
・Two 70-by-70 matrices with 143,640 scalar multiplications O(n2.7962)
・Two 48-by-48 matrices with 47,217 scalar multiplications. O(n2.7801)
・A year later. O(n2.7799)
・December 1979. O(n2.521813)
39
]
]
.
Fast matrix multiplication: theory
Begun, the decimal wars have. [Pan 1978, Bini et al., Schönhage, …]
・Two 70-by-70 matrices with 143,640 scalar multiplications O(n2.7962)
・Two 48-by-48 matrices with 47,217 scalar multiplications. O(n2.7801)
・A year later. O(n2.7799)
・December 1979. O(n2.521813)
・January 1980 O(n2.521801)
39
.
]
]
.
History of arithmetic complexity of matrix multiplication
O (n 2 + ε )
O (n 2 + ε )
determinant ⎢A ⎢ Θ(MM(n))
LU decomposition A = LU Θ(MM(n))