You are on page 1of 13

Modular Multiplication

Dr. Arunachalam V
Associate Professor, SENSE
Introduction

• Modular multiplication means computing A×B mod N, where A and B are


residues modulo N.
• Of course, once the product C = A×B has been computed, it suffices to perform
a modular reduction C mod N, which itself reduces to an integer division.
• The algorithms presented here benefit from some precomputations involving
N, and are thus specific to the case where several reductions are performed
with the same modulus.
• Also, some algorithms avoid performing the full product C = A×B ; one such
example is McLaughlin’s algorithm.
Precomputations and different algorithms
• Algorithms with precomputations include Barrett’s algorithm, which
computes an approximation to the inverse of the modulus, thus trading division
for multiplication; Montgomery’s algorithm, which corresponds to Hensel’s
division with remainder only, and its sub-quadratic variant, which is the LSB-
variant of Barrett’s algorithm; and finally McLaughlin’s algorithm.
• The cost of the precomputations is not taken into account: it is assumed to be
negligible if many modular reductions are performed.
• However, we assume that the amount of precomputed data uses only linear,
that is O(logN), space.
• As usual, we assume that the modulus N has n words in base β, that A and B
have at most n words, and in some cases that they are fully reduced, i.e.,
0≤ , < .
Barrett’s algorithm
• Barrett’s algorithm is attractive when many divisions have to be made with the
same divisor; this is the case when one performs computations modulo a fixed
integer.
• The idea is to precompute an approximation to the inverse of the divisor.
• Thus, an approximation to the quotient is obtained with just one multiplication,
and the corresponding remainder after a second multiplication.
• A small number of corrections suffice to convert the approximations into exact
values.
• For the sake of simplicity, we describe Barrett’s algorithm in base β, where β
might be replaced by any integer, in particular 2n or β n.
= 1980; = 36; = 64 = 4096 = 113

= (1980) = (30 × 64 + 60)


= 52; = 108 = 3 ×

= 55; =0

Theorem 2.4.1 Algorithm BarrettDivRem is correct and step 5 is performed


at most 3 times.
Complexity of the algorithm

• The multiplications at steps 2 and 3 may be replaced by short products, more


precisely the multiplication at step 2 by a high short product, and that at step 3
by a low short product .
• Barrett’s algorithm can also be used for an unbalanced division, when dividing
+ 1 words by n words for ≥ 2, which amounts to k divisions of 2n
words by the same n-word divisor.
• In this case, we say that the divisor is implicitly invariant.
• In the FFT range, this cost might be lowered to 1.5M(n) using the “wrap-
around trick”; moreover, if the forward transforms of I and B are stored, the
cost decreases to M(n), assuming M(n) is the cost of three FFTs.
Montgomery’s algorithm

• Montgomery’s algorithm is very efficient for modular arithmetic modulo a


fixed modulus N.
• The main idea is to replace a residue by = , where
is the “Montgomery form” corresponding to the residue A, with λ an integer
constant such that , = 1.
• Addition and subtraction are unchanged, since + = + .
• The multiplication of two residues in Montgomery form does not give exactly
what we want: ( ) ≠( ) .
• The trick is to replace the classical modular multiplication by “Montgomery’s
multiplication”: ′, ′ = .
• For some values of λ, ′, ′ can easily be computed, in
particular for = , where N uses n words in base .
REDC & Fast REDC

• Algorithm 2.6 is a quadratic algorithm (REDC) to compute


′, ′ in this case, and a sub-quadratic reduction
(FastREDC) is given in Algorithm 2.7.
• Another view of Montgomery’s algorithm for = is to consider that it
computes the remainder of Hensel’s division.
• For example, with inputs C = 766 970 544 842 443 844, N = 862 664
913, and β = 1000,
• Algorithm REDC precomputes μ = 23; then we have = 412, which
yields ← + 412 = 766 970 900 260 388 000;
• then = 924 , which yields
← + 924 = 767 768 002 640 000 000;
• then = 720 , which yields ← + 720 = 1 388 886 740.
• At step 4, R = 1 388 886 740, and since ≥ , REDC returns
− = 526 221 827
Precomputation of µ
• For example, N = 862 664 913, and β = 1000,
• =− ⇒ =1
• Apply Euclid’s algorithm till the reminder is 1
• 1000 = 913 1 + 87 ⇒ 1000 + 913 −1 = 87
• 913 = 87 10 + 43 ⇒ 913 + 87 −10 = 43
• 87 = 43 2 + 1 ⇒ 87 + 43 −2 = 1
• Rewrite the factors in terms of β and least word of N
• 87 + 913 + 87 −10 −2 = 1 ⇒ 913 −2 + 87 21 = 1
• 913 −2 + 1000 + 913 −1 21 = 1 ⇒ 1000 21 + 913 −23 = 1
• Therefore precomputed μ = 23;

Refer this video https://www.youtube.com/watch?v=shaQZg8bqUM for finding µ


Comparison with classical method
• Compared to classical division (Algorithm BasecaseDivRem),
• Montgomery’s algorithm has two significant advantages:
• the quotient selection is performed by a multiplication modulo the
word base , which is more efficient than a division by the most
significant word of the divisor as in BasecaseDivRem;
• and there is no repair step inside the for-loop — the repair step is
at the very end.
Reference
1. Chapter 2.4 of Richard P Brent and Paul Zimmerman, “Modern
Computer Arithmetic”, Cambridge University Press 2010.
Next Class

MORE EXAMPLES

You might also like