Modular Multiplication: Dr. Arunachalam V Associate Professor, SENSE

Modular Multiplication
Dr. Arunachalam V
Associate Professor, SENSE
Introduction
• Modular multiplication means computing A×B mod N, where A and B are

residues modulo N.
• Of course, once the product C = A×B has been computed, it suffices to perform
a modular reduction C mod N, which itself reduces to an integer division.
• The algorithms presented here benefit from some precomputations involving
N, and are thus specific to the case where several reductions are performed
with the same modulus.
• Also, some algorithms avoid performing the full product C = A×B ; one such
example is McLaughlin’s algorithm.
Precomputations and different algorithms
• Algorithms with precomputations include Barrett’s algorithm, which
computes an approximation to the inverse of the modulus, thus trading division
for multiplication; Montgomery’s algorithm, which corresponds to Hensel’s
division with remainder only, and its sub-quadratic variant, which is the LSB-
variant of Barrett’s algorithm; and finally McLaughlin’s algorithm.
• The cost of the precomputations is not taken into account: it is assumed to be
negligible if many modular reductions are performed.
• However, we assume that the amount of precomputed data uses only linear,
that is O(logN), space.
• As usual, we assume that the modulus N has n words in base β, that A and B
have at most n words, and in some cases that they are fully reduced, i.e.,
0≤ , < .
Barrett’s algorithm
• Barrett’s algorithm is attractive when many divisions have to be made with the
same divisor; this is the case when one performs computations modulo a fixed
integer.
• The idea is to precompute an approximation to the inverse of the divisor.
• Thus, an approximation to the quotient is obtained with just one multiplication,
and the corresponding remainder after a second multiplication.
• A small number of corrections suffice to convert the approximations into exact
values.
• For the sake of simplicity, we describe Barrett’s algorithm in base β, where β
might be replaced by any integer, in particular 2n or β n.
= 1980; = 36; = 64 = 4096 = 113
= (1980) = (30 × 64 + 60)

= 52; = 108 = 3 ×
= 55; =0
Theorem 2.4.1 Algorithm BarrettDivRem is correct and step 5 is performed

at most 3 times.
Complexity of the algorithm
• The multiplications at steps 2 and 3 may be replaced by short products, more

precisely the multiplication at step 2 by a high short product, and that at step 3
by a low short product .
• Barrett’s algorithm can also be used for an unbalanced division, when dividing
+ 1 words by n words for ≥ 2, which amounts to k divisions of 2n
words by the same n-word divisor.
• In this case, we say that the divisor is implicitly invariant.
• In the FFT range, this cost might be lowered to 1.5M(n) using the “wrap-
around trick”; moreover, if the forward transforms of I and B are stored, the
cost decreases to M(n), assuming M(n) is the cost of three FFTs.
Montgomery’s algorithm
• Montgomery’s algorithm is very efficient for modular arithmetic modulo a

fixed modulus N.
• The main idea is to replace a residue by = , where
is the “Montgomery form” corresponding to the residue A, with λ an integer
constant such that , = 1.
• Addition and subtraction are unchanged, since + = + .
• The multiplication of two residues in Montgomery form does not give exactly
what we want: ( ) ≠( ) .
• The trick is to replace the classical modular multiplication by “Montgomery’s
multiplication”: ′, ′ = .
• For some values of λ, ′, ′ can easily be computed, in
particular for = , where N uses n words in base .
REDC & Fast REDC
• Algorithm 2.6 is a quadratic algorithm (REDC) to compute

′, ′ in this case, and a sub-quadratic reduction
(FastREDC) is given in Algorithm 2.7.
• Another view of Montgomery’s algorithm for = is to consider that it
computes the remainder of Hensel’s division.
• For example, with inputs C = 766 970 544 842 443 844, N = 862 664
913, and β = 1000,
• Algorithm REDC precomputes μ = 23; then we have = 412, which
yields ← + 412 = 766 970 900 260 388 000;
• then = 924 , which yields
← + 924 = 767 768 002 640 000 000;
• then = 720 , which yields ← + 720 = 1 388 886 740.
• At step 4, R = 1 388 886 740, and since ≥ , REDC returns
− = 526 221 827
Precomputation of µ
• For example, N = 862 664 913, and β = 1000,
• =− ⇒ =1
• Apply Euclid’s algorithm till the reminder is 1
• 1000 = 913 1 + 87 ⇒ 1000 + 913 −1 = 87
• 913 = 87 10 + 43 ⇒ 913 + 87 −10 = 43
• 87 = 43 2 + 1 ⇒ 87 + 43 −2 = 1
• Rewrite the factors in terms of β and least word of N
• 87 + 913 + 87 −10 −2 = 1 ⇒ 913 −2 + 87 21 = 1
• 913 −2 + 1000 + 913 −1 21 = 1 ⇒ 1000 21 + 913 −23 = 1
• Therefore precomputed μ = 23;
Refer this video https://www.youtube.com/watch?v=shaQZg8bqUM for finding µ

Comparison with classical method
• Compared to classical division (Algorithm BasecaseDivRem),
• Montgomery’s algorithm has two significant advantages:
• the quotient selection is performed by a multiplication modulo the
word base , which is more efficient than a division by the most
significant word of the divisor as in BasecaseDivRem;
• and there is no repair step inside the for-loop — the repair step is
at the very end.
Reference
1. Chapter 2.4 of Richard P Brent and Paul Zimmerman, “Modern
Computer Arithmetic”, Cambridge University Press 2010.
Next Class
MORE EXAMPLES

Modular Multiplication: Dr. Arunachalam V Associate Professor, SENSE

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Modular Multiplication: Dr. Arunachalam V Associate Professor, SENSE

Uploaded by

Copyright:

Available Formats

Modular Multiplication

• Modular multiplication means computing A×B mod N, where A and B are

= (1980) = (30 × 64 + 60)

Theorem 2.4.1 Algorithm BarrettDivRem is correct and step 5 is performed

• The multiplications at steps 2 and 3 may be replaced by short products, more

• Montgomery’s algorithm is very efficient for modular arithmetic modulo a

• Algorithm 2.6 is a quadratic algorithm (REDC) to compute

Refer this video https://www.youtube.com/watch?v=shaQZg8bqUM for finding µ

You might also like