You are on page 1of 28

Mixed Precision Solvers

Numerical Algorithms

Samir Moustafa

Faculty of Computer Science


University of Vienna

December 15, 2022


Single vs. double precision

I Single precision (SP) is faster than double precision (DP)


on modern processors

I Processing in vector units


I Fit twice as many SP than DP values in a vector

I Data transfer on memory bus


I Move twice as many SP than DP values through the bus

I Caches and registers


I Fit twice as many SP than DP values in the cache
Mixed precision solver

Can we solve a linear system Ax = b with SP speed,


but DP accuracy?

almost . . .
What is iterative renement?

Iterative renement is a method for iteratively improving


(rening) the accuracy of a computed solution.
The idea behind iterative renement

I Compute an approximate solution x̂ to linear system Ax = b


I Evaluate through residual r = b − Ax̂
I Thus, x̂ satises Ax̂ = b + r

I Hence, the error s := x − x̂ is given as s = A−1 r


I Error estimate kx − x̂ k ≤ kA kkr k
−1

I Idea: For given x̂ , one can approximate s = A−1 r with some


lower accuracy in order to nd a correction to x̂
I We know that the approximate solution x̂ is o by roughly s
I Add s in order to get closer to true solution: x̂ + s ≈ x
Derivation of iterative renement
I Matrix  = A + E
I Â result of nite-precision Gaussian elimination
I E small
I Thus: Ax = Âx − Ex = b
I x as xed point for iteration Âx (k+1) − Ex (k) = b
I It follows:
Âx (k+1) − (Â − A)x (k) = b
and

x (k+1) = x (k) + Â−1 (b − Ax (k) ) (1)

= x (k) + Â−1 r (k) (2)

= x (k) + s (k) (3)


Iterative renement and Newton's method

I Renement iteration: x (k+1) = x (k) + Â−1 (b − Ax (k) )


I Similar to inexact Newton iteration

x (k+1) = x (k) − ff ((xx


(k) )
I Newton's method: 0 (k) )

I Renes approximate root of function f (x )

I For iterative renement: f (x ) = b − Ax


I Rening root of f (x ) is equivalent to solving linear system
Ax = b
Newton's method

x (k+1) = x (k) − ff ((xx


(k) )
I Iteration: 0 (k) )

k+1 k

f(x)

Buttari (2011)
Basic structure of iterative renement

I Derived iteration: x (k+1) = x (k) + Â−1 (b − Ax (k) )


I Iterative renement method:

1: x (0) ← A−1 b . solve linear system


2: repeat
3: r (k) ← b − Ax (k)
4: s (k) ← A−1 r (k) . solve linear system
5: x (k+1) ← x (k) + s (k)
6: until convergence
Iterative renement in more detail

1. Compute approximate solution x (0) to linear system Ax = b


2. Compute residual r (0) ← b − Ax (0)
3. Set k←0

4. Solve linear system As (k) = r (k) for s (k)


5. Take x (k+1) ← x (k) + s (k) as new approximate solution
6. Compute residual r (k+1) ← b − Ax (k+1)
7. If kr (k+1) k is not suciently small → set k ←k +1
and continue at step 4
Main benet of iterative renement

The process can be repeated to rene the solution successively until


convergence.

This potentially produces a solution with a residual as small as


possible for the arithmetic precision used.
Cost analysis of iterative renement

1: x (0) ← A−1 b . O(n3 )


2: repeat
3: r (k) ← b − Ax (k) . O(n2 )
4: s (k) ← A−1 r (k) . O(n2 )
5: x (k+1) ← x (k) + s (k) . O(n)
6: until convergence
Mixed-precision iterative renement

1: x (0) ← A−1 b . single precision


2: repeat
3: r (k) ← b − Ax (k) . double precision
4: s (k) ← A−1 r (k) . single precision
5: x (k+1) ← x (k) + s (k) . double precision
6: until convergence

I Perform expensive factorization in single precision

I Do renement in double precision


Iterative renement for dense linear systems

1: LU ← PA . LU factorization
2: Solve Ly = Pb for y . forward substitution
Solve Ux
3: (0) = y for x (0) . back substitution
4: r (0) ← b − Ax (0)
5: for k ← 0, 1, . . . do
6: Solve Ly = Pr (k) for y . forward substitution
Solve Us
7: (k) = y for s (k) . back substitution
8: x (k+ 1) ← x (k) + s (k)
9: r (k+1 ) ← b − Ax (k+1)
10: check convergence
11: end for
Iterative renement for dense linear systems

1: LU ← PA . single precision
2: Solve Ly = Pb for y . single precision
Solve Ux
3: (0) = y for x (0) . single precision
4: r (0) ← b − Ax (0) . double precision
5: for k ← 0, 1, . . . do
6: Solve Ly = Pr (k) for y . single precision
Solve Us
7: (k) = y for s (k) . single precision
8: x (k+ 1) ← x (k) + s (k) . double precision
9: r (k+1 ) ← b − Ax (k+1) . double precision
10: check convergence
11: end for
Example for iterative renement
I Consider solving the following linear system:

x1 + x2 = 2
2x1 + 3x2 = 5

I We obtain the matrices A, L, U and right-hand side b:


      
A= 1
2
1
3
=
1
2
0
1
1
0
1
1
= LU b= 2
5

I Assume: approximate solution x1 = 0.9 and x2 = 1.3


computed

I Residual:
      
r = b − Ax = 2
5

1
2
1
3
0.9
1.3
=
−0.2
−0.7
Example for iterative renement

I Solve As = r for s
I Forward substitution: solve Ly = r for y
        
1 0 y1 −0.2 y1 −0.2
= ⇒ =
2 1 y2 −0.7 y2 −0.3

I Back substitution: solve Us = y for s


        
1 1 s1 −0.2 s1 0.1
= ⇒ =
0 1 s2 −0.3 s2 −0.3
Example for iterative renement

I Rened solution:
     
x (1) = x (0) + s = 0.9
1.3
+
0.1
−0.3
=
1
1

I Since     
Ax (1)
=
1
2
1
3
1
1
=
2
5
= b,

we have the solution in only 1 iteration ( exact arithmetic !)


Convergence of iterative renement
I Subtract Âx − Ex = b from Âx (k+1) − Ex (k) = b
I We obtain

Â(x (k+1) − x ) − E (x (k) − x ) = 0


and hence
x (k+1) − x = Â−1 E (x (k) − x )
I Take norms:

kx (k+1) − x k ≤ kÂ−1 E kkx (k) − x k

I If kÂ−1 E k < 1 it is guaranteed that x (k) → x as k → ∞


I But only in exact arithmetic
I In practice, residual is computed in nite precision
⇒ No further progress at some point
Iterative renement in nite precision

I Derived iteration: x (k+1) = x (k) + Â−1 (b − Ax (k) )


I Floating-point arithmetic:

x (k+1) = x (k) + Â−1 (b − Ax (k) + δ(k) ) + µ(k)


I δ (k) is error associated with computing the residual
I µ(k) is error associated with update

I Ifkδ (k) k < α, kµ(k) k < β (for all k ) and kÂ−1 E k not too
close to 1 (e.g. kÂ
−1 E k < 1/2):

kx (k) − x k ≤ kÂ−1 E kk kx (k) − x k + c(αkÂ−1 k + β)


I c is a constant which is not too large
Iterative renement in nite precision
I From evaluating the residual, we obtain

α ≤ c1 εm kAkkx k
β ≤ c 2 εm k x k
I For modest c1 , c2

I For large enough k:

kx (k) − x k
≤ c1 εm κ(A) + c2 εm
kx k

⇒ Relative error of iterative renement not too much greater


than error due to small relative perturbation of A
I Result is backward stable

I If one evaluates the residual accurately enough and


ακ(A) . β , then one also gets a small forward error
Disadvantages of mixed precision iterative renement

I Residual may not be large enough to be computed with


sucient accuracy without requiring extra precision
I Only if initial solution is already very good

I To produce maximum benet, residual must be computed with


higher precision than that used in computing the initial solution
I Uses considerably more (∼ 50%) memory than the standard
(double precision) algorithm
I Factors of original matrix need to be stored in both high and
low precision
Advantages of mixed precision iterative renement

I Recover full accuracy for systems that are badly scaled

I Can stabilize solution methods which are otherwise potentially


unstable
I e.g. Gaussian elimination without pivoting

I Single precision computation is signicantly faster than double


precision computation

I Convergence is often achieved within few, cheap iterations


Experiments: double vs. mixed precision
I AMD Opteron 246  2.0 GHz

Symmetric problems Unsymmetric problems


6
6
5 single mixed
single 5
mixed
4
4
Gflop/s

Gflop/s
3 double 3
double
2 2

1 1

0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
problem size problem size
Buttari (2011)

I Convergence is always achieved within 3 iterations when


kb − Ax k2 ≤ kx k2 · kAkF · εd · n
Experiments: double vs. mixed precision
I IBM PowerPC 970  2.0 GHz

Symmetric problems Unsymmetric problems


10 8
single 7 single
8
mixed 6 mixed
6 5
Gflop/s

Gflop/s
double 4
double
4
3

2
2
1

0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
problem size problem size
Buttari (2011)

I Convergence is always achieved within 3 iterations when


kb − Ax k2 ≤ kx k2 · kAkF · εd · n
Experiments: double vs. mixed precision
I Intel Woodcrest  3.0 GHz

Symmetric problems Unsymmetric problems


15
16
single 14 single
mixed
12 mixed
10
10
Gflop/s

Gflop/s
double 8
double
5 6

0 0
0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000
problem size problem size

Buttari (2011)

I Convergence is always achieved within 3 iterations when


kb − Ax k2 ≤ kx k2 · kAkF · εd · n
Experiments: double vs. mixed precision
1600
SP Solve
1400
DP Solve (MP
1200 Iter.Ref.)

1000 DP Solve

800

600

400

200

Matrix size
Dongarra (2012)
Summary

I Iterative renement allows to improve the accuracy of


solutions of linear equation solvers

I Similar to inexact Newton iteration

I Computations can be done in mixed precision


I Almost as fast as lower precision
I Usually as accurate as higher precision
I More memory required

You might also like