You are on page 1of 336
6661 IIEH e0QUeIq Numerical Methods Using MATLAB Third Edition John H. Mathews California State University, Fulleron Kurtis D. Fink Northwest Missoun State University: Hagskoten i Vestfoic iblioteket - Borre Prentice Hall, Upper Saddle River, NI 07458 Contents 2 22 23 2a 25 34 Preface vii Preliminaries =I Review of Calculus Binary Numbers 13 Error Analysis 24 The Solution of Nonlinear Equations fQr=0 40 leeration for Solving = ain) 4 Bracketing Methods for Locating a Root 51 Initial Approximation and Convergence Criteria 62 Newion-Raphson and Secant Mesbods 70 Aithen’s Process and Steffensen’s and Maller’s Methods (Optional) 90 The Solution of Linear Systems AX = B 101 Introduction o Vectors and Matrices 707 Contents 32 Properties of Vectors and Matrices 109 3.3 Upperriangular Linear Systeras 120 34 Gaussian Klimination and Pivoring 125 35 Triangular Factorizavon 14? 36 erative Methods for Linear Systems 156 3.7 Iteration for Nontinear Systems: Seidel and ‘Newton's Methods (Optional) 167 Interpolation and Polynomial Approximation 186 SL Taylor Series and Calculation of Functions 187 42 Introduction to Interpolation 199 43° Lagrange Approximation 206 44 Newlon Polynomials 229 45 Chebyshev Polynomials (Optional) 230 4.6 Padé Approximations 243 Curve Fitting 252 SL Leastsquares Line 253 52 Curve Fiting 263 53 Interpotation by Spline Functions 279 S4 Fourier Series and Trigonomersic Polynomials 297, Numerical Differentiation 3/0 6.1 Approximating The Derivative 31/ 62 Nuerca!Difereniatia Formulas 429 Numerical Integration 342 71 Aneduetion vo Quadranwe 4 12 Composite Tapezcidal and Simpror’sRale 354 13. Recursive Rules and Romberg Integration 68 TA Adaptive Quadratre 382 7S Guuss-Legendre ltegration (Optional) 389 Covresrs 8 8) om 92 93 94 9s 96 97 98 99 10 10. 102 103 11 uu 12 13 ne Numerical Optimization 399 Minimization of ¢ Funetion 400 Solution of Differential Equations 426 Invoduction to Differential Equations 427 Buter’s Method 433 Hieun’s Method 443 Taylor Series Method 451 Runge-Kutta Methods 458 Predictor-Comector Methods 474 ‘Systems of Differersia! Equations 487, Boundary Velue Problems 497 Finite-dfference Method 505 Solution of Partial Differential Equations 5/4 Hyperbolic Equations 516 Parabolic Equaions 526 Elliptic Equations 538 Eigenvalues and Eigenvectors 555 Homogencous Systeme: The Eigenvalue Problem 556 PowerMethod 568 Jacobs Method 587 Eigenvaluesof Symmetric Matrices. S94 Appendix: An Introduction to MATLAB 608 Some Suggested References for Reports 616 Bibliography and References 619 Answers to Selected Exercises 637 Index 655 Preface ‘This book provides a fundamental introduction to qumerical analysis suitable for ur ergraduate students in mathematics, computer science, physical Sciences, and ens. neering. It is assumed thatthe reader is familiar with calculus and has taken a struc tured programming course. The text has enough material fited modularly for either a single-term course or 8 year sequence. Ia shor. the book contains enough material so ‘instructors will be able to select topies appropriate to their needs ‘Students of various backgrounds should fird numerical mezhods quite imeresting and useful, and thss is kept in mind throughout the book. Thus. there is @ wide vari- ty of examples und problems that felp to sharpen one’s skill in both the theory and practice of rumerical analysis. Computer culations are presented in the form of te ‘les and graphs whenever possible so that the resulting numerical approximations are easier to Visualize and interpret. MATLAB programs afe the vehicle for presenting the ‘underlying numerical algorithms. Emphasis is placed on understanding why numerical methoes work and th itations. This is challenging and involves a balance between theory, error analysis, and readability. An error analysis for each method is presented in a fashion that is appropriate for the method 2t hand. yet does not tam off the reader, A mathematical derivation for each method is given that uses elementary results and builds he stodent's understanding of calculus. Computer assignmen:s using MATLAB give students 29 ‘opporcunity to practice ther skills a scientific programming. Shomer numerica! exercises can be carried out with a pocket calculator/compute= and the longer ones can be donc using MATLAB subroutines. Its lef for the inst tor to guide the students regarding the pedagogical use of numerical computations Each instructor can make assignments that are appropriate to the availtSle comput il PREFACE ing resources, Experimentation with the MATLAB subroutine libraries is encouraged “These materials can be used to assist students in the completion ofthe numerical anal {ysis component of computer laboratory exercises ‘This Third Edition grows oot of much polishing of the narrative for the Second Edition. For example, the QR method has been added to the chapter on Eigenvalues and Eigenvectors. New to this edition is the explicit use of the software MATLAB, ‘An appendix gives an introduction to MATLAB syntax. Examples have been added throughout the text with MATLAB and complete MATLAB programs are given in each section. Af instructors disk is available upon request from the publisher, Previously we took the attitude that any software program thst students mastered ‘would work fine. However, many students entering this course have yet to master a programming language (computer scence students excepted). MATLAB has become the tool of neariy all engineers and applied matzematicians, and its newest versions hhave improved the programming aspeets. So we think that studen:s will ave an easier and more productive time ir this MATLAB version of our text, Acknowledgments, We would lke 10 express my gratitude 10 all the people whose efforts contributed (9 the various editions of this book. 1 John Mathews) thank the students at California Statue University. Fullerton. [thank my cclleagues Stephen Goode. Mathew Koshy. Edward Sabotha, Haris Schultz. and Soo Tang Tan for :heir suppor in the first edition; addaiorally. | thank Russel! Egbert, William Gearhart. Ronald Miller. and Gieg Pierce {or their suggestions forthe second edition. 1 also thank James Friel, Chairman of the Mathematics Department at SUF, for his encouragement. Reviewers who made useful recommendations for the fist edition are Walter M, son. I, Lander Collage: George B. Miller. Centzal Connecticut State Gniver- sity; Peter J, Gingo, The University of Akron; Michael A. Freedman, The University of Alaska, Faithanks; and Kencein B. Bube, University of California, Los Acgeles. For the second edition, he thank Richard Bumby, Rutgers Cniversty: Robert L. Cony, CS. Army; Bruce kdwards, University of Florida; anc David R. Hill. Temple Univer- sity For this third edition we wish to thank Titn Sauer, George Mason University; Ger ald M. Pitstick, University of Oklahowa; Victor De Brunner. University of Oklahoma: George Trapp, West Virginia University; Tad Jarik, University of Alabarca, Huntsville; Jeffrey S. Scroggs, Nonth Carolina State University, Kur Georg. Coloredo State Un'- versity: and James N. Craddock, Southern Illinois University at Carbondale Suggestions for improvements and additions 10 the book are always welcome and. can be made by corresponding directly withthe authors John H, Mathews Kunis D. Fink ‘Macematics Deparment Deparment of Mathematics California State University Nonhwest Missouri Stare University Fullerton, CA 92634 Maryville, MO 64468 rathewsOfellerton, edu kEink@na] .nvmesouri.edu Preliminaries Consider the function fx) = cos(x). us derivative f"(x) = —sin(x), and its an- liderivative F(x) = sin(x) + C. These formulas were studied in calculus. The former is used to determine the slope m = (19) ofthe curve y = F¢x)ata point (x9, f(40)) and the later is used to compute the area under the curve for a < x 0, there exists a 5 > 0 such that, whenever « © $,0-< bx ~ x0] <5 implies that |f(x) — LI < ¢. When the h-increment notation x — 9 +h is used, equation (1) becomes a Jim, fom) Sec. 11 Review oF CALCUL 3 Definition 1.2, Assume that f(x) is defined on a set 5 of real numbers and letx0 € S. ‘Then f is aid t be continuous af x — x9 if @ im, f= Seow The function f is said o be continuous on S if tis continuous at each point x € 5. ‘The notation C7(S) stands for the set of all functions such that f and its first n derivatives are continuous on S. When S is an interval say [a,b], then the novation Cla. 61 is used. As an example, consider the function f(x) = x4 on the inter- val [=1, }}. Clearly. /(0) aad f'(z) = (4/3)x"? are continuous on f=, 1), while F(x) = (4/9)x-27 is not continuous atx a Definition 1.3. Suppose that (x=), is an infinite sequence. Then the sequence is Said to have he tnt and we write a slim 6 if, given any ¢ > 0, there exists a positive integer W = N(e) such that > 4 implies at [ty — L] <€ a ‘When a sequence has a limit, we say that it is a convergent sequence. Another ‘commonly used notation is "xq > Las m+ 0." Equation () i equivalent to o time =o, ‘Thus we can view the sequence (¢}ziy = {3 — L]%y a8 an error sequence. The following theorem relates the concepts of continuity and convergent sequence. ‘Theorem 1.1. Assume that f(x) is defined on the set Sand xp € S. ‘The following atements are equivalent: (2) The function J is coatinvous at xo © (OVE rn ey — to, then Jim. Frm) = Fp. ‘Theorem 1.2 (Intermediate Value Theorem), Assume that f € Cla, 6] and L is ‘any number between f(a) and f(b). Then thete exists a number c, with © © (a,b). such that f(e) = L. Example L.1. The function f(2) ~ cos(x ~ 1) 1s continuous over (0, 1], and the constant L ~ 08 € (cos(0), cos()). The solution to f(x) = 08 aver (0. 1} is c1 = 0.356499. Similarly, fx) is continuous over (1, 2S]. and L = 0.8 ¢ (cos(2.5).cos(t)). The solution to f(a) = 0.8 over 1.2.5) is eg — 1643502. These ewo oases are shown in Figure 1.2. 4 CHAP. PRELIMONARIES yes) Figure 1.2 The intermedia: «s theorem applied to the function — f(x) = costx ~ 1) over (0, 1] and ‘ver the interval [1,25] fe Figure 13. The exeme velo ‘theorem applied (0 the fut fa) 38-4 59.51 ~ 65x ‘ver the imteral (0.3) ‘Theorem 13 (Extreme Value Theorem for a Continuous Funetion). Assume that Ff © Cla,b). Then there exists a lower bound Mi, an upper bound Ma, and two numbers x1.*2 € fa, b] such that o My = fn) < FO) < flex) = Mz whenever « «fa, 61 ‘We sometimes express this by writing My = flay = min, {FO} and Mp = fen) = ax (fe) Differentiable Functions Definition 1.4. Assume that f(x) is defined on an open interval containing xp. Then 1s said to be diferentiable at xo if io tim £22= F020) hr Sec. 1.1 REVIEW OF CaLcut us 5 cexists. When this limit exits, itis denoted by f(a) and is called the derivative of f aro. An equivalent way to express this lmic 5 10 use the A-increment notation: im £2944 ~ Soo oy Ris A function that has u derivative at each point in a se: $ is sald to he differentiable (on S. Note that, the nurnber m = f"(xp) is the slope of the targent line fo the graph of the function y = f(x) at the point (x0, f (30). 2 «19 Feo ‘Theorem 14. If f(x) is differentiable atx 9. theo 2) is continuous atx 0. 1k follows from Theorem 1.3 that, if function f is differentiable on a closed interval [a,b] then is extreme values occur atthe end points ofthe interval ora the exiical points (solutions of (x) = 0) athe open interval (a,b). Example 1.2, The function fe) = (Sx? ~ 66,5159 Sx 4.38 is differentiable an (0, 31 The solutions to f'(x) = 4542 — 123» 59.5 = Oare £1 = 0.54955 and x2 = 2 40601 ‘The maximvam and minimum values of fo (0.3) a ming f(0), £3). fOr), Fra) and rin{35, 20.50.10438, 2.11850) = 2.11880 smaxtf(O) £13), far). F421] = enaxlS, 20, 3.10838, 2.11850 0.10838. ‘Theorem LS (Rolle's Theorem). Assume that / € Cla, b) and that J"(x) exists for all © (a, 6). If f(a) = f(b) = 0, then there exists a number, with ¢ ¢ (a,b), such that #"(e) ‘Theorem 1.6 (Mean Valve Theorem). Assume thet f € Cla, 6] and that f"(x) exists forall x € (a. b). Then there exists a number c. with ¢ € (a. b), such that f(b) fla) boa Geometrically. the Mean Value Theorem says that there is at least one number © € (a. b) suc that the slope of the tangent line to the graph of y = f(x) atthe po:nt (€, F(€)} equals the slope of the secant ine through the points (a, f(a)) and (b, £16). an fe) Example 13. The function f(«) = sin(x) s continuous onthe closed interval [0.1.2.1] and differentiable on the agen interval (0.1, 21). Tus. by the Mean Value Theorens, here is enumber ¢ such that F121) = 00.1) _ 0 863209 ~ 0.099833 2A Ole eeeea Or {he solution to f"te) = cosie) = 0.381688 in the interval (0.2.1) sc = LATOITL “he graphs of fx). the secant line y = 0.381688" + 0.099833, and the tangent line = 0.381688s + 0.474215 are shown i Figure {4 . fo 381688. 6 CaP. PRELIMINARIES 10 fe) (SO) fo os | Fico eset eee eer ‘Theorem 1.7 (Generalized Roll’s Theorem). Assume that € Cla, b] and that LOD L os £) exis over (a,b) and 30,5120 %p € [a B) fap) =O for j = 0,1,...,7, then there exists a number c, with ¢ € (a, b), such that f'(c} = 0. Integrals ‘Theorem 1.8 (First Fundamental Theorem). If f is continuous over (a, b] and F is any antiderivative of fon fa, b) then (. ; of fordeeF0=Fe) we FU = ‘Theoretn 1.9 (Second Fundamental Theorem). If f is continuous over fa, 8) and x € (a,b), then «9 Ef toa Example 14. The function f(x) = cos(x) satistes the hypotheses of Theorem 1.9 over he interval (, 2/2}, dus by the ebain rule (a). dupe eae gerd $f cnr mons = rect . ‘Theorem 1.10 (Mean Value Theorem for Integrals). Assume that f € C[a, bI Then there exists « number c, with ¢ & (a,b), suck thet as mf fords = fle. The value /(c) is the average value of f over the interval fa, 0} Sec. 1.1 REVIEW or CALCULUS 7 TS Ce | igure 15 The mean valve ae theorem for imtgrals applied to pe a Foye any fine) ovr te vo 1s 2025 * henaasy Example 15. The function (2) = sin(x) + | sin(x) sass the hypotheses of The orem 1.10 over the interval [0,25]. An antidervaive of f(x) is F(x) =~ costs) {oo The average value of the function f(3) over he interval (0.2.5 1 ps FQS)~ FO) _ 0.762629 ~ (1.111111) {Oe Sas 25 1.873740 Tee 0.749496 “Tere sre thee solutions 10 the equation fc) = 0.749496 over the interval (0,2: 1 = 0.440566, ¢2 = 1.268010, and cy — 1.873583, The area ofthe rectangle with base ba = 25 and height fey) = 0.749496 8 f(c))(0~ a) = L873740, Te aes ofthe rectangle hs the same aumerial valve as he negra of f(s) taken over the intr. val (0,21. A comparison of he aca under the curve y = (x) and that ofthe rectangle Cn be seen in Figure 15. . ‘Theorem 1.11 (Weighted Integral Mean Value Theorem). Assume that fg € C{a, b) and g(x) 2 Oforx € [a, 6], Then there exists a number c, with ¢ € (@,D), such that a4) LOewdx = for _ atx) ds. Example 16. The functions f(x) = sin(x) and g(x) = satisfy the hypotheses vf ‘Toren 1.11 over the ineva 0, x/2]. Thus there exists a umber e such that x? x? sin(x) dx sino = BAH sna de 1 Bova | o.gs36a1 ore = sin~!(0.883631) = 1.08356. . 8 Cuan.) PaeLinnanes Series Definition 1.8, Let {ag}, be a sequence. Then -% , ay is an infinite series. The nth partial sum i Sp = SZ) 4. The infaite series converges if and only i the seaqtence {Sy}, converges to a limit $. that is, 0s) tig Ss = Jim, So = Ifa series does not converge, we sy tha i diverges . Example 17. Conder init een (a), ={ paral n't ‘Therefore the sum of the infinite series is 8 tin se tin (1 i) ‘Theorem 1.12 (Taylor's Theorem). Assume that f € C"*"(a.b] and let x9 € {a,b}. Then. for every x € (a,b), there exists a number © = c(x) (the value of ¢ epends on the value of x) that lies between xp and.x such that “6 Fla) = Pala) + Rex) where = 3 Low) " an fats — SP ay and tet us uns Example 1.8. ‘The function f(x) = sin(x) satisfies the hypotheses of Theorem 1.12. 1s Taylor polynomial Px) of degree m = 9 expanded about xo = 0 is obtained by evalu Sec. 11 Review or CaLcutus Figure 16 The graph of (2) = sin(x) and the Taylor polynomial P(x) =x —x/3! + 39/51~ 27/71 + 29/94 1g the numerical values ino formula (17) the following derivatives at x flr) = sino), f10) =0. FG) = cos) FO)=1, fx) = = sing). f°) =0, —costx), FO) —=1. cos(z), £0) = b pe cera eet dot A graph of both f and Py over the interval [0.2] is shown in Figure 6 . Corollary Li. If Py(x) is the Taylor polynomial of degree a given in Theorer 1.12, then a9. PEC) — Fx) for K=O We ees Evaluation of a Polynomial Let the polynomial P(x) of degree m have the form 0 PO) = gs dy tal be bags? bay Ha, 10 CHAP PRELIMINARIES Horner's method ox synthetic division is « technique for evaluating polynomials, It ‘can be thought of as nested multiplication. For example, afifth-degree polynomial can bbe writen n the nes:ed multiplication form Psla) = Wlasx +g) + 03)x + ay) = ay)x + an ‘Theorem 1.13 (Horner's Method for Polynomial Evaluation). Assume that P(x) isthe polynomial given in equation (20) and x = cis anumber for which P(e) isto be evaluated Ses by ty and compare en cb.) for then by = P(e}. Moreover, if a Dot) = bys”! or bys hs bya? ~ bax + ba. then ay P(2) = (x Gols) + Rov ‘where Qo(x) is the quotient polynomial of degree n — 1 and Ro = by = P(c) isthe remainder, Proof, Substinuting the right side of equation (22) for Qo(x) and by for Ro in equa- tion (23) yields PU) = (= bg gata 8 oot Bat? + bax +1) +b rx = (aa — ey)" os + (bp — eb)a? + (by = chade + (bo ~ cbs) en ‘The numbers by are deterinined by comparing the coefficients of «4 in equations (20) and (24), as shown in Table 1.1 ‘The value Pc) = bp is easly obtained by substituting x = c into equation (22) and using the fact that Ry = By es) Plc) = le ~ IQote) = Ro = bo, . The recursive formula for by given ia (21) ‘A simple algorithan is easy to implemeat with a computer, b(n) = ain fork =n— 1-10 Bk) = ath) + € «b= Ns Sec. 1.1 Review or Caucutes n ‘Table 11 Coefcients by for Homer's Method 2 | Companig (20 and 24) | Solving for be When Homer's method is performed by hand, it is easier to write the coefficients of (2) on line and perform the calculation by = ay + cbrey below ay im a columa, ‘The format for this procedure is illustrated in Table 1.2. Example 1.9. Use synthetic division (Homer's method) to find P(3) forthe polyaomi| 6x4 + Bx? = Bx? + dx ~ 40, agitate 2% ‘Therefore, P(3) = 17 . 12. CHar.t PRRE-MINARIES Exercises for Review of Calculus 10. M. (a) Find 6 = Hiany stn 4 19/(2n~ 8). Then determin {eg} = (L xq} and ed (Oy Find £ = bmg +5920 and find ity tn (Ls) 6-1) (422mg). Then determine ( Let (rel Dea sequence such that litt +20 Xe (Find timo sins (b) Find my ints) Fine the numbers} ¢ referred 10 inthe intermediate value theorem for each function ‘over the indicated interval and forthe ven value off (a) fix) = a8 + 2x = Zover 1,0) using L=2 (0) fix) = Va ExT over (6, 8] using L = ind the upper and lower hounds refered to in the extreme value theorem foreach function over the indicated intervsl (a) f(x) = x8 ~ Bet Lover|=1.2] (hy fla) = cot ~ sin(x) over 0,23} Find the nuraber(s;c refered to in Roles theorem for each function over the ind cated x4 =x? over|-2.21 sin(x) + sinQ28) over [0.25] Find the number(s} ¢ referred to ia the mean value theorem for each funetion over the ‘ngicsted interval fay f(x) = Je over|0,4] f= Apply the generalized Rolle’ theorem to f(x) = x(r— 1G — 2) over [0,3]. Apply the Srst fundamental theorem of caleuls to each funetion over the indicated interval (a) fis) =e! over 0.21 7 over 1.1) Wy foo = 2 overt.) aT Apply the second fundamental theorem of alls to each anton, fo fA Pemns wy pie Find the qurbe(s refed io inthe mean valve taorem for tepals for eack function, over the incted neva ta) fa) = 64 over |=. 4 6) fox) xcoss) over {0.38/21 Find the surn of each sequence or series 12 Sec. 1.2 BivaRy NUMBERS t 12, Find the Taylor polynomial of degree n aiven value of x0. @) f= VR=1 442 43x4 120 osx). x0= 0 =) /St— 21/74 29/91, Show F°9(0) for k = 1,2, .0-49. 14, Use syothetic division Homer's method) to find PCC) @) Paar - 18-2 12,c83 @) PU) = 2k) 438 $x) It DBe= 15, Find the average area ofall circles centered atthe origin with radi between I and 3. 16. Assume that « polynomial, P(x), has n real roots in the interval (a, ], Show ths ‘PO-1)(e) has a least one real root i the inteval fa, 6}. 17, Assume that f, f',and f” are defined on the interval [2.8]; f(a) = f(b) = 0; are F(O) > Ofore € (a,b), Show that there is x umber d € (a,b) such that f"(d) <0 Binary Numbers Human beings do arithmetic wsing the decimal (base 10) aumber system. Most coin puters do arithmetic using the binary (base 2) number system. It may seem otherwise since communceation with the computer (inpuvoutput) is in base 10 numbers. Thi: ‘ransparency Goes not mean that che competer uses base 10. tn fact, it converts input ta base 2 (or perhaps base 16). then performs base 2 arithmetic, and finely translate the answer into base 10 before it displays «result. Some experimentation is require: 1 verify this. One computer with nine decimal digils of accuracy gave the answer vase w Sou = 9909. 90947 Here the intent was to add the number repeatedly 300, 000 times. The mathematic? answer is exactly 10,000. One goa isto understand the reason forthe computer's ap parenly flawed calculation. AL the end ofthis section it wll be shown how somethin Js lost when the computer wanslaes the decimal fraction into a binary number 14 CHART PRELIMINARIES Binary Numbers ‘Base 10 numbers are used for most mathematical purposes. For illustration, the number 1563 is expressible in expanded form as 1563 = (1 « 109) ~ (5 x 107) + (6 x 10") + 3 x 10%). In general, let N denote a positive integer; then the digits ap. an, ..., a4 exist so that 1 has the base 10 expansion N= (ay x WA)» (apy x 1 +e (ay X10") + (ay 10%), where the digits ae are chosen from {0, 1.....8.9}. Thus N is expressed in decimal ‘ NS aay 220,045 (decimal ics uncerstood that 10 is the base, then (2) is writen as N= apah-1- -aparay. Forexample, we understand dat 1563 = 1563. Using powers of 2. the number 1563 can be written 1563 = (1x 2) 4 (1 29) +02) 0x2) + 0x2) eo + Ox Bex M40 xP)+0xF) +0 x2) +UxP), ‘This can be verified by performing the calculation 1563 = 102445124 16+8+2+1, In general. let N denote a positive integer; the digits bo, by, .... by exist so that 7 has the bace 2 expansion (by x DV yet Ebb + (by x D+ Coy xD, Cn ‘here each digit bis cither 0 oF 1. Thus is expressed in binary notation as ° Ne bydss--babrbopny (inary) Using the notatior: (5) and the eesult in (3) yields 1563 = 1100001101 Io Remarks. The word "two" will always be used asa subserip atthe end of a binary fuumber, This will enable the reader to distinguish binary numbers from the ordinary base 10 usage. Thus 7 LI means one hundred eleven, whereas L Ij stands for seven, Sec. 13 BIvaRy NewBERS 1s is usually the ease thatthe binary representation for W will require mote digits than the decimal representation. This is due to the fact that powers of 2 grow much ‘more slowly than do powers of 10. Aneefficientalgorithen for finding the base 2 representation of te integer N can be derived from equation (4). Dividing both sides of (4) by 2 yields y by ) Fe by Pres (by DD © T=, Ve (byt 2!) by x 2) aes sine vg Wy 2 hb, Moe deme, ean a m Qo = (by x YN) + yin x! Ph 4 tp x2) + x Ds, oe die tie Cy 2 0 4-2) 3) 4 2 nyt) 1. then the series verges. Proof. The summation formula for a finite geometric series is U9) Sys eber tet pe tere ‘To establish (12), observe that «ay br <1 implies char im r= 0. Taking the limit as n —+ 00, use (14) and (15) to get (1 fim") = En S, en, a8 I=" By equation (15) of Section 1.1, the limit above establishes (12), When r| 2 1, the sequence ir?!) does mot converge. Hence the sequence {Sq} in (14) does not tend to limit. Therefore, (13) is established * Equation (12) in Theorem 1.14 represents an efficient way to convert an infinite repeating decimal into a fraction. ‘Example 1.11. Binary Fractions Binary (base 2) fractions can be expressed as sums involving negative powers of 2. If Ris a real number that lies in the range 0 < R < 1, there exist digits dy, dz dye. $0 that a6) Rad x2 where dj ¢ (0, 1]. We usually express the quantity on the right side of (16) in the binary fraction notation (x2 AH Hy RIM Fo, a R= Odidy dy 490 18 CHAR] PRELIMINARIES ‘There are many real numbers whose binary representation requires infinitely many ‘digits, The fraction 7/10 can be expressed as 0.7 in base 10, yet its base 2 representa- tion requires infinitely many digits 7 owe ag) Fp OTTO. The binary fraction in (18) is a repeating fraction where the group of four digits 0110 is repeated forever. ‘An efficient algorithm for finding base 2 representations can now be developed. If both sides of (16) are multiplied by 2, the resut is as BR = dy (dp x IN eee dy IH $s), ‘The quamity in parentheses on the right side of (19) is @ positive number and is less than 1, Therefore dy is the integer part of 2R, denoted dy = int (28). To continue the process, take the fractional part of (19) and write 20) Fy = fract2R) = (ds x PME dg OY) where frac(2R) is te fractional part of the rea! number 2R. Multiplication of both sides of (20) by 2 results in au BE dy dy PY a hy ME Now take the integer part of (21) and obtain d> = int(QF;). ‘The process is continued, possibly ad infinitum (if has an infinite nonrepeating bbase 2 representation), and two sequences {dy amd {)) are recursively generated inti2Fi-). im(2R) and F, = trac(2R). The binary decimal representation of R is then given by the convergent geometric series Raga f Example 1.12. The bisary decimal epeseniation of 7/10 given in (18) was found sn the formas in 22) Let R= 7/10" 07 then 2R=ls 2F, 08 de = int(08) = 0 ty = it) = 1 Sec 1.2 Binary NeMBBRs. 9 Note that 2) = 1.6 = 2Fy. The patterns dy = dia and Fe = Fess will occur for k 3.4, .. Thus7/10 = 0.10710 po, . Geometric series ean be used to find the base 10 rational number tha: a binary number represents ‘Example 1.13. Hind the base 10 rational number that the binary number 0.0 live repre: sents. In expanded form, 0.0} = (0 x 278) + (1 x 274) ~ (0 x 279) 4 x 2-4 = Set 14 Sa) & & 1 a . ae aaa Binary Shifting 1 rational surnber tht is equivalent roan infinite repeating binary expansio sions, then a shift in the digits can be helpful. For example, let $ be given by oa = 0.000TTO Muniplying both sides of (23) by 2° will shit the binary point five places tothe ight sznd 325 hus the form, on 325 = 0.1700 Sirius. multiplying both sides of (23) by 2° wil shift she binary point ten places to the right und 10245 has the form 2s) 1024s 1000.TT00thve. ‘The result of naively taking the differences becween the left- and right-hand sides of (24 and (25) is 9925 = 1100049 oF 9925 = 24, since 1100049 = 24. Therefore. S= 3% Scientific Notation A sa:nlanl way f0 present a real number, called scientific notation, is obtained by slufuns tte decimal point and supplying an appropriae power of 10. For example, 0,0000747 = 7.47 10°8, 31.4159265 = 3.14159265 x 10, 9,700,000,000 = 9.7 x 10°. {In chemistry, an important constant is Avogadro's number, whick is 6.02252 x 108. It is tve numberof atoms in the gram atomic weight of an element, In computer science, TK — 1.024 » 108, 20 Char. 1 PRELIMINARIES ‘Twble 1.3 _ Decimal Equivalems for # Set of Binary Numbers with 4-Bit Manisa and Exponent of = —3,—2,.0u0 34 Exponent Mantas n=O [nal 01000 | oases fois |oas los |x [a | 4 1001 | oa7m312s | 0.190625 | 028125 | o.s62s | 12s | 225 | 45 01010 | oa78i2s | o.ts62s | o3i2s | oms | ias }2s | 0.101 tr | 020859375 | 0.171875 | 024375 | 0687s | 1375 | 275 | 55 11000 | ooss7s | ouss |oss jor lis [3 | 6 2110%yo | 01018625 | 0.208128 | o.a06as | osias | 1625] 325 | 65 0.1110 | 0405375 | o2is7s | oas7s | oss |i7s | 3s | 7 Otte | 041171875 | 0234395 | oa6s7s | 09375 | vars | 375 | 75 Machine Numbers Computers use a normalized floating-point binary representation for real numbers ‘This means thatthe mathematical quantity x is not actually stored in the computer, Instead, the computer stores a binary approximation tox: 26) eg x ‘The number ¢ isthe mantissa and i is afnite binary expression satisfying the inequal- ity 1/2 = q < 1. The integer nis called the exponent. Ina computer, only a small subset of the real number systems used. Typically, this subset contains only a portion of the binary numbers suggested by (26). The number of binary digits is restacted in both the numbers q and n. For example, consider the set ofall positive real numbers ofthe form en Odidadsdagus 2" where dy = | and dz, ds, and dy ae either 0 oF 1, and m € {-3, -2, ~1.0,1,2,3,4) “There are eight choives forthe mantissa and eight choices forthe exponent in (27), and this produces a set of 64 numbers 28) {0.100020 % 2-7, 0.1001ayo % 2°%, .-- J 0.11109 ¥ 24, 0.11 eww % 249. “The decimal frm ofthese 6 numbers aze given in able 13. tis important o lear that when the mantissa and exponent in 27) ar resticted the computer has a limited ‘numberof values it chooses from to store as an approximation to the real number x ‘What would happen if a computer had only a 4-bit mankisea and was restricted to perform the computation (+ }) + £? Assume thatthe computer rounds all real ‘numbers to the closest binary number in Table 1.3. At each stp the reader ean look at the tablet se that the best approximation i being used SBc. 1.2 BINARY NUMBERS. a Bh ¥ 011019 X27 = 0.0110Inng x 2°? 9) 4 © 0.110lwe x2? = _0.1101wo x 27? b DOO g x “The computer must decide how to store the number 0,001 I x 2-2. Assume that is rounded 10 0.10100 x 2~!. The next step is fh © 01010 X27 = 0.101099 x 2-1 eo) 4 = 0.101twe X22 = 0.0101 Ip x 2°! % O11 x2 “The computer must decide how to store the umber 0.111 Ine x 2~!. Since rounding is assumed to take pace it stores 0.10000hy, x 2°. Therefore, the computer's solution to the addition problem is an F =the x2? The eran fe compute i o> 1 -01t0the wats -asmoonm 00m Expressed as a percentage of 7/15, this amounts to 7.14%. Computer Accuracy ‘To store numbers accurately, computers must have floating-point binary nuepbers with at least 24 binary bits used for the mantissa; this translates to about seven decimal places, Ifa 32-bit mantissa is used, numbers with nine decimal places can be stored. ‘Now, again, consider the difficulty encountered in (1) at the beginning ofthe section, ‘when a computer added 1/10 repeatedly. ‘Suppose tha the mantissa qin (26) contains 32 binary bits, The condition 1/2 1. Suppose thut kis the maximum eumber «s dacimal digits carried in the floating, point computations of 2 computer: then the real nuinber p is represented by fly Pp). which i givea by did dh * 10" o FlexegP when | < dy € 9 and 0 < dy = 9 for 1 « j Sk. The mumber Fisgunip) is ealed ne chopped floating-point representation ot. In this case the Ath digi of fletop(P) agroos with the kab digit of p. An alternative A-digitrepresen:ation is the rounded Slocing-point representation lraya\ p)- which given by ae SFbrgunaP) — £0.dydady 14% 10" where |<) = 9 und 0 = d, < 9for | < j < & and the lst digit ra. is obtained bby munding the number did dh-2 ++ to che nearest integer. For example, the real nunhor p= 7 = 3.eaasiiazss 2487 ts the followig sx digit representation: ‘Fegeg 1 = 0.314285 > 10 Plast P) = 0.14286 = 10! thr common pirposes the chopping snd rounding would be written as 3.14285 and 3.14286, respectively. The reader should note that essentially all computers use sense form of the rounded floaing point representation method, 2B CHAR. 1 PRELIMINARIES Loss of Significance Consider the wo numbers p = 3.1415926536 and q = 3.1415957341, which are rneurly equal and both carry 11 decimal digits of precision. Suppose that their differ- cence is formed: p — q = ~9.0000030805, Since the frst six digits of p and g are the same, their difference p — q contains only five decimal digits of precision. ‘This ‘phenomenon is called lass of significance or subtractive cancellation. This redvtion in the precision of the final computed answer can creep in when itis not suspected. Example 1.17. Compare the results of calculating £(S00) and g(500) using six digits and rounding, The functions are f(x) =x (V+ 1 ~ vz) and g(x) = ete. Fore first function, £4500) 500 (V50i ~ v500) For gtx), = ae30 Baaeo7 ~ B7ag7 = 1S “The second function, (2), algsbraicaly equivalent to (2), a shown by the computa- 2 (VE ET— vi) (VEFT + VA) vetl+JE + ((veFiy'- (va) Vaatt ve fo “Tai ‘The answer, (500) = 11.1748, involves less error and isthe same as that obtained by rounding the true answer 11.17475S300747198.. to six digits. . ‘The reader is encouraged to study Exercise 12 on how to avoid loss of significance in the quadratic formula. The next example shows that a truncated Taylor series wil sometimes help avoid the loss of significance erro. Example 1.18. Compare the result of calculating f(0.01) and P(0.01) using six digit. and rounding, where a so= SEC.1.3 ERROR ANALYSIS » ‘The function P(x) is the Taylor polynomial of degree n = 2 for f(x) expanded about 0 For te first function 0.01 _ 1,010080- 1-09.01 aoe Gor ‘0.001 = 05. Forte second function 1 001 Poo 549% = 0.5 + 0.001667 + 0,000004 = 0.501671 ‘The answer P(0.01) = 0.501671 contains less error and isthe same as hat obtained by rounding the ue answer 0,50167084168057542.... wo sx digit. . For polynomial evaluation, the rearrangement of terms into nested multiplication form ‘will sometimes produce a better result, Example 119, Let P(x) = 29 — 3x? + 3x — 1 and Qx) = Ue = 3x + 3)= 1 Use three-digit rounding arithmetic to compute approximations 9 P(2.19) and (2.19). ‘Compare them with the true values, P(2.19) = Q(2.19) = 1.685159. (2.19) © (2.19)? ~ 32.19)? + 32.19) ~ 105 ~1444657~1= 1.67. 20.19) = (2.19 = 392.194 392.19 = 9, ‘The errs are 0.015159 and ~0.004841, respectively. Thus the approximation (2.19) ~ 1.69 has less error. Exercise 6 explores the situation near the root ofthis polynomial, O(h*) Order of Approximation Clearly the sequences {4} and Qe are both converging to zero. In addition, it should be observed that theirs sequence Is converging to zero more rapidly than the Second sequence. Inthe coming chapers some special terminology and notation will be wed to descrbe how rapidly sequence is converging. Definition 1.9, The function f(h) is said to be big Oh of g(h), denoted fh) = O(a). if there exist constants C and.c such that o [FDS Clad) whenever hc, 4 Example 4.20, Consider the functions f(x) = 22-41 and g(x) = 2°, Since? Litfollows thatx? +1 = 2e? forx > 1. Therefore, (2) = O@()). 30 Capt PReLiMnvanies The big Oh notation provides useful way of desenhing the rate of growth of function interms of well known elementary functions (". x1". Topy ek) The rate of convergence of sequences can be described ina simile manner Definition 110. Let {2n}%2, and (yn), be two sequences. The sequence (sph is Said 10 be of order big Oh of (vq). denoted xy ~ O(n) if there exist constants C uid such hat a tal < Clyal whenever n > a Example 1.21. | whenever 2 1 . ‘Often a function f(A) is approximated by a Function pik) and the ertor hound is known to be Min". This leads to the following definition Definition 1.11, Assume that f() is approximated by the function p(h) and that there exist areal vonstant Mf > 0 and a positive integer n $0 that Ld wont SM for sufficiently small h. We say thar p(A) approximates f(A) with order of approximation O(H") and write 0) FAH = pth) + OH. 4 ‘When relation (9) is rewnuten in the form |f() ~ p(k)! < M|AM|, we see thatthe notation O(") stands in place of the enor bound Mi"). The following results show ‘how to apply the definition to simple combinations of two functions. ‘Theorem 1.18, Assume that f(1) = p(h) + QUA"). g(h) = a(hy + OLA), and = min(m.n}. Then an 00) + gh) = PUR) 1 gih) + OOH" «ip 10g) = pohyath) + OW) and hyp 13, LE _ BE, G47) provided tha gh) #0 and 9h) £0. wh) ~ gh) I is instructive to consider p(x) 10 be the nth Taylor polynomial approximation ‘of (x): then the remainder term is simply designated O(h"!"), which stands forthe presence of omitted terms starsing with dhe power h°". The remainder term converges to zero with the same rapidity dhat 4"*! converges to zero as h approaches 7er0. 38 ‘expressed in the relationship a tt ag LOMO pnt 7 = a une) w meet as FO Sec 1.3 ERROR ANALYSIS a for sufciemly small A, Hence the notaion O(K*™!) stands in place ofthe quantity Mb’ *" where M isa constant or “behaves lke a constant.” ‘Theorem 1.16 (Taylor's Theorem). Assume that f € C*fa, b]. If both xo and xs mp4 Hehe m la, 6), then as Sera +h) ats one & H The following example illustrates the above theorems. The computations use the ‘addition properties (?) OC?) + OUh?) = O(hP), i) Oth?) + OCh#) = OCH"), where r — min{p,}, and the multiplicative propery (ii) O(HPIO(H) = Oth, wheres =p tg. Example 1.22. Consider the Taylor polynomial expansions ee out) and conti Sa er omh) ch) Determine the order of approximation for their sum and product Forthe sum we have en 1th A scosthy mits + +00 a2ing Bont sf +00 since O14 4 = 01h) and OCH! + O14) = OU, tis redvosio Pec) 244! om =(oaek (iene s2) + OM OG) Bosh ns eae oe + 0G) 4 00 + OGM 32 CHART PReLisnNanres Since O00) SH EF ous) 4 auth 4 04") = OV EF et a + 00) + 00 + 00 = 00, she preceding equation is simplied yield 00) and cos £00, th and the order of approximations OCH), . Order of Convergence of a Sequence ‘Numerical approximations ate often arrived at by computing a sequence of approxi ‘mations thac get closer and closer to the desired answer. The definition of big Oh for sequences was given ia Definition 1.10, and the definition of onder of convergence for ‘a sequence is analogous to that given for functions in Definition 3.11 Definition 5.12 Suppose tha litassoxy = x and (rel&y i a sequence wit lite-ofn = 0. We say that {zg converges tox wilh the order of coave: gence 0(ry) if there exists a constant > 0 such that bina} Trad x with onder of conver ence (re). * Example 1.23, Let x9 = c0s(n)/n? and rg = 1/n thee lity 20.5 = O with rate of ccomergence O(1/n?). This follows imunediately from the relation. = Jeoulenl <1 for all . as wis Propagation of Error ‘Lats investigate how ertor might be propagated in successive computations. Con der the addition of two numbers p and q (the true values) with the approximate values 3 and @, which contain errors é> and ¢y, respectively. Staring with p = B+ € and q =O + eq, the sums 6) pta= Peet Grqh= P+D +p eey) Heac, for addition, the errr in the sum isthe sum of the errors in the adends, $8.13 ERROR ANALYSIS 3B on ofervoF in multiplication is more complicated. The produc: is an 19 = BH pNB + 6g) = BO* Beg + ep + Coe Hence. if Band are lager than Lin absolute value, terms Fey and Gey show that here isa ponsodity of magnification ofthe orginal erors cp and ¢g. Insights ae gained i ‘we Lok athe relative error. Rearrange the terms in(17)10 get, 06, 29 ~ BA = Bey + Fep ~ epty Sappose that p # O-and g # O; then we can sivide (18) by pq (0 obtain te relative error in the procuet py _ Beat Gep + ents (9) Ry = a a Farhermore, suppose that J and 7 are good approximations for p and gi then Bp 1,G]9 * |, und Rp Ry = (€p/P)(€o/q) * O(Rp and Ry ate the relative errors in te approximations and 4). Then making these substitutions into (19) yields Simple! velaionship 20) Rog ees 2+ 0= Ry +Rp 4 TMi shows uit the relative error inthe product pq is approximately the sum of the relative errors ithe approximations F and @. ‘Often an intial error will be propagated in a sequeace of calculations. A quality ‘hats desicaoic for any numerical process is that a small erro i the intial conditions will prodace sn:lk changes in the final esult, An algorithm with this feature is calles stable; otherwise. itis called unstable, Whenever possible we shall choose methods that are sable. ‘Ihe following definition is used to describe the propagation of errr. Definition 1.13. Suppose that ¢ represents an initial error and e(m) represents the growth of Vie enor after steps. IF le(n); * ne. the growth of error is said to be linear. If lc:n)] ® Ke. the grow of error is called exponential. If K > I, the exponential cevor grows without bound as n+ oc, and if 0 < K < 1, the exponential error Alrsnishes to 7200 a8 n > 92. a “he nest two examples show how an initial error can propagate ia either a steble ar an unstable fasion, Inthe first example, three algorithms are introduced, Each algorithm eeprsively generates the same sequence. Then. in the second example, small ‘changes witl be made to Ue initial conditions and the propagation of error will eal, Mo CHAP.L PRELIMINARIES ‘Table LA The Sequence (x=) = (1/3"F and the Approximations {rs (2x) and (ab Sec. 3 ERROR ANALYSIS 38 ‘he LS The Err Sequences (25 ~ als te — Pal and (9 — da} (as Dera] eeepc |e te ; a on mea o ‘oono000009 | 0.9999600000 | 1,0000000000 | 1,9000000000 0 “0.6000800000 “@-9008000000 ‘0.0000060000 | 1 dionn133333, .000133333 Soonni 1| — feosssssas95 | ossssam0m0 | 03535200000 | 03333200000 : ona cane 7 2{ Jeommmum | arcs | oxnoessss0 | ono 3 o.oo 481s 1000192883, camsaare | | | : como Sonos Soosnees 3] peoamsnve | cams | eontimm | vow: : ommONONCes Senmiet aomaeae 4] deomnsem | oases | vanes | conmeun ‘ oad oxcasras onset 5] sk=emmnsan | omni” | come | comms * ‘.op5000061 10.000135970 0.0328009992 6 qysnansrinan | oomsnexr | goonies! -oonnrso 2 oonond 2.001990 oepasions 7) hy =oomsros | ommossrae1 |) oo0ora865 | -oor007sn ~ a | dacomoisiiss | oconsan | oowsonse | oasis cincva | samo | samen |-bo ——eummbepmn hig < A+ ont ai .oo001693%4 | -o.900n80646 | ~0.2952280648 ‘Example 1.24, Show that the following thee schemes can be used with infnte-preision arithmetic to recursively generate the terms in the sequence (1/3"]2°,, = focn = ay cua 21) p= hei and pe = ZPrai— fed for Gla @m=ta 10 = eden form Formula (21a) is obvious. In (218) che difference equation has the general solution 7 (1/3) + B, This canbe verified by direct substcetion 4 tara 14g) eee (ten) et (eae jet reae3(sr+#)-3(sa~8) (43), (41 1 =(£-3)a~($-f)acadene rn (5-3) 4 (G-3) = Age t =m (0 will generate the desired sequence. In (21e) the difference Setting A = Land B 10 10 Bae a aesn!2(ghn +00) (sha +29™) ) 4-00 -038 +B =a Seking A = 1 and B= 0 generates the equi sequence : xample L2$. Genet approximation 1 the sequence (ta) = (1/3")tsing the ence 0s) ry 0.99986 and re = ret format, 4 1 and p= 5 Pri 5Pa2 forma? 3 220) po =I, ps = 0.33332, 2) = ha 3, eure ss cee 4 (22a) the initial error in ro is 0.00008, and in (226) and (22c) the inital errors inp nn gs are 0.000013. Investigate the propagation of errr for each scheme. Table 1.4 gives the ist ten numerical approximations for each sequence, and Table 1-5 _fsesthe errr in each formula, The enor for {ra is stable and decreases in an exponential ‘mane. The errr for {pq} is stable, The error for {gn} is unstable and grows at an expo- neat rate, Although the error fr (p] is stable, the terms py —> 0188 n > co, 5 that the cro eventually dominates and te terms past pg have no significant digits. Figures 1.8, 419, and 110 show the errors in {rq}, (Pn and (ge) espectvely . 36 CHAR) PReLinuNARIES ‘6.000015 9.000010 ‘0.000005 . 2 ‘ % 5 10 0.000015 0.000010 ‘0.000005 Figure 19. A stable enoe sequence {4 Pa}. Uncertainty in Data Data from real-world problems contain uncertainty or error. This type of error is v= ferred to a8 noise. It will affect the accuracy of any numerical computation that is based fn the data. An improvement of precision is not accomplished by performing succes- sive computations using noisy data, Hence, if you start with data with d significant digits of accuracy, then the result of computation should be reported in ¢ signiicant digits of accuracy. For example. suppose thatthe data p} = 4.152 and p2 = 0.07931 both have four significan digits of accuracy, Then i is tempting to report all the digits that appear on your calculator (L., py + pa = 4.23131). This is an oversight, because oe 03 . 02. o1 . 2 4 6 3 0 igure 110 An unstable increasing eor sequence (ry ~ SEC.1.3 ERROR ANALYSIS a ‘you should not repor conclusions from noisy data that have more significant digits than the original data. The proper answer inthis situation is pi + pp = 4.231 Exercises for Error Analysis ——————T— 1. Find the error Ey and relative error Ry. Also determine the numberof siaiicant gis in the proximation (x =2.7828182,F = 2.7182 2, Complete the following computation f° “are [ (ete State what type of eror is present in this situation. Compare your answer with the ‘uve value p = 0.2553074606, 3. (a) Consider the data py = 1.414 and p> = 0.09125, which have four significant digits of accuracy. Determine the proper answer for the sum pi + pa and the product pps. () Consider the data py = 31.415 and p> = 0.027182, which have five significant digits of accuracy, Determine the proper answer for the sum py + pa and the product pip. 44 Complete the following computation and state what type of error is present in tis situation. Ge) 2(§ +000 TOTIO678119 ‘0.00001 oy 82 +..00005) — Ine) _ 049317208005 — 069514718056 ~~ t000s ‘0.00005 5, Sometimes the loss of significance eror can be avoided by rearranging terms in the function using a known identity from trigonometry or algebra, Find an equivalent {formula forthe following functions that avoid a loss of significance. (@) Ine +1) ~ Ine) for large x ) Vet F1~ x forlarge (©) costs) ~ sin?(x) forx = 2/4 [ir eoity @ fo wren (6. Polynomial Evaluation. Let P(x) = x°—3x2+3x—1, Q(@) and R(x) = (x = 1)? (a) Use four-digitrounding arithmetic and compute P(2.72), Q(2.72), and R(2.72). Inthe computation of P(x), assume that (2.72)? = 20.12 and (2:72)? = 7.398. ((e~ 3943 38 CHAR. PRELIMINARIES (0) Use fouige rounding ardmetic and compute POSTS), 10978), and (0.975). Jn the computation of P(x), assume that (0.975)? = 0.9268 and (0.975)? = 0.9506. 1. Ute tre digi ondig aime o compute he lowing sums (um inthe gsen ower are ) Shade 8. Discs the propagation of rf the following (a) Thesum of ce umes PHatr= Gres Gra+ Gren. 0) Tequsenofwo nantes: £ = Ete (The prodactf tree mmber PAT = P+ EG + e+e). 9. Given the Taylor polynomil expansions 4 2am 4 004 atthe eee Fe) and Bo oe, costh +B +00, ‘Determine the order of approximation for their sum and product 10. Given the Taylor polynomial expansions , Pac alte eae and sing Determine the order of approximation for ther sum and product. 11, Given the Taylor polynomial expansions eB ” +h +0U' sos BBs oa Determine the order of epproximation for their sum and product. singh) = h = Sc. 1.3 ERROR ANALYSIS » 12, improving the Quadratic Formala. Assume that # Oand b?~4ac > Oand consider theequition ax?-+bx-+ = 0, Theootscar be computed with the quart formulas abate nb iar 2a 2 a9 ‘Show that these roots can be calculated with the equivalent formulas 0 de 26 —— and xq = 2 _ Oy VP aae oP ae Hint. Rationaize the womeratorsia (i). Remark. Inthe cases when [b| = VBF= ac, ‘one must proceed wit caution to avoid loss of precision due toa catastrophic can” cellation, If > 0, then x1 should be computed with formula (i) and x2 should be computed using (i). However, if 5 < 0, then x; should be computed using (i) and x3, should be computed using (i 13, Use the appropriate formule forx1 and x2 mentioned in Exercise 12 wo find the roots of the following quadratic equations. (2 —1,000.001x+41=0 (b) x? 10,000.0001x +1 (©) x - 100,000.00001x +1 =0 (a) x? ~1,000,000.000001x +1 = 0 “ Algorithms and Programs 1. Use the resuls of Exercises 12 and 13 to construct an algorithm and MATLAB pro- gram that will accurately compute the roots of a quadratic equation in all situations, Including he troublesome ones when [b1=-VB¥= fae 2, Follow Example 1.25 and generate the fist ten numerical approximations for each ofthe following thee difference equations. In each case a small initial er isin ‘woduced. IF there were no inital errr, then each ofthe difference equations would generate the sequence {1/2")2 . Produce output analogous 10 Tables 1.4 and 1.5 and Figures 1.8, 1.9, and 1.10 (9) 1m = 0.994 and ry = dre ©) po 1 pr = 0497, and py The Solution of Nonlinear Equations f(x) =0 ‘Consider the physical problem that involves a spherical ball of radius r that is sub- ‘merged to a depth d in water (see Figure 2.1). Assume thatthe ball is constructed from 4 variety of longleaf pine that has a density of p = 0.638 and that its radius measures ‘cm, How much of the ball will be submerged when it is placed in water? ‘The mass My of water displaced when a yphere is submerged to depth d is nd*Gr—a) — fixe G@-ar and the mass of the ball is My = 4xr#p/3, applying Archimedes" law Mu = Mb. produces the following equation that must be solved: xd? - 34 +4°>p) Figure 2.1 The portion of sphere of radius r that isto be sub- merged to & depth d. SEC, 21 ITERATION FOR SOLVING = g(2) 22582 ~ 00+ dP igure 22. The cubic y = 2852 — 30d? 4.4 In our case (with r 0 and p = 0.638) tis equation becomes (2552 — 30d? + a) ; =0. ‘The graph ofthe cubic polynomial y = 2552 — 30d? + d° is shown in Figure 2.: from it one can see thatthe solution lies near the value d = 12, ‘The goal of this chapter is to develop a variety of methods for finding nume approximations forthe roots of an equation. For example, the bisection method « be applied to obtain the three roots d = ~8.17607212, dz = 11.86150151 ds = 26.31457061. The fist root d} is nota feasible solution for this problem, bes cannot be negative. The third root ais larger than the diameter of the sphere i not the desired solution. The root 2 = 11.86150151 lies inthe interval {0,20 is the proper solution. Its magnitude is reasonable because a litle more than onc of the sphere must be submerged. Iteration for Solving x = g(x) ‘A fundamental principle in computer science is iteration. As the name sugge process is repeated until an answer is achieved, Iterative techniques are used t+ roots of equations, solutions of linear and nonlinear systems of equations, and sol of differential equations. In this section we study the process of iteration using ry substitution ‘A re or function g(x) for computing successive terms is needed, together starting value pp. Then a sequence of values (pa} is obtained using the iterativ 42 Cay? THESOLETION OF NONLINEAR EQUATIONS f(x) = 0 Piet = glps). The sequence has the pattern po (starting value} Pi = apo) pa ap) o ‘What can we lear from an unending sequence of numbers? Ifthe numbers tend toa limit, we feel thac something has been achieved. But what if the numbers diverge or are periouhe? The next example addresses tis situation, Example 2.1. The iterative rl 1 and pet = 1 001 ps for k = 0, 1... produces divergent sequence, The fist 100 terms look as follows: px = 1.0019 = (1.001)(1,000000) = 1.001000, 2 = 1.0019) = (1.001)(1,001000} = 1.002001, ps = 1.001 pz = (2.001)(1,002001) = 1.003008, 190 = 1.001 pop = (1.001)¢1.104012) = 1.108216, Te process can be continued indefintely. and itis easily shown that liny--a. Pa = = ‘In Chapter 9 we will see thatthe sequence {py} is a numerical solution tothe ditferential equation y’ = 0.001. The soluion is known tobe v(x) = 2°. Indeed, if we compare ‘the 100th term :n the sequence with (100), we see that pjog = 110Si 10 = 1.195171 2 = 100), . In this section we are concerned with the types of functions g(<) hat produce convergent sequences ip) Finding Fixed Points Definition 2.1 (Fixed Point. A fired point of a function g(x) is a real number P such that = g(P) a Geometrically the fixed points of 2 functor.» g(x) and a(x) are the points of imersection ofs Definition 2.2 (Fixed-point Iteration). The eration pri = gpa) for n = 0, 1... is called fixed-point iteration. 2 Skc.2.5 [ERATION FoR SOLVING. ro e Theorem 2.1. Assume that g is acontinyocs function and that {pa} is sequence generated by fited-point iteration. Him... pq = P. shen P isa fixed point of g(x) Prefs lite ne pa =P. the bese Pas) = P. I Jollows from this result, the ertinys off, andthe relation pre = e(pa) that @) 8 (jinn, Po) = i, (Po) = fi, Pat = P ‘Therefore, i ned point of (0) . Example22. Consider he convergent iteration po=OS and pine fork “The fies 10 terms are obtained by the calculations 6 9000 = 0606531 545239 0.879703 pe “The sequence is converging, and further calculations revea hat ling pa 0.507143 ‘Twat we have found an approxiroation for the fixed potat ofthe function y = . ‘The followizg two theorems establish conditions forthe existence of a fixed point tnd the convergence of the Fxed-point iteration process toa fixed point ‘Toeorem2.2. Assume that g € Cla. bj. 18) Uf the range ofthe mapping y = g(x) satisties y € (a,b! for allx & ja. 6]. hen 4 bas atixed point in (a, 5! (4) Fuithérmore. suppose that g(x) is defined aver (a, 6) and that a positive constant 1A < Lexis with (go) = K < 1 forall x © (a,b). then g has 2 weigoe fred pint Pin a,b] 44 Cuae 2 THe SOLUTION OF NoxLingAR EQUATIONS f(x) =0 Proof of (3). Uf g(a) = a or g(b) = b, the assertion is tue. Otherwise, the values of g(a) and g() must satisfy g(a) € (a, 6] and g(b) € [a, 6). The function f(x) x = g(x) has the property tha Flay=a~glay<0 and fH) Now apply Theorem 1.2, the Intermediate Valle Theorem, to f(x), with the constant 1. = 0, and conclude that there exists a number P with P € (a, b) so that f(P) = 0, Therefore, P = g(P) and P is the desired fixed point of g(x). Proof of (4), Now we must show that thi solution is unique. By way of contradic tion, letus make the additional assumption that there exist rwo fixed points Pi and Pa. [Now apply Theorem 1.6, the Mean Value Theorem, and conclude that there exists number d € (a, 6) s0 that b~ g(b) > 0. 6) Next, use the facts that g(Pi) = P| and g(P1) = Pz to simplify the right side of equation (5) and obtain igre s@= Bah But this contradicts the hypothesis in (4) that |2'(x)| < 1 over (a, 6), 80 it is not possible for two fixed points to exist. Therefore, g(r) has a unique fixed point P in [a, 5] under the conditions given in (4). . Example 2.3. Apply Theorem 2.2 to rigorously show that g(x {xed point in (0,1) Clearly, ¢ € C10, 1). Secondly, g(r) = cos(x) is decreasing function on [0 1), thus its range on (0, 1] [208(1) 1] ¢ (0, 1]. Thus condition (3) of Theorem 2.2 is stistied and «gas a fixed point in (0, 1}. Finally, ifx € (0, 1), then [g'(x)| = | — sin(x)] = sin(x) < sin(1) < 0.8415 < 1. Thus K = sin(1) < |, condition (4) of Theorem 2.2 is satisfied, and sr has.a unique fixed point i [0 1]. . c03(r) has a unique We can now state theorem that can be used to determine whether the fixed-point iteration process given in (1) will produce a convergent or divergent sequence. Theorem 2.3 (Fixed-point Theorem). Assume that (i) 2, g' € Cla, 6), (i) K isa positive constant, (il) po € (a,b), and (iv) g(x) € [a,b] forall x € [a,). (6) IF gx) = K < | forall x € {a,b}, then the iteration py = g¢Py-1) will converge (0 the unique fixed point P € [a,b]. In this case, P is said to be an attractive fixed point (7) IF lg] > 1 for all xe [a,b], then the iteration py = g(Py—1) will not converge to P. In this case, P is said to be a repelling fixed point and the iteration exhibits local divergence. S5C.2.1 ITERATION FOR SOLVING x = g(x) 4s [> IP p>} IP - pd —s} a ao Figure 23. The rcationship among P, po, pts |P — pol and? — pl ‘Remark 1. Itis assumed that po # P in statement (7). Remark 2. Because gis continuous onan interval conning P, its permissible to use the simpler criterion [g"(P)| < K < Vand |g/(P)! > 1 in (6) snd (7), respectively Proof We first show that the points (P4J%q all le in (a,b). Starting with po. we apply Theorem 1.6, the Mean Value Theorem. There exists a value cp ¢ (a, 6) so that IP~ pi leCP) — s(Podl = le'{co)(P — pod) igo? ~ pol < KIP ~ pol < \P ~ pot 8) ‘Therefore, pi is no further from P than po was, and it follows that p, € (a,b) (see Figure 2.3) In general, suppose that py € (a, 6); then HP ~ pa! = Ig(P)— 8¢Pm-D1 = |8€en-i(P ~ Pr 0 i Ig'€€u DIP ~ Peal SKIP ~ Pratl < 1P ~ Patt Therefore, py € (a,b) and hence, by induction, all the points {py} lien 2. b) “To complete the proof of (6), we will show that ao) lim, P — pel = 0. First, ¢ proof by induction will establish the inequality ay IP — pal SKIP = pol The case n = I follows from the details in relation (8). Using the induction hypothesis |P — pai} < K*-!/P — po| and the ideas in (9), we obtain VP = pol = KVP paca S KKOUP ~ pol = KP ~ po ‘Thus, by induction, inequality (11) holds for all n. Since 0 < K < 1, the term K* oes 10 Zero as n goes 10 infinity. Hence 0, (12) 0 jim 1P = pol € im K"IP ~ pol ‘The limit of |P — p,j is squeezed berween zero on the left and zero on the right, so we ean conclude that lims-+2o | ~ Pn| = 0. Thus littn-+7e Py = P and, by Theorem 2.1 the iteration py = g(pu-1) converges tothe fixed point P. Therefore, statement (6) of ‘Theorem 2.3 is proved, We leave statement (7) forthe reader to investigate. . 46 CHAP.2 THE SOLUTION OF NONLINEAR EQUATIONS (x) Figure 2.6 (0) Osilating convergence when —1 < g'(P) <0, Corollary 2.1. Assume that g satisfies the hypothesis given in (6) of Theorem 2.3, Bounds for the error involved when using p, to approximate P are given by 13) IP pal = K"1P = pol for all n = 1, and *ip1 ~ pol 14) ~ pals SPO oe ath me co) IP ~ pal s PL > S6C.2.1._ ITERATION FOR SOLVING x = a(x) ” Figure 25 (a) Monotone dive gence when 1 < @'(P) Figure 25 (b) Divergent oscilla tion when g/(P} <= Graphical Interpretation of Fixed-point Iteration Since we seek a fixed point P to g(x), itis necessary that the graph of the curve y = g(x) and the line intersect at the point (P, P). Two simple types of ‘convergent iteration, monotone and oscillating, ar illustrated in Figure 2.4(a) and (b), respectively. ‘To visualize the process, start at pon the x-axis and move vertically to the point (po, pt) = (po, 2(po)) on the curve y = g(x). Then move horizontally from (po, p1) to the point (pi, pi) on the line y = x. Finally, move vertically downward to p on the x-axis, The recursion py. = 2(pa) is used to construct the point (Pq, Past) On the graph, then a horizontal motion locates (?n1, Pn 1) on the line y =x, and then a vertical movement ends up at py.1 0n the x-axis. The situation is shown in Figure 2.4, 48 Cliap. 2 THe SOLUTION OF NONLINEAR EQUATIONS f(x) If|g'(P) > Ba then the iteration pas = g(x) produces a sequence that diverges away from P. The two simple types of divergent iteration, monotone and oscillating, ace illustrated in Figure 2.52) and (b), respectively. (pe) when the function g(x) = L+x— 27/4 = g(2). The two solutions x72, Example 24, Consider the iteration py. is used. The fixed points can be found by solving the equation (fixed points of g) are x = ~2and.x = 2. The derivative ofthe functions g(x) ‘and there are only two cases to consider. casei Pu-2 case i: Pm? ‘Stat with = 205, Sian with then get = 2100605 then get = 220378135 =24r7944a1 in, pes = lig, pn =2 Since g(x > 9 00 [~3.—H1, by The- ‘orem 23.the seqiene will pt converge P= Sites 61 < § 08 1.31, by Too: tem 23th sequence will Cone ‘Theorem 2.3 does not state what will happen when g/(P) = 1. The next example has been specially conscructed so that the sequence {?,] converges whenever pp > P and it diverges if we choose pp < P. Example2.5. Consider the iteration p+ = (pa) When the function g(x) = 2¢x~1))/2 forx 2 1is used, Only one fixed point P = 2 exists. The derivatives g(x) = 1/c— 1)! and ¢/(2) = |, 50 Theorem 2.3 does not apply. There are two cases 10 consider when the starting value lies to the left or right of P = 2. qi Case (i): Sta with py = 15, Case (i): Sian with pp = 25, then gee 121356, then get py» 2.448974 2a7t8851 2 240789513 07179943 Pea237I09sI4 3390832 4 =2.34358284 (0 46409168)"/2, i, P=? Since py lies ouside the domain of ‘This sequence is converging too slowly 12) he term ps canna be computed to the value P = 2: indeed, Pon = 200398714, SeC.2.f ITERATION FOR SOLVING x = gtx) ” Absolute and Relative Error Considerations In Example 2.5, ease (i), the sequence converges slowly, and after 1000 iterations the three consecutive terms are ‘Piowo = 2.00398714, —pjoor = 2.00398317, and pow ‘This should not be disturbing: afterall, we could compute a few thousand more terms and find a beter approximation! But what about a criterion for stopping the iteration?” Notice that if we use the difference between consecutive terms, poo — prone! = [2.00398317 — 2.00397921) = 0.00000396. ‘Yet the absolute error in the approximation pyoo0 is known to be LP ~ proool = |2.00000000 ~ 2.00398714| = 0.00398714. ‘This is about 1000 times larger than |piooi — pooa| and it shows that closeness of consecutive terms does not guarantee that accuracy has been achieved. But itis usually the only criterion available and is often used to terminate an iterative procedure. 00397921. function (k,p,err P]=fixpt(g,po, tol ,aaxt) 4% Input ~ g is the iteration function input ac a string ’g’ % = pO is the initial guess for the fixed point * = tol is the tolerance x + maxi is the maximum number of iterations Yourput ~ k ie the munber of iterations that vere carried out P is the approximation to the fixed point err is the error in the approxination P contains the sequence {pn} erreabs(P(K)-P(r-1)) ; relerr-err/(abs(P(x))+eps) ; peP(k) if (err 1 om this interval, Ifthe fixed point P and the initial approximations py and py lie in the interval (a,b), then show that = (pu) implies that [E\| = |P — pil > IP — po! = |Eol Hence statement (7) of Theorem 2.3 is established (local divergence), 8. Let g(x) = 0.00013? + x and p (a) Show that po > py > <-> Pa > Pett > (8) Show that py > O forall s(x) and find all he fixed points of g (there are in- iteration be used to find the solution(s) tothe equation 1 and constdee fixed-point iteration 22 SEC.2.2 BRACKETING METHODS FOR LOCATING A ROOT st (©) Since the sequence {py} is decreasing and bounded below, it has a limit. What is tae limit? Let g(x) = 05x + 1.5 and pp = 4, and consider fixed-point iteration (a) "Show thatthe fixed point is P = 3 (b) Show that |/P ~ ppl = 1P — py-t|/2for n = 1,2,3, (©) Show that /P — pql = IP ~ pol/2" forn = 1,2,3, 10, Let g(x) = x/2, and consider fixed-point iteration (a) Find the quantity ‘p11 ~ pal/\P4sal (b) Discuss what will happen if only the relative error stopping criterion were useu in Program 2.1 11. For fixed-point iteration, discuss why i isan advantage to have g'(P) ~ 0. Algorithms and Programs 1, Use Program 2.1 to approximate the fixed points (if any) of each function. Answers should be accurate to 12 decimal places. Produce @ graph of each function and the line y = x that clearly shows any fixed points. @ ex) 3x3 — 2s? +2 ) cos(sin(x)) © @ Bracketing Methods for Locating a Root Consider a famitiar topic of interest. Suppose that you save money by making regular ‘monthly deposits P and the annual interest rate is 7; chen the total amount after N poss s eee): P(i nye x ta) tore(g ‘The fst term onthe ht side of equation (1 iste ls payment. Then the next ost payment, which has eared one period of interest, contributes P (1 + 74). The second- from-last payment has earned two periods of interest and contributes P (1+ 44)? and soon. Finally the last payment, which bas earned interest for N—1 periods, contributes (1+ 4)" toward the total. Recall that the formula for the sum of the N terms of a geommettic series is a @ 52 Cap, 2 ‘THe SOLUTION OF NONLINEAR EQUATIONS (x secant em 1 ty ayy ty se (re(redpe(vedy ee (ed)”) plicit geo: This can be simplified to obtain the annuity-due equation, 2 saz {(r4)'-) ‘The following example uses the annuity-due equation and requires a sequence of repeated calculations to find an answer, Example 2,6. You seve $250 per moath for 20 years and desire thatthe total value of all payments and interest is $250, 000 atthe end of the 20 years. What interest rate J is needed to achieve your goal? If we hold N = 240 fixed, then A isa function of F alone that is A = A(D), We will start with two guesses, fo = 0:12 and Jy = 0.13. and perform a 1to narrow dowa the final answer. Starting wi e050. oy so = i ((1222)-3) 1 o1212s/12 (( ea: ) ‘) 251,518. 4.12125) SEC.2.2 BRACKETING METHODS FOR LOCATING « ROOT 83 fasay (a.fay (fon (6.70) 40) (a) If fia) and fic) have ‘opposite signs then secre from the right, (©) IF fe) and fb) have ‘opposite signs then squeeze from the leit Figure 26 The decision process fo the bisection process Further iterations can be done to obtain as many significant digits as required. The ‘purpose ofthis example was o find the value of J that produced a specified level L ofthe {unction value, that i to find solution to A) = L. Its standard practice to place the ‘constant om the left and solve the equation A(T) ~ L. . Definition 2.3 (Root of an Equation, Zero of a Function), Assume that f(x) isa continuous function, Any number for which f(r) = Dis called aroot of the equation (2) = 0. Also, we say r is azero of the function f(x) “ For example, the equation 2x? + Sx 7 = ~3, whereas the corresponding function f (x hhas two real zeros. ry = 0.5 and r2 = —3. 0 has two real roots r) = 0.5 and 2 45x =3 = x E+3) ‘The Bisection Method of Bolzano Inthis section we develop our first bracketing method for finding a zero ofa continuous function, We must start with an initial interval (a, b], where f(a) and f(b) have ‘opposite signs. Since the graph y = f(x) of a continuous function is unbroken, it will crossthe x-axis ata zero.x = r that ies somewhere in the interval (see Figure 2.6), The bisection method systematically moves the end points ofthe interval closer and closer together uati] we obtain an interval of arbivarily small width that brackets the zero The decision step for this process of interval halving is frst to choose the midpoint 54 CHAP.2 THE SOLUTION OF NONLINEAR EQUATIONS f(x} = 0 ¢ = (@-+.b)/2and then to analyze the three possibilities dhat might arise: «@ If fla) and f(ct Rave opposite sigas. a zero lies in la. cl “ If fc) and f(b) have opposite signs, a 2ero les in [c, 6) “6 If f1e) = 0. then the zero isc either ease (4) oF (5) occurs, we have fond an interval half as wide asthe original interval that contains the root. and we are “squeezing down on it” (see Figure 2.6). To ‘continue the process, zelabel the new smaller iterval la, 6] and repeat the process until ihe interval is as small as desired. Since the bisection process involves Sequences of nested intervals and their migpoints, we will use the following notation to keep wack ‘of the details in the process: jo, bo} he staring interval and cy = 52 i the midpo.. ‘as. bis the second interval, which brackets the zero r, ane is ts mado; te interval (a. i] is alf as wide as lao, Pol After artiving atthe mh interval {ap. Da}, which brackets r and has midpoi the interval (ayy. Boas] is constructed, which also brackets r ands half as wade a [09 Fl Tris left as an exercise forthe reuder to show that the sequence of left end points is increasing and the sequence of right end points s decreasing; that is, @ a $0) $+ Sy SSP S00 S by So SH Sy, where cy = S72, and if f(an—1)f (ones) <0. then (9) Laeets Bast] = Lanse OF Cast Best] = Lene bn) for all m ‘Theorem 2.4 (Bsection Theorem). Assume cha: f € Ca, b] ane that there exists a momber r € [a,b] such that fir) = 0. Hf f(a) and £O, break,ond maxiel+round( (10g(b~ for kel :maxt e=(avb)/2; yesteval (fc); Af yom bee: Af yoeyero bees yor ose yasye; end if bra < delta, break, end end ex(aeb)/2; errvabs(b-a); yerteval (fc) Log (delta)) /og¢2)) 60 CHAP.2. THE SOLUTION OF NONLINEAR EQUATIONS f(s) [Program 2.3 (Fae Psi or Regula Fall Method. To wpyoxinasarooiot | the equation #(«) = O in the interval (ab, Proceed withthe method only if (x) | iscontinuous and f(a) and (6) have opposite signs function (¢,err,yc]=regula(f,a,b,delta,epeilon maxi) ‘input - f ie the function input as a string 'f? “ = a and b are the left and right end pointe % = delta ie the tolerance for the zero % ~ epsilon is the tolerance for the value of f at the zero % ~ maxi is the maxizuz nunber of iterations Yourpur - ¢ is the zero % > ye=t(e) x = err is the error estinate for ¢ fal (ta); yorteval(£,b): Af yaeyb>0) AispC Note: £(a)*#(o)>0°), break, end for ketiaaxt axeybe (oma) /(yb-ya); cnbeax} cee yerfeval (fc); Af yor=0,broak; elseif ybeyc>o bec yoryes else y end dxrmin (abs (4x) .20); if abs(dx) 3, then (ap) f(bo) < 0. Thus. on the interval (a9, bo) the bsecon meta wil come tone of te vee 2708. Tag = 1a ly > 3 foe selesed sch thal cy = Sf eno equal fo 12,3 forany n> 1, then the TSseation method will ever converge to which m0) Why? 18, Ife polynomial, /(2), has a od number of rea rosin the interval a. by an ity, then f (a9) f(bo) < 0, and the bisection teh il comergeto oe othe ern. If = land by > 3 ar elected seh that er = si not equal tony ofthe Zeros off (2) fr ayn >. then te bisection method will never converge to which zero(s)? Why? Algorithms and Programs 1 Find an approximation accurate to 10 decimal places) forthe mere ate 7 hat wil Yield a ol anny valae of 8500, 0001240 monthly payments of 3300 re made. 2 Consider aspherical ball of radissr = 15 em ta is coneructed fom a varity of white cak tat has dest ofp ~ 0710. How much of the ball accurte t 8 doial pas willbe stomerged when its placed in wate? {4 Mocify Programs 2 and 2.3 to oupit a matrix analogous Tables 2.1 and 2.2, tepestiely (oe the it ow of the matin would [029 co PO. Fea) 4. Use your programs rom Problem 3 o approximate the three smallest poskive oot ofr = tan) (actuate § decimal place), 5. Ait phere i cut into wo segments byaplane. One segment haste times the Colum ofthe ote, Determine the distance of he plane fom the centr ofthe sphere (accurate to 10 decimal paces) 3. Initial Approximation and Convergence Criteria The bracketing methods depend on finding an interval (a,b) so that f(a) and J (b) have ‘opposite signs. Once the interval has been found, no matter how large, the iterations will proceed until a root is found. Hence these methods are called globally convergent. However, if f(x) = Ohas several roots in {a,b}, then a diferent starting interval must ‘be used to find each root, Its not easy to locate these smaller intervals on which f(x) ‘changes sign In Section 2.4 we develop the Newton-Raphson method and the secant method for solving f(x) = 0. Both ofthese methods require that a close approximation to the roo! SEC.2.3 INITIAL APPROXIMATION AND CONVERGENCE CRITERIA 6 be given o guarantec convergence. Hence these methods are called locally convergent ‘They usually converge more rapidly than do global ones. Some hybrid algorithms start with a globally convergent method and switch to a locally convergent method when the iteration gets close to a root Ifthe computation of reots is one part of a larger project, then a leisurely pace 4s suggested and the first ching to do is graph the function. We can view the graph f(x) and make decisions based on what it looks like (concavity, slope, oscillatory ‘behavior, local extrema, inflection points, etc.). But more important, ifthe coordinates of points on the graph are available, they can be analyzed and the approximate location of roots determined, These approximations caa then be used as stating values in our root-finding algorithms. ‘We must proceed carefully, Computer software packages use graphics software of varying sophistication, Suppose that a computer is used to graph y = f(x) on [a,b] Typically, the interval is partioned into N + 1 equally spaced points: a = x9 < xy < s+ < ay = b and the function values yy = f (xx) computed. Then either a line segment or a “fitted curve” are plotied between consecutive points (x1, ys—1) and (ip, ye) for k = 1, 2, ..., . There must be enough points so that we do not miss @ root in portion of the curve where the function i changing rapidly. If (x) is continuous and two adjacent points (xz. yet) and (xz. 94) lie on opposite sides of the x-axis, then the Intermediate Value Theorem implies that at least one root lies inthe interval [xx~1,xe1- But jf there is a root, or even several closely spaced roots, in the interval [x4—1.x4] and the two adjacent points (x41, ye-1) and (xe. 94) lie om the same side of the x-axis, then the computer-generated graph would not indicate a situation where the Intermediate Value Theorem is applicable. The graph produced by the computer will not be a true representation of the actual graph of the function f Itis not unusual for fonctions to have “closely” spaced roots; that is, roots where the graph touches but does not cross the x-axis, or roots “close” to 2 Vertical asymptote. Such characteristics ofa function need to be considered when applying any numerical root-finding algorithm. Finally, near to closely spaced raots or near a double root, the computer-generated ceurve between (14-1, ¥4—1) and (x4. 2) may fail to cross or touch the x-axis. If 1,702) is smaller than a preassigned value ¢ (Le.,f (41) * 0), then xy is a tentative spproximate root. But the graph may be close to zero over a wide range of values near 44, aad thus xp may not be closet an actual rool. Hence we a the requisement that the slope change sign near (x,y); that is, m,_, = ZZ and my = BE=2 mast have opposite signs. Since xy — x4) > O and x141 — te > 0; itis not necessary to use the difference quotients, and it wll sufice to check to see ifthe differences yx — 3-1 and yi.-1 — yx change sign. In this case, xe is the approximate root. Unfortunately, wwe cannot guarantee that this starting value will produce a convergent sequence. Ifthe graph of y = (+) has a local minimum (or maximum) that is extremely close to 2er0 then ic is possible that x, will be reported as an approximate root when f(x4) ~ 0. although xg may not be close to 2 100. 64 CHAP. 2. THE SOLUTION OF NONLINEAR EQUATIONS f(x) = 0 ‘Table 23 Finding Approximate Locations for Roots Function values | __ Differences in y Sinica changes ml ona | ow | ent | mm insedorsey a2 | =305 [ose ais? a 09 | “096s | “oss | ize | 0.663 | f change signin t-y.34) -o6 [ose | 10m | 06s | aise m03 | nom} iiss | ons | 0195 | 7’ changes sign nears, oo | ie | to [018] ose 03 10 | oer | -o3 | “03st 0% | 067 | 02s | 03 | 0237 09} 0256 | oo | -o257_| “ow | 7 cango sign noses, 12 | oo | aces [~o0e [057 gure 2.10 The graph of the cv bic polynomial y= x4 — x? = 24 1 Example 2.9. Find the approximate location ofthe roots of? — x2 — x + 1 = onthe terval [-1.2, 1.2] For illustration, choose N= 8 and look at Table 23, ‘The three abscisses for consideration ae ~1.05,~0.3, and(9. Because f(x) changes sign on the inerval [=1.2,—033}, the value =105 is\an approximate root: indesd F(C1.05) = -0.210, Although the slope changes sign near ~0.3, we find that f(—0.3) = 1.183; hence ~0.3s not near root, Finally the slope changes sign near 0.9 and (0.9) = 0.019, 500.9 | an approximate root (see Figure 2.10) . SEC.2.3. INITIAL APPROXIMATION AND CONVERGENCE CRITERIA “6 = igure 2.11. (4) The horizontal comergence band for locating a solution 10 7) =0. Figure 2.11) The vertical convergence band fo locating solution to f(x) = Checking for Convergence A graph can be used to sce the approximate location ofa root, but an algorithm must be ‘used to compute a value px that isan acceptable computer Solution. Heraton is often used to produce a sequence {74} that converges to a root p, end a termination criterion for strategy must be designed ahead of time so that the computer will stop When an aceurate approximation is reached. Since the goal isto solve f(x) = 0, the final value x should have te property that |f(q)l < ¢ “The user can supply a tolerance value «forthe sizeof |/(pq)| and then an iterative process produces points Pk = (pe. f(p2)) unl the last point Py lies in the horizontal hhand bounded by the lines y = te and y = —¢, as shown in Figure 2.11(@). This criterion is useful ifthe user is trying to solve A(x) = L by applying a rootinding 66 CHAP.2 "THE SOLUTION OF NORLINEAR EQUATIONS f(x) = 0 algorithm tote function f(x) = h(x) — Another termination citerion involves the abscissas,and we can try toteri the sequence [7s] is converging. If we draw the vertical lines x = p +8 anc cn each side of x = p, we could decide to stop the iteration when the pois! between these two vertical lines, as shown in Figure 2.11(b). The later criterion is often desired, but it is difficult to implement because it volves the unknown solution p. We adap this idea and terminate further caculatics when the consecutive iterates py and py are sufficiently close or if they agree witkn M significant digi. Sometimes the user of an algorithm will be satisfied f py © pq andl other ties when f(Pq) = 0. Correct logical reasoning is required to understand the cons quences. If we require that |p» — p| < 8 and \f(pq)) < ¢, the point P, wilt located in the rectangular region about te solution (7, 0) as shown in Figure 2.12( If we stipulate that [pn ~ pl = 8 OF If (p4)] = ¢; the point P, could be locate anywhere in the region forme by the union of te horizontal and vertical stipes, showin in Figure 2.12(b), The size of the tolerances 8 and ¢ arc crucial. Ifthe ci cerances are chosen too stv iteration may continue forever. They should be choscr about 100 times larger than 10~¥, where M is the number of decimal digits in the computer's floating-point numbers. The closeness ofthe abscissas is checke« wth on ofthe entra Pe — Pratl <3 estimate forthe absolute error) 2p ~ Prot! ial +1Pa-il ‘The closeness ofthe ordinate is usually checked by (f(Pa)| <<. <4 (estimate forthe relative err). ‘Troublesome Functions ‘A computer solution to f(x) = 0 will almost always be in error due © ro and/or instability in the caleulations. If the graph y = f(x) is steep m (p,0) then the root-finding problem is well conditioned (ie., a solution with seveva) significant digits is easy to obtain). Ifthe graph y = f(x) is shallow near (7.0) then the root-finding problem is ll conditioned (.., the computed root may have 0"! afew significant digits). This occurs when f(x) has a multiple root at p. This i> >approot (X,0.00001) 71.9875 -1,6765 1.1625 1.1625 1.6765 1.9875 Sec. 2.3 INITIAL APPROXIMATION AND CONVERGENCE CRITERIA o ‘Comparing the results with the graph of f, we now have good initial approximations for ‘one of our root finding algorithms. . Exercises for Initial Approximation {In Exercises | through 6 use a computer or graphics calculator to graphically determine the approximate location of the roots of f(x) = 0 in the given interval. In each case, determine an interval {a,b] over which Programs 2.2 and 2.3 could be used to determine the roots (i.e. f(a) f(b) <0). A fa) =x%— et for-2 SO vp Sin) Figure 2.13. Tho georeuie constuction of py and p2 for the Newton-Rupason method. SEC. 2.4 NEWION-RAPHSON AND SECANT METHODS 1" “The process above ean be repeated wn obtain a sequence {ps} that converges to. ‘We row miake these ideas more precise, Tacorem 2.5 (Newion-Raphson Thieorem). Assume that f € C2ja,b) and there exits a number p € [a,b], where fp) = 0 IC /’(p) # 0, then there exiss2 5 > 0 such tat the sequence {74122 defined by the Heraion w Pe = (pr) = Pea will converge to p for any initial approximation po € [p — 8. p+ 81 ‘Reinark. The function g(x) defined by formula fe) Fw is called the Newion-Raphson ieration function. Since j(p) = 0, itis easy to see p. Thus the Newlon-Raphson iteration for finding the 001 of the equation fl) = 0 is accomplished by finding a fixed point of the function g(x), 6 ate) Proof. The geometric construction of p shown in Figure 2.13 does not help in un- derstanding why rp needs to be close to p oF why the continuity af f(x) i essential (Ouranalysis starts withthe Taylor polynomial of degree n = 1 and its remainder term: FON = pol? 0 Soe < SPo) + F" (pod pod + ‘where lies somewhere between py and.x. Substituting x using the fact that f(p) = 0 produces. p into equation (6) and LMP = po? L109) + F (p00 — poy ¢ HOG 1 mis lose enough to. ast em onthe igh she of 7) wl be smal com porch sum ofthe hin two terme, Hence can be neglected and we can ete ‘yoximaion a 8) 0% J(p0) + S'(poX ~ pod. Solving for p in equation (8), we get p * po — f(POd/f’(po). This is used to define the rexk approximation py tothe root (po) f(Po When pi is used in place of pc in equation (9). the general rule (4)is established. For 1most plications this is all that noeds wo be understood. However, to fully comprehend 0 PL= po 72 CHAR.2 THE SOLUTION OF NONLINEAR EQUATIONS. ‘what is bappening, we need to consider the fixed-point iteration function and epply ‘Theorem 2.2 in our sitvation, The key is in the analysis of g'(x): _ OOF) = FOF") _ fOIF"w) Far rar” By hypothesis, f(p) = 0: thus £/(p) = 0. Since g’(p) = O and g(x) is continuous, ic is possible to find a > 0 so thatthe hypothesis |e’(x)| < 1 of Theorem 2.2is satisfied on (p ~ 6, p +8). Therefore, a sufficient condition for pp to initialize a convergent sequence (?1)}29, which converges to a root of f(x) = 0.is that po ¢ (p ~ 4, p +8) and that 8 be chosen so that La f"Co) POP g@=l 9) <1 forall re (p—5.p 48) . Corollary 2.2 (Newton's Iteration for Finding Square Roots). Assume that A > 0 is areal number and let pp > Obe an initial approximation to «/A. Define the sequence (palzo using the recursive rule 4 Peni + Pei 2 ‘Then the sequence {pe If2g converges to V/A; that is ity + be = VA for k ap pe Outline of Proof. Start with the function f(x) = x? — A, and notice that the roots of the equation x? ~ A = Oare 4/4, Now use f(x) and the derivative fx) in formula (5) and write dawn the Newton-Raphson iteration formola 10 _, fia ae This formuta can be simplified to obtain aa ele at a3) a= ‘When g(x) in (13) is used to define the recursive iteration in (4), the result is formula (11), Tecan be proved thatthe sequence that is generated in (11) will converge for any starting value po > 0. The details are left for the exercises. : ‘An important point of Corollary 2.2 is the fact that the iteration function (1) involved only the arithmetic operations +, —, x, and /. If g(+) had involved the ca culation ofa square root, we would be caught inthe circular reasoning that being able tw calculate the square root would permit you to recursively define a sequence that will converge to V/A. For this reason, f(x) = x2 — A was chosen, because it involved onl the arithmetic operations SBC.2.4 NEWTON-RAPHSON AND SECANT METHODS n Example 2.11, Use Newton's square-root algorithm to find V5. ‘Starting with pp = 2 and using formula (11), we compute min 2t32 29 pe BEB amon IRIN ESO 9 youre paw 230081978 pease 2.236067978. Punter iterations produce py * 2236067978 for k > 4, so we see that convergence accurate to nine decimal places hes been achieved, . Now let us turn to a familiar problem from elementary physics and see why de- termining the location ofa root is an important task. Suppose that a projectile is fired ‘rom the origin with an angle of elevation bg and intial velocity vp. In elementary courses, air resistance is neglected and we learn thatthe height y = (0) and the dis lance taveled x = x(t), measured in feet, obey the rules ro) yaoyt—t6? and x= ut, “where the horizontal and vertical components of the initial velocity are v- = vp costo) and v = vpsin(éy), respectively. The mathematical model expressed by the rules in (14) is easy to Work With, but tends to give too high an altitude and too long a range forthe projectle’s path. If we make the additional assumption that the air resistance is ‘proportional tothe velocity, the equations of motion become as fi) = (e420 (1-28) 201 and 06 sertoecu (ie), where C = m/k and k is the coefficient of air resistance and m is the mass of the projectile, A larger value of C will result in a higher maximum altitude and a longer range forthe projectile. The graph of a flight path of a projectile when air resistance is considered is shown in Figure 2.14. This improved model is more realistic, but requires the use of a root finding algorithm for solving, f(¢) = 0 to determine the elapsed time ‘until the projectile hits the ground. The elementary mode] in (14) does not require a sophisticated procedure to find the elapsed time. 74 CHae.2 THE SOLUTION OF NONLINEAR BQUATIONS f(x) = 300 neces) Figure 216 Path of a projectile with ar resistance considered. 200 400 so 80000 Weight, Fon) weaaera00 6.683987) 0.03050700 874217467 =0.00000100 870217466 Ms ‘e.90000000 Bxample 212. A projectile is fired with an angle of elevation by = 45%, vy = vy 160 fusec, and C = 10, Find the elapsed time unt immpact and find the range. Using formulas (15) and (16), the equations of motion are y = f(?) = 4800(1 — 6-1") 3207 and x = r(e) = 160001 — e "9, Since f(8) = 83.220972 and £(9) =31.534367, we will use the intial guess pg = 8. The derivative is #7() = 4806/10 — 320, and its value f"(po) = (8) = —104 3220972 used in formals (4) 1 get 83.22097200 1 8 os = 8797731010. ‘A sornmary of the calculation is given in Table 2.4, ‘The value ps has eight decimal places of accuracy, and the time until impact is 1 = £8.74217466 seconds. The range can now be computed using r(), and we get 7(8.79217466) 00 (1 eI) «gn ap8es0oR . "The Division-by-Zero Error (One obvious pitfall of the Newton-Raphson method is the possibility of division by zet0 in formula (4), which would oceur if f"(p,1) = 0. Program 2.5 has a procedure SeC.24 NEWTON-RAPHISON AND SECANT METHODS ra to check for this situation, but what use is the last calculated spproximation piy in this case? It is quite possible that fp.) is sufficiently close to zero and that p— isan acceptable approximation to the root. We now investigate this situation and will "uncover an interesting fact, that is, hOw fast the iteration converges. Definition 2.4 (Order of a Root). Assume that f(x) and its derivatives "(r), , £182) are defined and continuous on an interval about x = p. We say that F2) = O has aroot of order M at x = p if and only if ay F=f, FMP O, and Fp) 0. A root of order M = 1 is often called a simple root, and if M > 1, itis called a ‘muliple root. A root of order M = 2 is sometimes called a double root, and s0 on. The next result wil illuminate these concepts a Lemma 2.1. If the equation f(x) = 0 has a root of order M at x = p, then there exists a continuous function f(x) so that f(x) can be expressed asthe product as) fe (= Mn), where hp) #0. 2 anda Example 2.13. The function f(x) = x3 ~ 3x +2 bas a simple root at p| 3-3 ‘double root at p = 1. This can be verified by considering the detivatives /"(x) and f"(x) = 6x. At the value p = ~2, we have f(-2) = Oand {"(-2) = 9. so ‘M = 1 in Definition 2.4; ence p = —2 isa simple root. For the value p = 1, we have F(1) =0, f'(1) =0, and "(1) = 6, 50 M = 2in Definition 2.4; hence p = 1 double oot. Also, notice that (x) has the factorization f(x) = Gr + 2)¢x = 1 . Speed of Convergence The distinguishing property we seckis the following. If pis a simple root of fx) ‘Newton’s method will converge rapidly, and the number of accurate decimal places (roughly) doubles with each iteration, On the other hand, if p is a multiple root. the terror in each successive approximation is a fraction of the previous error. To make this precise, we define the order of convergence. This is a measure of how rapidly a sequence converges. Defintién 28 (Order of Convergence). Assume tat {Pq ]32q converges to p and ‘0 Ey = p~ Py tor n = 0 If wo positive constants A z Oand R > Oexist, and a9) 76 CHAP.2 THE SOLUTION OF NONLINEAR EQUATIONS fix) = 0 Table 2.5. Newion’s Method Converges Quadraically st a Simple Root $8C.2.4NEWTON-RAPHSON AND SECANT METHODS n ‘Table 26 Newton's Method Converges Linear ats Double Root : ql fe isa + ee apm Exaal | =2aoooonn0 | —oansaoesae ‘aaT6i90178 7 ‘aotomnma | —oopeaeias | aoe | — aaisisisis 1 | “Borers | conasseacs | conersoee | crscemae 1 Viogosscs—) “oesoorss | Co.ososoms |g sostess3 2 | zone | conse: | someon | oeeane!s 2 | tosssuan | “onossssos | “oassseno | Gaversins 5 | 2omncoesas toonosss9 3 | oasis: —O0rsiem | ~oozsmost: | o.son7sas 4 —2.000000000 -9.000000000, iit 1.013257730 0.066143) -0.013257730 0.501097775, 4 s 1,006643419 —0.003318055, ~0.006643419 (0,500550093 I then the sequence is said to converge to p with order of convergence R, The num- ber A is called the asymptotic error constant. The cases R = 1,2 are given special consideration. 2 i a ir If Ris lange, the sequence {py} converges apidly to p: thats, relation (19) implies that for lage values of n we have the approximation jEy-+i| ~ A\Ey|*. For example, suppose that R = 2and [Fq © 10"; then we would expect that [Ear] © A x 10~* Some sequences converge at arate that js not an integer. and we will see that the ‘order of convergence ofthe seeant method is R= (1 + o/5)/2 = 1.618033989, Ite convergence of {9 }92 is called linear. = 2, me convergence of {n]%2. is called quadrane. ‘ Example 2.14 (Quadratic Convergence at a Simple Root). Star with py = ~2.4 and use Newton-Rephson iteration to find the root p = —2 of the polynomial f¢x 3x42. The iteration formula for computing (pa) is 2p -2 3-3 Using formule 2) to check for quaatic convergence We gle valusin-Table2.5. 9 2) Pe = BCPA A detailed look atthe rate of convergence in Example 2.14 will reveal tha he error Jn each successive iteration is proportional to the square of the error in the previous iteration, That i, I= pestt ® Alp rab. where A ~ 2/3. To check this, we use 000008589 and |p — pl and it is easy to see that 0.000012931 Ip pst (0.003596011 0.000008589 ~ 0,00000862: 2 lp ps jie mk. Example 2.15 (Linear Convergence at 2 Double Root). Start with pp = 1.2 and use "Newton-Raphsom iteration to find the double root p = 1 of the polynomial f(x) =x? — arr? Using formula (20) 10 check for linear convergence, we get the values in Table 2.6. Notice that che Newton-Raphson method is converging to the double rot, but at slow rate. The values of f(ps) in Example 2.15 goto zero faster than the values of #"(pe) 30 the quotient f(p.)/f’(pe) in formula (4) is defined when pe # ‘The sequence is converging linearly, and the eror is decreasing by a factor of approx. imately 1/2 with each successive teration. The following theorem summarizes the performance of Newnon’s method oa simple and double roots. ‘Theorem 2.6 (Convergence Rate for Newton-Raphson Iteration). Assume that "Newton-Raphson iteration produces a sequence {4} verges to the root p of the function f(x). If p isa simple oot, convergence is quadratic and 2x) Ett ~i2 4 [Eai? for m sufficiently lange I p isa multiple oot of order M, convergence is linear and M ay Enel = G7 lEn! for m sufiienty large. Pitfalls ‘The diyision-by-rero error was easy to anticipate, but there are other difficulties that tre not $0 easy to spot, Suppose that the function is f(x) = x? ~ 4x + 5; then the sequence {px} of real numbers generated by formula (4) will wander back and forth from left to right and not converge. A simple analysis of the situation reveals that (2) > and has no real roots. 78 CuAP.2_ THE SOLUTION OF NONLINEAR EQUATIONS f(s) = 0 03 02 01 00 7 or? Par Ere Figure 215 (a) New2on-Raphson iteration for f(x) = xe can produce a divergent sequence Sometimes the initial approximation po is 100 far away from the desired root and the sequence (pe converges to some other root. This usually happens when the slope F (20) is small aod the tangent line tothe eurve y = f(x) is pearly horizontal. For example, if (2) = cose) and we seek the root p = 2/2 and star with py = 3, calculation reveals that p, = ~4,01528255, py = ~4,85265757, .... and (pa) will converge oa different root ~3n/2.% ~4.71298898 ‘Suppose that f(x) is pasitve and monotone decreasing on the unbounded interval Ia, 00) and po > a: then the sequence {pz} might diverge to +00. For example, if (5) = xe" and po = 2.0. then Pi=40, 333333333, pis = 19.723549434, and (pt) diverges slowly to +00 (sce Figure 2.15(a)). This particular function has another surprising problema. The value of f(x) goes to zero rapidly as x gets large, for example, f(pis) = 0:0000000536, and it is possible that p1s could be mistaken for ‘2 root. For this reason we designed stopping criterion in Program 2.5 to involve the relative error 2| 441 — pu|/(\pel+10~), and when k = 15, this value is 0.106817, so the tolerance 3 = 10-6 will help guard against reporting a false root, Another phenomenon, eyeling, occurs when the terms in the sequence {p,) tend to repeat oralmost repeat. Forexample if f(x) = x°—x~3 and the intial approximation is pp = O, then the sequence is 3.000000, pp = -1.961538, py = ~1.147176, py = 0.006579, 1.961818, py = -1.147430, a Ps = ~3,000389, ps and we are stuck in a cycle where pase © px for k = 0,1, ... (see Figure 2.15(b) But ifthe starting value pg is sufficiently close tothe root p ~ 1.671699881, then {pt} ‘SeC.2,4 NEWTON-RAPHSON AND SECANT METHODS ” Figure 2.15 (b) Newtoa-Raphson iteration for f(2) = x) x ~ 3 can produce a eyelic sequence. arctan) Figure 218 (c) Newton-Raphson iteration for f(r) = arctan(x) can produce a divergent oscillating sequence converges. If pp = 2, the sequence converges: pi = 1.72727272, p = 1.67369173. ‘ps = 1,671702570, and py = 1.671699881. ‘When [g'(x)[ 2 1 on an interval containing the root p, there is a chance of di ‘vergent oscillation. For example, let f(x) = arctan(x); then the Newton-Raphson iteration function is g(x) = x — (1 +x") aretan(x), and g’(x) = —2x arctan(x). Ifthe starting value po = 1.45 is cbosen, then 845931751, py = ~2.889109054 py = ~1.550263297, pp etc. (see Figure 2,15(c)). But ifthe staring value is sufficiently close to the root p= 80. CHAP.2 THE SOLUTION OF NONLINEAR EQUATIONS ow) foe) igure 2.16 The geometric construction of p> for the se cant method, a convergent sequence results. If py = 0.5, then pi = ~0,079559511, pz = 0.000335302, ps = 0.000000000, ‘The situations above point to the fact that we must be honest in reporting an answer. Sometimes the sequence does not converge, Itis not always the case that after iterations a solution is found, The user of a root finding algorithm needs to be warned of the situation when a root is not found. If there is other information concerning the context of the problem, then its less likely that an erroneous root will be found. Sometimes f(x) has a definite interval in which a root is meaningful. If knowledge of the behavior ofthe function or an “accurate” graph is available, then itis easier ‘© choose po, ‘The Secant Method ‘The Newion-Raphson algorithm requires the evaluation of two functions per iteration, (4-1) and f"(p-1)- Traditionally, the calculation of derivatives of elementary func: tions could invoive considerable effort. But, with modern compoter algebra software packages, ths has become less of an issue, Sull many functions have nonelementary forms (integrals, sums, ec.) andit is desirable to have @ method that converges almost 4 fast as Newton's method yet involves only evaluations of f(x) and not of f¢x) ‘The secant method will requice only one evaluation of f(x) per step and ata simple root has an order of convergence R ~ 1.618033989. is almost as fast as Newton's ‘method, which bas order 2. “The formula involved in the secant method is the same one that was used in the regula falsi method, except thatthe logical decisions regarding how to define each succeeding term are different, Two intial points (po f (po)) and (py. f(p1)) neaz the point (p,0) are needed, as shown in Figure 2.16. Define p2 to be the abscissa SEC.2.4 NEWTON-RAPHSON AND SECANT METHODS 8 Table 2.7 Convergence ofthe Secant Method ata Simple Root % Pest Fearn ° 2600000000 ‘0.200000000 ‘.go0000000 0.918152831 1 —2,400000000 (0293401015 ‘9400000000 (0.469497765 2 =2,106598985 0083957573 O.106s98085 o.gs7290012 3 ~2omrsarat2 0.01 130314 1622641812 0.593608822 ‘ =2.001511088 .001488561 .001511098 ownssei1i6 s ~2,000022537 0.0K22515 0.000022837 0727100587 6 | ~2ocao00022 .a0nc00022 ‘.000000022 7 =2.009900000 2.000000000 6.000000000 of the point of intersection of the line through these (wo points and the x-exis; then Figure 2.16 shows that p will be closer to p than to either pp or pr. The equation relating p, 21, and pp is found by considering the slope LOV~ FOO say o- fin Pi Po PP ‘The values of m in (25) are the slope of the secant line through the frst to approx: ‘mations and the slope ofthe line through (p., f(p1)) and (2, 0), respectively. Set the right-hand sides equal in (25) and solve for p = g(p1, po) and get LCP = po) Tip) ~ Fro) ‘The general term is given by the two-point iteration formula 25) (26) P= (PL Po) = Pi ~ FUP PR = Phos) ) 1 = s(n. pe-r) = pr — LODE ‘ Pk 8 (Pky Pea) = Pe Tipo — Fen) Example 2.16 (Secant Method at a Simple Rood. Star with pp = -26 ant Py = ~24 and use the secant method to fad the root p = ~2 ofthe polynomial function SQ) =x? — 3x42 Tn tis cave the iteration formula (27) (= 3+ 26m @ past = (ps. pit) = pr ~ PER DP PE Pin ~3Pk +3 P03 ‘Thiscan be algebraically manipulated to obtain pipe + P40) @) Pest = (Pb Pes) = PE+ pape-i * Py ‘The sequence of iterates is given in Table 2.7. . 82 CHAP.2. THE SOLUTION OF NONLINEAR EQUATIONS f(x) =0 ‘There is a relationship between the secant method and Newton's method. For a polynomial function f(x), the secant method two-point formula pei = g(P4, Pk) will reduce to Newton's one-point formula piz1 = (pe) if px is replaced by pr Indeed, if we replace pr by pk-1 in (29), then the right side becomes the same as the Fight side of (2) in Example 2.14 roofs about the rate of convergence of the secant method can be found in advanced texts on numerical analysis. Let us state thatthe eror terms satisy the relationship ‘prp) joe seal [Bgl (30) (Eves = Eel SE ‘where the order of convergence is R = (1 + /5)/2 % 1.618 and the relation in (30) is valid only at simple roots. To check this, we make use of Example 2.16 and the specific values |p — psi = 0.000022537 ip pal'S1* = 0.001511098' = 0.000027296, and ASIF (HD/2F A208 = (2/3)8 = 0,778351205, Combine hese and it is easy to see that .0.000022537 ~ 0,000021246 = Alp — pal! ®"* \p — psi Accelerated Convergence ‘We could hope that there are root-finding technigues that converge faster than linearly when p is a root of order M. Our final result shows that a modification can be made to Newton's method so that convergence becomes quadratic at 4 multiple root, ‘Theorem 2.7 (Acceleration of Newton-Raphson Iteration). Suppose that the ‘Newton-Raphson algorithm produces a sequence that converges linearly to the root is = pof order M > 1. Then the Newton-Raphson iteration formula _ MP FO) en Pe ‘will produce a sequence {pi )g that converges quadratically tp. ‘SEC.2.4 NEWTON-RAPHSON AND SECANT METHODS 83 Table28 Acceleration of Convergence ata Double Root ial ‘ wi me Best! nm pest pe apps au 2 Taoonooon | —oasaeasoen | —oaecoonen | — 051515150 1 1oogososes | —o.0ogs4sis | —ooncososos | augsries7e 2 1onn00se7 | —2.000008067 | —o.cooon0s? 3 1000000000 cooanence | .oncmneneo Tabie29 Compariton of te Speed of Comergence Special Retaion bowen Matt conslsrions or Biscction agi © HIERi Regula as Fiat = AIEML Scent eo Matipe rot Big = AIELL Newton Raphson Mutipe root Bini = AIBA Scant metod Simpl ot Bugs = ale ott Newton Rapin Simple ot Bust = alee? Accelerated Malipe rot Buns SALE ‘Newen-Raphion Exaraple 2.17 (Acceleration of Convergence at a Double Root). Start with po = 1.2 Ap ue cele Newton Raton aon find the donb at p = # of 8) = = 3c +2. Since M = 2, the acceleration formula (31) becomes pee pay 2 LD. Phat Ski —4 FO) 3p ,-3 ‘nd we obtain the values in Table 2.8 ‘ Table 2.9 compares the speed of convergence of the various root-finding methods ‘that we have studied so far. The value of the constant A is different for each method, 84 CHAP.2 Tuk SOLUTION OF NONLINEAR EQUATIONS f(x) = 0 | Program 2.5 (Newton-Raphson eration). To approximate a root of F() {given one initial approximation py and using the iteration Mm for k= 1,2, function [p0, err, x,y) -noveon af ,p0 delta, epeilon aaxi) Waput ~ £ is the object function input as a string ’f? ~ af is the derivative of # input as a string 'éf? ~ pO is the initial approximation to & zero of £ = delta ie the tolerance for pO ~ epsilon is the tolerance for the function values y ~ maxi is the naxinuz nusber of iterations = pO is the Nevton-Raphson approximation to the zero ~ err 19 the error estinate for pO ~ x Je the munber of iterations ~ y is the function value £(g0) for kel:maxt pi=p0-feval(t pO) /taval (€,p0); err=abs(pt-p0) ; relerr~2+erz/(abs(pi)+delta); po=pti; yofeval (£0): Af Corrcdelca) | (relerr = 0.000062, and Es = 0000000. Estimate the asymptotic ervor constant A and the order of convergence 2 ofthe sequence geverated by the iterative method, Algorithms and Programs 1. Modity Programs 25 and 2.6 to display an appropiate eror message whe () vision by zero occurs in (4) o€ (27), respectively, or (i) the maximum number ‘erations aaxt, exceeded 2 This often instatve to display the terms in he sequences generate by (4) and (2 (este second cola of Table 2.4). Modify Progam 2.5 and 26 to display th sequences Renerted by (4 and (27), respectively 3. Modify Program 25 o ase Newion’s sguare-100t algorithm t approximate cach the following square oot to 10 decimal places (2) Start with pp = 3 and approximate VE. (b) Start with po = 10 and approximate V/91. (Stat with pp = ~3 and approximate - VB. 4. Modify Progam 25 to use the cube-ro0t algorithm in Exercise 11 wo epproxims cach ofthe elng cube rots o 10 decimal paces. (a) Start with po = 2 and approximate 71°, (b) Start with 9 = 6 and approximate 200%. {c) Start with pp = —2 and approximate (~7)!/. Skc.2.4 NEWTON-RAPHSON AND SECANT METHODS ” 5. Modify Program 2.5 t0 use the accelerated Newlor-Raphson algorthin in Theo- rem 2.7 to find the rot p of order M of each ofthe following functions (a) f(x) = (x—2)5, M25, p = 2: star with po = 1 (©) fl) = sings), M = 3, p = 0; start with po = 1 (© f= = 1) Mole), M-=2, p= Ay stan with po = 2. ‘6. Modify Program 2.5 to use Halley's method in Exercise 22 to Gnd the simple 2010 of #2) = 23 ~ 3x +2, using po = 7. Suppose thatthe equations of motion fo a projectile are yas x =r(s) = 2400(1 - e!) (a) Find the elapsed time until impact accurate to 10 decimal places, (b) Find the cange accurate to 10 decimal places 8. (@) Find the point on the parabola y = 2? that is closest to the point (3, 1) accurate to 10 decimal places. (©) Find the point onthe graph of y 2.1.0.5) accurate to 10 decimal places. (©) Find the value of x at which the minimum vertical distance between the graphs of F(x) = x7 + 2 and g(x) = (4/5) ~ sin(x) occurs accurate 10 10 decimal places, (Ce ~ sin(x)) that is closest to the point ‘9 An opzttop box is consiructed from 2 rectangular piece of sheet metal measuring 10 by 16 inches. Squares of what size (accurate to 0,000000001 inch) should be cu from the corners ifthe volume ofthe box i to be 100 eubie inches? 10, Acatenary is de curve Formed by a hanging cable. Assume thatthe lowest point is (0,0); then the formula forthe catenary is y = Ceosh(x/C) ~ C. To determine the ‘atenary that goes through (-ta, b) we must solve the equation b = Ceosh(a/C) ~C for €. (Show that the catenary through (10, 6) is y = 9.1889eosh(«/9.1889) 9.1888, (b) Find the catenary that passes through (12, 5. 90 CHAP.2 THE SOLUTION OF NONLINEAR EQUATIONS f(x) =0 2.5 Aitken’s Process and Steffensen’s and Muller’s Methods (Optional) In Section 2.4 we saw that Newton's method converged slowly at a multiple root and the sequence of iterates (pe) exhibited linear convergence. Theorem 2.7 showed how to speed up convergence, but it depends on knowing the order ofthe root in advance. Aitken’s Process A technique called Aitken’s A? process can be used to speed up convergence of any sequence that is linearly convergent. In order to proceed, we will need a definition. Definition 2.6, Given the sequence (p9}24, define the forward difference App by a Px = Pati Pr for n>0. Higher powers 4 p, are defined recursively by 2 AM, = A NApa) for k > 2. . ‘Theorem 2.8 (Aitken’s Acceleration). Assume that the sequence (?}22, con- verges linearly to the limit p and that p — pq % 0 for all n = 0, If there exists a real number A with 14] defined by (pn (Past = Pa)? o = Pn Men" Pata 2Pavi + Pr converges to p faster than {4} qin the sense that 6) Proof. We will show how to derive formula (4) and will leave the proof of (5) as an ‘exercise. Since the terms in (3) are approaching a limit, we ean write o when 7 is large P— Prat ‘The relations in (6) imply that @ (~ Past? © (P — Pms2) (P ~ Pr) Sec. 2.5 AITKEN'S PROCESS AND STEFFENSEN'S AND MULLER'S METHODS SL Table 2.10 Linealy Convergent Sequence (py) Fe " Pn Brame | Ame ‘asosssosso | coseseTse0 | -0.586616600 0585238212 | -o.nrisoao7e | -0.336119357 0379703095 | 0.012559805 | -0.573400269 0360064628 | -o.00707%603 | —0.363596551 os7i72149 | oooso2sese | -0-369155385 ososss20s7 | -o.ooz2a0343 | ~0.566002381 ‘Table 2.11 Derived Sequence (gn) Using ‘Aitken’s Process * m ae T 036779899 .000135699 2 Ossrissiaz .000089852 . 0367139364 .onn016074 4 oseriasiss 2.000005163, 5 0.s6riaa9s2 0,000001662 6 0.s6714392s ‘000000534 ‘When both sides of (7) are expanded and the terms p? are canceled, the result is, oy _Pesata~ Pay mat {he fla in (is ed defn the ten gan be reamed lg to doin oma (9), ch hs es emer open when comp latent re me : for n @ 1 Pose Example 2.18. Show that the sequence (p,) in Example 2.2 exhibits linear convergence, and show thatthe sequence (qx) obtained by Aitken’s A process converges faster. ‘The sequence (pq) was obtained by fixed-point iteration using the function g(x) 6°* and starting with pp - 0.5. After convergence has been achieved, the limit is P= 0.567143200. The values pp and ga ae given in Tables 2.10 and 2.11. For illustration, the value of iis given by the calculation (wr pu py - Se 2p + Pi 0.606530660 - {2812914487 9 serpgg089, . 10.095755331 92 CHAR.2 ‘THE SOLUTION OF NONLINEAR EQUATIONS f(s) ep sey) % a a thy 1h, 120 ee Figure 2.17 The starting approximations pp, py. and p> for Muller's method. and the dliflerences ho ard hy. Although the sequence (qq) in Table 2.11 converges linearly, it converges faster than {py} in the sense of Theorem 2.8, and usually Aitken’s method gives a better improvement than this. When Aitken's process is combined with fixed-point iteration the result is called Steffensen 's acceleration. The details are given in Program 2,7 and sn the exercises, Muller’s Method Mailler’s method is a generalization of the secant method, in the sense that it doe ‘not require the derivative of the function. It is an iterative method that requires thre. starting points (po, f(po)). (Pi. f(p1)). and (pa, f(p2)). A parabola is consiructe! that passes through the three points; then the quadratic formula is used to find a r0¢! of the quadratic for the next approximation. It has been proved that near a simph: oot Muller's method converges faster than the secant method and almost as fast a= ‘Newton's method. The method can be used to find real or complex zeros of a function and can be programmed to use complex arithmetic Without loss of generality, we assume that p> is the best approximation to th root and consider the parabola through the three starting values, shown in Figure 2.17 ‘Make the change of variable o tas—pr and use the difference (10) ho= po pz and y= p)— pr Consider the quadratic polynomial involving the variable : ap yaar torte, Sec.2.$ ANTKEN'S PROCESS AND STEFFENSEN'S AND MULLER’S MeTHODS 93, Each point is used to obtain an equation involving a,b, and c: ALI sho: aig-+ bho +c a Actas ahh + bhy +e Atr=O a? +h0 +c From the thir equation in (12), we see that a3) Cah ‘Substituting (13) into the fist two equations in (12) and using the definition eo = foc and ex = fi — results in the linear system at + bho = fo ah} +bhy = fi- cr) Solving the linear system for @ and resus in fohy = esha Ini hob pacha Inkg— hol? as) ‘The quadeatic formula is used to find the roots # of (11) «6 Formula (16) is eguivalent to the standard formula for the roots of a quadeatic and is better inthis case because we know that =f ‘To ensure stability of tae method, we choose the root in (16) that has the smallest absolute value. If > 0, use the positive sign with the square root, and if @ < 0, use the negative sign. Then ps is shown in Figure 2.17 and is given by a pea pate ‘To update the iterates, choose po and py tobe the two values selected from among (po. pi, p3} that lie closest io ps (1... throw out the one that i farthest away). Then te place PBWith ps, Although 2 lot of auxiliary calculations are done in Mullers method, itonly requires one function evaluation per iteration. If Maller’s method is used to find the real roots of f(x) = 0, it is possible that ‘one may encounter complex approximations, because the roots of the quadratic in (16) ‘might be complex (nonzero imaginary components). In these cases the imaginary com: ‘ponents will have a small magnitude and can be set equal to zero So thatthe calculations proceed with real numbers. 94 CHAP.2. THE SOLUTION OF NONLINEAR EQUATIONS f(x) =O Table 212 Comparison of Convergences near a Simple Root SeC.2.5 ARTKEN'S PROCESS AND STEFFENSEN'S AND MULLER'S METHODS 9S ‘Table2.13 Comparison of Convergence Neat & Double Root sen Mules Newons ‘Seforen Sect weer Newoxs Sica ‘ etd amet ethod ih Nomen k net inetd etd with Nevo at] 2 -aceooomwnses| ae 0 sqrt (b" 2-41 sene¥(2)-¥(3);c=¥(3); Wand the smallest root of (17) itb 1 and shrinks the vector wien le| < 1. This is shown by using equation (10) TeX = xp 4 Ade heady? lola? +2} +--- baby = an IX ‘An important relationship exists between the dot product and norm of & vector. If both sides of equation (10) are squared and equation (9) is used, with ¥ being replaced with X, we have a WX? aad ead deta = XX, If X and ¥ are position vectors that locste the two points (x1.12,....4») and (4, y2.--+.9n) in N-dimensional space, then the displacement veetor irom X to ¥ isgiven by the difference a ¥ =X (displacement from position X to position ¥). Notice that if particle starts atthe position X° and moves through the displacement ¥ ~ X, its new positon is ¥. This can be obtained by the following vector sum: ) 4, Yex+W-X) ‘Using equations (10) and (13), we ean write dow the formula forthe distance between two points in N-space. a (8) WY Xt = (G1 — 20? +24 tow aw?) When the distance between points is computed using formula (15), we say that the Points lie in N-dimensional Euclidean space.

You might also like