This action might not be possible to undo. Are you sure you want to continue?
While it was noticed early that the roots of a polynomial are not rational expressions of the coefﬁcients, the mathematician Girard noticed in 1629 that certain expressions in the roots were rational expressions of the coefﬁcients. Suppose we consider the polynomial equation t 3 + p1 t 2 + p2 t + p3 = 0 Let x1 , x2 , x3 be the roots. Then Girard noted that x1 + x2 + x3 x2 + x2 + x2 1 2 3 x3 + x3 + x3 1 2 3 x4 + x4 + x4 1 2 3 = = = = −p1 p2 − 2p2 1 −p3 + 3p1 p2 − 3p3 1 p4 − 4p2 p2 + 4p1 p3 + 2p2 1 1 2
While it seems clear that we can continue, it was not clear to Girard what the general pattern was. In fact, the general pattern is quite complicated but Isaac Newton published in 1683 a simple set of recursive equations. Note that for convenience we are writing the coefﬁcients of the polynomial in a non-standard order. Theorem 4.6.1 (Newton’s Identities) Let f (t) = tn + p1 tn−1 + p2 tn−2 + · · · + pn−1 t + pn be a polynomial with roots (counted according to multiplicity) x1 , x2 , . . . , xn . For j = 1, 2, 3, . . . let sj = x j + x j + · · · + x j 1 2 n Set pk = 0 for k > n. Then for all j > 0 sj + p1 sj−1 + p2 sj−2 + · · · + pj−1 s1 + jpj = 0 What this theorem says is that we have the equations s1 + p 1 s2 + p1 s1 + 2p2 s3 + p1 s2 + p2 s1 + 3p3 s4 + p1 s3 + p2 s2 + p3 s1 + 4p4 = = = = 0 0 0 0
124 The ﬁrst equation allows us to solve for s1 in terms of p1 . Then since we k now s1 and the pk the second equation allows us to solve for s2 . Now we k now s1 , s2 and the pk so we can solve the third equation for s3 . We can continue in this manner to ﬁnd s4 , s5 and so on for as long as we lik The reader should checkthat solving the e. four equations gives Girard’ formulas above. Newton did not bother to supply a proof s for these identities. We will sk etch a proof modiﬁed from Uspensk We will work y. ∞ i with formal power series, i.e. expressions of the form i= 0 ai x . By formal we mean that we won’ worry about convergence, we will only do algebraic operations. The t operations of sum and product for formal power series are as deﬁned for polynomials in Chapter 1 and we obtain an integral domain. The main surprise is that formal power series with non-z constant term have multiplicative inverses. In particular we have ero the geometric series 1 = 1 + x + x2 + x3 · · · 1−x Proof We start with the factoriz : ation f (t) = tn + p1 tn−1 + · · · + pn = (t − x1 )(t − x2 ) · · · (t − xn ) We then tak the logarithmic deriv e e ativ 1 1 1 f (t) = + + ··· f (t) t − x1 t − x2 t − xn M ultiplying by t gives t f (t) t t t + + ··· = f (t) t − x1 t − x2 t − xn ( .10 4 )
Now if we expand each term on the right using the geometric series we have t 1 xi x2 x3 i i = = 1 + + 2 + 3 + ··· t − xi 1 − xti t t t Adding, we see that the right hand side of 4 .10is 1 1 1 n + (x1 + · · · + xn ) + (x2 + · · · + x2 ) 2 + (x3 + · · · + x3 ) 3 + · · · 1 n 1 n t t t or, alternatively ( writing s0 = n) t f (t) s1 s2 s3 = s0 + + 2 + 3 + ··· f (t) t t t ( .11) 4
4.6. NEWTON’S IDENTITIES
M ultiplying 4 .10by f (t) then gives s1 s2 + 2 + ···) t t M ultiplying out the right hand side of 4 gives a power series .12 tf (t) = (tn + p1 tn−1 + · · · + pn )(s0 + bn tn + bn−1 tn−1 + · · · b0 + b−1 t−1 + · · · where On the other hand, bn−j = sj + p1 sj−1 + · · · + pj s0 tf (t) = ntn + (n − 1)p1 tn−1 + · · · + pn−1 t
( .12) 4
This says that the coefﬁcient of tn−j on the left is (n − j)pj which mak sense for all es j > 0 since we set pj = 0 for j > n. F inally ( since we are using formal power series we may equate the coefﬁcients of tn−j on both sides of 4 to get .12 (n − j)pj = sj + p1 sj−1 + · · · + pj s0 Subtracting (n − j)pj from both sides of this equation and using the fact that s0 = n gives Newton’ identity. s We give an application of the use of Newton’ Identities. This is the application that s Newton had in mind and gives a method for ﬁnding the largest real root of a polynomial ( assuming such a root exists, is not a multiple root, and is actually the root of largest modulus). We note that this method is of no practical value today. The method is based on the fact that if the real root xn is is of larger modulus than any other root then for large k, sk = xk + · · · + xk ≈ xk . Thus the sequence 1 n n √ √ √ s1 , s2 , 3 s3 , 4 s4 , . . . should converge to xn . Ex mp e 4.6.2 Let f (x) = x3 − 5x2 + 6x − 1. Then p1 = −5, p2 = 6 and p3 = −1. a l F rom the identities we get −p1 = 5 −p1 s1 − 2p2 = −(−5)(5) − 2(6) = 13 −p1 s2 − p2 s1 − 3p3 = −(−5)(13) − (6)(5) − 3(−1) = 38 −p1 s3 − p2 s2 − p3 s1 = −(−5)(38) − (6)(13) − (−1)(5) = 117 √ √ √ Thus our sequence is s1 = 5, s2 = 3.6055, 3 s3 = 3.3619 7, 4 s4 = 3.2888, . . . The actual root is 3.24 69 s1 s2 s3 s4 = = = =
126 M a l Imp ementa pe l tion For actually calculating the ghost coefﬁcients sj with Maple it is easier to use equation 4.11 directly than Newton’s identities. The idea is to expand the left hand side of this equation in an asymptotic series, that is a series in negative powers of t. Thus to ﬁnd the sj in the above example one might do f:= tˆ3 - 5*tˆ2 + 6*t-1; asympt(t*diff(f,t)/f, t, 5); and the result would look like 3+ 1 5 13 38 117 + 2 + 3 + 4 + O( 5 ) t t t t t
where s0 , s1 , . . . , s4 are the numerators and the Oterm at the end is to remind you that this is an inﬁnite series. Replacing the 5 with a larger number will get you as many terms as you like. Ex ise 2 a[ erc 3 10points]Use the method j described to ﬁnd the largest real root of ust f (x) = x4 − 2x3 − 5x2 + 6x + 3 correct to 2 signiﬁcant digits.
4.7 M ore on Newton’s Identities
In this optional section we consider some more applications of Newton’ identities. The s material in this section will not be needed in the sequel, so you may wish to sk this ip section to maintain the continuity of our story. F our next application we note that if we k or now the sk we can then calculate the coefﬁcients pj of our polynomial. F solving Newton’ identities for pj in terms of the or s sk we have p1 = −s1 s2 − s2 p2 = 1 2 −2s3 + 2s1 s2 + s3 − s1 s2 1 p3 = 6 and so on. Evidently these formulas get very complicated quick but numerically it is ly, easy to solve for the pj ’ directly from Newton’ Identities. s s
4.7 . M OR E ON NEWTON’S IDENTITIES
This technique has been used to calculate the eigenvalues of a matrix. Recall that the eigenvalues of an n × n matrix A are the roots of the characteristic polynomial f (λ) = d e t(λI −A) = λn +p1 λn−1 +p2 λn−2 +· · ·+pn−1 λ+pn . ( Note that we are using d e t(λI − A) rather than the formula d e t(A − λI ) found in some linear algebra book in s n order that our polynomial be monic, these formulas differ by a factor of (−1) .) If the eigenvalues ( complex and counted according to multiplicities) are x1 , x2 , . . . , xn then it is not too difﬁcult to see that the sum s1 = x1 + · · · + xn is the sum of the diagonal entries of A, in fact both are equal to −p1 . The sum of the diagonal entries of a matrix A is called the trace of the matrix, trace(A). It is a bit harder to see that the eigenvalues of A2 are x2 , x2 , . . . , x2 . When there are 1 2 n no repeated eigenvalues this follows from the fact that λ is an eigenvalue of A if there exists a vector v = 0 so that Av = λv. Then A2 v = A(Av) = A(λv) = λAv = λ2 v so λ2 is an eigenvalue of A2 . It follows that the trace of A2 is then x2 + x2 + · · · + x2 = s2 . 1 2 n M ore generally the argument above suggests that sk = xk +· · ·+xk is the trace of Ak for 1 n each k > 0. In fact this is true. Thus the method is to calculate Ak for k = 1, 2, . . . , n and set sk to be the trace of Ak . Then work back ing wards using Newton’ identities we s can ﬁnd the coefﬁcients p1 , p2 , . . . , pn of the characteristic polynomial. Ex mp e 4.7 We wish to ﬁnd the eigenvalues of the matrix a l .1 1 2 0 A = 3 −2 1 2 1 4 We ﬁrst calculate 7 −2 2 5 20 6 A2 = −1 11 2 , A3 = 36 −22 19 13 6 17 65 31 74 We then see that s1 = trace(A) = 3, s2 = trace(A2 ) = 35 and s3 = trace(A3 ) = 57. Now from the identity s1 + p 1 = 0 we calculate p1 = −3. F rom the identity s2 + p1 s1 + 2p2 = 0 we have 35 + (−3)(3) + 2p2 = 0 so p2 = −13. F inally from s3 + p1 s2 + p2 s1 + 3p3 = 0
128 we obtain 57 + (−3)(35) + (−13)(3) − 3p3 = 0 so p3 = 29 . We thus conclude that the characteristic polynomial of A is f (λ) = λ3 − 3λ2 − 13λ + 29 Using any method from Chapter 2 we see that the roots of f (λ) are −3.3813, 1.924 4 and 4 5 i.e. these are the eigenvalues of A. .4 69, Ex ise 2 b[ erc 3 20points]F the characteristic polynomial and the eigenvalues of the ind matrix 1 2 0 −1 2 1 −2 0 A= 0 −2 3 3 −1 0 3 1 Hint:The eigenvalues are real. There are similar identities for ﬁnding sums of negative powers of the roots of a polynomial. Theorem 4.7 Let f (x) = p0 tn +p1 tn−1 +· · ·+pn be a polynomial with roots (counted .2 according to multiplicity) x1 , x2 , . . . , xn . Assume that pn = 0, that no xi = 0 and i.e. set pk = 0 for k < 0. Deﬁ for k ≥ 0 s−k = x−k + x−k + · · · + x−k = x1k + · · · + x1k . ne 1 2 n n 1 Then for all j ≤ n (n − j)pj + pj+ 1 s−1 + · · · + pn sj−n = 0 The proof is similar to that of Theorem 4 .6.1 by noting that expansion as a formal power series in positive powers of t gives −t f (t) = s−1 t + s−2 t2 + s−3 t3 + · · · f (t) ( .13) 4
M a l Imp ementa pe l tion Again the most efﬁcient way to calculate the ghost coefﬁcients s−j is to expand the left hand side of equation 4.13as a series, eg. series(-t*diff(f,t)/f,t,8); would give s−j as the coefﬁcient of tj for j = 1, 2, . . . , 7.
4.7 . M OR E ON NEWTON’S IDENTITIES
Often in the literature the Newton’ identity for positive and negative powers is s combined by adding the two identities as follows: Corol ry4.7 With hypotheses as in the prev l a .3 ious theorem, setting s0 = n we hav e for all integers (positiv negativ and z j e, e ero) p0 sj + p1 sj−1 + p2 sj−2 + · · · + pn sj−n = 0 Actually, the Newton’ Identity for negative powers look nicer if we use our more s s standard notation for polynomials: Corol ry4.7 Let f (x) = a0 + a1 t + a2 t2 + · · · + an tn be a polynomial of degree l a .4 n with a0 = 0. Let x1 , . . . , xn be the roots counted according to multiplicity and let s−k = x−k + · · · + x−k for all k > 0. Then for all j > 0 1 n jaj + aj−1 s−1 + aj−2 s−2 + · · · + a0 s−j = 0 It should be noted from this result that alghough the number of roots and the actual roots depends on the degree and all the coeﬁcients, the sum of the j th powers of the reciprocals of the roots depends only on the coefﬁcients a0 , a1 , . . . , aj when j < n and is independent of the degree. This observation motivated the mathematician Euler to apply this last corollary to power series. M ost modern mathematicians will say that Euler’ argument is wrong, but his results are correct. s Euler wanted to calculate the number ζ(n) =
∞ k= 1
for n a positive even integer. To this end he started with the series sin (x) = x − x3 x5 x7 + − + ··· 3! 5! 7!
and noted that sin (x) has roots kπ for all integers k. Dividing by x eliminates the root √ at 0and replacing x by x eliminates the negative roots according to Euler. Thus Euler argued that √ x2 x3 sin ( x) x − + ··· f (x) = √ =1− + 3! 5! 7! x
130 has roots at (kπ)2 for k = 1, 2, 3, . . .. This is certainly correct, however ( here is the questionable step! Euler then set ) s−j =
∞ k= 1
1 1 = 2j 2 )j ((kπ) π
∞ k= 1
1 k 2j
( .14 4 )
and calculated the s−j using the Newton’ Identities of the last corollary. F j = 1 he s or 1 had 1a1 + a0 s−1 = 0 so s−1 = −a1 = 3! = 1/6. M ultiplying equation ( .14 by π 2 4 ) gives ∞ π2 1 = π 2 s−1 = ζ(2) = k2 6 k= 1
1 Next using 2a2 + a1 s−1 + a0 s−2 = 0 gives s−2 = ( 3! )2 − by π 4 gives ∞ π4 1 = π 4 s−2 = ζ(4) = k4 90 k= 1 2 5!
1 . 90
M ultiplying ( .14 4 )
It should be mentioned that there is still no simple formula for odd values of ζ, for example ζ(3) = ∞ 1 k13 is k nown only numerically and only in the last few years has k= even been shown to be irrational. Ex ise 2 c[ erc 3 10points]Assuming that Euler’ calculation is correct, ﬁnd ζ(6) and s ζ(8).
4.8 S mmetricPolnomias y y l
The expressions sk = xk +· · ·+xk are examples of symmetric polynomials. Symmetric 1 n polynomials played a large role in the development of the modern theory of solvability of polynomials. The 18th century mathmatician Edward Waring is often credited for much of the development of the theory of symmetric polynomials. M ore generally we start with a polynomial f (x1 , x2 , . . . , xn ) in n v ria l xj . This a b es means that f (x1 , x2 , . . . , xn ) is a sum of terms each one is a constant times positive powers of some or all of the variables xj . f (x1 , x2 , . . . , xn ) is symmetric if any permutation of the variables leaves the result unchanged. M ore precisely, f (x1 , x2 , . . . , xn ) is symmetric if for all 1 ≤ j < k ≤ n f (. . . , xj , . . . , xk , . . .) = f (. . . , xk , . . . , xj , . . .). F example or f (x1 , x2 , x3 ) = x2 x2 x3 + x1 x2 x3 + x1 x2 x2 1 2 3 is a symmetric polynomial but g(x1 , x2 , x3 ) = x1 x2 x3 2 3
4.8 . SY M M ETR IC P OL Y NOM IA L S
is not for g(x2 , x1 , x3 ) = x2 x2 x3 = g(x1 , x2 , x3 ). 1 3 Consider a polynomial f (t) = t3 + p1 t2 + p2 t + p3 of degree three with roots x1 , x2 , x3 . Then f (t) factors as f (t) = (t − x1 )(t − x2 )(t − x3 ). M ultiplying this last expression backout we get f (t) = t3 −(x1 +x2 +x3 )t2 +(x1 x2 +x1 x3 +x2 x3 )t−x1 x2 x3 . Thus we conclude that p1 = −(x1 + x2 + x3 ) p2 = x1 x2 + x1 x3 + x2 x3 p3 = −x1 x2 x3 We note that the coefﬁcients p1 , p2 , p3 are symmetric functions of the roots x1 , x2 , x3 which should not be suprising since the order in which we listed the roots was clearly irrelevant. This pattern holds also for polynomials of higher degree and was well k nown to mathematicians of the 17th and 18th century. Theorem 4.8 Let f (t) = tn + p1 tn−1 + · · · + pn−1 t + pn be a monic polynomial of .1 degree n with roots x1 , x2 , . . . , xn . Then
p1 = −(x1 + x2 + · · · + xn ) = − p2 = x1 x2 + · · · + xn−1 xn =
p3 = −(x1 x2 x3 + · · · + xn−2 xn−1 xn ) = − . . . = (−1)n x1 x2 · · · xn
xj xk x
1≤j< k< ≤n
Ex mp e 4.8 F a polynomial with roots 1,2,3,4 By the Theorem a l .2 ind . p1 p2 p3 p4 = = = = −(1 + 2 + 3 + 4) = −10 1 ∗ 2 + 1 ∗ 3 + 1 ∗ 4 + 2 ∗ 3 + 2 ∗ 4 + 3 ∗ 4 = 35 −(1 ∗ 2 ∗ 3 + 1 ∗ 2 ∗ 4 + 1 ∗ 3 ∗ 4 + 2 ∗ 3 ∗ 4) = −50 1 ∗ 2 ∗ 3 ∗ 4 = 24
so f (t) = t4 − 10t3 + 35t2 − 50t + 24.
132 Ex mp e 4.8 Suppose we k a l .3 now for some reason ( we started Graeffe’ method) eg. s √ 3 2 that f (x) = x − x − x − 15 has an imaginary root of modulus 5. We wish to ﬁnd the roots. Call this root x1 = a + bi. Then x2 = a − bi is also a root so |x1 |2 = x1 x2 = 5. Now p3 = −15 = −x1 x2 x3 = −5x3 so we see easily that x3 = 3. But p1 = −x1 −x2 −x3 = −2a−3 = −1 so −2a = 2 or a = −1. Since |x1 |2 = 5 = a2 +b2 it follows that b2 = 4 so b = ±2. Thus the roots are −1 + 2i, −1 − 2i and 3. Since Theorem 4 gives the close relation between the roots and the coefﬁcients .8.1 one might hope that these formulas might give a way to ﬁnd the roots given the coefﬁcients. Unfortunately this will not work but to some extent these formulas are the basis , for all attempts after the days of Cardano. In Theorem 4 we can view p1 , p2 , . . . as symmetric polynomials in the variables .8.1 x1 , . . . , xn . These symmetric polynomials are often k nown in the literature as the elementary symmetric polynomials. Thus we can view Theorem 4 .6.1 as saying that the symmetric functions sk = xk + · · · + xk are polynomials in the elementary symmetric 1 n polynomials, eg. as Girard noted s3 = −p3 +3p1 p2 −3p3 . What is much more suprising 1 is the following: Theorem 4.8 (Fu menta Theorem on S mmetricPolnomias) Let .4 nda l y y l f (x1 , x2 , . . . , xn ) be a symmetric polynomial in n v ariables with coefﬁ cients in an integral domain R. Then f (x1 , x2 , . . . , xn ) can be ex pressed as a polynomial in the n elementary symmetric functions p1 , p2 , . . . , pn with coefﬁ cients in R. The coefﬁcient ring R will generally be Z or Q. P roofs of this theorem are contained in the book by Uspensk and Lang, in addition Adams and Loustanau sk s i etch a modern proof as an exercise and details of this proof can be found in Fine and Rosenberger. We will simply illustrate by some examples. Ex mp e 4.8 In the two variable case the elementary symmetric functions are p1 = a l .5 −x1 − x2 and p2 = x1 x2 . Consider the symmetric polynomial D = (x1 − x2 )2 . Expanding, D = x2 − 2x1 x2 + x2 = s2 − 2p2 where s2 = x2 + x2 as in §4 By .6. 1 2 1 2 2 2 2 Newton’ Identities s2 = p1 − 2p2 so D = p1 − 2p2 − 2p2 = p1 − 4p2 Note that this is s j the discriminant of the quadratic polynomial f (t) = t2 + p1 t + p2 . ust Ex mp e 4.8 F a real cubic polynomial f (t) = t3 +p1 t2 +p2 t+p3 the discriminant a l .6 or is D = (x1 − x2 )2 (x1 − x3 )2 (x2 − x3 )2 where x1 , x2 , . . . , x3 are the roots. It is a bit much workto put down here but it has been shown that D = 18p1 p2 p3 − 4p3 p3 + p2 p2 − 4p3 − 27p2 1 1 2 2 3 If p1 = 0 as in §4then we simply have D = −4p3 − 27p2 as in Theorem 4 .1. .4 2 3
4.8 . SY M M ETR IC P OL Y NOM IA L S
M ore generally given a polynomial f (t) = tn + p1 tn−1 + · · · + pn of degree n with roots x1 , . . . , xn the discriminant is deﬁned by D = (x1 − x2 )2 (x1 − x3 )2 · · · (xn−1 − xn )2 i.e. the product of all differences (xj − xk )2 for j < k. The general formula, which requires some advanced techniques to j ustify, is s0 s1 · · · sn−1 s1 s2 · · · sn s3 · · · sn+ 1 D = d e t s2 . . .. . . . . . . . . sn−1 sn · · · s2n−2 where as in §6 sk = xk + · · · + xk . Newton’ Identities then can be used to express D s 1 n in terms of the pj ’ Note in particular that if the coefﬁcients pj are real then D is real, s. if the pj are all integers then D is an integer. The generaliz ation of Theorem 4 .1 is .4 Theorem 4.8 Let D be the discriminant of f (t) as abov where the coefﬁ .7 e, cients pj are real. Then f (t) has multiple roots if and only if D = 0. Otherwise if D > 0 then f (t) has an ev number of p irs of imaginary roots, D < 0 then f (t) has an odd en a if number of p irs of imaginary roots. a It should go without saying that except for n = 2, 3 calculating the discriminant is a terrible way to tell if f (t) has multiple roots or to count real roots. We will need a few more calculations for the next section: Ex mp e 4.8 Let f (t) = t4 + p1 t3 + p2 t2 + p3 t + p4 be a biquadratic polynomial with a l .8 roots x1 , x2 , x3 , x4 . Consider A = x1 + x2 − x3 − x4 B = x1 − x2 + x3 − x4 C = x1 − x2 − x3 + x4 Let a = A2 , b = B 2 and c = C 2 . Then a + b + c, ab + ac + bc and ABC are all seen to be symmetric polynomials in x1 , . . . , x4 . We can calculate a + b + c = 3p2 − 8p2 1 ab + ac + bc = 3p4 − 16p2 p2 + 16p1 p3 + 16p2 − 64p4 2 1 1 ABC = p3 − 4p1 p2 + 8p3 1 ( .15 4 ) ( .16) 4 ( .17 4 )
134 We leave the ﬁrst two calculations as rather hard exercises for the reader ( Uspensk see y for example) and tack only the third. We note that multiplying out we can have 3 le types of terms, x3 , x2 xk and xj xk x where in each term j, k, are different. Note that j j x1 occurs only with “+”signs, but each of x2 , x3 , x4 occur exactly twice with − signs in the expressions A, B, C. Thus the expansion of ABC will contain the terms x3 for j each j so it will contain s3 = x3 + x3 + x3 + x3 . 1 2 3 4 To get a term of the form x2 xk we must pick two of A, B, C to pick out the j j and the k comes from the third. There are 3 ways to do this. It can be seen that two of the ways give a “−”sign but the third gives a “+”so each of the 12 terms x2 xk j occurs with a coefﬁcient of −1 in the expansion. Now note, for example when j = 1, x2 x2 + x2 x3 + x2 x4 = x2 x1 + x2 x2 + x2 x3 + x2 x4 − x3 = x2 (−p1 ) − x3 . Repeating 1 1 1 1 1 1 1 1 1 1 this for j = 2, 3, 4, adding and multiplying by the −1, we see that the contribution of the x2 xk in ABC is (x2 + x2 + x2 + x2 )p1 + (x3 + x3 + x3 + x3 ) = s2 p1 + s3 . j 1 2 3 4 1 2 3 4 F inally we have terms of the form xj xk x where we can tak j < k < . Note that e each such term can be generated 6 ways, i.e. we can choose the j from either factor A, B or C, then we have only two factors from which to choose the k and we must choose the from the remaining factor. By careful inspection, we see that 4ways give “+”signs and 2 ways give “−”signs, thus the contribution of the xj xk x terms in the product ABC is 2x1 x2 x3 + 2x1 x2 x4 + 2x1 x3 x4 + 2x2 x3 x4 = −2p3 . Thus we can conclude that ABC = s3 + (s2 p1 + s3 ) − 2p3 = 2s3 + s2 p1 − 2p3 = 2(−p3 + 3p1 p2 − 3p3 ) + (p2 − 2p2 )p1 − 2p3 = −p3 + 4p1 p2 − 8p3 as claimed. 1 1 1 Ex ise 2 d[ erc 3 10points]Write the symmetric polynomial f (x1 , x2 , x3 ) = x3 x2 x3 + 1 x1 x3 x3 + x1 x2 x3 in three variables in terms of the elementary symmetric functions 2 3 p1 , p2 , p3 of three variables.
4.9 La ra e’s S u g ng ol tion ofthe Biq a tic u dra
After Cardano’ publication of del F s erro, Tartaglia and F errari’ solution of the cubic s and biquadratic, many mathematicians tried to ﬁnd similar methods for solving the quintic ( th degree) and higher degree equations. They failed, as we now k 5 now they must, and so, for the most part, history has not recorded their efforts. Several of the attempts were more noteworthy than the others, for example Vandermonde’ almost s correct solution ( 17 0 of the cyclotomic equation xn − 1 in radicals of degree less in 7 ) than n ( Gauss ﬁlled in the details in 180 However the most signiﬁcant attempt was 1). made by Lagrange. In order to attackhigher degree polynomials he started with a detailed analysis of the solution of the cubic and biquadratic. The information he gained
4.9 . L A G R A NG E’S SOL U TION OF TH E B IQ U A DR A TIC
was, of course, not enough to help him solve the quintic, but it laid the foundation for the proofs by Rufﬁni, Abel and Galois that the quintic and higher degree polynomials could not, in general, be solved by radicals. A good discussion of the history of these ideas can be found in B.L v der Waerden’ “A History of Algebra” Lagrange devised an s . solution methods for the cubic and the biquadratic, these are given in Chapter XI of Uspensk We give his solution of the biquadratic since, unlik the cubic where Lagrange y. e re-derives Cardano’ equations ( Exercise 23e), the solution is given in a different, s see and more elegant, form than that of F errari. We start with a polynomial f (t) = t4 + pt3 + qt2 + rt + s where for notational simplicity we are using the letters p, q, r, s instead of p1 , p2 , p3 , p4 respectively. Let x1 , x2 , x3 , x4 be the roots of f (t). As in example 4 we let .8.8 A = x1 + x2 − x3 − x4 B = x1 − x2 + x3 − x4 C = x1 − x2 − x3 + x4 and a = A2 , b = B 2 and c = C 2 . We then have a + b + c = 3p2 − 8q = u ab + ac + bc = 3p4 − 16p2 q + 16pr + 16q 2 − 64s = v abc = (p3 − 4pq + 8r)2 = w F rom Theorem 4 it follows that a, b, c are the roots of the resolv cubic .8.1 ent g(t) = t3 + ut2 + vt + w The cubic equation g(t) = 0 can be solved by Cardano’ method ( Lagrange’ method s or s in Exercise 23e) so a, b, c can be calculated. Then A, B, C can be found by tak ing square roots of a, b, c, being careful only to select signs so that ABC = −p3 + 4pq − 8r as required by Example 4 .8.8. We then have equations ( .18) together with the equation 4 −p = x1 + x2 + x3 + x4 Thus we have a system of 4linear equations in the 4unk nowns x1 , x2 , x3 , x4 which can ( .19) 4 ( .20 4 ) ( .21) 4
( .18) 4
136 be solved once and for all by x1 = x2 x3 x4 −p + A + B + C 4 −p + A − B − C = 4 −p − A + B − C = 4 −p − A − B + C = 4
While this method is very elegant in theory, we warn the reader that in practice we may have u = 0 in the resolvant cubic, and in any case the resolvant cubic may have a messy solution, the square roots of which must then be calculated! We end this section with an analysis of Lagrange’ method. It is based on the s following theorem, which is a special case of a theorem proved by Lagrange. This theorem deals with what we might call somewhat symmetric polynomials in that some permutations, but perhaps not all, may leave the value unchanged. Theorem 4.9 Let g(x1 , x2 , . . . , xn ) be a polynomial in n v .1 ariables. Suppose that under all permutations of the v ariables the polynomial tak on ex es actly m different v alues, g1 = g, g2 , . . . , gm then there is a polynomial f (t) of degree m whose coefﬁ cients are polynomials in the elementary symmetric functions of x1 , . . . , xn so that the roots of f (t) are g1 , g2 , . . . , gm . The idea of the proof is to construct the elementary symmetric functions P1 , . . . , Pm of the gj , i.e. P1 = g1 + g2 + · · · + gm , Pm = g1 g2 · · · gm etc. It can be seen that the Pj are actual symmetric functions and hence by Theorem 4 .8.4expressible in the elementary symmetric functions on x1 , . . . , xn . But by Theorem 4 .8.1 g1 , . . . , gm are roots of f (t) = tm + P1 tm−1 + · · · + Pm . In Lagrange’ solution of the biquadratic we tookfor g(x1 , . . . , xn ) the somewhat s symmetric function a = (x1 + x2 − x3 − x4 )2 . It should be noted that permuting the variables gives exactly three different values, mainly a, b and c. Thus a, b, c are roots of a polynomial whose coefﬁcients are polynomials in p, q, r and s, mainly the resolvent polynomial. In connection with Theorem 4 .9.1, Lagrange noted that if one multiplied the number of different values tak by g(x1 , . . . , xn ) by the number of permutations which left en g(x1 , . . . , xn ) unchanged, the product will always be n!, the number of all permutations
4.1 0 . INSOL V A B IL ITY OF TH E Q U INTIC
of the n variables. F example, in the paragraph above, a tak on 3 distinct values or es under permutations, but there are 6 ways to permute the variables so that a remains unchanged. 4 ∗ 6 = 24 = 4!. At the time Lagrange lived, group theory had not yet been invented. When group theory was invented 10 0years later the mathematician Camile J ordan named a now famous theorem of group theory after Lagrange because Lagrange’ observation was simply a special case of this general theorem. s Ex ise 2 e [ erc 3 30points] Derive Lagrange’ solution of the cubic. Let f (t) = t3 + pt2 + s qt + r have roots x1 , x2 , x3 and set A = x1 + ωx2 + ω 2 x3 and B = x1 + ω 2 x2 + ωx3 √ where ω = −1+2 3i is a cube root of 1. Show that a = A3 tak on only 2 values under es permutations of the roots, mainly a and b = B 3 and thus these two values are roots of the resolvent quadratic. F the coefﬁcients of the resolvent quadratic in terms of p, q ind and r. Note also that AB is a symmetric function of the roots so calculate AB in terms of p, q, r. Then solving the resolvent quadratic, tak appropriate cube roots of a, b to ing get the correct AB, the formulas for A, B and −p = x1 + x2 + x3 give three linear equations in 3 unk nowns which can be solved to obtain Cardano’ Equations. s
4.10 Insola il v b ityofthe Qu intic
Lagrange had hoped that his study of solution methods of the cubic and biquadratic would lead to a solution of the quintic ( th degree polynomial equation). After all, by 5 his Theorem, all one needed to ﬁnd was a suitable somewhat symmetric function of 5 variables that tookon exactly 4different values under permutation of the variables. If this function was a power of a linear function of the variables, then with the additional equation −p1 = x1 + · · · + x5 he could then solve the “resolvent biquadratic”and tak e roots to obtain 5equations in 5unk nowns which could be solved for the roots of the original polynomial. Unfortunately, such a function does not exist. While Lagrange was optimistic that the solution method would be found, the Italian mathematian P aulo Rufﬁni realiz that Lagrange’ analysis could lead instead to ed s a proof that no solution could exist. Rufﬁni claimed that he had such a proof in 17 98 but many mathematicians were sk eptical. What Rufﬁni actually proved was that Lagrange’ method would not lead to a solution, but there was a gap in his argument that s if Lagrange’ method did not work no method would work In 1824the 22 year old s , . Niels HenrikAbel ﬁlled the gap in Rufﬁni’ proof. s It is important to understand exactly what Rufﬁni and Abel proved. They started with v ria l x1 , x2 , . . . , x5 and deﬁned p1 , p2 , . . . , p5 as in 4 a b es .8.1. The polynomial
138 f (t) = t5 + p1 t4 + · · · + p5 is called the general quintic. The goal was to solve this in terms of p1 , . . . , p5 using only the algebraic operations of addition, subtraction, multiplication, division and the tak of roots ( ing square roots, cube roots, etc.). In other words, the goal is to recover the variables x1 , . . . , x5 from the polynomials p1 , . . . , p5 using only algebraic operations. Of course if this were possible then by replacing the pj by the coefﬁcients of an actual polynomial and then doing these algebraic operations then the numbers x1 , . . . , x5 obtained would be the actual roots. F example, in the case of the quadratic, the general q or uadratic is f (t) = t2 + pt + q √ where p = −(x1 + x2 ) and q = x1 x2 . The general q uadratic formula x = then recovers x1 , x2 from p, q as follows: −p ± x1 + x2 ± p2 − 4q = 2 x1 + x2 ± = = x1 + x2 ± (x1 + x2 )2 − 4x1 x2 2 x2 − 2x1 x2 + x2 1 2 2 (x1 − x2 )2
−p± p2 −4q 2
2 x1 + x2 ± (x1 − x2 ) = 2 = x1 or x2
A solution of this type is called a solution by radicals for the general equation. What Rufﬁni and Abel proved is that a solution by radicals did not exist for the general quintic. In particular, there is no single solution method which work for all quintic s equations. Several questions still remain. F irst, perhaps while no one single solution method work for all quintics, maybe there are several methods one of which would workfor s any given quintic. F instance, Cardano thought, wrongly as we now k or now, that 13 different methods were necessary to solve all cubics. P erhaps we need 13 methods to solve all quintics?There certainly are some types of quintics which can be solved, for example the cyclotomic equation x5 − 1 = 0 was solved by radicals by Vandermonde. P erhaps different methods would solve other types of quintics. The ﬁrst question then is whether this is actually true. Even if it was not possible to have a ﬁnite list of solution methods covering all quintics, one would surely expect that for any given quintic with rational coefﬁcients the roots would be algebraic expressions involving rational numbers, sums, differences, products, quotients and roots of various orders. So the second question is: this true? “is ”
4.1 0 . INSOL V A B IL ITY OF TH E Q U INTIC
In 1831 Evariste Galois showed that in fact the answer to both questions is no!F or 5 example, not only is there no algebraic method to ﬁnd the roots of x − 6x + 3 = 0 but the roots cannot be expressed in terms of radicals. Galois went much further than this, by showing that this negative result applies also to polynomials of degree higher than 5 . M ore importantly, he gave a method for determining whether or not a given polynomial could be solved by radicals ( least in principle, if not in practice). Galois’method, at lik the method of Rufﬁni and Abel following Lagrange, involves permuting the roots e of the polynomial. But unlik Lagrange, Galois does not allow all permutations of the e roots, only a group of permutations which somehow preserve the algebra of numbers which can be built up from the roots ( set of such numbers is called the root ﬁ of the eld the polynomial). Thus Galois replaces the polynomial by its root ﬁeld and then replaces the root ﬁeld by the abstract algebraic obj now k ect nown as the Galois group and shows that solvability of the polynomial is equivalent to some facts about the structure of the group. Galois’proof method, now k nown as Galois Theory remains this day as one of the most elegant theories in mathematics. As there are many good accounts in the mathematical literature we will not pursue this any further here. F the reader who or wants a reasonably elementary introduction we recommend the account in Birk and hoff Mac Lane or the one in Fine and Rosenberger. The technique of replacing one mathematical obj ( polynomials) by others ect eg. easier to analyz ( ﬁelds, groups) has become central in modern mathematics. In e eg. addition there are many important direct applications of Galois theory. In a modern research j ournal such as the Bulletin of the American Mathematical Society the name “Galois”appears as often as the name of any other single mathematician. Yet Galois’ workdid not bring him any fame, or even recognition, in his lifetime. Galois sent two papers to Cauchy, who lost them. He sent one paper to F ourier who promptly died, and this paper is also lost. His most important paper ( 1831) was given to the mathematicians P oisson and Lacroix to review, but they couldn’ understand it. A year later Galois was t shot in a duel, not yet 21 and not yet k nown. F inally in 184 the paper was published 6 by Liouville in his j ournal. However the importance of Galois’workdid not become apparent to the mathematical public until 187 0when J ordan published his full account of Galois theory.