PRACTICAL MATHEMATICS

REAL FUNCTIONS, COMPLEX ANALYSIS, FOURIER TRANSFORMS, PROBABILITIES

Efthimios Kaxiras Department of Physics and School of Engineering and Applied Sciences Harvard University

2 PREFACE This book is the result of teaching applied mathematics courses to undergraduate students at Harvard. The emphasis is on using certain concepts of mathematics for applications that often arise in scientific and engineering problems. In typical books of applied mathematics, when all the details of the mathematical derivations and subtleties are included, the text becomes too formal; students usually have a hard time extracting the useful information from the text in a way that is easy to digest. Here I tried to maintain some of the less-than-formal character of a lecture, with more emphasis on examples rather than the exact statement and proof of mathematical theorems. Mathematics is the language of science. Someone can learn a new language in a formal way and have a deep knowledge of the many aspects of the language (grammar, literature, etc.). This takes a lot of effort and, at least in the beginning, can be rather tiresome. On the other hand, one can pick up the essentials of a language and develop the ability to communicate effectively in this language without a deep knowledge of all its richness. The contents of this book are a study guide for such a quick familirization with some topics in applied mathematics that can be very handy in science and engineering. It is a bit like learning a language at a street-smart level, without being exposed to its fine points. Of course much is missed this way. But the satisfaction of being able to quickly and effectively use the language for new experiences may compensate for this.

Contents
1 Functions of real variables 1.1 Functions as mappings . 1.2 Continuity . . . . . . . . 1.3 Derivative . . . . . . . . 1.4 Integral . . . . . . . . . 5 5 10 12 14 23 24 27 32 41 41 45 48 52 56 58 79 79 86 93 95 116

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

2 Power series expansions 2.1 Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Convergence of number series . . . . . . . . . . . . . . . . 2.3 Convergence of series of functions . . . . . . . . . . . . . . 3 Functions of complex variables 3.1 Complex numbers . . . . . . . . . . . . . . . . . . 3.2 Complex variables . . . . . . . . . . . . . . . . . . 3.3 Continuity, analyticity, derivatives . . . . . . . . . 3.4 Cauchy-Riemann relations and harmonic functions 3.5 Branch points and branch cuts . . . . . . . . . . . 3.6 Mappings . . . . . . . . . . . . . . . . . . . . . . . 4 Contour Inegration 4.1 Complex integration . . . . . 4.2 Taylor and Laurent series . . 4.3 Types of singularities - residue 4.4 Integration by residues . . . . 4.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Fourier analysis 117 5.1 Fourier expansions . . . . . . . . . . . . . . . . . . . . . . 117 5.1.1 Real Fourier expansions . . . . . . . . . . . . . . . 117 5.1.2 Fourier expansions of arbitrary range . . . . . . . . 121 3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . .2. . . . . .limiting functions Fourier analysis of signals . . . .1 Definitions . .4. .3 5. . . . . . . . . . . . . . . . . . . .4 5. .4. . . . . . . . .4. 6. .4 Error analysis in series expansions .4. . . . . .1. . . . . . . . 6. 6. . . 6. . . .3 Gaussian or normal distribution . . 5. . . . . . The Fourier transform .5 6 Probabilities and random numbers 6. . . . .2 5. . . . . . . . . . . . . . . .2 Poisson distribution . 5. . . . .1. . . . . . . .3 Complex Fourier expansions .1 Probability distributions . . . . .2 Monte Carlo integration .3 Singular transforms . . . .4. . . . . . . . . . . . . . . .2 Poisson equation . . . . . . . . . 5. . .4 5. .3. . . . . . . . 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. .4 Random numbers . . . . 5. . . .1 Binomial distribution . . . . . . . .2 Multivariable probabilities . . The δ-function and the θ-function . . . 6. .1. . . . . . . . . .1 Arbitrary probability distributions 6. . . .3 Conditional probabilities .2 Properties of the Fourier transform . . .1. . 5.1 Definition of Fourier Transform . . . . . . . .1. . . . . . . . . . . 6. 122 124 125 125 126 127 127 129 129 130 137 139 145 145 148 149 150 157 161 173 175 176 5. . . CONTENTS . . . Application to differential equations . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . .1 Diffusion equation . .

Some familiar examples of functions are: f (x) = c. as these pertain to our subsequent discussion of functions of complex variables. The two basic trigonometric functions are defined by the projections of a radius of the unit circle. A useful set of functions of a real variable are the trigonometric functions. we recall here some of their properties. onto the vertical and horizontal axes.1) (1. There are two aspects of interest: the fact that a function implies a mapping.3) an : real constants The first is a constant function. which lies at an angle θ with respect to the horizontal axis (θ here is measured in radians). the third is a general polynomial in powers of the variable x. b : real constants f (x) = a0 + a1 x + a2 x2 + · · · + an xn . the second represents a line. (1. Since we will be dealing with these functions frequently and extensively in the following chapters. and the fact that functions can be approximated by series expansions of simpler functions. 1. these are called the “sine” (sin(θ)) and “cosine” 5 . a. centered at the origin. called nth order polynomial by the largest power of x that appears in it. The variable x is called the argument of the function.1 Functions as mappings A function f (x) of the real number x is defined as the mathematical object which for a given value of the variable x takes some other real value. c : real constant f (x) = ax + b.2) (1.Chapter 1 Functions of real variables In this chapter we give a very brief review of the basic concepts related to functions of real variables.

a number of useful relations between trigonometric functions of different arguments can be easily established.6) Moreover. these are called the “tangent” (tan(θ)) and “cotangent” (cot(θ)) functions. These definitions are shown in Fig. 1. cos(θ + 2kπ) = cos(θ).6 CHAPTER 1. by geometric arguments. The definition of the basic trigonometric functions in the unit (cos(θ)) functions. we can deduce tan(θ) = sin(θ) . From the definitions.1. k : integer (1.4) The following relation links the values of the sine and cosine functions: sin2 (θ) + cos2 (θ) = 1 (1. FUNCTIONS OF REAL VARIABLES y cot(θ) tan(θ ) sin(θ) x 1 θ 0 cos(θ) Figure 1.1: cirlce. respectively. Additional functions can be defined by the line segments where the ray at angle θ with respect to the horizontal axis intersects the two axes that are parallel to the vertical and horizontal axes and tangent to the unit circle.5) From their definition. that is. cos(θ) cot(θ) = cos(θ) sin(θ) (1. . if the argument of any of these functions is changed by an integer multiple of 2π the value of the function does not change: sin(θ + 2kπ) = sin(θ). respectively. it is evident that the trigonometric functions are periodic in the argument θ with a period of 2π.

5). f (x) = ex = lim 1+ x N N N →∞ (1.8) Another very useful function is the exponential exp(x) = ex . namely c. (1. from their definition. every value of the variable x gives the same value for the function.10) (1. that is. are the so-called hyperbolic sine (sinh(x)) and hyperbolic cosine (cosh(x)) functions: ex + e−x 2 ex − e−x f (x) = sinh(x) = 2 f (x) = cosh(x) = (1.1.12) which is reminiscent of the relation between the sine and cosine functions. defined as a limit of a simpler function that involves powers of x. FUNCTIONS AS MAPPINGS for example: cos(θ1 + θ2 ) = ⇒ sin(θ1 + θ2 ) = ⇒ cos(θ1 ) cos(θ2 ) − sin(θ1 ) sin(θ2 ) cos(2θ) = cos2 (θ) − sin2 (θ) cos(θ1 ) sin(θ2 ) + sin(θ1 ) cos(θ2 ) sin(2θ) = 2 cos(θ) sin(θ) 7 (1.14) by analogy to the trigonometric sine and cosine functions. We can also define the so-called hyperbolic tangent (tanh(x)) and cotangent (coth(x)) functions as: tanh(x) = sinh(x) ex − e−x = x cosh(x) e + e−x cosh(x) ex + e−x coth(x) = = x sinh(x) e − e−x (1. The inverse of the mathematical object we called a function is not necessarily a function: For the first example. it is easy to show that the hyperbolic sine and cosine functions satisfy cosh2 (x) − sinh2 (x) = 1 (1.13) (1.7) (1. Eq. as it should be .11) Note that.1. The exponential and the hyperbolic functions are not periodic functions of the argument x. if we know the value of the argument we know the value of the function.9) Other important functions that are defined through the exponential.

2. .2: The exponential function and the hyperbolic cosine and sine functions for values of the argument in the range [−2. 2 7 6 5 4 3 2 1 -2 Out[1]= In[2]:= -1 1 2 Graphics Plot Cosh x . 2 In[3]:= Plot Sinh x . 2. x. FUNCTIONS OF REAL VARIABLES Examples of functions 1 In[1]:= Plot Exp x . x.8 CHAPTER 1. 3 2 1 -2 -1 -1 -2 -3 Out[3]= 1 2 Graphics Figure 1.5 2 1. 2. 2 3. 2].5 -2 Out[2]= -1 1 2 Graphics x.5 3 2.

for all values of x in the domain of the function (the domain being the set of x values where the function is well defined). (1. . just like Eq. the inverse of the exponential function is the familiar natural logarithm: f (x) = ex ⇒ f −1 (x) = ln(x) domain of exp(x) : x ∈ (−∞. In some of the other examples. we cannot tell from which value of the argument it was obtained. The mapping can be visualized by plotting the values of the function y = f (x) on a vertical axis. Thus. For example.1)).9) . Eq. FUNCTIONS AS MAPPINGS 9 according to the definition. If this line intersects the plot of the function only once.2. Examples of plots produced by the familiar functions in Eq. evaluated at the value of the original function y: y = ax + b ⇒ x = f (x) = y ⇒ x = f −1 (y) (1.2)). (1. A given value of the argument x gets mapped to a precise value. that in this case the relation can be inverted to produce another function. Eq. since all values of the argument give the same value for the function.2): y−b (1. If the relation can be inverted.9).1)). 1. a function takes us unambiguously from the value of x to the value of y = f (x). but not necessarily in the opposite direction (from y to x). it is obvious from the plot of the exponential function. +∞) (1.15) a assuming a = 0 (otherwise the function would be a constant. In fact. then we can obtain x from the inverse function.16) This brings us to the idea of mapping.17) The domain of the exponential is the entire real axis (all real values of x). But. (1.1. A function represents the mapping of the set of real values (denoted by the argument x) to another set of real values. that is. the relation can be inverted.11) are shown in Fig. otherwise it cannot (as in the first example. if we know the value of the function f (x). as in the second example.(1. the relation can be inverted. Eq. which we denote by f (x) or y. (1. These plots give a handy way of determining whether the relation can be inverted or not: Draw a horizontal line (parallel to the x axis) that intersects the vertical axis at an arbitrary value. then the relation can be inverted (as in the second example. for each value of the argument x on a horizontal axis.1. if we know the value of the function f (x) = y. (1. this corresponds to some value of the function. we can uniquely determine the value of the argument that produced it. denoted by f −1 . Eq. (1. If each value of x gets mapped to one and only one value y. otherwise it cannot.

as we change its argument by some amount δ. 1. Crudely speaking. which means f (x) cannot be restricted by an arbitrary value ǫ > 0 within f (x0 ). that is f (x0 ). Therefore. but this is not mandatory. f (x) is closer that ǫ to the corresponding value f (x0 ): |x − x0 | < δ ⇒ |f (x) − f (x0 )| < ǫ (1. very close to 0. +∞) with the lower bound corresponding to the value of the exponential at the lowest value of its argument (−∞) and the upper bound corresponding to the value of the exponential at the highest value of its argument (+∞). FUNCTIONS OF REAL VARIABLES but exp(x) takes only real positive values. in fact. we cannot even assign a real value to f (x0 ) for x0 = 0. we require that as the value of the argument x approaches some specific value x0 . Usually.10 CHAPTER 1. otherwise it is called discontinuous. 1. and when x → 0− the value of f (x) → −∞.2 Continuity An important notion in studying the behavior of a function is that of continuity. In mathematical terms. we can think of ǫ being very small. the value of the function f (x) approaches the value which it takes for x0 . when ǫ becomes smaller we take δ to be correspondingly smaller. ǫ → 0. Therefore. To make this notion more precise. Since we require this to hold for any value ǫ > 0. which means that for a function which is continuous near x0 we can always find values of f (x) arbitrarily close to f (x0 ). a function is called continuous if its values do not make any jumps as the value of the argument is varied smoothly within its domain.3. This means that the domain of the logarithm is only the positive real axis domain of ln(x) : x ∈ (0. this is expressed as follows: For every value of ǫ > 0. .18) This concept is illustrated in Fig. there exists a value of δ > 0 such that when x is closer than δ to the specific value x0 . A counter example is the function f (x) = 1 x in the neighborhood of x0 = 0: when x → 0+ the value of f (x) → +∞. if we change the argument of the function by some amount δ we can jump from a very large positive value of f (x) to a very large negative value.

2.1. x0 x 0 +δ x Figure 1. Eq. we must have |x0 | > δ which is always possible to satisfy as long as x0 = 0. since ǫ and δ must both be greater than zero. this function is discontinuous at x0 = 0. (1. for a given ǫ we take the value of δ to be: δ ǫ|x0 |2 ⇒ǫ= δ= 1 + ǫ|x0 | (|x0 | − δ)|x0 | and consider what happens for |x − x0 | < δ: First.3: Illustration of continuity argument: when x is closer than δ to x0 .18). which is always positive as long as x0 = 0. . To show this explicitly. Using the last relationship together with |x−x0 | < δ we obtain: ǫ= 1 x − x0 |x − x0 | 1 δ = = |f (x) − f (x0 )| > = − (|x0 | − δ)|x0 | |x||x0 | xx0 x x0 that is. CONTINUITY f(x) f(x0) +ε f(x0) f(x0) −ε 11 0 x 0 −δ f (x) gets closer than ǫ to f (x0 ). we have: |x0 | = |x0 −x+x| ≤ |x0 −x|+|x| < δ+|x| ⇒ |x| > |x0 |−δ ⇒ 1 1 < |x| |x0 | − δ where we have used the triangle inequality (see Problem 1) to obtain the first inequality above. for ǫ being very small. For this proof to be valid. for any value of ǫ we have managed to satisfy the continuity requirement. Note that it is continuous at every other real value of x. with our choice of δ.

26) (1. which in this case is g. it is properly defined. As an example.25) (1. The derivatives of some familiar functions are: f (x) = xa f (x) = ex f (x) = cosh(x) f (x) = sinh(x) f ′ (x) = axa−1 a : real f ′ (x) = ex f ′ (x) = sinh(x) f ′ (x) = cosh(x) 1 f (x) = ln(x) → f ′ (x) = . If this holds for all values of x in a domain. if the function is discontinuous. FUNCTIONS OF REAL VARIABLES 1.28) (1. so that the derivative does not exist. and the evaluation of the resulting . Thus.29) The notation f ′ (g(x)) implies the derivative of the function f with respect to its argument. x > 0 x ′ f (x) = cos(x) → f (x) = − sin(x) f (x) = sin(x) → f ′ (x) = cos(x) → → → → (1.21) (1. where the function is discontinuous.27) Two useful relations in taking derivatives are the chain rule and the derivative of a function of a function: d [f (x)g(x)] = f ′ (x)g(x) + f (x)g ′ (x) dx d [f (g(x))] = f ′ (g(x))g ′(x) dx (1. Conversely. the numerator in the limit can be finite (or even infinite) as the denominator tends to zero. the derivative of the function 1/x is: 1 (1/x − 1/x0 ) (x0 − x) d(1/x) (x0 ) = x→x lim = x→x lim =− 2 0 0 (x − x )xx dx x − x0 x0 0 0 (1.12 CHAPTER 1. except x0 = 0.22) (1. When the function is continuous at x = x0 this limit gives a finite value. the derivative of f (x) = 1/x is f ′ (x) = −1/x2 for all x = 0. that is. then the derivative is itself a proper function in this domain.23) (1.3 Derivative The notion of continuity makes it feasible to define the derivative of a function.20) which holds for every value of x0 .19) (x0 ) ≡ lim x→x0 dx x − x0 (the notation f ′ is used interchangeably with the notation df /dx). which makes it impossible to assign a precise value to the limit.24) (1. df f (x) − f (x0 ) f ′ (x0 ) = (1.

3. with the first derivative playing the role of the function in the definition of the second derivative and so on: f ′′ (x0 ) = d2 f f ′ (x) − f ′ (x0 ) (x0 ) = lim x→x0 dx2 x − x0 (1. y) − f (x0 . so the notation f (n) is often used for dn f /dxn for the nth derivative.33) (x0 . We first rewrite the ratio as the product.32) and similarly for the partial derivative with respect to y evaluated at the same point: f (x0 . this is called the “partial derivative with respect to x”. through 1 h(x) = h(x)f (g(x)).21) with a = −1 produces: 1 1 h′ (x)g(x) − h(x)g ′ (x) d h(x) = h′ (x) − h(x) g ′(x) = dx g(x) g(x) [g(x)]2 [g(x)]2 (1. (1. y0 ) ≡ lim y→y0 ∂y y − y0 The geometric interpretation of the derivative f ′ (x) of the function f (x) is that it gives the slope of the tangent of the function at x. y0) ≡ lim x→x0 ∂x x − x0 (1.28) and (1. y). y0 ): ∂f f (x. DERIVATIVE 13 experession at g(x).31) For higher derivatives it becomes awkward to keep adding primes.28) and (1. by using Eq. For instance. holding y constant at a value y = y0 . we can obtain the general expression for the derivative of the ratio of two functions. y0 ) − f (x0 .1. where : f (x) = g(x) x which with the help of Eqs.29) gives d h(x) = h′ (x)f (g(x)) + h(x)f ′ (g(x))g ′(x) dx g(x) and this. This is easily justified by the fact that the line described by the equation q(x) = f (x0 ) + f ′ (x0 )(x − x0 ) . evaluated at (x.29).30) The notion of the derivative can be extended to higher derivatives. and want to take its derivative with repsect to x for x = x0 . f (x. (1. Another generalization is the “partial derivative” of a function of several variables with respect to one of them. (1. y0 ) ∂f (1. suppose that we are dealing with a function of two variables. Using Eqs. y0 ) (x0 . y) = (x0 .

4: Geometric interpretation of (a) the derivative as the slope of the tangent of f (x) at x = x0 and (b) the integral as the area below the graph of the function from x = xi to x = xf . denoted by dx: dF (x) = f (x)dx ⇒ F ′ (x) ≡ dF (x) = f (x) dx (1.4. 1. . We define a function F (x) such that its infinitesimal increment at x.14 f(x) CHAPTER 1. it is evident that finding the function F (x) is equivalent to determining a function whose derivative is equal to f (x) 1 . Note that because the derivative of a constant function is zero. y = f (x0 )). Now consider the summation of all the values dF (x) takes starting at some initial point xi and ending at the final point xf . passes through the point (x = x0 . F (x) is defined up to an unspecified constant. that is. Since these are successive infinitesimal increments in the value 1 For this reason F (x) is also referred to as the antiderivative of f (x). denoted by dF (x) is equal to f (x) multiplied by the infinitesimal increment in the variable x. FUNCTIONS OF REAL VARIABLES (a) f(x) f(b) f(a) (b) f(x0) 0 x0 x 0 a bx Figure 1. q(x0 ) = f (x0 ). 1.4 Integral A notion related to the derivative is the integral of the function f (x).34) From this last relation. and has a slope of q(x) − q(x0 ) = f ′ (x0 ) x − x0 as illustrated in Fig.

so we give a new name. the exponential of −x2 which is known as the gaussian function. the integrand in absolute value should vanish faster than 1/|x|. INTEGRAL 15 of F (x). j = 0. we call this quantity the indefinite integral. we conclude that the definite integral is the area below the graph of the function from xi to xf . or if these infinities are of a special nature so that they cancel. . this summation will give the total difference ∆F of the values of F (x) between the end points: xf ∆F = F (xf ) − F (xi ) = dF (x) x=xi Because this sum involves infinitesimal quantities. as 1/|x|a with a > 1. the difference between the values at the end points of integration cancels the unspecified constant involved in F (x). In the definite integral. that is. if the upper or lower limit of the range of integration is ±∞. but the function takes non-negligible values only for a relatively narrow range of values of |x| near . and use the symbol instead of the summation: xf xi f (x)dx = F (xf ) − F (xi ) (1. N xj = xi + j∆x. x0 = xi .4. The domain of the gaussian function is the entire real axis. then in order to obtain a finite value for the integral. which implies finding the antiderivative of the function f (x) that appears under the symbol. xN = xf ∆x =  N xf xi f (x)dx = lim  N →∞ j=0 f (xj )∆x  (1. . the definite integral. We can express the integral in terms of a usual sum of finite quantities if we break the interval xf − xi into N equal parts and then take the limit of N → ∞: xf − xi . it has a special character. . . Moreover. 1. this is illustrated in Fig.1.36) From this definition of the integral. as this limit is approached. As a simple exercise we apply these concepts to a very useful function.35) If we omit the limits on the integration symbol. From this we can also infer that the definite integral will be a finite quantity if there are no infinities in the values of the function f (x) (called here the “integrand”) in the range of integration.4. N.

∞ −∞ ∞ −∞ ∞ ∞ −∞ x2 + y 2. making the value of the function essentially zero. A general expression that uses the rules of differentation and integration we discussed so far is the so-called “integration by parts”. Examples of gaussian functions with different values of α are shown in Fig. In order to find the constant factor which will ensure normalization of the gaussian function we must calculate its integral over all values of the variable x. Using the rule for the derivative of a function of a function. We can include a constant multiplicative factor in the exponential. chosen here to be α2 . θ = tan−1 y x e −α2 x2 dx = = = e −α2 x2 ∞ dx e −α2 y 2 dy 1 2 1 2 dxdy = √ 1 ∞ 2 π −α2 r 2 1 2 = 2π e dr 2 α 0 −∞ −∞ e −α2 (x2 +y 2 ) 2π 0 0 ∞ e −α2 r 2 rdrdθ 1 2 (1. This integral is most easily performed by using Gauss’ trick of taking the square root of the square of the integral and turning the integration over the cartesian coordinates (x.29). θ). Suppose that . One of the nice features of the gaussian is that we can obtain analytically the value of the definite integral over the entire domain. (1. It is often useful to multiply the gaussian with a factor in front. y = r sin(θ) → r = These relations imply: dxdy = rdrdθ. this is a useful mathematical construct called a δ-function.37) In the limit of α → ∞ this function tends to an infinitely sharp spike of infinite height which integrates to unity. y) into one over the polar coordinates (r. FUNCTIONS OF REAL VARIABLES x = 0.21) with a = 2. we find the derivative of the gaussian to be: 2 2 ′ gα (x) = −2α2 xe−α x where we have also used Eq.16 CHAPTER 1. (1. 1. because for larger values of |x| the argument of the exponential becomes very large and negative. Eq. which allows us to adjust the range of the domain in which the gaussian takes non-negligible values.5. 2 2 gα (x) = e−α x . with the two sets of coordinates related by x = r cos(θ). to make the definite integral of the product over the entire domain exactly equal to unity (this is called “normalization” of the function). which we will examine in great detail later.

1 25 x ^ 2 .4. INTEGRAL 17 Gaussian functions 1 In[1]:= Show Plot Exp Plot Exp x ^ 2 .2 -1 -0.8 0.6 0. These plots do √ not include the normalization factors α/ π. 1 0.4 0.6 0.5: Examples of gaussian functions with α = 1 and 5. 1 .2 -1 Out[1]= 0. 1.2 -1 -0.1 . x.5 1 0. PlotRange 0. 1.8 0. 1.5 1 0.5 1 -0.5 1 Graphics Figure 1. 1 . 1.5 0.1. PlotRange 0.4 0. x.4 0.6 0.5 1 0. .8 0.

38). Eq. . which will prove very useful in chapter 5. (1.38) where we have used primes to denote first derivatives. we have used the integration by parts formula once again. dx d cos(nx) = −n sin(nx) dx and the integration by parts formula. π 0 xk sin(nx)dx with k. to obtain the second result in each case. we have employed the rule for differentiating a product. and we introduced the notation [h(x)]xf = h(xf ) − h(xi ) xi An application of the integration by parts formula. n positive integers. The usefulness of these results lies in the fact that the final expression contains an integrand which is similar to the original one with the power of x reduced by 2.39) π 0 xk−2 cos(nx)dx π 0 xk sin(nx)dx = − π k (−1)n k π k−1 + x cos(nx)dx (1.28). we find: π 0 xk cos(nx)dx = − k π k−1 x sin(nx)dx n 0 kπ k−1 (−1)n k(k − 1) − = n2 n2 (1.40) n n 0 π k (−1)n k(k − 1) π k−2 x sin(nx)dx − = − n n2 0 where. FUNCTIONS OF REAL VARIABLES the integrand can be expressed as a product of two functions. Consider the following definite integrals: π 0 xk cos(nx)dx. one of which is a derivative: xf xi f ′ (x)g(x)dx = xf xi d [f (x)g(x)] − f (x)g ′(x) dx dx xf xi = [f (x)g(x)]xf − xi f (x)g ′(x)dx (1. (1. we can evaluate any integral containing powers of x and sines or cosines. Using the identities d sin(nx) = n cos(nx). involves powers of x and sines or cosines of multiples of x. Applying these results recursively. Eq.18 CHAPTER 1.

4. in anticipation of integration methods based on complex analysis (see chapter 3). Notice that this cancellation is achieved only if the value x = 1 is approached . just above x = 1 we express write the variable x as x = 1 + ǫ with ǫ > 0 which gives 1 1 1 = →+ for ǫ → 0 2 −1 (1 + ǫ) ǫ(ǫ + 2) 2ǫ so that when we add the two contributions they cancel each other. these can be made to cancel. INTEGRAL 19 We give two more examples of definite integrals to demonstrate difficulties that may arise in their evaluation. just below x = 1 we write the variable x as x = 1 − ǫ with ǫ > 0 which gives 1 1 1 = →− for ǫ → 0 2 −1 (1 − ǫ) ǫ(ǫ − 2) 2ǫ and similarly. with the change of variables x = tan(t) = sin(t) cos2 (t) + sin2 (t) 1 ⇒ dx = dt = dt cos(t) cos2 (t) cos2 (t) we find for the original integral: ∞ 0 1 dx = x2 + 1 π/2 0 1 1 dt = tan (t) + 1 cos2 (t) 2 π/2 0 dt = π 2 (1. this integral presents no particular difficulties.1. The first example we consider is the definite integral ∞ 0 1 dx x2 + 1 We note that the integrand is always finite over the range of integration and it vanishes as ∼ 1/x2 for x → ∞. which suggests that this integral should give a finite value. Indeed. but in this case there are infinities of the integrand as x approaches 1 from below or above. Fortunately. To see this. The next definite integral we consider is the following: ∞ 0 1 dx x2 − 1 which is somewhat more demanding: for x → ∞ the integrand again vanishes as ∼ 1/x2 .41) Other than the inspired change of variables.

while the preceding analysis suggested that the two parts must be evaluated simultatneously to . In the general case where the integrand has an infinite value at a point x0 in the interval of integration x ∈ [a. that is. If we attempted to solve the problem by a change of variables analogous to the one we made for the previous example.42) This way of evaluating the integral is called taking its “principal value”. (1. because both integrals are infinite and we cannot decide what the final answer is. however. FUNCTIONS OF REAL VARIABLES symmetrically from above and below.12). b]. has led to an impass. this symmetric approach of the infinite value on either side of x0 is expressed as follows: b a f (x) dx = lim x0 −ǫ a ǫ→0 f (x) dx + b x0 +ǫ f (x) dx (1. x → ∞ ⇒ t = 0 so that the second integral gives: ∞ 1 1 dx = − 2 −1 x 0 ∞ 1 1 dt = 2 coth (t) − 1 sinh2 (t) ∞ 0 dt This. we might try first to separate the integral into two parts to facilitate the change of variables: ∞ 0 1 dx = x2 − 1 1 0 1 dx + x2 − 1 ∞ 1 1 dx x2 − 1 and for the first part we could choose x = tanh(t) = cosh2 (t) − sinh2 (t) sinh(t) 1 ⇒ dx = dt = dt 2 cosh(t) cosh (t) cosh2 (t) with x = 0 ⇒ t = 0. x = 1 ⇒ t → ∞ and we have used Eq. The problem in this approach is that we broke the integral in two parts and tried to evaluate them independently because that facilitated the change of variables. we use the same values of ǫ for x → 1− and x → 1+ . so that the first integral gives: 1 0 1 dx = 2 −1 x ∞ 0 1 1 dt = − 2 tanh (t) − 1 cosh2 (t) ∞ 0 dt while for the second part we could choose x = coth(t) = 1 cosh(t) sinh2 (t) − cosh2 (t) dt = − dt ⇒ dx = 2 sinh(t) sinh (t) sinh2 (t) with x = 1 ⇒ t → ∞.20 CHAPTER 1. even though from the preceding analysis we expect it to be finite.

4. . INTEGRAL 21 make sure that the infinities are cancelled (at least in the “principal value” sense of the integral). The approaches we will develop in chapter 3 provide an elegant way of avoiding these difficulties.1. This is an illustration of difficulties that may arise in the evaluation of real definite integrals using traditional approaches.

(1. 2π]? Is it a properly defined function. using the definition of the exponential given in Eq. Show that the even moments of the normalized gaussian are given by the expressions below. Is the inverse of cos(x) a properly defined function. +∞)? Is it a properly defined function. if cos(x) is defined in the domain x ∈ [0. .22 CHAPTER 1. that is. if cos(x) is defined in the domain x ∈ (−∞. x2 show that: |x1 + x2 | ≤ |x1 | + |x2 | This is known as the “triangle inequality”. if cos(x) is defined in the domain x ∈ [0. π]? What about the inverse of sin(x) for the same three domains? 3. FUNCTIONS OF REAL VARIABLES Problems 1. by simply taking derivatives with respect to α2 of both sides of Eq.(1. Show that this inequality can also be generalized to an arbitrary number of real numbers. (1.9).22).37): √ ∞ 1 π 2 −α2 x2 xe dx = 2 α3 −∞ √ ∞ 3 π 4 −α2 x2 xe dx = 4 α5 −∞ √ ∞ (2n − 1)!! π 2n −α2 x2 x e dx = 2n α2n+1 −∞ where the symbol (2n − 1)!! denotes the product of all odd integers up to (2n − 1). |x1 + · · · + xn | ≤ |x1 | + · · · + |xn | 2. For any two real numbers x1 . 4. Prove that the derivative of the exponential function is given by the expression of Eq.

the series expansion is an exact representation of the function only if we include an infinite number of terms. gn (x). differentiate. Thus. 23 . This approach is useful if the exact infinite series expansion can be truncated to a few terms which give an adequate approximation of the function. 2. which can be manipulated more easily than the original function: f (x) = ∞ n=1 cn gn (x) (2. • The coefficients cn can be obtained easily from the definition of gn (x) and the function f (x).. with n = 1. differentiate or integrate. . the choices in response to the above questions should be such that: • The functions gn (x) are easy to work with (evaluate.. integrate). and it is consequently easier to deal with these functions than with the original function.1) By this we mean that the functions in the series are easier to evaluate.. There are three important issues in dealing with series expansions: • What are the functions gn (x) that should be included in the series? • What are the numerical coefficients cn accompanying the functions in the expansion? • Does the series converge and how fast? Typically.Chapter 2 Power series expansions We are often interested in representing a function f (x) as a series expansion of other functions.

For x close to x0 . the expansion can be truncated to the first few terms. the Fourier series expansion.8) . we obtain for the common exponential. . which gives the best possible representation of the function in terms of powers of (x − x0 ).24 CHAPTER 2. there exists an expression.. trigonometric and hyperbolic functions: 1 1 ex = 1 + x + x2 + x3 + · · · .. second. This is known as a “power series” expansion.2) + n! with the first. x0 = 0 2 6 1 2 1 3 log(1 + x) = x − x + x + · · · .. derivatives evaluated at x = x0 . or more generally powers of (x − x0 ) where x0 is a specific value of the variable near which we want to know the values of the function. 2.7) (2.6) (2. The Taylor series expansion of a continuous and infinitely differentiable function of x around x0 in powers of (x − x0 ) is: f (x) = f (x0 ) + 1 ′ 1 f (x0 )(x − x0 ) + f ′′ (x0 )(x − x0 )2 + · · · 1! 2! 1 (n) n f (x0 )(x − x0 ) + · · · (2. POWER SERIES EXPANSIONS • The series must converges fast to f (x). logarithmic. The numerical coefficient accompanying this term is the inverse of the factorial: n! = 1 · 2 · 3 · · · n. nth . will be discussed in detail in chapter 5.3) (2.4) (2.. x0 = 0 2 24 1 5 1 3 x + · · · . x0 = 1 2 3 1 2 1 cos x = 1 − x + x4 + · · · . The term which involves the nth derivative is called the term of order n. giving a very good approximation for f (x) in terms of the function and its lowest few derivatives evaluated at x = x0 . Another very useful expansion. known as the Taylor series expansion. so that truncation to the first few terms gives a good approximation. x0 = 0 sin x = x − x + 6 120 1 2 sin x = x + x3 + x5 + · · · .1 Taylor series The simplest terms on which a series expansion can be based are powers of the variable x. For the case of power series.. x0 = 0 sin x x 3 45 (2. x0 = 0 tan x ≡ cos x 3 15 cos x 1 1 1 cot x ≡ = − x − x3 + · · · .5) (2. Using the general expression for the Taylor series..

the binomial expansion can be written as: (a + b)N = aN + N = N! aN −k bk (N − k)!k! k=0 N! N! aN −1 b + aN −2 b2 + · · · (N − 1)!1! (N − 2)!2! (2. instead of x to make the analogy to compounding with time more direct.12) (2. It is evident from this expression that for p = N : an integer.11) (2. which mathematically is expressed by the relation lim 1+ κt N N N →∞ = eκt (2.9) (2. and its approximation by the two and three terms of lowest order in the Taylor expansion.13) Fig. coth x ≡ sinh x x 3 45 x0 = 0 x0 = 0 x0 = 0 x0 = 0 25 (2.1 shows the exponential function near x0 = 0.15) With the help of the Taylor expansion of the exponential and the binomial expansion.10) (2. p real. sinh x = x + x + 6 120 sinh x 1 2 tanh x ≡ = x − x3 + x5 + · · · . TAYLOR SERIES 1 1 cosh x = 1 + x2 + x4 + · · · . 2. This can be easily obtained as a Taylor series expansion (see Problem 1).16) where we use the variable t (multiplied by the constant factor κ). When p = N : an integer. the binomial expansion is finite and has a total number of (N + 1) terms.1. The . 2 24 1 5 1 3 x + ···. we can decipher the deeper meaning of the exponential function: it is the result of conitnuous compounding.14) n! with a > 0. if p is not an integer it has infinite terms.2. cosh x 3 15 1 1 1 cosh x = + x − x3 + · · · . the more terms we keep in the expansion the better the approximation is over a wider range of values around x0 . b. Another very useful power series expansion is the binomial expansion: (a + b)p = ap + p(p − 1) p−2 2 p(p − 1)(p − 2) p−3 3 p p−1 a b+ a b + a b +··· 1! 2! 3! p(p − 1)(p − 2) · · · (p − (n − 1)) p−n n + a b +··· (2. As is evident from this figure.

1.5 0. 1 .5 -1 -0.5 -1 -0.5 1 0.5 1 0.5 3 2.5 3 2.5 2 1.26 CHAPTER 2. 3 . 1 .5 -1 -0.1: The exponential function and its approximation by the Taylor expansion keeping terms of order up to one and two. 1 . x 1 2 x ^ 2. x.5 -1 -0. . x.5 2 1. PlotRange 0.5 3 2. x. 1. PlotRange 0.5 1 0. 3 .5 1 Figure 2. POWER SERIES EXPANSIONS Taylor series of exponential 1 In[1]:= Show Plot Exp x .5 1 0. 3 3 2. Plot 1 Plot 1 x.5 1 0. 1.5 2 1.5 1 0.5 1 0.5 2 1. PlotRange 0.

each factor involving N in the parentheses is approximately equal to N. Now we choose N large enough. to which we add interest at a rate κ every time interval dt. — at time 2dt → K(1 + κdt)2 .. as in 1% per day.. With the help of the binomial expansion we get: (1 + κdt) = N N! (κdt)k k=0 k!(N − k)! N The general term in this series can be written as: 1 N! (κdt)k = (κdt)k (N − k + 1)(N − k + 2) · · · N k!(N − k)! k! but since N is very large. — at time dt → K(1 + κdt).2.01. Ndt = t:finite. we get: 1 (κt)k = eκt k=0 k! where the last equality follows from the Taylor expansion of the exponential function.. so for dt = 1 day. so that even though dt → 0. — at time Ndt → K(1 + κdt)N . Then we will have the following amounts at successive time intervals: — at time 0 → K. CONVERGENCE OF NUMBER SERIES 27 argument goes as follows: suppose that at time t = 0 we have a capital K. Ndt = t with t a finite value. In order to reach the continuous compounding limit we will let dt → 0. We are in principle interested in the convergence of series expansions of functions. 1 . giving 1 1 1 N! (κdt)k ≈ (κdt)k N k = (κNdt)k = (κt)k k!(N − k)! k! k! k! and for N → ∞.2 Convergence of number series An important consideration when dealing with series expansions is whether the series converges or not. that is. . we must assign dimensions of [1/time] to the constant κ. ∞ 2. our original capital increases by a factor (1 + κdt) at every time interval 1 .2. such as the power series expansions If we want to be very precise. the capital would increase by a factor 1. dt → 0.

starting at some value . this sequence converges to the value 1 (for large enough value of n. For example. Now consider an infinite series. we usually focus on what happens for the “tail” of the series. To discuss the issue of convergence of a number series.. ˆ ∀n > N where a is the limit (also called the limiting or asymptotic value) of the ˆ sequence. that is. that is. The precise definition of the convergence of a sequence is the following: A sequence an converges if for any ǫ > 0.28 CHAPTER 2. an . for the sequence 2 3 4 5 . a2 . Since this must hold for any ǫ > 0. A sequence of numbers a1 . as long as the terms themselves are finite. . we refer first to the more familiar case of the convergence of a sequence. . it is easier to establish the basic concepts in the context of series of real numbers. POWER SERIES EXPANSIONS we discussed above. . . However. defined by ∞ n=1 an The issue here is whether or not the summation of the infinite number of terms an gives a well defined (finite) value. an comes arbitrarily close to unity). is said to converge if for n → ∞ the terms get closer and closer to a well defined. Accordingly. or for that matter in the sum of any finite number of terms. arbitrarily close to ˆ the limit of the sequence. because this part always gives a finite number.. we can find a value of the index n = N such that |an − a| < ǫ. . . finite value. typically we are not interested in the sum of the first few terms. . Notice that the problem comes from the infinite number of terms as well as from the values that these terms take for large values of n. . by taking ǫ → 0 the above defintion ensures that beyond a certain value of the index n all terms of the sequence are within the narrow range a ± ǫ. . For this reason. . . 1 2 3 4 defined by the following expression for the nth term an = n+1 ⇒ lim an = 1 n→∞ n that is.

then so does the original series: n |an | : converges ⇒ an : converges n . CONVERGENCE OF NUMBER SERIES 29 of the index n and going all the way to infinity. . One useful thing to notice is that if the series of absolute values converges.2. We use this concept to define the convergence of series: a series is said to converge if the sequence of partials sums converges. We will therefore often denote the infinite series as a sum over n of terms an . j = 0. consider the sequence of numbers given by: aj = tj . so will the series. As an example. The corresponding sequence of partials sums is: 1 − tn+1 sn = t = 1+t+t +t +···+t = 1−t j=0 j 2 3 n n which for large n converges because t < 1 ⇒ lim tn = 0 ⇒ lim n→∞ 1 1 − tn+1 = n→∞ 1 − t 1−t (2. . is 1/(1 − t). for 0 < t < 1. and the corresponding series. otherwise we say that the series diverges. In order to determine the convergence of an infinite series we can turn it into a sequence. . If it does.17) which proves that the limit of the geometric series. 0 < t < 1. since the asymptotic term of the sequence sn for n → ∞ is the same as the infinite series. by considering the partial sums: n sn = j=1 aj because then we can ask whether the sequence sn converges. 1. without specifying the starting value of n and with the understanding that the final value of the index is always ∞.2. given by: ∞ j=0 aj = ∞ j=0 tj = 1 + t + t2 + t3 + · · · + tn + · · · which is known as the “geometric series”.

(2. then we say that the series an converges absolutely (and of course it converges in the usual sense). whereas the original terms can be positive or negative. We give some of these tests here.17).30 CHAPTER 2. If the series |an | converges. POWER SERIES EXPANSIONS The reason for this is simple: the absolute values are all positive. |aN +1 |/(1 − t). bn : an converges absolutely bn converges absolutely an diverges absolutely bn diverges absolutely (2. (2. ∀ n > N ⇒ the series converges an Proof: from the inequiality obeyed by successive terms for n > N we obtain: |aN +2 | ≤ t|aN +1 | |aN +3 | ≤ t|aN +2 | ≤ t2 |aN +1 | |aN +4 | ≤ t|aN +3 | ≤ t3 |aN +1 | By summing the two sides of the above inequalities we obtain: |aN +1 | + |aN +2 | + |aN +3 | + · · · ≤ |aN +1 |(1 + t + t2 + · · ·) = |aN +1 | (1 − t) where we have used the result for the geometric series summation. The last relation for the sum of an with n > N shows that the “tail” of the series of absolute values is bounded by a finite number. 1. 2. A number of tests have been devised to determine whether or not a series converges by looking at the behavior of the nth term. Ratio test: For the series if an . then it is the same as summing all the absolute values with the overall sign outside. that if a series diverges absolutely we cannot conclude that it diverges. therefore the series converges absolutely. . however. then their total sum in absolute value will necessarily be smaller than the sum of the absolute values. Eq. so if the series of absolute values converges so does the original series. If all the terms have the same sign. Comparison test: Consider two series.20) an+1 ≤ t < 1.19) (2. If the terms can have either sign.18) Note. and therefore if the series of absolute values converges so does the original series. if |bn | ≤ |an | if |an | ≤ |bn | and and ⇒ ⇒ an .

if L1 or L2 = 1 → we cannot determine convergence. we define the following limits: lim n 31 |an+1 | = L1 . bn and sn : . As an example of a series that satisfies the Dirichlet test consider the series with terms given by: π 1 cj = cos (j − 1) 4 j Making the obvious choices aj = cos[(j − 1)π/4] and bj = (1/j). if L1 or L2 > 1 → the series diverges. the series cj converges.2. n→∞ |an | n→∞ |an | = L2 if L1 or L2 < 1 → the series converges absolutely. 4. Limit tests: For the series lim an . then the series ∞ an n=1 converges if and only if ∞ 1 f (x)dx < ∞ 5. Integral test: If 0 < an+1 ≤ an and f (x) is a continuous nonincreasing function such that an = f (n). CONVERGENCE OF NUMBER SERIES 3. and defining the partial sums as n sn = j=1 aj leads to the following table of values for an . Dirichlet test: Suppose that we have a series with terms cj which can be expressed as follows: ∞ j=1 cj = ∞ j=1 aj bj Consider aj and bj as separate series: then if the partial sums of aj are bounded and the terms of bj are positive and tend monotonically to 0.2.

This leads to the notion of uniform convergence. and the bj terms are positive and monotonically decreasing. As in the case of series of numbers. If the sequence sn (x) converges to the limiting function s(x).1). it is useful to consider the convergence of the series as a function of x. we can find N such that for all n > N. |sn (x) − s(x)| < ǫ. From this we conclude that the original series cj converges. the dependence of this condition on x ˆ introduces another dimension to the problem. the partial sums of aj are bounded by the √ value (1 + 1/ 2). we study the convergence of a series of functions by turning them into sequences through the partial sums: n sn (x) = j=1 cj gj (x) where gj (x) is the set of functions we have chosen for the expansion. 2. POWER SERIES EXPANSIONS 3 0 4 1 − √2 5 −1 0 6 1 − √2 7 0 8 1 √ 2 9 1 ··· ··· ··· ··· sn 1 1+ 1 √ 2 1+ 1 √ 2 1 1 − √2 1 − √2 0 1 bn 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 As is evident from this table.32 n an 1 1 2 1 √ 2 CHAPTER 2. since given the value of x. which has to do with convergence of the series for a range of values of x. so will the series ˆ because ∞ j=1 cj gj (x) = n→∞ sn (x) = s(x) lim ˆ In this case. Of course for a given value of x this is just a series of real numbers. each function gn (x) takes on a specific real value. However. x2 ] is . However. The definition of uniform convergence for x ∈ [x1 . the condition for convergence of the sequence sn (x) is: for any ǫ > 0 and a given value of x.3 Convergence of series of functions We turn next to the case of series of functions as in Eq. (2.

2(a). we want to find the value of N such that (1−x)/(n+1) < ǫ for n > N. N is finite and does not depend on x). x2 ] is that for ǫ > 0. If we choose 1 1 < ǫ⇒N > −1 N +1 ǫ then 1/(n + 1) < 1/(N + 1) < ǫ and |sn (x) − s(x)| ≤ ˆ 1 1 < <ǫ n+1 N +1 for n > N. 3 are illustrated in Fig. Since our choice of N is independent of x we conclude that the sequence sn (x) converges uniformly. As an example of uniform convergence we consider the sequence sn (x) = n (1 − x). ˆ N always depends on ǫ (usually. . for x ∈ [0. 2. we can find N such that for n > N all terms of sn (x) lie within ±ǫ of s(x) and this applies to all x in the interval [x1 . we examine |sn (x) − s(x)| = ˆ 1 1 n (1 − x) − (1 − x) = (1 − x) = (1 − x) n+1 n+1 n+1 Notice that because x ∈ [0. 1]. x2 ]. the sequence converges uniformly. But (1 − x)/(n + 1) ≤ 1/(n + 1). x2 ]. These functions for n = 1. but as long as we can find one N for all x in the interval [x1 .2. CONVERGENCE OF SERIES OF FUNCTIONS 33 the following: when for any ǫ > 0 we can find N such that |sn (x)−ˆ(x)| < ǫ s for all n > N and for all x in the interval of interest x ∈ [x1 . we will have 0 ≤ (1 − x) ≤ 1 ⇒ |sn (x) − s(x)| = ˆ 1 1−x ≤ n+1 n+1 Given an ǫ > 0. The meaning of uniform convergence of the sequence sn (x) for x ∈ [x1 . so it is sufficient to find N such that 1/(n + 1) < ǫ.3. We first need to determine limit of the sequence: s(x) = lim sn (x) = lim ˆ n→∞ n (1 − x) = (1 − x) n→∞ n + 1 To determine whether the sequence is uniformly convergent. x2 ] that satisfies this condition (that is. the smaller the value of ǫ the larger the N we have to choose). 2. 1] n+1 (we are not concerned here from what series this sequence was produced).

∞) These functions for n = 1. For fixed x. we will have sn (x) < sn0 (x) and for sufficiently large n we will achieve sn (x) < ǫ. In each case the limiting function is denoted by s(x).34 sn(x) s(x) s3(x) s2(x) s1(x) CHAPTER 2. we consider the sequence sn (x) = nx . Therefore. then 1 N n0 Nx N = <ǫ⇒1+ 2 ( 1 )2 2 x2 1+N n0 1 + N n0 2 > 1N ǫ n0 . How much larger? Call N the first value of n for which sn (x) < ǫ. which is equal to 1/2. This function has a maximum at x = 1/n. 1 + n2 x2 x ∈ [0. as a function of x. 2. ˆ As a counter-example of a sequence that does not convergence uniformly. if the sequence converges uniformly we should be able to find N such that nx/(1 + n2 x2 ) < ǫ. How large should N be to make this work? We define n0 to be the index of the function with its maximum at our chosen value of x ⇒ n0 = 1/x (we assume that x is small enough so that 1/x is arbitrarily close to an integer). we examine the behavior of nx = sn (x) |sn (x) − s(x)| = ˆ 1 + n2 x2 Let us study the behavior of sn (x) for a given n. n must be at least larger than n0 . 2. For n larger than n0 .2(b). 3 are illustrated in Fig. for n > N. and given ǫ > 0.2: Examples of (a) uniform and (b) non-uniform convergence of sequences sn (x). POWER SERIES EXPANSIONS sn(x) s3(x) s2(x) s1(x) 0 1 x 0 s(x) x Figure 2. Again we first determine the limit s(x): for a fixed value of x ˆ s(x) = n→∞ sn (x) = n→∞ ˆ lim lim nx =0 1 + n2 x2 To determine whether the sequence converges uniformly or not.

∞) it is not possible to find a single N to satisfy N > 1/(xǫ) for all x. For an example. but if we include x = 0 in the interval we can never squeeze the spike away. CONVERGENCE OF SERIES OF FUNCTIONS This inequality is satisfied by the choice (N/n0 ) > 1/ǫ: 1 N N > ⇒ n0 ε n0 2 35 N1 N > ⇒1+ n0 ǫ n0 2 > N1 n0 ǫ But N > n0 /ǫ ⇒ N > 1/(ǫx). but it never goes away.22) We easily see that in the limit n → ∞ this sequence becomes the Taylor expansion of the exponential of −x. It is often useful to try to establish uniform convergence by a graphical.23) . If we truncate the interval to [ˆ. 1]. so that: s(x) = lim sn (x) = e−x ˆ n→∞ (2.21) in the range x ∈ [0. The partial sums of this series are given by: n sn (x) = 1 + (−1)k k=1 1 k x k! (2. This spike cannot be contained by an ǫ within s(x) which is zero everywhere. ∞) ˆ x with x a finite value. But for x ∈ [0. non-uniform convergence implies that the difference between sn (x) and s(x) can be made arbitrarily small ˆ for each x with suitable choice of n > N. The problem with this sequence is that there is a spike near x = 0. ∞). and as n gets larger the spike moves closer to zero and becomes narrower. from which we conclude that the sequence sn (x) does not converge uniformly for x ∈ [0.3. which makes the choice of N depend on x. as required for uniform convergence. the choice 1 1 N> ≥ xǫ ˆ xǫ covers all possibilities in the interval [ˆ. As is evident from these two examples. This contradicts the requirements for uniform convergence. but cannot be made uniformly small for all x simultaneously for the same value of N. ∞). If we concentrate for a moment to values x ∈ [ˆ. rather than an analytic argument.2. ∞) where x is a finite number x ˆ > 0. so that sn (x) converges unifromly x for x in that interval. consider the series 1−x+ 1 2 1 1 1 x − x3 + x4 − x5 + · · · 2! 3! 4! 5! (2. then we can sqeeze the spike to values below x and ˆ ˆ the sequence converges uniformly in this range.

00002 -0.0015 0.0002 . 0.00003. PlotRange x ^ k k .6 0. 1.6 0.0002 Out[2]= 0. 7 0.0015 Out[1]= 0.00005 0.0015 .0001 -0.0002. In[3]:= Plot 0.0005 0.2 -0.001 -0. 6 0.00001 -0. 0. k.8 1 Graphics Exp x 1 Sum x. In[2]:= Plot 0. 0. 1 .4 0.0002 0. 1].2 -0. PlotRange x ^ k k .001 0. POWER SERIES EXPANSIONS Uniform convergence of series expansion of exponential 1 In[1]:= Plot Exp x 1 Sum x.00003 0.00002 0.8 1 Graphics Exp x 1 Sum x.3: Graphical argument for the uniform convergence of the sequence of partial sums defined in Eq. 0. 0.00015 -0. .0015.4 0. 1.6 0.0005 -0.4 0.8 1 Graphics Figure 2. (2. 5 0. 0.00003 .22) to exp(−x) in the interval x ∈ [0.00003 Out[3]= 0.00015 0.2 -0.0001 0. 1 . k. 0. 1. 1 . k.00001 0.36 CHAPTER 2.00005 -0. PlotRange x ^ k k .

1]. As we can see from the plots of this difference as a function of x ∈ [0. 1] for different values of n. These plots also show that the worst case senario corresponds to x = 1.2. 2. m > N.3. then so is the sequence sj (x) in the interval [x1 . If |sj (x)| < aj . because if it is convergent we will have: ǫ ǫ |sn (x) − s(x)| < . . • Cauchy test: a sequence for which |sn (x) − sm (x)| < ǫ for n. CONVERGENCE OF SERIES OF FUNCTIONS 37 In order to establish uniform convergence of the original series in the given interval. if ǫ = 0. shown in Fig. This is a necessary condition for convergence of the sequence.0002 → N = 6. we must take the difference |ˆ(x) − sn (x)| and ask whether or s not it is bounded by some ǫ > 0 for n > N and for all x ∈ [0. For example. which will then guarantee that the same condition is met for all other values of x in the interval [0. so that if a sequence is a Cauchy sequence then it converges uniformly. no matter how small ǫ is. we can devise tests for the uniform convergence of series of functions. x2 ] and aj is a convergent sequence. |sm (x) − s(x)| < . if ǫ = 0. we obtain |sn (x) − sm (x)| ≤ |sn (x) − s(x)| + |sm (x) − s(x)| < ǫ ˆ ˆ It turns out that this is also a sufficient condition.0015 → N = 5. m > N is called a Cauchy sequence. m > N ˆ ˆ 2 2 where we have called ǫ/2 the small quantity that restricts sn (x) and sm (x) close to s(x) for n. we can always find a value for N which satisfies this condition. x2 ]. so all we have to do is to make sure that |ˆ(x) − sn (x)| for x = 1 is smaller s than the given ǫ. Since aj does not involve x. 1]. By analogy to the tests for convergence of series of numbers.3. this implies uniform convergence of the sequence sj (x). for n.00003 → N = 7. if ǫ = 0. etc. for x ∈ [x1 . We can rewrite the difference ˆ between sn (x) and sm (x) as: |sn (x) − sm (x)| = |sn (x) − s(x) + s(x) − sm (x)| ˆ ˆ and using the triangle inequality and the previous relation. Some useful tests are: • Weierstrass test: consider the sequence of functions sj (x) and the sequence of positive numbers aj .

and dx ∞ n at will and thereby makes possible the evaluation of integrals and derivatives trhough term-by-term integration or differentiation of a series. then it converges to g ′(x). b]. b]. • Theorem 4. then the sum of the series is also a continuous function of x in [a. b]. . Let gn (x) be a series such that each gn (x) is a continuous function of x in the interval [a. Let gn (x) be a series of differentiable functions that ′ converges to g (x) in [a. we mention here four useful theorems related to uniform convergence. • Theorem 3. without delving into their proofs: • Theorem 1. • Theorem 2. b]. b]. then the series gn (x) is uniformly and absolutely convergent in [a. b]. POWER SERIES EXPANSIONS Finally. If the series is uniformly convergent in [a. . Moreover. If the series gn (x) converges uniformly ˆ in [a. then ˆ β α gn (x) converges β α g (x)dx = ˆ β α g1 (x)dx + β α g2 (x)dx + . is that it allows us to interchange the operations dx or d . . such that |gn (x)| ≤ cn for all values of x ∈ [a. where a ≤ α ≤ b and a ≤ β ≤ b. . If a series of continuous functions uniformly to g (x) in [a. although in some cases this may be possible.38 CHAPTER 2. + gn (x)dx + . b]. . cn . If there is a convergent series of constants. the convergence is uniform with respect to α and β. as is evident from the above theorems. . ˆ The usefullness of uniform convergence. Nonuniform convergence implies that in general these operations cannot be interchanged. b].

2. (c) Show that the alternating harmonic series: (−1)n n n=1 converges (Hint: combine successive terms in pairs in the partial sums of the series. they can be truncated at some term. Using the Taylor expansion. expanded around x0 = 0. Eq.14). (2. 4. (a) Show that the harmonic series: 1 n=1 n diverges (Hint: use the integral test). for which one can easily determine the convergence. that is. CONVERGENCE OF SERIES OF FUNCTIONS 39 Problems 1. using another series. (b) Show that the series 1 α n=1 n converges if and only if α > 1 (Hint: use the integral or comparison test). Show that the partial sums converge to a unique limit.3. with the remainder of the terms making an insignificant contribution? 3. f (x) = cosh(sin(x)).) ∞ ∞ ∞ near x0 = 0 near x0 = 0 . Derive the binomial expansion. Find the Taylor expansion of the following function: f (x) = cos(πex ). find power series representations of the function 1 √ 1±x For what range of values of x can these power series expansions be used effectively as approximations of the function. 2. from the Taylor expansion of the function f (x) = ap (1 + x)p with x = (b/a).

(b) Make a sketch showing the character of sn (x) for the values n = 1. 2. where 0 < x0 . ∞]. does the sequence converge uniformly in this interval? Justify your answer. what is the limit of the sequence for n → ∞? Call this limit s(x). 8. 3. 1]. ∞]? Explain in detail your answer. ∞] and n ≥ 1. n ≥ 1 and λ: real. 1]. x0 ]. ∞)? 6. ˆ . (c) If the interval is specified as x ∈ [0. Consider the sequence sn (x) = e−x for x ∈ [0. 2 (a) For any chosen finite x in the give interval. Does this sequence converge uniformly in the interval x ∈ [0. Find the values of λ for which the sequence converges uniformly. Consider the sequence sn (x) = 2nxe−nx for x ∈ [0. where 0 < x0 < 1. Does the sequence converge uniformly in [0. such that the bound → 0 as n → ∞.40 CHAPTER 2. 2. Uniform convergence requires that we be able to bound the differene between s(x) and sn (x). Establish by both an analytic and a graphical argument that the sequence x n sn (x) = 1 − n converges uniformly in the interval x ∈ [0. how might the index N be chosen for a given bound ǫ. 1] s 1 0 (d) Evaluate the integral sn (x)dx and take its limit for n → ∞. so that |ˆ(x) − sn (x)| < ǫ for n > N and x ∈ [x0 . Consider the sequence sn (x) = 2nλ e−nx for x ∈ [0. what is the limit of the sequence for n → ∞? Call this limit s(x). for all x in the ˆ interval. How does this limit compare to the integral 1 0 s(x)dx ˆ 2 /n2 7. 1] and n ≥ 1. 3. POWER SERIES EXPANSIONS 5. ˆ 2 (b) Make a sketch showing the character of sn (x) for the values n = 1. (a) For any chosen finite x in the give interval. Does the sequence converge uniformly? (c) If the interval had been specified as x ∈ [x0 .

Chapter 3 Functions of complex variables

3.1

Complex numbers

We begin with some historical background to complex numbers. Complex numbers were originally introduced to solve the inocent looking equation x2 + 1 = 0 (3.1)

which in the realm of real numbers has no solution because it requires the square of a number to be a negative quantity. This led to the introduction of the imaginary unit i defined as √ (3.2) i = −1 ⇒ i2 = −1 which solved the original problem seemingly by brute force. This solution however, had far reaching implications. For starters, the above equation is just a special case of the second-order polynomial equation ax2 + bx + c = 0 (3.3)

The usual solutions of this equation for ∆ ≥ 0 are √ −b ± ∆ x= (3.4) 2a but there are no real solutions for ∆ < 0. With the use of the imaginary unit, for ∆ < 0 the solutions become: z= −b ± i |∆| 2a 41 (3.5)

42

CHAPTER 3. FUNCTIONS OF COMPLEX VARIABLES

z is a complex number with real part −b/2a and imaginary part ± |∆|/2a. We denote the real part of a complex number z by x = R[z] and the imaginary part by y = I[z]. Thus, the introduction of the imaginary unit made sure that the second-order polynomial equation always has exactly two solutions (also called “roots”). In fact, every polynomial equation of degree n an xn + an−1 xn−1 + · · · + a1 x + a0 = 0 (3.6) with real coefficients aj , j = 0, 1, 2, . . . , n has exactly n solutions if we allow them to be complex numbers. A significant breakthrough in working with complex numbers was introduced by the representation of the complex number z = x + iy on the so-called “complex plane” with abscissa (x) along the real axis (horizontal), and ordinate (y) along the imaginary axis (vertical). This is illustrated in Fig. 3.1.
imaginary axis

x r θ
0

z y

x = r cos(θ) y = r sin( θ)

real axis

Figure 3.1:

Illustration of the representation of a complex number z on the complex plane, in terms of its real x and imaginary y parts and the polar coordinates r and θ.

An extension of this idea is the so-called Riemannian sphere, introduced to include the point at infinity in the representation. This is called the extended complex plane. In this picture, illustrated in Fig. 3.2 the South Pole corresponds to the value z = 0, the sphere has diameter equal to unity, and the North Pole corresponds to the point z → ∞. Each point on the z-plane is mapped onto a point on the sphere through the following construction: consider a line that begins at the North Pole and ends on the z-plane; the point where this line intersect the surface of the sphere is the image of the point where the line meets the z-plane. Trhough this

3.1. COMPLEX NUMBERS

43

procedure, the equator of the sphere maps to the unit circle on z-plane, the points in the southern hemisphere map to the interior of this circle and the points in the northern hemisphere map to the exterior of this circle.
NP z1 : line from NP intersects sphere z2 : line from NP intersects sphere
at point on the equator at point in northern hemisphere

z 3 : line from NP intersects sphere

at point in southern hemisphere

1 0 z2 x 1 z1 z3 y

Figure 3.2:

Illustration of the representation of a complex number on the Riemannian sphere and its mapping to the complex z-plane.

A useful concept is the complex conjugate z of the complex number z: ¯ z = x + iy ⇒ z = x − iy (3.7)

A complex number and its conjugate have equal magnitude, given by √ (3.8) |z| = z · z = x2 + y 2 We examine next the form that different operations with complex numbers take: The product and the quotient of two complex numbers z1 , z2 in terms of their real and imaginary components, z1 = x1 + iy1 , z2 = x2 + iy2 , are: z1 z2 = (x1 x2 − y1 y2 ) + i(x1 y2 + x2 y1 ) x1 x2 + y1 y2 −x1 y2 + x2 y1 z1 = +i 2 2 2 z2 x2 + y2 x2 + y2 2 An extremely useful expression is the so-called Euler formula: eix = cos(x) + i sin(x) (3.11) (3.9) (3.10)

7182818284590452 · · · π = 3. a Applying Euler’s formula for x = π we obtain: eiπ = −1 which relates in a neat way the irrational constants e = 2.1415926535897932 · · · and the real and imaginary units. Adding these two expressions we find cos(x)+i sin(x) = 1+(ix)+ 1 1 1 1 (ix)2 + (ix)3 + (ix)4 + (ix)5 +· · · (3.12) 2! 2! 4! 5! in which we recognize the Taylor series expansion for the exponential of (ix). In terms of the polar coordinates the complex number takes the form: z = x + iy = r[cos(θ) + i sin(θ)] (3. because we have also dealt with exponentials of real numbers so far.13) as illustrated in Fig.44 CHAPTER 3. i5 = i. What allows us to do this is that the laws of addition and multiplication are the same for real and imaginary numbers. therefore the term by term operations involved in the above series make it identical to the exponential of ix power series expansion ` la Taylor. FUNCTIONS OF COMPLEX VARIABLES We can prove Euler’s formula by using the series expansions of the trigonometric functions (see chapter 1): 1 2 1 4 1 1 x + x − · · · = 1 + (ix)2 + (ix)4 + · · · 2! 4! 2! 4! 1 5 1 1 1 3 i sin(x) = ix − i x + i x − · · · = (ix) + (ix)3 + (ix)5 + · · · 3! 5! 3! 5! cos(x) = 1 − where we have used i2 = −1. y are related by: x = r cos(θ). which produces a much more convenient way of handling complex number operations. θ = tan−1 y x (3. y = r sin(θ) ⇒ r = x2 + y 2. which is the desired result. i3 = −i. There is a different representation of complex numbers in terms of polar coordinates. It may seem strange that we identify the above infinite series with an exponential.1. The polar coordiantes radius r and angle θ and the usual cartesian coordinates x. etc.14) . i4 = 1. 3.

we can prove the trigonometric relations discussed in chapter 1. (1. and θ is called the argument. we obtain: eiθ n = [cos(θ) + i sin(θ)]n = [cos(nθ) + i sin(nθ)] the last equation is known as De Moivre’s formula. COMPLEX VARIABLES 45 r is called the magnitude of z because r = |z|. (1.8): [cos(θ) + i sin(θ)]2 = [cos(2θ) + i sin(2θ)] ⇒ cos2 (θ) − sin2 (θ) = cos(2θ). 3. The use of Euler’s formula and the polar coordinate representation trivializes operations with complex numbers: product : z1 · z2 = (r1 r2 )ei(θ1 +θ2 ) = (r1 r2 )[cos(θ1 + θ2 ) + i sin(θ1 + θ2 )] inverse : z −1 = r −1 e−iθ = r −1 [cos(−θ) + i sin(−θ)] z1 r1 r1 quotient : = ei(θ1 −θ2 ) = [cos(θ1 − θ2 ) + i sin(θ1 − θ2 )] z2 r2 r2 th n n inθ n n power : z = r e = r [cos(nθ) + i sin(nθ)] nth root : z 1/n = r 1/n ei(θ+2kπ)/n . This produces certain complications in handling complex numbers using polar coordinates. which we will discuss in detail below. denoted by Arg[z].15) where k: any integer. notice that the ambiguity in the value of the argument θ of z .2.7). Note that the use of polar coordinates introduces an ambiguity: if the argument is changed by an integral multiple of 2π the complex number stays the same: r[cos(θ + 2kπ) + i sin(θ + 2kπ)] = r[cos(θ) + i sin(θ)] (3.2 Complex variables The next natural step is to consider functions of the complex variable z. First. 2 cos(θ) sin(θ) = sin(2θ) where we simply raised the left-hand side of the first equation to the square power in the usual way and then equated real and imaginary parts on the two sides of the resulting equation. As an application of this powerful formula.3. Eqs. k : integer Applying the expression for the nth power to a complex number of magnitude 1.

. For example. 3 as can be easily verified by raising these four solutions to the fourth power. 2i. this expression is well defined.. because it contains the imaginary unit i. It turns out (see Problem 1) that the exponential of z can be experessed as: 1 1 ez = 1 + z + z 2 + z 3 + · · · (3.16) This is actually what should be expected if we consider it a solution of an algebraic equation: call the unknown root w.19) 2! 3! which is consistent with the Taylor expansion of the exponential of a real variable (see chapter 1) or the exponential of a purely imaginary variable . From Euler’s formula we can derive the following expressions for the trigonometric functions sin(x) and cos(x) in terms of exp(±ix): cos(x) = eix + e−ix 2 ix e − e−ix sin(x) = 2i (3. k = 0.18) or w = 2. consider the equation w 4 − 16 = 0 ⇒ w 4 = 24 ei(0+2kπ) ⇒ w = 2ei(2kπ/4) . −2. 1. we can consider functions of complex numbers.. FUNCTIONS OF COMPLEX VARIABLES in the polar coordinate representation has important consequences in the evalution of the nth root: it produces n different values for the root z 1/n = r 1/n cos θ + 2kπ n + i sin θ + 2kπ n . A value of k outside the range [0. 2eiπ . k = 0. 2eiπ/2 . −2i The exponential of (ix) involved in the above expressions is a new type of function. ⇒ w = 2. then w = z 1/n ⇒ w n − z = 0 which implies that there should be n values of w satisfying the equation.17) (3. 2..46 CHAPTER 3. n − 1 (3. Taking this idea a step further. n − 1] gives the same result as the value of k within this range that differs from it by a multiple of ±n. 2ei3π/2 . such as the exponential of z = x + iy: ez = ex+iy = ex eiy and since we know how to handle both the exponential of a real number exp(x) and the exponential of an imaginary number exp(iy).

that is. We can also define the hyperbolic and trigonometric functions of the complex variable z as: cosh(z) = ez + e−z ez − e−z .21) by analogy to the definition of the same functions for a real variable. usually chosen as the interval 0 ≤ θ < 2π (which corresponds to k = 0 in the above expressions). sinh(z) = 2 2 iz iz −iz e − e−iz e +e .23) Another use of the Euler formula is to provide a definition for the natural logaritm of a complex number: ln(z) = ln(rei(θ+2kπ) ) = ln(r) + i(θ + 2kπ) (3. θw = x(θt + 2kπ) + y ln(rt ) Example: We are interested in finding the magnitude and argument of w defined by the equation w = (1 + i)(1+i) . This bothersome situation is dealt by considering what is called the principal value.20) (3. COMPLEX VARIABLES 47 (see above.22) (3.3. Arg[ln(z)] = tan−1 θ + 2kπ ln(r) which both depend on the value of k.24) In this case. limiting the values of the argument to one specific interval of 2π. such as raising the complex number t to a complex power z: w = tz ⇒ ln(w) = z ln(t) ⇒ ln(rw ) + iθw = (x + iy) [ln(rt ) + iθt + i2kπ] where rt and θt are the magnitude and argument of t. proof of Euler’s formula).2. These definitions lead to the following relations: cosh(iz) = cos(z). the ambiguity in the argument of z produces an ambiguity in both the argument and the magnitude of ln(z) which are: | ln(z)| = [ln(r)]2 + (θ + 2kπ)2 . The logarithm can be used to define arbitrary powers. cos(iz) = cosh(z). and again the last expression allows us to determine the magnitude and argument of w by equating real and imaginary parts on the two sides of the equation: ln(rw ) = x ln(rt ) − y(θt + 2kπ). sinh(iz) = i sin(z) sin(iz) = i sinh(z) (3. sin(z) = cos(z) = 2 2i (3.

48 CHAPTER 3. as we did for functions of real variables. something that is not allowed for a properly defined function. and we have seen above that the logarithm suffers from this porblem as well unless we restrict the value of the argument (as in the definition of the principal value). Usually.25) where u(x. when substituted in the above equation gives: w = eln( w= √ √ 2)+i(π/4+2kπ) √ (1+i) = eln( √ √ 2)+i ln( 2) i(π/4+2kπ)−(π/4+2kπ) e √ π θw = ln( 2) + 4 that is. The first is the issue of continutity. but this is not always necessary. y) and v(x.6. The reason for this multivaluedness is that we relied on the logarithm to obtain the expression for w. a function of a complex variable f (z) can always be separated into its real and imaginary parts: f (z) = w = u(x.3 Continuity. We will consider several issues related to the behavior of functions of complex variables. We define continuity of a function f (z) of the complex variable z as follows: the function is continuous if for any ǫ > 0. y) (3. y) + iv(x. This is actually very problematic. FUNCTIONS OF COMPLEX VARIABLES We first express (1 + i) in polar coordinates: 1+i= √ 1 1 2 √ + i√ 2 2 = √ 2ei(π/4+2kπ) = eln( √ 2) i(π/4+2kπ) e which. whereas its argument is unique (up to a multiple of 2π which is irrelevant).3. there exists δ > 0 such that |z − z0 | < δ ⇒ |f (z) − f (z0 )| < ǫ (3. derivatives As in the case of a complex number z = x + iy. The function f (z) implies a mapping of the z-plane to the w-plane. analyticity. y. 3. y) are real functions of the real variables x. we must take δ smaller when we take ǫ smaller. 3. a topic we discuss in great detail in section 3.26) in close analogy to what was done for a function of a real variable in chapter 1. because the function w is multivalued. The essence of this condition is illustrated in Fig. . the magnitude of w takes an infinite number of values for k an arbitrary integer. 2e−(π/4+2kπ) ei(ln( 2)+π/4) ⇒ rw = √ ⇒ 2e−(π/4+2kπ) .

z 0 ε w 0 w.3: f (z0 ) = w0 . then we can manage to make |w − w0 | < ǫ. ANALYTICITY.3. for any given ǫ < 0 we have to show that when |z − z0 | < δ. we can always find a value for δ > 0 that satisfies the contniuity condition. We must find a way to relate ǫ and δ.26). The next issue is the definition of the derivative. Eq. By analogy to the case of functions of real variables. consider the function w = f (z) = z 2 In order to show that it is continuous. a situation that is common but is not always necessary. (3.w 0 0 y 0 u Figure 3.27) . we will have: 2 |w − w0 | = |z 2 − z0 | = |(z − z0 )(z + z0 )| < δ|z + z0 | = δ|(z − z0 ) + 2z0 | Using the triangle inequality. Illustration of continuitty of function f (z) = w near the point As an example. From this expression. CONTINUITY. we obtain from the last relation: |w − w0 | ≤ δ[|z − z0 | + |2z0 |] ≤ δ[δ + 2|z0 |] If we take the relation between δ and ǫ to be ǫ = δ[δ + 2|z0 |] ⇒ δ 2 + 2|z0 |δ − ǫ = 0 ⇒ δ = |z0 |2 + ǫ − |z0 | > 0 (where we have chosen the positive root of the second-order polynomial equation in δ). we define the derivative of a function f (z) of a complex variable as: f ′ (z0 ) = lim f (z0 + ∆z) − f (z0 ) f (z) − f (z0 ) = lim z→z0 ∆z→0 ∆z z − z0 (3. then the last expression shows that no matter how small ǫ is. DERIVATIVES x z-plane w = f(z) δ 49 v w-plane z0 z.3. it is also edivent that when ǫ gets smaller we must take a correspondingly smaller δ. Given that |z − z0 | < δ.

the function is called non-analytic. it is single-valued. that of analyticity. The possible dependence of the ratio ∆f (z)/∆z on the approach to z0 introduces another notion. and — f (z) has continuous derivatives of all orders. FUNCTIONS OF COMPLEX VARIABLES In order for the derivative f ′ (z) to make sense. therefore the derivative of z ¯ does not exist. and • f (z) is differentiable in D. that is. This function is the complex conjugate: w = f (z) = z ¯ From the deifnition of the derivative we have: f ′ (z0 ) = lim (¯0 + ∆¯) − z0 z z ¯ ∆x − i∆y = lim ∆z→0 ∆x→0. that is. independent of the direction in which the point z0 on the complex plane is being approached. that is. that is. If these conditions are not met. A function f (z) is called analytic in the domain D of the xy-plane if: • f (z) is defined in D.50 CHAPTER 3.∆y→0 ∆x + i∆y ∆z If we choose the approach along the line determined by y = y0 + µ(x − x0 ) ⇒ then the derivative will take the form f ′ (z0 ) = (1 − iµ)2 1 − iµ = 1 + iµ 1 + µ2 ∆y =µ ∆x This last expression obviously depends on the path of approach ∆z → 0. since the point x0 on the real axis can only be approached along other real values on the axis. . This aspect did not exist in the case of the derivative of a function of a real variable. we consider a counterexample. the derivative f ′ (z) exists and is finite. There are important consequences of the analyticity of a function f (z): — f ′ (z) is continuous (this is known as the Coursat theorem). the value of the slope of the line µ. that is. the limit must exist and must be independent of how ∆z → 0. To demonstrate the importance of this aspect. a function for which the derivative is not properly defined.

By this convention. In contrast to this. The convention we adopt is that the notion of analyticity applies strictly in a finite. CONTINUITY. depending on what goes wrong with the analyticity of the function. The function f (z) = z is a non-analytic function and so are all simple ¯ functions containing it. as can easily established by the preceding analysis of this function (the derivative of f (z) = z is well defined along a particu¯ lar line). In fact. the function f (z) = 1 z is analytic everywhere on the complex plane except at z = 0. where n is a positive integer. that is. except if it is in the denominator. two-dimensional portion of the complex plane. in fact f (z) = 1/z n is analytic everywhere except for z = 0. where it is non-analytic because it does not have a finite value. we will need to develop the power series expansions for functions of complex variables. the set of all points in a straight line do not consitute a region. therefore f (z) is not analytic. DERIVATIVES 51 In the following when we employ the notion of analyticity we will assume that it holds at least in a finite region around some point z0 . In order to classify the singularities. We give a few examples of analytic and non-analytic functions: The function f (z) = z n . had we been considering lines as regions. For instance. Isolated points where an otherwise analytic function is nonanalytic are called singularities. which is not analytic. because: f (z) = R[z] = x = z+z ¯ 2 contains z . we would conclude that the function f (z) = z is analytic ¯ along any straight line. ANALYTICITY. By region we mean an area. For example. the function f (z) = R[z] = x is not an analytic function. it does not make sense to talk about analyticity at a single point of the complex plane. is analytic and so are all simple functions containing it. part of the complex plane that has the same dimensionality as the plane itself.3. What does ¯ it mean to have a function like R[z]? This expression certainly leads to a . because then it can vanish for z = 0 causing problems. For any other point. For instance.3. it makes perfect sense to talk about a function being non-analytic at certain points inside a region where the function is analytic. as we will discuss in more detail in following sections. which is a singularity. because the line is one-dimensional whereas the plane is two-dimensional. the function is properly defined and so are all its derivatives. arbitrarily close to zero. There exist different types of singularities.

∆y→0 . since f (z) = w = u(x. which are both functions of x and y. y) and v(x. ∂x ∂y ∂u ∂v =− ∂y ∂x (3. To prove this statement we rely on the definition of the derivative.29) Thinking of f (z) in terms of its real and imaginary parts. which for an analytic function must exist and must be independent of the approach ∆z → 0: f ′ (z) = lim ∆f ∆z→0 ∆z (3. y) + iv(x. etc.4 Cauchy-Riemann relations and harmonic functions For an analytic function f (z) = w = u(x.52 CHAPTER 3. u(x. y) respectively. Let us assume that fx and fy are not both zero. y) = 0 but this mapping is not a function in the usual sense. y). and we will take fx = 0. the existence of the derivative leads to the following relations between the partial derivatives of the real and imaginary parts: ∂v ∂u = . y) ⇒ u(x. FUNCTIONS OF COMPLEX VARIABLES mapping. y) = x. that is. then the derivative can be put in the form f ′ (z0 ) = lim ∆x + (fy /fx )i∆y fx ∆x + i∆y ∆x.28) These are called the Cauchy-Riemann relations. it cannot be differentiated. y) + iv(x. v(x. we can write the differential of f (z) in terms of the partial derivatives with respect to x and y: ∂u ∂f ∂f ∂v ∂v ∂u ∆x + ∆y ∆f = ∆x + ∆y = +i +i ∂x ∂y ∂x ∂x ∂y ∂y = ∂u ∂v ∂v ∂u ∆x + −i i∆y = fx ∆x + fy i∆y +i + ∂x ∂x ∂y ∂y where the last equation serves as the definition of the partial derivatives fx and fy in terms of partial derivatives of u and v. 3.

as they are in the denominator. (3. the temperature field in an isotropic thermal conductor under steady heat flow.3. CAUCHY-RIEMANN RELATIONS AND HARMONIC FUNCTIONS 53 The derivative must be independent of the approach of ∆x. then so is the function f (z) given by: f (z) = f1 (z) + f2 (z) . v(x.30) (3. Consequently. because this is the only way of making the coefficients of ∆x and i∆y in the numerator equal. • a complex analytic function is determined by the boundary values. the pressure field for incompressible liquid flow in a porous medium. The harmonic functions appear frequently in equations related to several physical phenomena. such as electrostatic fields produced by electrical charges. the conjugate field (u from v or v from u) is determined up to a constant of integration. y). Eq.28).4. called Laplace’s equation. fx = fy ⇒ ∂u ∂v +i ∂x ∂x = −i ∂u ∂v + ∂y ∂y Equating the real and imaginary parts of fx and fy we obtain the CauchyRiemann relations. y). y) of an analytic function is a harmonic function (because it corresponds to multiplication of the original function f (z) by a complex number). • if one field is known. From these we can also derive the following general statements. If f1 (z).31) Functions that satisfy this condition. v(x. The Cauchy-Riemann relations have certain important implications: • the u(x. y) functions (often called fields) are determined by the boundary values. ∆y → 0 which can be achieved only for fx = fy . are referred to as harmonic. • any linear combination of the two fields u(x. f2 (z) are two analytic functions in a domain D. Taking cross-derivatives of these relations and adding them produces another set of relations: ∂2v ∂2u ∂2v ∂2v ∂2v ∂2u = 2 and =− 2 ⇒ + =0 ∂x∂y ∂y ∂x∂y ∂x ∂x2 ∂y 2 ∂2v ∂2u ∂2v ∂2u ∂2u ∂2u = and =− ⇒ + =0 ∂x2 ∂x∂y ∂y 2 ∂x∂y ∂x2 ∂y 2 (3.

except where fi (z) = 0 (i = 1. FUNCTIONS OF COMPLEX VARIABLES f (z) = f1 (z)f2 (z) 1 f (z) = . we begin by expressing the variables x and y in terms of z and z : ¯ 1 1 ¯ ¯ x = (z + z ). the following rules also apply to the differentiation of analytic functions: f (z) = f1 (z)f2 (z) ⇒ d d d f (z) = f1 (z) f2 (z) + f1 (z) f2 (z) dz dz dz d d d f (z) = f1 (f2 (z)) ⇒ f (z) = f1 (f2 (z)) f2 (z) dz df2 dz Finally. an important implication of the Cauchy-Riemann relations is ∂f =0 ∂z ¯ (3.32) In order to prove this. 2) fi (z) f (z) = f1 (f2 (z)) in the same domain D. By analogy to the case of functions of real variables. y = (z − z ) 2 2i from which we obtain for the partial derivatives with respect to x and y: ∂ ∂z ∂ ∂z ∂ ¯ ∂ ∂ = + = + ∂x ∂x ∂z ∂x ∂ z ¯ ∂z ∂ z ¯ ∂z ∂ ∂z ∂ ¯ ∂ ∂ ∂ = + =i −i ∂y ∂y ∂z ∂y ∂ z ¯ ∂z ∂z ¯ Using these expressions in the Cauchy-Riemann relations we find: ∂v ∂u ∂u ∂v ∂v ∂u = ⇒ + =i −i ∂x ∂y ∂z ∂z ¯ ∂z ∂z ¯ ∂v ∂u ∂u ∂v ∂v ∂v ∂v ∂u ∂u ∂u =− ⇒i −i =− − ⇒i +i = − ∂y ∂x ∂z ∂z ¯ ∂z ∂ z ¯ ∂z ∂z ¯ ∂z ∂z ¯ Adding the two equations on the far right of the above lines we get: ∂ ∂ ∂ ∂ (u + iv) + (u + iv) = (u + iv) − (u + iv) ∂z ∂z ¯ ∂z ∂z ¯ which simplifies to the following result: ∂f ∂f ∂f =− ⇒ =0 ∂z ¯ ∂z ¯ ∂z ¯ .54 CHAPTER 3.

∂x2 ∂2u = + cos(x)e−y ∂y 2 the sum of which vanishes.4. so e−y c1 (y) = sin(x)c2 (x) which can only be satisfied if they are both equal to the same constant c. y) and what is complex analytic function f (z) = u(x. what is conjugate field v(x. A conseqeunce of this result is that a function that contains explicitly z cannot be analytic.3. From the CauchyReimann relations we obtain: ∂u ∂v =− = cos(x)e−y ⇒ v(x. y) = ∂y ∂x [cos(x)e−y ]dx = e−y [sin(x) + c1 (y)] [− sin(x)e−y ]dy = sin(x)[e−y +c2 (x)] where c1 (y) and c2 (x) are constants of integration. produced the first from the indefinite integral over dx. CAUCHY-RIEMANN RELATIONS AND HARMONIC FUNCTIONS 55 as desired. Then the complex function f (z) must be given by f (z) = u(x. we consider the following simple example: given the field u(x. because the first is exclusively a function of y and the second is exclusively a function of y. y) = cos(x)e−y we would like to know whther or not it is harmonic. the second from the indefinite integral over dy. y) = [cos(x) + i sin(x)]e−y = eix−y = eiz . y) + iv(x. We next try to determine the conjugate field v(x. a fact we have already mentioned. ¯ Example: To illustrate how these concepts work. The two different expressions for v(x. and if it is. y) is a harmonic function. y). This conclusion gives v(x. y) = ∂x ∂y ∂v ∂u = = − sin(x)e−y ⇒ v(x. y) + iv(x. y) onbtained from the two integrations must be equal. y)? To answer the first question we calculate the second partial derivatives of the given field and check if their sum vanishes or not: ∂2u = − cos(x)e−y . which means that u(x. y) = sin(x)e−y + c and we will take the constant c = 0 for simplicity.

If we go around a path that encircles any other point but not zero. To illustrate the idea of a branch point. The same problematic behavior was encountered for the natural logarithm: if the argument of the variable z is allowed to increase without limit. If we continue increasing the value of θ we will hit this point again every time θ is an integral multiple of 2π. All the points we encounter while doing so are different points on the complex plane for 0 ≤ θ < 2π. then both the argument and the magnitude of the function w = ln(z) change when we go through points on the complex plane whose argument differs by a multiple of 2π. since their argument will be divided by n. . . the nth root: w = z 1/n = rei(θ+2kπ) 1/n = r 1/n ei(θ+2kπ)/n .5 Branch points and branch cuts One type of singularity that is very important when discussing mappings implied by functions of complex variables. with an overall minus sign. z = 0 is a special point as far as the function w is concerned. It is useful to notice that in the case of the logarithm because of the relation 1 ln = − ln(z) z the behavior of the function in the neighborhood of the points z and 1/z is the same. Only for z = 0 it does not matter what the value of k is. z = 0 is a singularity of the function f (z) = 1/z because the function is not well defined at this point. but when θ hits the value 2π we get back to the same point where we had started. FUNCTIONS OF COMPLEX VARIABLES 3. This only happens if we go around the point z = 0 more than once. k = 0. we use a simple function in which this problem comes into play. for the same value of the complex variable z we get multiple values for the function w. . but is not a branch point because this function does not suffer from multivaluedness. When z is raised to 1/n power. For example. is that which gives rise to multivaluedness. Not all singularities are branch points.56 CHAPTER 3. We can look at this from a different perspective: suppose that we are at some point z = r = 0 and we start increasing the argument from θ = 0 at constant r. . if z = 0 is a special point in . Thus. the argument of the variable will not keep increasing. In this sense. and therefore the problem will not arise. 1. We call such singularities branch points. even though these are the same point on the complex plane. Thus. n − 1 For any point other than zero. this function takes n different values depending on the choice of value for k. these values of z will in general be different from each other.

We conclude that the logarithm has two branch points. this would be equivalent to restricting the argument θ to the range (0. 2π]. we introduce the idea of the branch cut. BRANCH POINTS AND BRANCH CUTS 57 the sense described above. For example. and the reverse would have been equally suitable. This is a matter of convention. that is. In order to remove these problems. as we discussed above. and at z = ±i where the content of the parenthesis becomes infinite. being allowed to approach the values of z that lie on the x axis from below the axis but not from above the axis.3. Note that by this we refer to the values of the variable where the quantity whose logarithm we want to evaluate. but we cannot reach them from below the axis. or the base of the non-integer power. so is the point z → ∞. where we imagine that we can approach the cut from one side but not from the other.5. z = 0 and z → ∞. which become particularly accute when we discuss mappings implied by multivalued functions of z. The effect of this is that it elminates any ambiguity in the values that the function ln(z) is allowed to take! . the function √ z2 − 1 f (z) = ln √ 2 z +1 has branch points at z = ±1. the function f (z) = . In fact. In the case of the logarithm one possible cut would be along the x axis form x = 0 to x → ∞. 1 −b z π a : finite has branch points z = a and z → ∞. This is a cut of the complex plane which joins two branch points: we imagine that with the help of a pair of “mathematical scissors” we literally cut the complex plane along a line so that it is not allowed to cross from one side of the line to the other. since all noninteger powers of complex numbers can be expressed with the help of the logarithm. and so on. b=0 has branch points z = 1/b and z = 0. vanishes or becomes infinite. the function f (z) = (z − a). we can reach the values of z that lie on the x axis approaching from above the axis. That is. where the content of the parenthesis becomes zero. we conclude that the functions involving non-integer powers of z have the same two branch points. This choice restricts the argument θ in the range of values [0. 2π). but not at z → ∞ because for this value the content of the parenthesis is finite.

any line joining the points 0 and ∞ is a legitimate choice. etc. 3. Each such choice of branch cut restricts the range of the argument of z to one set of values spanning an interval of 2π. for the purposes of the present discussion. y) = (ar y + ai x) + bi . expressing w in terms of its real and imaginary parts w = u + iv. a. can be written as: w = u + iv = (ar + iai )(x + iy) + (br + ibi ) ⇒ u(x. the values of θ would be in the interval [π. There are also more complicated possible choices. which occasionally lurks in the definition of other functions we will examine. b : constants which. if we had chosen the negative x axis as the branch cut. the allowed values of θ would be in the interval [π/2. This approach can be applied to any function that suffers from multivaluedness. 5π/2). 3π). we have therefore managed to eliminate the problem of multivaluedness for the natural logarithm and non-integer powers. but it always spans an interval of 2π for a given value of r. FUNCTIONS OF COMPLEX VARIABLES There are usually many possible choices for a branch cut. for which the range of θ may be different for each value of the radius r. the logarithm and non-integer powers. A very simple. In fact. So far we have met only these two types of functions that suffer from this problem. if we had chosen the positive y axis as the branch cut. which are in fact related. because then a portion of the complex plane would be completely cut off (it could not be reached from the rest of the plane). we will study mappings of the z-plane to the w-plane through complex functions w = f (z). These mappings take us from a value of z to a value of w through f (z). For example.58 CHAPTER 3. the only truly multivalued function that must be fixed by the introduction of branch cuts is the natural logarithm. With the introduction of branch points and the associated branch cuts. Notice that the introduction of a branch cut that restricts the values of the argument to 0 ≤ θ < 2π is equivalent to taking the principal value of the argument. almost trivial case is the mapping produced by the function w = az + b. v(x. for the case of the logarithm or a non-integer power. as long as it does not cross itself. Thus. which is not allowed to happen.6 Mappings Next. y) = (ar x − ai y) + br .

We illustrate these ideas with several examples. what function f (z) can map them onto a set of simple curves on the w plane? We will not attempt to address this question directly. rotated and scaled through the mapping implied by the function f (z). Thus. y) = v0 on the . ai = |a| sin(φ) the original values of x. with v assuming any value between −∞ and +∞). which are circlular arcs centered at the origin at fixed radius rw (the arcs are full circles for 0 ≤ θw < 2π). Other simple sets of curves on the w plane are those that correspond to a fixed argument θw and variable magnitude rw of w. This represents a shift of the origin by −b and a rotation and scaling of the original set of values of x and y.3. An even more difficult question is: Given a set of complicated curves on the z plane. and to v = constant (these are straight horizontal lines. Specifically. The central idea is to consider a set of complicated curves on the z-plane which through f (z) gets mapped to a set of very simple curves on the w-plane. if the values of x. Thus. y) = u0 or horizontal lines v(x. which are rays emanating from the origin at fixed angle θw with the u axis. this curve will be shifted. we consider what curves in the z plane get mapped to vertical lines u(x. Example 1: We consider the mapping of the z-plane to the w-plane through the exponential function: w = ez = ex cos(y) + iex sin(y) ⇒ u(x.6. MAPPINGS 59 where we have also separated the constants a and b into real and imaginary parts. but its shape will not change. or those that correspond to a fixed magnitude rw and variable argument θw of w. with u assuming any value between −∞ and +∞). y describe a curve on the z plane. if we define the angle φ through φ = tan−1 ai ar ⇒ ar = |a| cos(φ). v(x. It is often useful. we will pose the question as follows: Given the function f (z). The simplest set of curves on the w-plane correspond to u = constant (these are straight vertical lines. y) = ex sin(y) In the spirit discussed above. to view the mapping from the opposite perspective. y are rotated by the angle φ and scaled by the factor |a|. but the familiarity we will develop with the mappings of several common functions goes a long way toward suggesting an answer to it in many situations. y) = ex cos(y). from a practical point of view. what kind of curves in the z plane produce some simple curves in the w plane? This is because the usefulness of the mapping concept lies in simplifying the curves and shapes of interest.

FUNCTIONS OF COMPLEX VARIABLES w = exp[z] y 3π/2 π π/2 0 z-plane v w-plane 0 u x −π/2 w = exp[z] y 2π 3π/2 π π/2 0 z-plane v w-plane u 0 x Figure 3.60 CHAPTER 3.4: Mapping of z-plane to w-plane through w = f (z) = exp(z) using cartesian coordinates in the w plane. .

0 ≤ θw < 2π] Similarly. 3. θw = y In this case. cos(y) < 0 ⇒ < y < 2 2 Similarly. 3.4. y) = ex sin(y) = v0 ⇒ x = ln(v0 ) − ln(sin(y)) for [v0 > 0. map to circles of constant radius rw = ex0 : [x = x0 . This analysis of the mappings produced by the function w = exp(z) indicates that every strip of width 2π in y and −∞ < x < ∞ on z-plane maps onto entire w-plane. sin(y) < 0 ⇒ π < y < 2π] These mappings are illustrated in Fig. map to straight lines (rays) emanating from the origin at constant angle angle θw to the u axis: [−∞ < x < ∞.3. For vertical lines on the w planes the corresponding curves on the z plane are given by: u(x. y) = ex cos(y) = u0 ⇒ x = ln(u0) − ln(cos(y)) for u0 > 0. θw = y0 ] as illustrated in Fig.6. sin(y) > 0 ⇒ 0 < y < π] ⇒ x = ln(−v0 ) − ln(− sin(y)) for [v0 < 0. MAPPINGS 61 w plane. 0 ≤ y < 2π] → [rw = ex0 . y = y0 ] → [0 ≤ rw < ∞. cos(y) > 0 ⇒ − π π <y< 2 2 ⇒ x = ln(−u0 ) − ln(− cos(y)) 3π π for u0 < 0. for horizontal lines on the w planes the corresponding curves on the z plane are given by: v(x. This is true for the mapping using catresian or polar coordinates in the w plane.5. We can examine the mapping produced by the same function using polar coordinates in the w plane: w = ez ⇒ ln(w) = z ⇒ ln(rw ) + iθw = x + iy ⇒ ln(rw ) = x. curves described by a constant value of x = x0 with y in a range of values spanning 2π. Example 2: We next consider the mapping of the inverse function of the . curves described by constant value of y = y0 and x taking all real values −∞ < x < ∞.

5: Mapping of z-plane to w-plane through w = f (z) = exp(z) using polar coordinates in the w plane.62 CHAPTER 3. FUNCTIONS OF COMPLEX VARIABLES w = exp[z] y 3π/2 π π/2 0 −π/2 0 z-plane v w-plane u x w = exp[z] y 3π/2 π π/2 0 −π/2 0 z-plane v w-plane u x Figure 3. .

0 ≤ r < ∞] → [−∞ < u(x. We have already established that the function ln(z) has two branch points. zero and ∞. but leads to difficulties if the range of values of θ is not specified. y) = u0 = ln(r0 ) on the w plane of height 2π. and from this obtain the mapping by reversing what we found for the mapping w = ez . as we have seen earlier. θw ↔ θ).6: Branch cut for the function f (z) = ln(z + a) − ln(z − a). while rays at an agle θ0 to the x axis on the z plane map to straight horizontal lines v(x. This is all well and nice when θ lies within an interval of 2π. y) = ln(r) = ln( x2 + y 2 ). We employ polar coordinates for z. namely the natural logarithm which will prove to be much trickier. y z-plane z r1 r2 θ2 θ1 -a 0 a x Figure 3. with the use of Euler’s formula make things simpler. v(x.3. y) = v0 = θ0 ] that is. y) = θ = tan−1 y x Since this is the inverse mapping of the exponential. This feature is of course related to the problem of multivaluedness of the logarithm. 0 ≤ θ < 2π] → [u(x. rw ↔ r. which can be eliminated by identifying the branch points and introducing a branch cut. . y) < ∞. we can make the correspondence (x ↔ u. as we assumed so far. which.6. and any line between them which does not intersect itself is a legitimate branch cut. y) < 2π] [θ = θ0 . v(x. y ↔ v. 0 ≤ v(x. MAPPINGS 63 exponential. w = ln(z) = ln(reiθ ) = ln(r) + iθ ⇒ u(x. circles of radius r0 on the z plane map to straight vertical lines u(x. y) = v0 = θ0 on the w plane. y) = u0 = ln(r0 ). This simple argument gives the following mappings: [r = r0 .

One question we need to answer is whether the function has any other branch points. For a closed path that encircles only one of z = a or z = −a the quantity (θ1 − θ2 ) increases by 2π at the end of the path relative to its value at the beginning. Thus.6. for this function we need to introduce branch cuts. Any such path will do. centered at two different points z = a and z = −a. 3. that is. θ2 = Arg[z − a]. To see this.64 CHAPTER 3. such a path would not lead to multivaluedness. z = ±a. and it does not . we need a branch cut joining the points z = ±a. FUNCTIONS OF COMPLEX VARIABLES This function is often used to simplify the description of radial or circumferential flow from a source (with the + sign) or a sink (with the − sign) situated at z = z0 . but the function is actually well defined in this limit: f (z) = ln(z + a) − ln(z − a) = ln z+a → ln(1) = 0 for z → ∞ z−a The other possible candidate. because: f (0) = ln(a) − ln(−a) = ln a −a = ln(−1) = ln(eiπ ) = iπ Thus the only branch points are z = ±a. also does not present any problem. with the source at z = −a (denoted by the function f1 (z)) and the sink at z = +a (denoted by the function f2 (z)). where the function is non-analytic. z = 0. because there are at least two branch points. which is: f ′ (z) = 1 1 − z+a z−a which is well defined everywhere except for z = ±a. it does not increase by a factor of 2π even though each of the angles θ1 and θ2 increase by 2π. Evidently. is called the source-sink pair: f (z) = f1 (z) + f2 (z) = ln(z + a) − ln(z − a) = ln( r1 ) + i(θ1 − θ2 ) r2 As another illustrative example of the need to introduce branch cuts consider the function: ± ln(z − z0 ) where r1 = |z + a|. Consequently. Another way to see this is to evaluate the derivative of the function. The combination of the two choices of signs. One possible candidate is z → ∞. θ1 = Arg[z + a] and r2 = |z − a|. we notice that for a closed path that encircles either both or neither of z = +a and z = −a the quantity (θ1 − θ2 ) takes the same value at the end of the path as at the beginning. These facts are established by inspecting Fig.

From this mapping we see that half of the z plane maps onto the entire w plane. the square. A possible choice for a branch cut is shown in Fig. y) + iv(x. we still can obtain all the possible values of u and v. b : (l. y) = u0 ) or to horizontal lines (v(x. identified by the corners at z : (x. Similar arguments apply to sub-regions of the z plane.6. l). y) a : (l.7: Mapping of z-plane to w-plane through w = f (z) = z 2 . y) = (x2 − y 2 ) + i(2xy) The last expression gives.6. we have: w = z 2 ⇒ u(x. 3. MAPPINGS w=z z-plane y 2 65 v w-plane 0 x 0 u Figure 3. y) = v0 ) on the w plane: v0 x2 − y 2 = u0 . because if we restrict the values of z to x ≥ 0 or to y ≥ 0. the region enclosed by the square z : 0 ≤ x ≤ l. Example 3: We consider next a simple power of z. have to pass through z → ∞. For instance. d : (0. c : (0. y = 2x which are hyperbolas.3. l). 0). which through w = f (z) = z 2 are “inflated” to cover much larger regions on the w plane. as illustrated in Fig. Using cartesian coordinates. 0) 0≤y≤l .7. 3. for the curves on the z plane that map to vertical lines (u(x.

Example 4: As another example we examine the inverse of the previous function. we have: w = z 2 ⇒ w 2 = z ⇒ (u2 − v 2 ) + i2uv = x + iy ⇒ x = (u2 − v 2 ). 0). which lies entirely on the first quadrant (upper right quarter plane). 0) with the correspondence a → A. 0). on the z plane. y) = u0 ) or straight horizontal lines (v(x. c → C. we will have: u = u0 → v = y y2 → u2 − 2 = x 0 2u0 4u0 1 . Another way to see this is to use polar coordinates: w = z 2 ⇒ rw eiθw = r 2 ei2θ ⇒ rw = r 2 .8: Mapping of a square region on the z-plane to the w-plane through w = f (z) = z 2 . that is. then its image on the w plane will cover the entire upper half plane. v) A : (l2 . d → D. the square root. If we think of the square on the z plane as covering the entire first quadrant (l → ∞). 2l2 ). the range of values from zero to π in the argument of z is mapped to a range of values from zero to 2π in the argument of w. D : (0. y) = v0 ) on the w plane. FUNCTIONS OF COMPLEX VARIABLES y z−plane w−plane v B 2l 2 c d0 l b l a x C D0 l 2 A u Figure 3. gets mapped to the region identified by the points w : (u. b → B. This region extends over the first two quadrants of the w plane (upper half plane). θw = 2θ so 0 ≤ θ < π → 0 ≤ θw < 2π. that is. Using cartesian coordinates.66 CHAPTER 3. C : (−l2 . B : (0. y = 2uv For the curves on the z plane that map to straight vertical lines (u(x.

we use polar coordinates: w = z 1/2 ⇒ rw eiθw = r 1/2 ei(θ+2kπ)/2 ⇒ rw = r 1/2 . Any proper branch cut joining these two points removes the ambiguity and renders the function single-valued. In order to remove the ambiguity. the entire z plane maps onto half of the w plane. In terms of real numbers. In this case. this corresponds to the fact that the root of x2 is ±x. we need again to introduce branch cuts. 1 2 so there are two possible choices for the values of k. 3.6. because we are not sure at which of the two halves of the w plane we will end up. This is a troubling fact. θw = θ + kπ. however. as illustrated in Fig. so that two values of opposite sign correspond to the same curve.9: Mapping of z-plane to w-plane through w = f (z) = v = v0 → u = y y2 2 → 2 − v0 = x 2v0 4v0 √ z. since the values of u0 and v0 appear squared in the final equations. given that the function z 1/2 is the inverse of the function z 2 . which are parabolas.3.9. k = 0. MAPPINGS w=z z-plane y 1/2 67 v w-plane 0 x 0 u Figure 3. To see this more clearly. the standard choice of the positive x axis as the branch cut with 0 ≤ θ < 2π . Evidently. This is what we would expect. because of the involvement of the logarithm in the definition of z a with a: non-integer. This has actually to do with the fact that the function z 1/2 is multivalued (it involves a non-integer power. the branch points in this case are again z = 0 and z → ∞. giving rise to the two possible mappings. For example. and hence the logarithm lurks somewhere in its definition).

y0 ) and radius ρ is: (x − x0 )2 + (y − y0 )2 = ρ2 ⇒ a(x2 + y 2 ) + bx + cy + d = 0 where the parameters a. FUNCTIONS OF COMPLEX VARIABLES (equivalent to the choice k = 0). (3. c. the same choice for a branch cut with 2π ≤ θ < 4π (equivalent to the choice k = 1). y0 .33) Using the following relations: x= z+z ¯ z−z 2 ¯ .68 CHAPTER 3.33). u2 + v 2 = w w = ¯ 2 2z¯ z 2i 2iz¯ z z¯ z we find that the above equation for a circle on the z plane is transformed to a + bu − cv + d(u2 + v 2 ) = 0 (3. d in describing the circle. b=− 2 . is not a problem because the latter set of parameters are not independent. the following relation between them can easily be established: b2 + c2 = 4a(d + 1) 1 . d= 0 2 0 −1 ρ2 ρ ρ ρ (3. c=− 2 . v= = . x + y 2 = z¯ z 2 2i and the fact that w = u + iv = 1/z ⇒ w = u − iv = 1/¯. The equation for a circle on the z plane with its center at (x0 . leads to the entire z plane mapping to the upper half (0 ≤ θw < π or v ≥ 0) of the w plane. c. b. Eq. in terms of the variables x and y. y= .34) which has exactly the same functional form in the variables u and v as the equation for the circle on the z plane. ρ to a set of four parameters a. b. therefore ¯ z u= w+w ¯ z+z ¯ w−w ¯ z−z ¯ 1 = . Example 5: For another example we consider the function w = f (z) = 1 z and ask what is the image of a circle on the z plane through the mapping by f (z) on the w plane. From this we conclude that the original circle on the z plane is mapped to a circle on the w plane. leads to the entire z plane mapping to the lower half (π ≤ θw < 2π or v ≤ 0) of the w plane. The fact that we pass from a set of three parameters x0 . ρ by 1 : a= 2x0 2y0 x2 + y 2 1 . y0. d are related to x0 .

10: 1 l2 Mapping of the shaded region on the z-plane to the w-plane through w = f (z) = 1/z 2 . MAPPINGS A related mapping is that given by the function w = f (z) = (z − a) (z − b) 69 which also maps a circle to a circle. shift origin by − 1 which proves that the mapping by w = (z − a)/(z − b) is also a mapping of a circle to a circle.3. All that remains to do is find the value of the scaling parameter λ. the other two merely shift or scale the curve without changing its shape. we perform the following changes of variables: z ′ = z − b : shift origin by b z ′′ = 1 : maps circles to circles z′ w = λz ′′ + 1 : linear scaling by λ. To see this.6. A generalization of this is the mapping produced by the so-called M¨bius funtion (see Problem 6). By requiring that the final expression is equal to the original one. o 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 1111111111111111111111111111 0000000000000000000000000000 y z−plane w−plane v 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 11111111111111 00000000000000 b B 0 C A u c 0 a x l Figure 3. . we find: w = λz ′′ + 1 = λ 1 1 z−a +1=λ +1= ⇒λ =b−a ′ z z−b z−b which completes the argument. since the only relation that involves a mapping is the second change of variables.

y) = (x2 − y 2 ) 1 + (x2 1 1 . with the point at infinity mapping to the origin of the circle. The function z 2 produces a full circle centered at the origin. consider the region on the z plane shown in Fig.70 CHAPTER 3. This curve is mapped through f (z) to a curve on the w plane given by: u(t) = r 2 e2t v(t) = r 2 e2t 2 /a2 cos(2t) 1 + sin(2t) 1 − 1 r 4 e4t2 /a2 1 r 4 e4t2 /a2 (3.10 (see also Problem 7). v(x. consisting of the entire upper half plane except for a semicircle of radius l centered at the origin. y(t) = ret 2 /a2 sin(t). with radius 1/l. The net result from applying the two mappings is shown in Fig. FUNCTIONS OF COMPLEX VARIABLES Having studied both the mapping of the function 1/z and the function z . this curve is illustrated in Fig. 3. y) + iv(x. −π < t < π (3. y) we find for the real and imaginary parts of w: u(x.35) where r. y) = 2xy 1 − 2 2 )2 +y (x + y 2)2 As an illustration of the mapping produced by f (z). The mapping of this by the function 1/z produces a semicircle centered at the origin. with radius 1/l2 . 3.36) 2 /a2 . that is the function: 2 w = f (z) = 1 z2 As an exercise.11.10. we consider a curve on the z plane described by the parametric form: x(t) = ret 2 /a2 cos(t). we can consider the mapping produced by the composition of the two functions. a are real positive constants. Example 6: We next study the mapping implied by a function which ivolves both powers and inverse powers of z: f (z) = z 2 + 1 z2 Expressing the complex variable z in terms of its real ad imaginary parts z = x + iy leads to: f (z) = (x2 − y 2) 1 + (x2 1 + y 2 )2 + i2xy 1 − (x2 1 + y 2 )2 and setting as usual the function f (z) equal to w = u(x. 3.

MAPPINGS 71 Mapping under f(z) = z^2+1/z^2 1 In[2]:= ParametricPlot 2 Exp t ^ 2 5 Cos t . (3.6.35) and its map on the w plane through w = z 2 √ 1/z 2 . Pi. in + this example we have chosen r = 2 and a = 5. t. (3. Pi ^2 1 1 2 Exp t ^ 2 5 ^4 .5 -10 -7. Pi 2 1 -12.3. .5 -5 -2.5 -1 -2 Out[2]= Graphics In[3]:= ParametricPlot 2 Exp t ^ 2 5 Cos t ^ 2 2 Exp t ^ 2 5 Sin t 2 2 Exp t ^ 2 5 Cos t 2 Exp t ^ 2 5 Sin t 1 1 2 Exp t ^ 2 5 ^ 4 . t. described by Eq. 60 40 20 50 -20 -40 -60 Out[3]= 100 150 200 Graphics Figure 3.36).11: Illustration of the curve on the z plane described by Eq. Pi. Exp t ^ 2 5 Sin t .

72 CHAPTER 3. In this case the quantity under the square root is always positive and the argument of the natural logarithm is positive only for the solution with the plus sign in front of the square root. the sets of curves on the z plane that correspond to vertical or horizontal lines on the w plane can be drawn. y) = sinh(x) sin(y) We investigate first the types of curves on the z plane that produce vertical (u = u0 : constant) or horizontal (v = v0 : constant) lines on the w plane. the quantity under the logarithm is positive for either choice of sign in front of the square root. for horizontal lines we have: v0 = ex − e−x 2v0 sin(y) ⇒ ex − − e−x = 0 ⇒ 2 sin(y) 1/2 2 v0 v0 ± +1 ⇒ x = ln  sin(y) sin2 (y) u2 u0 0 ± −1 ⇒ x = ln  cos(y) cos2 (y)  1/2   2 v0 v0 ± ex = +1 sin(y) sin2 (y)  1/2   where again we have used the second-order polynomial solutions. . for these values of y. v(x. which is no surprise since the hyperbolic cosine is the sum of exp(z) and exp(−z).11. From these equations. These sets of curves are quite similar to the curves that get mapped to vertical or horizontal lines through the function exp(z). For vertical lines we have: ex + e−x 2u0 u0 = cos(y) ⇒ ex − + e−x = 0 2 cos(y) which can be turned into a second-order polynomial in exp(x) by multiplying through by exp(x). 3. y) = cosh(x) cos(y). which are closely related. We begin by examining the mapping implied by the hyperbolic cosine: w = cosh(z) = ex (cos(y) + i sin(y)) e−x (cos(y) − i sin(y)) ez + e−z = + 2 2 2 ⇒ u(x. From this statement. Example 7: As our next example we will consider the mappings implied by the hyperbolic and trigonometric functions. FUNCTIONS OF COMPLEX VARIABLES which is also illustrated in Fig. Similarly. from which we obtain the solutions: u0 u2 0 ex = ± −1 cos(y) cos2 (y) 1/2 The last expression is meaningful only for values of y which make the quantity under the square root positive.

from the relation: π ez+iπ/2 + e−z−iπ/2 iez − ie−z ⇒ cosh(z ′ ) = = = i sinh(z) 2 2 2 we conclude that the function sinh(z) can be viewed as the combination of two functions three functions: the first consists of a shift of the origin by −iπ/2. just as we had found for the case of the mapping produced by exp(z). we find y2 x2 + = sin2 (u) + cos2 (u) = 1 2 2 cosh (v) sinh (v) . For instance. extending over the entire x axis and covering a width of 2π on the y axis. is mapped onto the entire w plane. because of the relations z′ = z + i cos(iz) = cosh(z). y = cos(u) sinh(v) From the last two relations. sin(iz) = i sinh(z) we can derive the mappings due to the trigonometric sine and cosine functions in terms of the mappings due to the hyperbolic sine and cosine functions. the values z and −z get mapped to the same value of w. in other words. Similarly. Example 8: Finally. using the properties of the trigonometric and hyperbolic sine and cosine functions. the strip of width 2π in y gets mapped twice to the w plane. w = sin−1 (z) Taking the sine of both sides of this equation we obtain: (cos(u) + i sin(u))e−v − (cos(u) − i sin(u))ev z = sin(w) = sin(u + iv) = = 2i sin(u) cosh(v) + i cos(u) sinh(v) ⇒ x = sin(u) cosh(v). From the above analysis. we consider the mapping due to an inverse trigonometric function. MAPPINGS 73 we conclude that a strip of values on the z plane.6. z ′ → z ′′ = cosh(z ′ ). so it is straightforward to derive the mapping of the hyperbolic sine. In particular. since multiplication by −i involves a rotation by π/2. In the present case. the second is the hyperbolic cosine and the third is a multiplication by −i: π z → z ′ = z + i . z ′′ → w = −iz ′′ = sinh(z) 2 But we know what each of these functions does in terms of a mapping.3. we can easily obtain the mappings due to the other hyperbolic and trigonometric functions. we conclude that under w = sinh(z) a strip of the z plane of width 2π on the x axis and extending over the entire y axis gets mapped twice to the entire w plane.

74 CHAPTER 3. 3.12. on the w plane is illustrated in Fig. FUNCTIONS OF COMPLEX VARIABLES y ellipse −1 0 y hyperbola 1 2a 2b x −1 0 1 2a 2b x Figure 3. as illustrated in Fig.12. y2 x2 = cosh2 (v) − sinh2 (v) = 1 − sin2 (u) cos2 (u) which is the equation of a hyperbola with length scales 2a = 2| sin(u)| on the x axis and 2b = 2| cos(u)| on the y axis. because we are really interested in obtaining the values of w from z.13.12: Illustration of the basic geometric features of the ellipse and the hyperbola. respectively. Setting v = v0 : constant. Similarly. This example deserves closer scrutiny. as illustrated in Fig. we see that confocal ellipses on the z plane with foci at z = ±1 get mapped to straight horizontal lines on the w plane. 3. by expressing z as a function of w. so we examine the expression of w in terms of z in more detail: z = sin(w) = eiw eiw − e−iw ⇒ eiw − 2iz − e−iw = 0 ⇒ 2i √ √ = iz ± 1 − z 2 = i z ± z 2 − 1 where we have used the solution of the second-order polynomial equation to obtain the last exrpession. 3. Setting u = u0 : constant. This may have hidden problems with multivaluedness. but we derived the relations between x and y by going in the inverse direction. which is the equation for an ellipse with extend 2a = 2 cosh(v) on the x axis and 2b = 2| sinh(v)| on the y axis. The images of ellipses and hyperbolae on the z plane through w = sin−1 (z) to horizontal and vertical lines. With this. we obtain the explicit expression of w as . that is. we see that confocal hyperbolae on the z plane with foci at z = ±1 get mapped to straight vertical lines on the w plane.

there are two square root functions involved as well as the logarithm function. With the plus sign. The two square roots imply that the points z = ±1. the last expression becomes z+z 1− 1 + · · · ≈ 2z for z → ∞ 2z 2 . all of which introduce branch points. for z → ∞ we have: z± √ z2 − 1 = z ± z 1 − 1 z2 1/2 = z±z 1− 1 +··· 2z 2 where we have used the binomial expansion to get the last result. we see that the expression under the logarithm above is never zero. a function of z: w= √ 1 ln i z ± z 2 − 1 i = √ π 1 z−1 + ln z ± 2 i √ z+1 from which we conclude that we do need to worry about branch points and branch cuts for this situation. the vertical lines at u = ±π/2 on the w plane. because: √ z ± z2 − 1 = 0 ⇒ z2 = z2 − 1 which cannot be satisfied for any value of z. located on the x axis from −∞ to −1 and from +1 to +∞ on the z plane and their images.13: Mapping of ellipses and hyperbolae with foci at z = ±1 on the z plane to horizontal or vertical lines on the w plane through w = sin−1 (z).6. On the other hand. MAPPINGS 75 z−plane y w−plane v −1 0 1 x −π/2 0 π/2 u Figure 3. Specifically. The branch points of the logarithm correspond to the values zero and ∞. ∞ are branch points. By inspection.3. The wavy lines indicate the branch cuts.

The entire z plane is mapped on to the strip of values extending over the entire range of v and having width 2π on the u axis on the w plane.13: it consists of two parts. the values where the argument of the logarithm becomes zero or infinity are both approached for z → ∞. The images of these lines to the w plane are the vertical lines u = ±π/2. In other words. FUNCTIONS OF COMPLEX VARIABLES while with the minus sign. This is precisely what we expect from the preceding analysis of the mapping of the function sin(z)! . 3. we conclude that the branch cut we need for this function must go through the points z = ±1 as well as through the point z → ∞. From this analysis. the two branch points for the logarithm are both at z → ∞. −∞ < x ≤ −1 and +1 ≤ x < +∞. A possible choice for a branch cut is shown in Fig.76 CHAPTER 3. it becomes z−z 1− 1 1 +··· ≈ for z → ∞ 2 2z 2z Thus.

7. 4. 2. y0 ) of of the center of the 2 circle on the z plane. Show explicitly that the function 1 z2 maps the region on the z plane shown in Fig.. (3. By following the same steps as those that led to Eq. w = f (z) = . Show that o this function maps circles to circles and lines to lines. Using the Taylor series expansions for exp(ix) and exp(y). 0 6. From the proof that the function f (z) = 1/z maps a circle in the z plane. The function az + b cz + d and the mapping it produces are named after M¨bius. (3.19).34). find the explicit expression for w as a o M¨bius function in terms of the parameters that enter in the two o functions of the combination. (4. that is: f (z) = w= a2 z ′ + b2 . Also show that the combination of two such functions. Comment on what happens if (x2 + y0 ) = ρ2 . to a circle in the w plane.33). by finding the images of a few representative points on the boundary of the z-plane region.35). c2 z ′ + d 2 z′ = a1 z + b1 c1 z + d 1 is another M¨bius fnuction. determine the radius and the position of the center of the circle on the w plane in terms of the radius ρ and the position (x0 .4). Eq. (4. 3. (3. Prove the relations between the hyperbolic and trigonometric functions of z and iz given in Eqs.22) and (3. prove the more general result Eq.6. Find the mapping of the curve described in parametric form by Eq. 3.3). Eq.23). through the function f (z) = z + 1 z 5.10 to the full circle on the w plane centered at the origin.3. MAPPINGS 77 Problems 1. (3. (3. show that the series which defines the exponential of z = x + iy is that given by Eq.

For this mapping to be meaningful. −1 ≤ x < +∞. that is. and what is the region of the w plane to which the entire z plane is mapped? Does this region have the same area as the region corresponding to the branch cut shown in Fig. we must introduce a branch cut which includes the points z = ±1 and ∞.78 CHAPTER 3. FUNCTIONS OF COMPLEX VARIABLES 8.13? . Consider the branch cut consisting of the x axis from −1 to +∞. We are interested in the mapping of the z plane to the w plane through w = sin−1 (z) which was discussed in the text. Is this an appropriate branch cut for the function of interest? What is the image of the branch cut on the w plane. 3.

1 Complex integration By analogy to the case of integrals of real functions. Indeed.35) and (1. In the case of real variables. An analogous statement applies to the integral of f (z). In the case of complex variables zi = xi + iyi and zf = xf + iyf there are many (actually infinite) ways of going from one number to the other. It is then possible that the definite integral between zi and zf depends on the path (called “contour”) followed on the complex plane. we define the integral of a complex function as: zf zi zf f (z)dz = lim ∆z→0 z=z i f (z)∆z (4. but for analytic functions it does not depend on the direction of approach. Eqs.Chapter 4 Contour Inegration 4. (1. this is the case in general: the integral zf f (z)dz zi depends on the contour along which the integration is performed.36). But it is not always the case. Recall that the derivative evaluated at z = z0 df (z0 ) dz in general depends on the direction of approach to z0 .1) There is a qualitative difference in performing definite integrals of functions of real variables and functions of complex variables. a definite integral is the integral between two points on the real axis xi and xf and there is only one way to go from xi to xf . For analytic functions integrated 79 .

A useful relation that bounds the value of an integral of function f (z) is the so-called ML inequality: z2 z1 f (z)dz ≤ max(|f (z)|) l2 l1 dl = ML (4. with t a real variable. the integral does not depend on the contour and is zero if the contour does not enclose any singularities of f (z). Although this method is straightforward. Then dz = f (z)dz = tmax tmin tmax tmin dz dt = z(t)dt = (x(t) + iy(t))dt ⇒ ˙ ˙ ˙ dt tmax tmin tmax tmin C [u(t) + iv(t)]z(t)dt = ˙ [u(t) + iv(t)][x(t) + iy(t)]dt = ˙ ˙ [u(t)x(t) − v(t)y(t)]dt + i ˙ ˙ [u(t)y(t) + v(t)x(t)]dt ˙ ˙ and in this last expression we are dealing with integrals of functions of real variables. the points of intersection create problems with the sense of traversing since at each intersection there are more than one choices for the direction of the path. z2 . l2 corresponding to the end points of the path z1 .80 CHAPTER 4. with l1 . which is called a “simple contour” and has a unique sense of traversing. one for which the starting and ending points are the same. This is known as Cauchy’s Integral Theorem (CIT for short). Examples of such contours are shown in Fig. Related to the integral of f (z) over a contour are the notions of a contour enclosing a simply-connected region. 4. Another important notion is that of contour deformation: it is often convenient to change the original contour along which we need to perform .2) where M is the maximum value of |f (z)| along the contour and L is the length of the contour. which presumably we know how to handle. it is not always applicable because there is no guarantee that the variable z can be put in a parametric form along the path of integration. which is a consequence of the fact that it contains a multiply-connected region.1. The simplest way to perform an integration of a function of a complex variable along a contour is in a parametric fashion: suppose that the contour C can be described by z = z(t). that is. The term complicated refers to the fact that such a contour intersects itself. CONTOUR INEGRATION along a closed contour. and of a multiply-connected region which we will call a “complicated contour” and has no unique sense of traversing.

This means that we only have to evaluate the integral I2 in order to find the value of I0 . can we use contour deformation in a practial sense? The answer is a resounding yes.4. in this example counter-clockwise. How useful is the above argument? In other words. the integration into a simpler one. As a first example consider the following integral 1 dz z C where C is an arbitrary simple closed contour around the point z = 0 where the integrand is singular. 4.1.1(c) the value of the integral over the new contour vanishes: I0 + I1 + I2 + I3 = 0 But in the limit where the inside and outside contours are fully closed. we will have I3 = −I1 because along the two straight segments f (z) takes the same values while dz1 = −dz3 . as numerous examples will demonstrate in the following sections of this chapter. 4. and (c) contour deformtation. it may be easy to perform a parametric integration. 4. COMPLEX INTEGRATION I1 I 3 81 I2 I0 I0 (a) (b) (c) Figure 4. Now assume that by deforming the contour as shown in Fig. (b) a “simple” contour enclosing a simply connected region with a unique sense of traversing. For example. which does not have a unique sense of traversing. We perform contour deformation as indicated in Fig. For the deformed contour. which does not enclose the . consider an irregularly shaped simple contour such as the one shown in Fig.1: Illustration of (a) a “complicated” contour enclosing a multiply connected region. which reduces the problem to one of doing integrals of real functions as described above.1(b).1(c). as long as this deformation does not change the result. Since the path for this new integrals is a circle. as described in the text. Let us denote the integral of the function around the contour as I0 .

the various terms in the last expression give: The first term C u(x. CONTOUR INEGRATION singularity z = 0.82 CHAPTER 4. We can then evaluate the original integral by calculating the integral along the inner circular contour. equal to one for i = j and zero otherwise. y)dy + i u(x. y)dx (4. y)dx + xmin xmax uu (x. Substituting these relations for the integral along the circular path we obtain: CR 1 dz = z 0 2π 1 iReiθ dθ = −2πi Reiθ 1 dz = 2πi z which shows that the original integral is I1 = C (4. 0 ≤ θ < 2π ⇒ dz = iReiθ dθ and the contour is traversed in the clockwise sense. We next prove Cauchy’s Integral Theorem. y)dy + i v(x. By exactly the same procedure.4) where δij is the Kronecker delta.5) For the closed contour C.3) This is a very useful result. 4. which means θ goes from 2π to 0. by the argument mentioned above. y)dx . y)dx − v(x. This leads to I0 = −I2 . since the integrals along the straight line segments cancel each other. The basic setup of the proof is illustrated in Fig. y)dx = xmax xmin ul (x. which can be done by parametric integration because along this circle of radius R we have: z = Reiθ . since the function 1/z is analytic everywhere inside and on this deformed contour. The function f (z) is assumed to be analytic on and inside the simple closed contour C. we conclude by Cauchy’s Integral Theorem that the integral is zero.2. y)][dx + idy] u(x. y) + iv(x. we can prove that for n a positive integer and for an arbitrary simple closed contour C enclosing z = 0: C 1 dz = 2πiδ1n zn (4. because it holds for an arbitrary simple closed contour which encloses the singularity of the integrand at z = 0. In general we have: f (z)dz = = [u(x.

4. y)dx − v(x.2: Setup for the proof of Cauchy’s Integral Theorem. y) and uu (x.1. The second term: C v(x.9) .6) S where ul (x. y)dy = = = ymax ymin ymax ymin vr (x. COMPLEX INTEGRATION 83 y path 2 ymax A ymin 0 B path 1 xmin xmax x Figure 4. (4. y) takes in the lower and upper parts of the contour. as those are identified in Fig.5): C [u(x. Combining these two terms we obtain for the real part of the last line in Eq. y)dy + ymin ymax vl (x. 4. y)dy + v(x. y)]dy = ∂v dxdy ∂x ∂v dxdy ∂x (4. y) refer to the values that u(x. y)dx] = S − (4. y)dy ymax ymin xmax xmin [vr (x. y) and vr (x. respectively. y) − uu (x.8) Similarly we can derive for the imaginary part of the last line in Eq.5): C [u(x. y)dy] = S − ∂u ∂v dxdy − ∂y ∂x ∂v ∂u dxdy + ∂y ∂x (4.7) S where vl (x. and S is the total area enclosed by the contour. = = xmax xmin [ul (x. respectively. y)]dx = − ∂u dydx ∂y xmax xmin ymax ymin − ∂u dydx ∂y (4. as those are identified in Fig. y) − vl (x.4.2. y) refer to the values that v(x.2. (4. y) takes in the left and right parts of the contour.

we conclude that its integral over C vanishes identically. = −1 ∂y ∂x ∂x ∂y A notion related to the CIT is the Cauchy Integral Formula (CIF for short): for a function f (z) which is analytic everywhere inside the region enclosed by a simple contour C. Its integral ¯ around C. the following relation holds: 1 f (z) f (z0 ) = dz (4. and the fact that z = x − iy ⇒ u(x. 0 ≤ θ < 2π in which case the integral of f (z)/(z − z0 ) around this contour is: R→0 CR lim f (z) dz = lim R→0 z − z0 2π 0 f (z0 + Reiθ ) iReiθ dθ = 2πif (z0 ) Reiθ .10) 2πi C z − z0 The proof pf this statement proceeds by considering a circular contour of constant radius R around z0 . (4. which encloses a total area S. if f (z) is analytic on and inside the contour C. and letting R → 0: CR : z = z0 + Reiθ ⇒ dz = iReiθ dθ.28). ¯ v(x.2. namely z . = 0. which are valid for any function. Therefore. which contains the point z0 . rather than from A to B). from the CauchyRiemann relations Eq. To prove this. and the result of the integration over the entire closed contour is zero.8) and (4. Our example concerns the prototypical non-analytic function that we discussed before. The combination of these paths forms a closed contour. (3.84 CHAPTER 4. CONTOUR INEGRATION Therefore. y) = x. B and the entire path lie within the domain of analyticity of the function. A corollary of the CIT is that the integral of an analytic function between two points A and B on the complex plane is independent of the path we take to connect A and B. labeled path 1 and path 2) . is: z dz = ¯ − ∂u ∂v dxdy + i − ∂y ∂x − ∂v ∂u dxdy = 2iS + ∂y ∂x C S S where we have used Eqs. if one of the paths is traversed in the opposite sense (from B to A. y) = −y ⇒ ∂v ∂u ∂v ∂u = 0. we consider the integral of a non-analytic function over a simple closed contour C. = 1. as long as A. 4.9). the integration over either path gives the same result. As a counter example. consider two different paths going from A to B (see Fig.

By using contour deformation we can easily prove that this result holds for any simple contour C that encloses z0 .3. Indeed. by using only values of the function on the contour C. This is consistent with the statements that we made earlier. by differentiating the expression in Eq. as illustrated in Fig. (4. CIF basically provides the recipe for obtaining all the values of an analytic function (and hence the harmonic functions which are its real and imaginary parts) when the boundary values (the values on the contour C) are known.1.3: Setup for the proof of the Cauchy Integral Formula. if z0 is in the interior of the contour. that is. Taking the CIF one step further.10) with respect to z0 : f ′ (z0 ) = d 1 dz0 2πi f (z) 1 dz = (z − z0 ) 2πi n! 2πi f (z) dz (z − z0 )2 C C and similarly for higher orders: f (n) (z0 ) = f (z) dz (z − z0 )n+1 (4. then it can be calculated everywhere within that region by performing the appropriate contour integral over the boundary. namely that harmonic functions are fully determined by the boundary values and that an anlytic function has real and imaginary parts which are harmonic funtions. we can evaluate the derivatives of analytic function. 4.4.11) C . The deeper meaning of Cauchy’s Integral Formula is the following: if analytic function is known at the boundary of a region (the contour of integration C). COMPLEX INTEGRATION y 85 R z0 C 0 x Figure 4. CIF gives the value of f (z0 ) by performing an integral of f (z)/(z − z0 ) over the contour.

By analogy to what we discussed for power series expansions of functions of real variables.1: To illustrate these points. In particular the comparison test. and using our knowledge of convergence criteria. we cannot use a simple contour of constant radius R and let R → 0 to obtain explicit expressions. and use as the terms in the series (z − z0 )n with n a non-negative integer. is useful in establishing the two statements mentioned. In the limit R → ∞ the series converges everywhere on the complex plane. To justify these statements. — if it diverges for z3 and |z4 − z0 | > |z3 − z0 | then it diverges for z4 . we can think of the convergence of the series in absolute terms and since |z − z0 | is a real quantity: |z − z0 | = (x − x0 )2 + (y − y0 )2 we can apply the tests we learned for the convergence of real series.12): — if it converges for z1 and |z2 − z0 | < |z1 − z0 | then it converges for z2 . assuming that we are interested in the behavior of the function near the point z0 on the complex plane: f (z) = ∞ n=0 an (z − z0 )n (4. we consider the familiar binomial expansion involving a complex number z: (a + z)p = ap + pap−1 z + p(p − 1) p−2 2 p(p − 1)(p − 2) p−3 3 a z + a z +··· 2! 3! . Example 4. CONTOUR INEGRATION This is also consistent with an earlier statement that we had made.86 CHAPTER 4. These lead us to the definition of the radius of convergence: it is the smallest circle of radius R = |z − z0 | that encloses all points around z0 for which series converges. we conclude that following statements must hold for the power series on the right-hand side of Eq.19). in the limit R = 0 the series diverges everywhere on the complex plane.12) This is the power series expansion of f (z) at a reference point z0 . (4. but we must perform the contour integration over C. 4.2 Taylor and Laurent series We next consider a series expansion of functions of complex vairables. (2. namely that all derivatives of an analytic function exist. We should caution the reader that in the case of derivatives. Eq.

assuming that it convergences absolutely and uniformly inside the disk |z − z0 | < R.12). we have put the function f (z) = (a+z)p in an infinite series form of the type shown in Eq. Since (z − z0 )n is continuous.4. Defining ζ = z/a. the following statements are true (some for obvious reasons): 1. The ratio of the (n + 1)th to the nth term is p(p − 1) · · · (p − n)ζ n+1 (n + 1)! p(p − 1) · · · (p − (n − 1))ζ n n! −1 = p−n ζ n+1 and the ratio test requires that this last expression be bounded by a positive number smaller than unity for all n > N. Next.12). (4. which is infinitely differentiable within the disk of convergence. we ask what is the radius of convergence of this series. (4. but n→∞⇒ p−n →1 n+1 leaving as the condition for the convergence of the series |ζ| < 1 ⇒ |z| < |a| which gives the radius of convergence R = |a| of the series around z0 = 0. with well defined coefficients given by an = p(p − 1) · · · (p − (n − 1)) p−n a n! around the point z0 = 0.20).13) We then apply the ratio test.2. f (z) is also continuous. For the power series representation of a function f (z). (2. Eq. TAYLOR AND LAURENT SERIES 87 In this expression. we obtain: (a + z)p = ap 1 + pζ + p(p − 1) 2 p(p − 1)(p − 2) 3 ζ + ζ +··· 2! 3! (4. 2. as given in Eq. 3. the power series represents an analytic function. to obtain a new convergent series. 4. . For R > 0. The power series representation about z0 is unique inside the disk. since the resulting series has the same ratio of the (n + 1)th to the nth term as the original series. We can differentiate and integrate the series term by term inside the disk. to determine the radius of convergence of this series.

the second to a function which is not analytic in the neighborhood of z0 . known as the Taylor and the Laurent series expansions. (2. We present next two specific power series expansions of functions of complex variables. but is analytic in a circular annulus around z0 . by applying the CIF.10) (note that in the following proof we have replaced the variable of integration by the symbol ζ and we have called the point where the function is evaluated simply z): f (z) = 1 2πi f (ζ) 1 dζ = (ζ − z) 2πi  f (ζ) C C (1 − z−z0 )(ζ ζ−z0 − z0 ) 2 dζ  1 = 2πi C = f (z0 ) + (z − z0 )f ′ (z0 ) + f (ζ)  z − z0 1+ (ζ − z0 ) ζ − z0 1 (z − z0 )2 f ′′ (z0 ) + · · · 2! z − z0 + ζ − z0 + · · · dζ where in the next-to-last step we have used the geometric series summation 1 = 1 + t + t2 + t3 + · · · 1−t (see chapter 1. with t = (z − z0 )/(ζ − z0 ) which is valid for |t| = |z − z0 |/|ζ − z0 | < 1 ⇒ |z − z0 | < |ζ − z0 |. it should have Taylor series expansion around this point. and for the last step we have used the CIF for the derivatives of f (z).17)). Eq. Eq. This can be actually derived for a function f (z) which is analytic everywhere on and inside a simple closed contour C. We discuss first the Taylor series expansion. Example 4. Eq. We first notice that this function is analytic in a disk of non-zero radius around the point of interest. and any method of obtaining this series is acceptable. The first refers to a function that is analytic in a disk around a point z0 .88 CHAPTER 4.2: Suppose we want to calculate the power series of f (z) = sin−1 (z) about the point z0 = 0. However. every function has a unique power series representation about z0 when it is analytic in a disk of radius R around it. Conversely. CONTOUR INEGRATION 5. such as evaluating integrals. Therefore. it is messy to try to take derivatives of the inverse sine function. (4.11). (4. so we resort to a trick: since any way . Both series are extremely useful in representing functions of complex variables and in using those representations for practical applications.

and hence the constant of integration must be zero.4. can be determined by observing that from the definition of w we have z = 0 ⇒ w = 0. p = −1/2. we call the original function w and examine its inverse. Moreover.13) with a = 1. y C2 ρ ρ 2 1 z0 C1 0 x Figure 4. which.2. a condition that is certainly satisfied for z in the neighborhood of zero. the constant of integration which appears when we integrate the series for dw/dz in order to obtain the series for w.4: Circular annulus around the point z0 on the complex plane. (4. ζ = −z 2 . for the derivation of the Laurent series expansion. TAYLOR AND LAURENT SERIES 89 of obtaining the power series expansion is acceptable for an analytic function. . when expanded in powers of (−z 2 ) and integrated term by term leads to: 3 5 7 1 z +··· w = z + z3 + z5 + 6 40 112 Note that we can apply integration term by term to the series we obtain for dw/dz because the binomial series converges uniformly for |ζ| = | − z 2 | < 1. which is simply the sine function: w = sin−1 (z) ⇒ z = sin(w) ⇒ dz = cos(w) = dw 1 − sin2 (w) = √ 1 − z2 from which we can derive the following relation: dw 1 =√ = [1 + (−z 2 )]−1/2 dz 1 − z2 and in the last expression we recognize the familiar binomial as in Eq.

taken as usual in the counter-clockwise sense: f (ζ) dζ. we obtain: f (z) = 1 2πi 1 f (ζ) dζ − (ζ − z) 2πi f (ζ) dζ (ζ − z) (4. The first integral can be written as: I1 = 1 2πi 1 f (ζ) dζ = (ζ − z) 2πi f (ζ) C1 C1 (1 − z−z0 )(ζ ζ−z0 − z0 ) dζ (4.15) and since for z in the annulus and ζ on the C1 contour we have z − z0 <1 ζ − z0 we can apply the geometric series expansion to obtain 1 I1 = 2πi f (ζ)  z − z0 1+ (ζ − z0 ) ζ − z0 f (ζ) 1 dζ = (ζ − z) 2πi  2 C1 z − z0 + ζ − z0 + · · · dζ  (4.90 CHAPTER 4. the function f (z) is analytic everywhere in and on the resulting closed contour. In the limit where the two line segments lie on top of each other. i = 1.17) . as illustrated in Fig. both centered at z0 . 4.16) For the second integral we have: I2 = − 1 2πi f (ζ) C2 C2 (1 − ζ−z0 )(z z−z0 − z0 ) dζ (4.14) C1 C2 In the last expression. and inner radius ρ2 on the circular contour C2 . and therefore we can apply the CIF for this situation. Consider a function f (z) which is analytic on a circular annulus with outer radius ρ1 on the circular contour C1 . each contour is traversed in the counter-clockwise sense and the integration over C2 has an overall minus sign because that part was traversed in the clockwise direction in the joined contour. CONTOUR INEGRATION Another important power series expansion is the Laurent series. hence their contributions cancel out.4. We deal with the two integrals that appear on the right-hand side of the above equation separately. 2 (ζ − z) Ci We notice that if we join the two circular contours by two line segments traversed in the opposite direction. We will employ the following integrals on the contours C1 and C2 .

2.19) + · · · dζ  (4. as can be easily justified by contour deformation. y II III I 0 1 2 x Figure 4. where the function is analytic on a disc or on a circular annulus.4. III. Example 4.3: To illustrate these concepts we consider the function f (z) = 1 1 1 = + (z − 1)(z − 2) 1−z z−2 . we arrive at the general expression f (z) = 1 ∞ (z − z0 )n 2πi n=−∞ f (ζ) dζ (ζ − z0 )n+1 (4. (4. II.19) is known as the Laurent power series expansion and contains both negative and positive powers of (z − z0 )n . The expression in Eq.5: Example of Taylor and Laurent series expansions of the function f (z) = 1/[(z − 1)(z − 2)] around z = 0: the complex plane is separated into the three regions labeled I.18) C with C a closed contour that lies entirely within the circular annulus bounded by the circular contours C1 and C2 . TAYLOR AND LAURENT SERIES and since for z in the annulus and ζ on the C2 contour we have ζ − z0 <1 z − z0 we can apply the geometric series expansion to obtain 1 I2 = 2πi f (ζ)  ζ − z0 1+ (z − z0 ) z − z0  91 C2 ζ − z0 + z − z0 2 Combining these two results.

III : 1+ + 2 +··· = = 1 1−z z z z (1 − z )(−z) I : The second fraction. Regions II and III can be considered as circular annuli which contain a singularity at z = 1. The first fraction. 1/(z − 2). we need to separate the complex plane in three regions in which we can easily determine the analyticity of the function. is analytic in region I. so we expect a Taylor expansion for it which is easily obtained from the geometric series. where both 1/(1 − z) and 1/(z − 2) are analytic. in both regions |z| > 1 ⇒ 1/|z| < 1. — region II : 1 < |z| < 2. These are: — region I : |z| < 1. so we expect a Laurent expansion. In each region. To answer this question. so we expect a Laurent expansion. we can determine separately the expansion of the two fractions that appear in f (z). since in this region |z| < 1. we obtain: 1 1 1 + z2 1 − +··· +z 1− 2 4 8 1 1 1 1 1 II : f (z) = · · · − 2 − − − z − z 2 + · · · z z 2 4 8 1 1 1 III : f (z) = (−1 + 2) 2 + (−1 + 22 ) 3 + · · · + (−1 + 2n−1 ) n + · · · z z z I : f (z) = . — region III : 2 < |z|. which encloses a singularity of both 1/(1 − z) and of 1/(z − 2). since in these regions |z| < 2 ⇒ |z/2| < 1.92 CHAPTER 4. 1/(1 − z) and 1/(z − 2). II : 1 −1 z −1 1 z 2 1+ + = +··· z = z−2 2 (1 − 2 ) 2 2 2 2 1 1 1 2 2 1+ + +··· = = III : 2 z−2 (1 − z )z z z z Combining the results in the different regions. which encloses a singularity of 1/(1 − z). hence if we rewrite the denominator so as to contain the factor (1 − 1/z) we can use the geometric series again to obtain: 1 = 1 + z + z2 + · · · 1−z 1 1 1 −1 1 II. Region III can be considered as a circular annulus which contains a singularity at z = 2. hence if we rewrite the denominator so as to contain the factor (1 − 2/z) we can use the geometric series again to obtain: I. that is. in this region |z| > 2 ⇒ 2/|z| < 1. 1/(1 −z). so we expect a Taylor expansion for it which is easily obtained from the geometric series. CONTOUR INEGRATION and ask what is its power series expansion around z = 0. is analytic in regions I and II.

3. z0 is a pole of order m if the coefficient b−m = 0 and coefficients with lower indices n < −m are all zero.3. . are zero. that is. . A pole of order one is also called a simple pole.4..RESIDUE 93 We note that the function has a Taylor expansion in region I. the function is then continuous and analytic everywhere in the neighborhood of z0 . the branch points. b−1 . If we define the value of the function to be the same as the value of the series expansion for z = z0 . TYPES OF SINGULARITIES . We have already encountered one type of removable singularities. z0 is a removable singularity if the coefficients of the negative exponents in the Laurent series. A different type of removable singularity is one in which the function itself is not properly defined at z = z0 (hence the singularity). We give some examples to illustrate these notions. For an example. consider the function f (z) = sin(z) z . but the Taylor series expansion that approximates it near z0 is well defined for all values of z. If the value of the function is assigned to be equal to the constant term of the Taylor series expansion. In this case the function has a Taylor series expansion around z0 . 4. 2. a Laurent expansion with all powers of z in region II and a Laurent expansion with only negative powers of z starting at n = −2 in region III. This problem is eliminated (removed) by the introduction of branch cuts that connect pairs of branch points.3 Types of singularities . then the singularity is eliminated (removed) and the function becomes continuous and analytic. the singularity is due to multivaluedness of the function which results from different values of the function when going around the point z0 more than once.residue With the help of the Laurent series expansion. In such cases. z0 is an essential sigularity if an infinite number of the coefficients of the negative exponents in the Laurent series are non-zero. . the function at z0 is assigned the value of the constant term in the series expansion. we classify the isolated singularities of a function into three categories: 1. b−2 . including z0 .

consider the expansion of sin(z) near z = 0. f (0) ≡ 1. . if we consider the power series expansion of the numerator and divide by the denominator for finite z. To see this. the standard example of a pole of order m is the function 1 f (z) = (z − z0 )m Occasionally. An example of an essential singularity is the point z = 0 for the function f (z) = e1/z = 1 + 1 1 1 1 1 + + +··· z 2! z 2 3! z 3 where we have used the familiar power sereies expansion of the exponential. we have: sin(z) = z − sin(z) 1 1 1 3 1 5 z + z −··· ⇒ = 1 − z 2 + z 4 − · · · . that is. However. including the point itself. z : finite 3! 5! z 3! 5! and the value of the last expression for z = 0 is unity. then the apparent singularity at z = 0 is removed and the function becomes analytic in the neighborhood of this point. the pole is a little harder to identify. so strictly speaking any value is possible for the function.94 CHAPTER 4. hence we have an infinite number of coefficients of the negative powers which are not zero (a Laurent expansion). each term z −m gives a larger and larger contribution as m increases. As far as poles are concerned. CONTOUR INEGRATION which for z = 0 is not properly defined: both numerator and denominator are zeroes. all negative powers of z appear. which gives: sin(z) = z − f (z) = 1 3 1 z + · · · ⇒ sinm (z) = z m 1 − z 2 + · · · 3! 3! 1 1 1 + z2 + · · · m z 3! −m m → 1 for z → 0 zm which is the behavior we expect for a pole of order m at z = 0. If we now assign the same value to the function for z = 0. For z → 0. as for example in the case: 1 f (z) = m sin (z) which has a pole of order m at z = 0. In the above expansion.

we have: f (z) = ∞ n=−∞ bn (z − z0 )n ⇒ C[z0 ] f (z)dz = ∞ n=−∞ bn C[z0 ] (z − z0 )n dz but we have shown earlier. 4.19). Assuming that the Laurent series converges uniformly for z inside a circular annulus around z0 .4). that is.4 Integration by residues The simplest application of the residue theorem involves simple poles. k = 1.. the integral is equal to 2πi times the coefficient of the term (z −z0 )n with exponent n = −1. Near the roots we can expand the function g(z) in Taylor series as: g(z) = g ′ (zk )(z − zk ) + 1 1 ′′ g (zk )(z − zk )2 + · · · 2! In the following we adopt the notation C[z0 ] to denote that the contour C encloses the point z0 at which the integrand has a singularity. A simple root of g(z) at z = zk means that the function vanishes at this value of z but its first derivative does not (see also Problem 2). Eq. to which we give a special name. this was the case in the example of a Laurent series expansion that we discussed above. Other features of the contour C. (4.4. f (z) and g(z) such that: f (z) is analytic inside the region enclosed by C and g(z) has simple roots at z = zk . We assume that the integrand can be written as a ratio of two functions. (4. that: C[z0 ] and of course all the terms (z − z0 )n with n ≥ 0 give vanishing integrals because they are analytic.. which leads to: C[z0 ] 1 dz = 2πiδ1n (z − z0 )n f (z)dz = 2πi b−1 = 2πi(Residue at z0 ) that is. n in this region. integrands which have simple roots in the denominator. can be evaluated using the coefficients of the Laurent series expansion. This is known as the “residue theorem”. . INTEGRATION BY RESIDUES 95 Another important consequence of the Laurent series expansion is that the integral of a function f (z) around a closed contour C[z0 ] 1 which encloses a singularity of the function at z0 . The usefulness of the theorem lies in the fact that the coefficients in the Laurent expansion may be determined by much simpler methods than by having to evaluate the integrals which appear in the general expression of Eq.. are indicated as subscripts. the “residue” at z = z0 .4. .

n) come out to be f (zk )/g ′ (zk ). We also know that because we are dealing with simple roots of g(z).20) where the final result comes from performing contour deformation as indicated in Fig. . We can then write: g(z) = g ′(zk )(z − zk ) 1 + 1 g ′′ (zk ) (z − zk ) + · · · ⇒ 2! g ′(zk ) g(z) = g ′(zk )(z − zk ) for z → zk n f (z) dz = g(z) k=1 The integral of the ratio f (z)/g(z) over the contour C is then given by: n f (z) f (zk ) dz = 2πi ′ (z )(z − z ) ′ C[zk (k=1. . Evidently. .. In this case.6 and using the small circular paths with radii tending to zero around each simple root of the denominator.n)] C[zk ] g k k k=1 g (zk ) (4. 4. a root of order m in the denominator. in this case the residues at z = zk (k = 1. . for which the result of Eq. CONTOUR INEGRATION z1 zk C 0 x Figure 4. which can be easily evaluated. zk in the closed contour C: the small circles centered at each pole have vanishing radii and the contributions from the straight segments cancel out..96 y CHAPTER 4.6: Illustration of contour deformation for the integration of a function with several simple residues at z = z1 . A generalization of this result is the calculation of the contribution from a pole of order m at z = z0 . since g(zk ) = 0. . the function of interest can be written in general form as: f (z) = b−m+1 b−1 b−m + +···+ + b0 + b1 (z − z0 ) + · · · m m−1 (z − z0 ) (z − z0 ) (z − z0 ) . where the dots represent higher order terms in (z − zk ) which vanish much faster than the first order term as z → zk . (4.4) applies. . .. that is.. . g ′ (zk ) = 0.

.7: I + I ′ + IR = C z2 1 dz +1 .23) which we will evaluate as one part of a complex integral over a closed contour C which is shown in Fig. 4. it has a Laurent expansion starting with the term of order −m.. . . n) within the contour C. each singularity being a pole of arbitrary order (but not an essential singularity.. we can generalize the residue theorem to any number n of singularities at z = zk (k = 1.21) z=z0 With this result. .4. with a factor of 1/(m − 1)! in front: b−1 = 1 dm−1 h(z) (m − 1)! dz m−1 z=z0 or. Example 4. INTEGRATION BY RESIDUES 97 that is. we illustrate it by several representative examples. .4. therefore the coefficient of the term of order m − 1 is the derivative of the same order evaluated at z = z0 .22) These results are extermely useful in evaluating both complex integrals as well as integrals of real functions. equivalently.n)] f (z) dz = 2πi (Residues in C) k=1 (4.. In fact. We define a new function h(z) by multiplying f (z) with the factor (z − z0 )m : h(z) = (z − z0 )m f (z) = b−m + b−m+1 (z − z0 ) + · · · + b−1 (z − z0 )m−1 + b0 (z − z0 )m + · · · which is a Taylor series expansion for the function h(z). because in that case we cannot evaluate the residue): n C[zk (k=1.4: We consider the following real integral: I= ∞ 0 x2 1 dx +1 (4. they can be used to evaluate many real integrals by cleverly manipulating the expressions that appear in the integrand. since it contains no negative powers of (z − z0 ). Since this assertion cannot be described in general terms. for a pole of order m the residue takes the form: (Residue at z0 ) = b−1 = dm−1 1 ((z − z0 )m f (z)) (m − 1)! dz m−1 (4.

(4. But the integrand along the real axis is an even function of x. 4. the real integral I corresponds to the integral along the part of the contour C extending from x = 0 to x → ∞.7. have for the total contribution to the contour C from its different parts: 1 =π 2i because there is only one singularity of the integrand contained in the contour C at z = i and the value value of the Residue for this singularity is 1/(2i). along the negative real axis (x → −∞ to x = 0) and along the semicircle of radius R centered at the origin. 0 ≤ θ ≤ π ⇒ dz = Rieiθ dθ and with these. the integral IR over the large semicircle with radius R → ∞ vanishes. From the residue theorem we y IR i −R I’ 0 I R x Figure 4.98 CHAPTER 4. The other two parts of the contour.23). because the integrand goes to zero faster than 1/R.7: Contour for evaluating the real integral of Eq. which leads to: 0 ∞ 1 1 I′ = dx = d(−x) = I 2 +1 (−x)2 + 1 −∞ x 0 I + I ′ + IR = 2πi (Residues in C) = 2πi Moreover. give rise to the contributions denoted as I ′ and IR respectively. it is the same for x → −x. To justify this claim. that is. (4. we note that we can write the variable of integration z and the differntial dz over the semicircle as: z = Reiθ .20) with f (z) = 1 and g(z) = z 2 + 1. the integral IR in the limit R → ∞ takes the form: R→∞ lim (IR ) = lim π 0 1 R→∞ R2 e2iθ + 1 iReiθ dθ = lim i R→∞ R π 0 eiθ e2iθ + 1 R2 dθ = 0 . CONTOUR INEGRATION As is evident from Fig. as can be easily derived from Eq.

INTEGRATION BY RESIDUES 99 since the integral over θ gives a finite value. using an inspired change of variables. we want this integral to give a vanishing contribution when . Typically.41). Before we proceed we further examples.4. Eq. θ1 ≤ θ ≤ θ2 which. the integration variable z and the differential dz take the form: z = Reiθ . leads to: IR = θ2 θ1 a0 + a1 Reiθ + a2 R2 ei2θ + · · · + ap Rp eipθ iReiθ dθ b0 + b1 Reiθ + b2 R2 ei2θ + · · · + bq Rq eiqθ Taking common factors Rp eipθ and Rq eiqθ from the numerator and the denominator and keeping only the dominant terms (because eventually we will let R → ∞) we arrive at: IR = ap p−q+1 R bq θ2 θ1 ei(p−q+1)θ idθ = 1 ap p−q+1 eiθ2 − eiθ1 R bq p−q+1 (4.24) where in order to obtain the last expression we must assume that p − q = −1. Having established that I ′ = I and IR → 0 for R → ∞. which contributes IR to the value of the contour integral. we will generalize what we found above for the behavior of the integral IR over a semicircle of radius R centered at the origin. Along this arc. we conclude that the value of the original integral is I= ∞ 0 x2 1 π dx = +1 2 a result we had obtained in chapter 1. here the same result was derived by simply applying the residue theorem. when substituted in the complex integral. (1.4. respectively: I= xf xi a0 + a1 x + a2 x2 + · · · + ap xp dx b0 + b1 x + b2 x2 + · · · + bq xq and we will try to evaluate this integral by turning it into a complex integral over a contour C one part of which can be identified as the real integral I: a0 + a1 z + a2 z 2 + · · · + ap z p dz b0 + b1 z + b2 z 2 + · · · + bq z q C Let us assume that the contour C contains an arc of radius R centered at the origin. Consider the case where the integrand is the ratio of two polynomials of degree p and q.

when the integrand is a ratio of two polynomials. for evaluating the real integral of Eq. to guarantee that IR → 0 as R → 0. For this to happen. we need to have p − q + 1 ≤ −1 ⇒ q ≥ p + 2 so that the factor Rp−q+1 makes the value of IR tend to zero.8: I’ I R x 0 I R x Two possible contour choices. 4. CONTOUR INEGRATION R → ∞. which is useful when one of the limits of the original is infinite. as in the previous example.25). the order of the polynomial in the denominator must be larger by at least two than the order of the polynomial in the numerator. we turn into a complex integral over a closed contour C1 . (4. shown in Fig. y Path 1 IR z2 z3 z1 z0 0 y Path 2 I’ z0 IR −R Figure 4. labeled Path 1 (C1 ) and Path 2 (C2 ).8: I + I + IR = C1 z 8 1 dz +1 This contour is chosen so that one part of it lies along the positive real axis (from x = 0 to x → ∞).25) 8 +1 x 0 which.100 CHAPTER 4. The conclusion from this analysis is that. for which a simple change of variables would not work. can still be applied with equal ease. however. one along the negative real axis . There are two more parts to the contour.5: As a second example. The real integral we want to evaluate is: ∞ 1 I= dx (4. Example 4. we consider a similar real integral. The residue theorem. which gives the integral I that we want to evaluate.

7 7 8zk Of those residues the ones with k = 0. we use the expression of Eq. 1. which for each residue leads to: 1 8 . for zk = −1 ⇒ zk = ei(2k+1)π/8 . 4. . (4. INTEGRATION BY RESIDUES 101 leading to the integral I ′ .4. We can also argue. we note from the above relations that 8 7 zk = −1 ⇒ zk = − zk 1 1 ⇒ 7 =− zk 8zk 8 which leads to the following result for the original integral: 2I = 2πi −1 iπ/8 e + ei3π/8 + ei5π/8 + ei7π/8 8 and using the trigonometric relations cos cos 7π 8 = − cos π . . 8 sin sin 7π 8 5π 8 = sin = sin π 8 3π 8 5π 3π = − cos . IR → 0 for R → ∞. that IR → 0 for R → ∞. Moreover. the part on the x axis (from x = 0 to x → ∞) gives the desired integral I.20) since the denominator g(z) = z 8 + 1 has only simple roots inside the contour C1 and the numerator is simply unity. in the notation of Fig. for the same reasons as in the previous example. 8 8 we arrive at the final result I= π π 3π sin + sin 4 8 8 = π π π sin + cos 4 8 8 Let us try to evaluate the same integral with the use of contour C2 : As for the previous path. For the . k = 0. therefore I + I ′ + lim (IR ) = 2I = 2πi R→0 (Residues in C1 ) For the residues. . We notice that the integrand is an even function under x → −x. that is. and one along the semicircle of radius R centered at the origin. so the integral on the negative real axis (from x → −∞ to x = 0) is the same as I. while the path along the circular arc gives a vanishing contribution by the same arguments as before. 2. 3 are within C1 . leading to the integral IR . I ′ = I. .8.4.

along the line at an angle π/4 relative to the x axis. and from the . I(1 − eiπ/4 ) = 2πi πi eiπ/8 −1 iπ/8 e ⇒I=− 8 4 1 − eiπ/4 or. as for example in the case of the integral I= ∞ 0 x8 1 dx −1 (4. using the relations we established above for the value of the residues. which we will call contour C1 : 1 dz I + I ′ + IR + Ia + Ib = C1 z 8 − 1 Note that in addition to the usual contributions from the positive real axis (which corrsponds to the desired integral I). that is.102 CHAPTER 4.26) with which we deal in the same fashion as before. we can write the variable z and the differential dz as: z = reiπ/4 . from the negative real axis (which corresponds to I ′ and as in the previous two examples I ′ = I). as it should be.9.6: A slightly more complicated situation arises when the integrand has poles on the real axis. dz = eiπ/4 dr which leads to the following contribution: I′ = 0 r=∞ 1 d(reiπ/4 ) = −eiπ/4 iπ/4 )8 + 1 (re ∞ 0 1 dr = −eiπ/4 I 8+1 r We therefore have for the entire contour: I + I ′ + lim (IR ) = I(1 − eiπ/4 ) = 2πi R→0 (Residues in C2 ) and since the only singularity within C2 is at z0 = exp(iπ/8). In the following we concentrate on the first choice. Example 4. CONTOUR INEGRATION last part of this contour. after using Euler’s formula for the complex exponentials. by turning it into a complex integral over a closed contour. Two possible contour choices are shown in Fig. a part of which can be identified as the desired real integral. I =− π π π π 1 πi sin + cos = π = −iπ/8 − eiπ/8 4e 8 sin 8 4 8 8 which is identical to the previous answer from contour C1 . we find the final answer. 4.

This situation is analogous to the example we had discussed in chapter 1. INTEGRATION BY RESIDUES 103 y Path 1 IR z2 y Path 2 z3 z1 z1 I’ Ib Ia 1 I IR Ib −R I’ −1 0 Ia 1 I R x 0 R x Figure 4.4.9: Two possible contour choices. This is precisely what the two semicircles with vanishing radii lead to: For the semicircle centered at x = +1 we have: Ca [+1] : za = 1 + ǫa eiθa . for the evaluation of the real integral in Eq. labeled Path 1 (C1 ) and Path 2 (C2 ). These contributions are required in order to avoid the singularities of the integrand at z = ±1. for the integral ∞ 0 x2 1 dx −1 where we had argued that approaching the root of the denominator at x = 1 symmetrically from above and below is required in order to produce a well defined value for the integral (we had called this approach taking the principal value of the integral). π ≤ θb ≤ 0 ⇒ Ib = Cb [−1] 1 dza = − 8 −1 za π 0 1 ǫa ieiθa dθa (1 + ǫa eiθa )8 − 1 1 dzb = − 8 zb − 1 π 0 1 ǫ ieiθb dθb iθb )8 − 1 b (−1 + ǫb e .26). for the semicircle centered at x = −1 we have: Cb [−1] : zb = −1 + ǫb eiθb . for the same reasons as in the previous examples vanishes for R → ∞).4. semicircle centered at the origin with radius R (which. (4. π ≤ θa ≤ 0 ⇒ Ia = Ca [+1] and similarly. we also have two more contributions from the the semicircles with vanishing radii. centered at x = ±1.

104 CHAPTER 4. 8 ǫb →0 lim (Ib ) = −iπ −1 8 that is. b. 7 zk k = 1. i = a. 3 Using the identities: cos π 4 = − cos 3π π . z2 = ei2π/4 . eiπ/2 = i 4 . Putting all these results together. 2. since we are interested in the limit ǫi → 0. we arrive at the final result: I + I ′ + lim (Ia ) + lim (Ib ) + lim (IR ) = 2I = 2πi ǫa →0 ǫb →0 R→0 (Residues in C1 ) 1 iπ/4 e + eiπ/2 + ei3π/4 8 because the only residues enclosed by the entire contour C are the simple poles at z1 = eiπ/4 . the two contributions from the infinitesimal semicircles centered at x = ±1 cancel each other. CONTOUR INEGRATION so in the limit ǫi → 0. b the integrals I and I ′ are evaluated in the principal value sense: I = lim I ′ = lim 1−ǫa 0 −1−ǫb ǫa →0 1 dx + 8 −1 x x8 ∞ 1+ǫa x8 1 dx −1 0 1 1 dx + dx 8 −1 ǫb →0 −1 −1+ǫb x −∞ We can expand the denominators in the integrands of Ia and Ib with the use of the binomial expansion: for the pole at z = +1 we have 8 8 za = (1 + ǫa eiθa )8 = 1 + 8ǫa eiθa + · · · ⇒ za − 1 = 8ǫa eiθa while for the pole at z = −1 we have 8 8 zb = (−1 + ǫb eiθb )8 = (1 − ǫb eiθb )8 = 1 − 8ǫb eiθb + · · · ⇒ zb − 1 = −8ǫb eiθb where we have neglected higher order terms. i = a. z3 = ei3π/4 = 2πi all of which are solutions of the equation 8 zk = 1 ⇒ 1 = zk . sin 4 4 = sin 3π . we find: ǫa →0 1 lim (Ia ) = −iπ . When these expansions are substituted in the integrals Ia and Ib .

which lies entirely within the annulus centered at z0 : Cǫ.28) where Cǫ. θ0 ≤ θ ≤ θ0 + ϕ To prove this statement. bm = 0 for m ≤ −2 which leads to: ∞ ∞ Cǫ. we note first that since f (z) is analytic on an annulus around z0 we can use its Laurent series expansion in the neighborhood of z0 assuming that it converges unifromly in the annulus: f (z) = ∞ n=−∞ bn (z − z0 )n Since the fuction has a simple pole at z0 .ϕ [z0 ] f (z)dz = 2πi ϕ (Residue at z = z0 ) 2π (4. it is instructive to write the expressions for Ia and Ib in a different format: ǫa →0 lim (Ia ) = 2πi π (−1)(Residue at z = 1) 2π π (−1)(Residue at z = −1) ǫb →0 2π These expressions can be rationalized as follows: The integral over a circular arc of vanishing radius centered at z = z0 where the integrand f (z) has a simple pole.ϕ [z0 ] f (z)dz = bn θ0 +ϕ n=−∞ θ0 iǫeiθ ǫn einθ dθ = bn In n=−1 . then lim (Ib ) = 2πi lim ǫ→0 Cǫ. an extra factor of −1 accompanies the residue when the arc is traversed in the clockwise sense. with a simple pole at z = z0 .27) In the above example. This is actually a specific application of a the following more general result: Consider a function f (z) analytic on annulus centered at z0 .ϕ [z0 ] : z = z0 + ǫeiθ .ϕ [z0 ] is a circular arc of radius ǫ and angular width ϕ.4.4. INTEGRATION BY RESIDUES we get for the value of I: √ π I = − (1 + 2) 8 105 (4. is equal to 2πi times the fraction of the arc in units of 2π times the residue of the function at the simpe pole.

when ϕ = 2π the expression on the right-hand side of Eq.29) vanishes only for n ≥ 0. as already mentioned in the example above. which is equivalent to evaluating integrals by taking their principal value.ϕ [z0 ] which proves the desired formula. and only the term n = −1 survives. because for these values of n the power of ǫ is positive and the limit ǫ → 0 produces a zero value for In .28) allows us to skip simple poles that happen to lie on the path of integration. (4.106 CHAPTER 4. which implies a pole of order higher than unity. since the integrals In produced by these terms will vanish and we will be left with the value of the residue. We emphasize. we observe that for an arbitrary value of ϕ = 2π the expression on the right-hand side of Eq. Eq. In contrast to this. Note that if ϕ = 2π this formula simply reduces to the residue theorem.28). because of the identity ei2π(n+1) = 1 which is a direct consequence of the Euler formula. that the formula of Eq. CONTOUR INEGRATION where the integral In is defined as: In = ǫn+1 i and evaluation of this gives: for n = −1 for n ≥ 0 → → I−1 = iϕ In = ǫn+1 θ0 +ϕ θ0 ei(n+1)θ dθ 1 iθ0 (n+1) iϕ(n+1) e −1 e n+1 (4. therefore ϕ > 0. We also note that if the circular arc is traversed in the counter-clockwise sense. the angle θ increases. whereas the residue theorem is valid for a pole of any order. Thus.29) In the limit ǫ → 0 the terms with n = −1 vanish. . leading to f (z) dz = iϕ b−1 = i ϕ (Residue at z = z0 ) Cǫ. whereas if it is traversed in the clockwise sense. (4. (4. ϕ < 0 which produces the (−1) factor factor mentioned earlier. (4. The formula of Eq. for ϕ = 2π we can afford to have terms with n < −1 in the Laurent expansion.29) vanishes identically for any value of n. (4. To see where this comes from.28) applies to simple poles only. however.

which is covers the entire range of allowed values for the situation we are considering. which lead to: |IR | ≤ F R θ2 θ1 e−λR sin(θ) dθ Assume θ1 = 0. dz = Rieiθ and the absolute value of the integral over this arc takes the form: |IR | = θ2 θ1 eiλ(x+iy) f (z)iReiθ dθ ≤ θ2 θ1 eiλ(x+iy) f (z)iReiθ dθ where the last relation is derived from the triangle inequality. C : z = Reiθ . θ2 = π. This statement can be proved as follows: on the circular arc. |f (z)| ≤ F.30) where λ is a real. that is. positive constant. 0 ≤ θ1 ≤ θ ≤ θ2 ≤ π and the function f (z) is bounded on C.4.4. INTEGRATION BY RESIDUES 107 Another very useful general criterion for when the integrals over circular arcs of infinite radius vanish is the following: Jordan’s Lemma: Consider the integral IR = C eiλz f (z) dz (4. But |i| = | exp(iλx)| = | exp(iθ)| = 1 and y = R sin(θ). F : finite then the integral IR over the circular arc is bounded and for F → 0 it vanishes. C is a circular arc of radius R centered at the origin and lying on the upper-half complex plane. extending from θ1 to θ2 we have: z = Reiθ . then since the integrand in the last expression is a symmetric function of θ with respect to π/2. we can evaluate it only in the range 0 ≤ θ ≤ π/2 and multiply it by a factor of 2: sin(π − θ) = sin(θ) ⇒ but in this range we have: 0≤θ≤ 2θ π : sin(θ) ≥ 2 π π 0 e−λR sin(θ) dθ = 2 π/2 0 e−λR sin(θ) dθ .

only then the path of integration must lie on the lower half of the complex plane (that is.31) I= −∞ x2 − a2 with a.108 CHAPTER 4. λ real positive constants. Example 4. and if we replace sin(θ) in the integrand by 2θ/π we will get an upper bound for the original value: 0≤θ≤ 2θ π : − sin(θ) ≤ − ⇒ 2 π π/2 0 e−λR sin(θ) dθ ≤ π/2 0 e−λR2θ/π dθ With all this we obtain: |IR | ≤ 2F R The following change of variables t= when inserted in the integral gives: |IR | ≤ Fπ λ λR 0 π/2 0 e−λR2θ/π dθ λR2θ π e−t dt = Fπ (1 − e−λR ) λ From this result. x2 − a2 e−iλx dx x2 − a2 . note that we can rewrite this integral in terms of two other integrals: I= I1 = 1 2 ∞ −∞ ∞ −∞ 1 1 eiλx + e−iλx dx = I1 + I2 2 − a2 x 2 2 I2 = ∞ −∞ eiλx dx. y ≤ 0). Note that Jordan’s Lemma can be applied for λ a real. we consider the following example of a real integral which can be evaluated by contour integration: ∞ cos(λx) dx (4.7: To illustrate the usefulness of Jordan’s Lemma. CONTOUR INEGRATION which can be easily shown by a graphical argument. First. we obtain in the limit of R → ∞ R → ∞ ⇒ |IR | ≤ Fπ λ which shows that the integral IR is bounded since F is finite and in the limit F → 0 it vanishes. negative constant as well.

10: Contour for the evaluating the real integral of Eq. we perform the integration on a contour skipping over the poles at x = ±a on the real axis and closing by a semicircle lying on the upper half of the complex plane. This contour integration leads to: ′ ′ I1 + lim (Ia ) + lim (Ia ) + lim (IR+ ) = 0 ⇒ I1 = − lim (Ia ) − lim (Ia ) ǫa →0 ǫa →0 R→0 ǫa →0 ǫa →0 where the integral IR+ on the upper half plane is eliminated by Jordan’s Lemma.4. in order to satisfy Jordan’s Lemma for y ≤ 0 for the integral IR− .4. we perform the integration on a contour skipping over the poles at x = ±a on the real axis and closing by a semicircle on the lower half of the complex plane. The other two integrals at the singularities on the x axis give: πi 1 ˜ lim (Ia ) = (2πi)(Residue at z = −a) = − e−iλa ǫa →0 2 2a . For I1 . because in this case the exponential in the integrand involves the real constant λ > 0 so that Jordan’s Lemma is satisfied for y ≥ 0 for the integral IR+ . and I1 . 4. INTEGRATION BY RESIDUES y −R IR+ −a ~ I’ a 0 109 y I2 a ~ Ia R x I’ a −R −a 0 Ia I1 a R x IR− Figure 4. This contour integration leads to: ˜ ˜′ ˜ ˜′ I2 + lim (Ia ) + lim (Ia ) + lim (IR− ) = 0 ⇒ I2 = − lim (Ia ) − lim (Ia ) ǫa →0 ǫa →0 R→0 ǫa →0 ǫa →0 where the integral IR− on the lower half plane is eliminated by Jordan’s Lemma. because in this case the exponential in the integral involves the real constant −λ < 0.10.31) using Jordan’s Lemma. (4. The other two integrals at the singularities on the x axis give: πi 1 lim (Ia ) = − (2πi)(Residue at z = −a) = e−iλa ǫa →0 2 2a 1 πi ′ lim (Ia ) = − (2πi)(Residue at z = +a) = − eiλa ǫa →0 2 2a For I2 . I2 can be considered as parts of contour integrals on the complex plane which satisfy the conditions of Jordan’s Lemma for the contours shown in Fig.

because sin(x)/x is well defined everywhere. Note that for Ia and Ia the semicircle of radius ǫa → 0 is traversed ˜ ˜′ in the clockwise sense.110 CHAPTER 4. we can extend the integration to negative values of x because the integrand is an even function of x. using the results we discussed above for skipping simple poles. CONTOUR INEGRATION 1 πi ˜′ lim (Ia ) = (2πi)(Residue at z = +a) = eiλa ǫa →0 2 2a In both cases. the integrals around the singularities at x = ±a were evaluated in the usual way. Morover. so 1 I= 2 ∞ −∞ sin(x) x 2 dx We use the expression for the sine in terms of the complex exponentials and rewrite the original integral as follows: I= 1 2 ∞ −∞ eix − e−ix 2ix 2 dx = − 1 8 ∞ −∞ e2ix − 2 + e−2ix dx x2 . note that there are no signularities involved in this integral. Combining the results we obtain for the value of the original integral: I= 1 1 ′ ˜ ˜′ [I1 + I2 ] = − lim (Ia ) + lim (Ia ) + lim (Ia ) + lim (Ia ) ǫa →0 ǫa →0 ǫa →0 2 2 ǫa →0 π = − sin(λa) a Example 4. This is equivalent to evaluating the integrals I1 and I2 in the principal value ′ sense. even for x = 0 because: lim sin(x) x =1 x→0 as can be easily established from the Taylor expansion of sin(x) near x = 0 (see chapter 1).32) First. while for Ia and Ia the semicircle is traversed in the counter-clockwise sense.8: We look at a different integral where Jordan’s Lemma comes again into play: consider the real integral I= ∞ 0 sin(x) x 2 dx (4.

and in the limit ǫ → 0 all these contributions are equal. Eq. As far as I0 is concerned. which consists of a semicircle on the upper half plane of radius ǫ → 0. for all three integrals. 4. they cancel out: Iǫ + Iǫ − 2Iǫ = 0 For contour that contains the integral I+ . centered at zero. whereas for I− we use a contour that closes on the lower half plane by a semicircle of radius R → ∞. (4. 4.11. we must treat them on an equal footing. by doing this we have introduced in the integrands functions with singularities. Since there are two contributions from this semicircle corresponding to I+ and I− and one more contribution corresponding to I0 which has an overall factor of −2 in front. that is. I− .24).4. We jump over the singularities at z = 0 with the same path. we conclude that the contribution from the . to find their values.11. x2 I0 = ∞ −∞ 1 dx x2 However. To avoid infinities that will arise from the singularities. We will employ the contours shown in Fig. the contribution of the semicircle of radius R → ∞ and since the enclosed area contains no sigularities we conclude that the contribution of this contour to the sum is zero. from Jordan’s Lemma. denoted by Iǫ in Fig. for which the contribution along the real axis represents the desired integrals: For I+ and I0 we use a contour that closes on the upper half plane by a semicircle of radius R → ∞. From the last expression. I0 .32) is decomposed. by using the same variables to describe portions of the path that skip over the singularities. using the result for the integral of a ratio of two polynomials. into which the real integral in Eq. we can break the integral into three parts and perform contour integration in terms of the complex variable z. INTEGRATION BY RESIDUES y −R 0 111 y Iε I− R x Iε −R 0 I+ I0 R x Figure 4. I+ .4. I± = 8 ∞ −∞ e±2ix dx.11: Integration contours for the evaluation of the three integrals. 1 I = − (I+ − 2I0 + I− ). (4.

Finally. and since the enclosed area for the corresponding contour contains no singularities its contribution to the sum is zero. Since the singularity at this point is a second-order pole. Consider the real integral I= ∞ 0 xλ dx. The contour is closed by a large circle of readius R → ∞ also centered at the origin. we can use the general result of Eq. the contour encloses the point z = 0.33) which can be expressed as part of a contour integral of a complex function.32): 1 π I = − [−2πi(Residue at z = 0)] = 8 2 Example 4. hence the small circle of radius ǫ → 0 around it. In this case. we obtain for the integral of Eq. 4.21) to find: (Residue at z = 0) = e−2iz d z2 2 dz z = −2i z=0 Putting all these partial results together. (4. for the integral I− . from z = 0 to z → ∞. with the branch cut along the real axis. we have: z = xe2πi ⇒ xλ zλ = ei2πλ ⇒ I ′ = −ei2πλ I z+1 x+1 The circle of radius ǫ around the origin contributes: z = ǫeiθ ⇒ Iǫ = − 2π 0 (ǫeiθ )λ iθ ǫie dθ = −iǫ1+λ ǫeiθ + 1 2π 0 eiθ(1+λ) dθ ǫeiθ + 1 . The point z = 0.12. For the part of the contour just below the real axis. from z = x = 0 to z = x → ∞. x+1 −1 < λ < 0 (4. CONTOUR INEGRATION semicircle of radius R → ∞ vanishes. being a singularity of the integrand (a branch point) must be avoided. where the argument of the complex variable z approaches the value 2π. Note that raising the variable x to the non-integer power λ implies that we will need to introduce a branch cut when we pass to the complex variable z. We therefore consider the contour shown in Fig.9: Another type of contour integral where similar issues arise concerns functions which require a branch cut. The part of the contour which lies just above the real axis approaches the value of the desired integral I.112 CHAPTER 4. (4. from Jordan’s Lemma we will have another vanishing contribution from the semicircle of radius R → ∞ on the lower half plane and the contour integral is therefore equal to 2πi times the sum of the enclosed residues.

the residue at that value being: |Reiθ + 1|−1 dθ ≤ (Residue at z = −1) = (eiπ )λ = eiλπ Putting all this together. the entire contour encloses only one singularity. By analogous considerations. the circle of radius R → ∞ around the origin contributes: z = Reiθ ⇒ IR = |IR | ≤ R1+λ 2π 2π 0 (Reiθ )λ Rieiθ dθ = iR1+λ iθ + 1 Re 2π 0 eiθ(1+λ) dθ ⇒ Reiθ + 1 R1+λ 2π ⇒ lim |IR | ≤ lim (2πRλ ) = 0 R→∞ R→∞ R−1 0 where we have used the ML inequality.33) with a branch cut along the positive real axis. the point z = −1 which is evidently a simple pole. (4. ⇒ lim(Iǫ ) = 0 ǫ→0 since the contribution of the integral over θ for any ǫ is a finite quantity and 1 + λ > 0.4.4.12: Integration contour for the evaluation of the real integral in Eq. Finally. INTEGRATION BY RESIDUES 113 y IR Iε −R 0 I I’ R x Figure 4. Eq. we obtain: I + I ′ + lim(Iǫ ) + lim (IR ) = 2πi(Residue at z = −1) ⇒ ǫ→0 R→∞ I(1 − ei2πλ ) = 2πieiλπ ⇒ I = − π sin(λπ) . to find an upper bound for the integral |IR | before taking the limit R → ∞. (4.2).

Integrals of that type appear when the integrand involves trigonometric functions.10: As a final example we consider an integration that requires a rectangular rather than a semicircular contour. Our example consists ot the following real integral: I0 = ∞ −∞ sin(λx) dx = I sinh(x) ∞ −∞ eiλx dx = Im[I] sinh(x) (4. CONTOUR INEGRATION Example 4.34).34) We notice that from the definition of the hyperbolic functions.34) because along the two horizontal parts of the contour the denominators in the integrands are the same. (4.13: Rectangular integration contour for the evaluation of the real integral in Eq. (4. ex − e−x ex+2πi − e−x−2πi sinh(x) = ⇒ sinh(x + 2πi) = = sinh(x) 2 2 so we can use the contour shown in Fig. along z = x + i2π we have: I′ = − ∞ −∞ eiλ(x+2πi) d(x + 2πi) = −e−2λπ I sinh(x + 2πi) For the contribution IR along the vertical path z = R + iy. For the part of this contour y 2π I’ R −R Ib π Ia 0 I’ IR R x I Figure 4.13 for the evaluation of the integral I in Eq. 4. 0 ≤ y ≤ 2π the integrand becomes: 2eiλ(R+iy) 2e−λy eiλR = R iy ⇒ (eR+iy − e−R−iy ) e (e − e−2R−iy ) .114 CHAPTER 4.

when substituted in the previous equation give for the value of the original real integral I0 = I[I] = π 1 − 1 sinh(λπ) . From the residue theorem we have: ′ I + I ′ + lim (IR ) + lim (IR ) + lim (Ia ) + lim (Ib ) = R→∞ R→∞ ǫa →0 ǫb →0 2ie−λy e−iλR dy = 0 e−2R+iy − e−iy I − e−2πa I + lim (Ia ) + lim (Ib ) = 2πi(Residue at z = πi) ǫa →0 ǫb →0 the singularity at z = πi being the only one inside the closed contour.4. we obtain: ǫa →0 ǫb →0 lim (Ia ) = (−1)πi (Residue at z = 0) = −πi lim (Ib ) = (−1)πi (Residue at z = 2πi) = −πie−2λπ (Residue at z = πi) = −e−λπ which. and since the poles at z = 0. πi. 2π ≥ y ≥ 0 we find 2eiλ(−R+iy) 2e−λy e−iλR = R −2R+iy ⇒ (e−R+iy − eR−iy ) e (e − e−iy ) R→∞ ′ lim (IR ) = lim R→∞ − 1 eR 2π 0 because again the integration over y gives a finite value. INTEGRATION BY RESIDUES lim (IR ) = lim 1 eR 2π 0 115 2ie−λy eiλR dy = 0 eiy − e−2R−iy R→∞ R→∞ since the integration over y gives a finite value (the integrand is finite in ′ magnitude). 2πi are simple poles (see Problem 3). Similarly for IR along the vertical path z = −R + iy.4.

(4. What is the relation of g1 (z) to g(z) near z0 ? Show that if the root of g(z) is of order m. do the two answers agree? .5 Problems 1. Find all the poles of the following functions f1 (z) = 1 1 1 1 . What is the relation of gm (z) to g(z) near z0 ? 3. g1 (z0 ) = 0 then the function f (z) = 1/g(z) has a simple pole at z = z0 . z = R exp(iθ). and make sure you get the same result as in Eq. that is. 5.31). CONTOUR INEGRATION 4. Show that if it is a simple root.30) with λ < 0 and the integration path C lying on the lower half of the complex z plane. as they apply to the neighborhood of the denominator roots). Consider a function g(z) which is analytic everywhere on the complex plane and has at least one root at z = z0 : g(z0 ) = 0. f2 (z) = . use the result found for the integral in Eq. 4. Pay attention to the integrations along the two semicircles with vanishing radii centered at z = 1 and at z = exp(iπ/4). f4 (z) = sinh(z) tanh(z) sin(z) tan(z) and show that they are all simple poles (Hint: use the results of Problem 2. x2 + p2 p>0 Alternatively. if the function can be written as g(z) = (z − z0 )g1 (z).9. that is.31) with a = ip. π ≤ θ ≤ 2π.116 CHAPTER 4. (4. 2.27). gm (z0 ) = 0 then the function f (z) = 1/g(z) has a pole of order m at z = z0 . Prove Jordan’s Lemma for the integral defined in Eq. (4. (4. using the contour C2 labeled Path 2 in Fig. that is. f3 (z) = . evaluate the real integral I= ∞ −∞ cos(x) dx. (4. Evaluate the real integral in Eq. 4. Using the same procedure as for the evaluation of the integral in Eq.26). if the function can be written as g(z) = (z − z0 )m gm (z).

Our first task is to determine the values of the coefficients an and bn that appear in the real Fourier expansion. We have explicitly separated out the cosine term with n = 0. f (x) = a0 + ∞ n=1 an cos(nx) + ∞ n=1 bn sin(nx) (5.1 5.Chapter 5 Fourier analysis 5. The expression on the right-hand side of Eq. (5. because sin(0) = 0).1.(5. since cos(−nx) = cos(nx) and sin(−nx) = − sin(nx).1) implies that the function f (x) is periodic: f (x + 2π) = f (x) because each term in the expansion obeys this periodicity.1 Fourier expansions Real Fourier expansions A very important representation of functions is in terms of the trigonometric functions.1) is referred to as the “real Fourier series representation” or “real Fourier expansion”. The expression of Eq. the sine and cosine functions with negative indices are the same or involve at most a sign change (in the case of the sine). thus they do not provide any additional flexibility in represneting the general function f (x). Note that we only need positive values for the indices n.1) where an and bn are real constants. which is a constant a0 (there is no such sine term. We consider a function f (x) of the real variable x and its representation as a series expansion in terms of the functions cos(nx) and sin(nx) with n: integer. we multiply 117 . In order to do this.

(5.4) (5. which produces the following terms on the right-hand side: cos(nx) cos(mx) = sin(nx) sin(mx) = 1 1 cos(nx − mx) + cos(nx + mx) 2 2 1 1 cos(nx − mx) − cos(nx + mx) 2 2 1 1 sin(nx) cos(mx) = sin(nx − mx) + sin(nx + mx) 2 2 where we have used the trigonometric identities cos(a ± b) = cos(a) cos(b) ∓ sin(a) sin(b) to rewrite the products of two cosines.1) by cos(nx) or by sin(nx) and integrate over x from −π to π. . we can easily integrate these terms over −π ≤ x ≤ π which all give zeros because of the periodicity of the sine and cosine functions: 1 sin(nx − mx) sin(nx + mx) + cos(nx) cos(mx)dx = 2 n−m n+m −π π π −π π −π π sin(a ± b) = sin(a) cos(b) ± cos(a) sin(b) =0 −π π sin(nx) sin(mx)dx = sin(nx) cos(mx)dx = − 1 cos(nx − mx) cos(nx + mx) + 2 n−m n+m 1 2 π −π 1 sin(nx − mx) sin(nx + mx) − 2 n−m n+m =0 −π π =0 −π The only non-vanishing terms are those for n = m: π −π cos2 (nx)dx = π −π sin2 (nx)dx = sin2 (nx) + cos2 (nx) dx = π (5.118 CHAPTER 5. we arrive a0 = an bn 1 π f (x)dx 2π −π 1 π f (x) cos(nx)dx = π −π 1 π f (x) sin(nx)dx = π −π (5. producing 2π. Collecting these results.2) and for n = 0 the integral is trivial and concerns only the cosine term. FOURIER ANALYSIS both sides of Eq. two sines and a sine and cosine. For n = m.3) (5.5) These are known as the “Euler formulae”.

which turn out to be: an = 0. π 1 sn (x) = √ sin(nx) π (5. We next give some examples of real Fourier series expansions.5. that is. bn = 4k . m. as in the complex exponential. n = odd. can be rewritten in more compact form with a change in notation. Here we will only deal with the sines and cosines and linear combinations of them. for all n. the integral vanishes unless we have only one function multiplying itself in the integrand.1: Consider the function that describes a so-called square wave: f (x) = −k. −π ≤ x ≤ 0 = +k. We define the functions cn (x) and sn (x) through the relations 1 cn (x) = √ cos(nx). πn bn = 0. π −π sn (x)sm (x) dx = δnm π −π (5. some involving various types of polynomials.6) in terms of which the relations between sines of different indices n and m take the form: π −π cn (x)cm (x) dx = δnm . the functions are properly “normalized”. Example 5. These sets are extremely useful as bases for expansion of arbitrary functions. which we want to express as a real Fourier expansion. From the Euler formulae we can determine the coefficients of this expansion. we can then express the orginal function as a Fourier expansion: f (x) = 1 1 1 4k sin(x) + sin(3x) + sin(5x) + sin(7x) + · · · π 3 5 7 . FOURIER EXPANSIONS 119 The relations we derived above between sines and cosines with different indices n. the integral is unity when it does not vanish. that is.7) cn (x)sm (x) dx = 0 The sets of functions that satisfy such relations are called “orthonormal”: they are “orthogonal” upon integration over the argument x from −π to π.8) with k > 0 a real constant. 0 ≤ x ≤ π (5. There are many sets of orthonormal functions.1. n = even With these coefficients. moreover.

120

CHAPTER 5. FOURIER ANALYSIS

A useful application of this expression is to evaluate both sides at x = π/2; from f (x = π/2) = k obtain:
∞ 1 1 1 1 π 1− + − +··· = (−1)n = 3 5 7 2n + 1 4 n=0

Example 5.2: As a second example, consider the function which descibes the so-called triangular wave: x f (x) = −k , −π ≤ x ≤ 0; π x = +k , 0 ≤ x ≤ π π

(5.9)

with k a real constant. From the Euler formulae we determine coefficients for the Fourier expansion of this function: a0 = k ; 2 an = − 4k , n = odd; an = 0, n = even; π 2 n2 bn = 0, for all n

From these we reconstruct f (x) as a Fourier expansion f (x) = k 4k 1 1 1 − 2 cos(x) + cos(3x) + cos(5x) + cos(7x) + · · · 2 π 9 25 49

A useful application of this expression is to evaluate both sides at x = π; from f (x = π) = k obtain: 1+
∞ 1 1 1 1 π2 + + +··· = = 2 9 25 49 8 n=0 (2n + 1)

It is instructive to consider the expansions in the above two examples in some more detail. Both expressions are approximations to the original functions which become better the more terms we include in the expansion. This is shown in Fig. 5.1, where we present the original functions and their expansions with the lowest one, two, three and four terms. Moreover, the terms that appear in the expansions have the same symmetry as the functions themselves. Specifically, for the square wave which is an odd function of x, f (−x) = −f (x), only the sine terms appear in the expansion, which are also odd functions of x: sin(−nx) = − sin(nx). Similarly, for

5.1. FOURIER EXPANSIONS

121

10

5

8

4
6

3
4

2
2

0

1

−2 −9.5

0

9.5

0 −9.5

0

9.5

Figure 5.1: Fourier expansion representation of the square wave (left) and the
triangular wave (right): the bottom curve is the original wave, the next four curves are representations keeping the lowest one, two, three and four terms in the expansion. The different curves are displaced in the vertical axis for clarity. In both cases, we have chosen the constant k = 1 in Eqs. (5.8) and (5.9).

the triangular wave which is an even function fo x, f (−x) = f (x), only the cosine terms appear in the expansion, which are also even functions of x: cos(−nx) = cos(nx). If the function f (x) has no definite symmetry for x → −x, then the Fourier expansion will contain both types of terms (sines and cosines).

5.1.2

Fourier expansions of arbitrary range

In the preceding discussion, we assumed that the functions under consideration were defined in the interval [−π, π] and were periodic with a period of 2π (as the Fourier expansion requires). The limitation of the values of x in this interval is not crucial. We can consider a periodic function f (x) defined in an interval of arbitrary length 2L, with x ∈ [−L, L] and f (x + 2L) = f (x). Then, with the following change of variables x′ = π L x ⇒ x = x′ L π

and by rewriting the function as f (x) = g(x′ ), we see that g(x′ ) has period 2π in the variable x′ which takes values in the interval [−π, π]. Therefore, we can apply the Euler formulae to find the Fourier expansion of g(x′ ) and

122

CHAPTER 5. FOURIER ANALYSIS

from this the corresponding expansion of f (x). The result, after we change variables back to x, is given by: f (x) = a0 = an = bn =
∞ πn πn bn sin an cos x + x (5.10) a0 + L L n=1 n=1 1 π 1 L g(x′)dx′ = f (x)dx 2π −π 2L −L 1 π 1 L πn g(x′ ) cos(nx′ )dx′ = f (x) cos x dx π −π L −L L 1 L πn 1 π x dx g(x′ ) sin(nx′ )dx′ = f (x) sin π −π L −L L ∞

For functions that have definite symmetry for x → −x, we can again reduce the computation to finding only the coefficients of terms with the same symmetry. Specifically, for odd functions, f− (−x) = −f− (x), only the sine terms survive, whereas for even functions, f+ (−x) = f+ (x), only the cosine terms survive, and the coefficients can be expressed as: an = bn = 2 L
L 0

f+ (x) cos

πn x dx, L

bn = 0 for all n

2 L πn f− (x) sin x dx, an = 0 for all n L 0 L Often, the function of interest is defined only in half of the usual range. For example, we might be given a function f (x) defined only in the interval [0, π]. In this case, we cannot readily apply the expressions of the Euler formulae to obtain the coefficients of the Fourier expansion. What is usually done, is to assume either a cosine series expansion for the function in the interval [−π, π], or a sine series expansion in the same interval. The first choice produces an even function while the second choice produces an odd function in the interval [−π, π]. Since both expansions approximate the original function accurately in the interval [0, π] (assuming enough terms in the expansion have been retained), it does not matter which choice we make. However, the behavior of the function may make it clear which is the preferred expansion choice, which will also converge faster to the desired resul.

5.1.3

Complex Fourier expansions

We can generalize the notion of the Fourier expansion to complex numbers by using the complex epxonentials exp(inx) as the basis set, instead of

(5. With this complex notation we obtain the complex Fourier series expansion: f (x) = ∞ −∞ cn einx (5. f (x) and g(x). The values of these coefficients can be obtained from the Euler formula for the complex exponential eix = cos(x) + i sin(x) and the Euler formulae for coefficeints that appear in the real Fourier series expansion. The coefficient cn of exp(inx) is given by: cn = 1 2π π −π f (x)e−inx dx (5. π]. . Alternatively.11) by e−imx and integrate over x in [−π. we can multiply both sides of Eq. f (x) = ∞ n=−∞ cn einx g(x) = ∞ m=−∞ dm eimx that is. the fact that the complex exponentials with different indices n and m are orthogonal. using: π −π ei(n−m)x dx = 2πδnm that is. Then the convolution of the two functions will be given by f ⋆ g(x) ≡ ∞ ∞ 1 2π π −π f (t)g(x − t)dt = π −π 1 cn d m n=−∞ m=−∞ 2π ∞ ∞ ∞ ∞ π −π eint eim(x−t) dt = 1 cn dm eimx n=−∞ m=−∞ 2π ei(n−m)t dt = ∞ n=−∞ 1 cn dm eimx (2πδnm ) = n=−∞ m=−∞ 2π cn dn einx From the last expression we conclude that the Fourier expansion of the convolution of two periodic functions is given by the term-by-term product of the Fourier expansions of the functions.1.12) We discuss next a useful relation that applies to the Fourier expansion of the convolution of two periodic functions.11) where cn are complex constants. FOURIER EXPANSIONS 123 the sines and cosines. Let us assume that the Fourier expansions of these functions are known.5. the coefficients cn and dm have been determined.

those that consitute the basis for the expansion. containing n terms. Therefore. we inevitably introduce an error in the function. the hope is that we can truncate the series to a finite number of terms and still get a reasonable representation. n λj = xf xi p2 (x)dx j (5. A convenient measure of the error introduced by the truncation is the square of the integral of f (x) − sn (x): E(n) = = ∞ j=n+1 xf xi [f (x) − sn (x)]2 dx = xf xi xf xi   ∞ j=n+1 xf xi c2 j p2 (x)dx + j = ∞ ∞ j=k=n+1 cj pj (x) dx 2 cj ck pj (x)pk (x)dx c2 λ j j j=n+1 where the last equation is obtained from the orthogonality of the functions pj (x). the error can be bound by ∞ E(n) ≤ c2 (j)λ(j)dj n . which is not possible from a practical point. xf ]: xf xi pj (x)pk (x)dx = δjk λj . using series expansions to represent a function is useful because we have to deal with simpler functions.4 Error analysis in series expansions As we have mentioned several times before. Let us assume that we have an infinite series representation of the function f (x) f (x) = ∞ cj pj (x) j=1 where pj (x) is a set of orthogonal functions in the interval [xi . It is useful to have a measure of this error. which we will denote by n. typically as a function of the number of terms retained. (5. FOURIER ANALYSIS 5. When we do this truncation.13) We define sn (x) = cj pj (x) j=1 which is the approximation of f (x) by a truncated series. as it is represented by the finite number of terms that we have retained in the series. However.1.124 CHAPTER 5. the representation of the original function in terms of a series expansion is exact only if we include an infinite number of terms in the expansion. By considering cj and λj as functions of j. Eq.13).

2. The behavior of this quantity is governed by the diffusion equation ∂2T ∂T κ 2 = (5.1 Application to differential equations Diffusion equation The diffusion equation is typically used to describe how a physical quantity spreads with time and space.5.2.2 5. which we call −c. Assume that the temperature distribution is given at t = 0. t) = f (x)g(t) We substitute this expression in the differential equation that T (x. From this we obtain: √ c f ′′ (x) = − f (x) ⇒ f (x) = f0 e±ix c/κ κ In order to the satisfy boundary condition at t = 0.14) ∂x ∂t where κ is the thermal diffusivity coefficient. APPLICATION TO DIFFERENTIAL EQUATIONS 125 5. by assuming that T (x. t) = T (L. a function of x is equal to a function of t which is possible only if they are both equal to a constant. Consider T (x. and that the boundary conditions are: T (0. The usual way of solving differential equations is by separation of variables. y) obeys to obtain: f ′′ (x) g ′ (t) κf ′′ (x)g(t) = f (x)g ′(t) ⇒ κ = f (x) g(t) In the last expression. t) to be the temperature distribution in a physical system. assuming for the moment that c > 0. t) = 0 for all t. t) = Tn sin g ′(t) = −cg(t) ⇒ g(t) = g0 e−tc nπ 2 2 2 x e−κn π t/L L . that is. we must choose the √ linear combination of e±ix c/κ functions which vanishes at x = 0: the proper choice is √ √ −ix c/κ +ix c/κ −e = 2i sin(x c/k) e In order to satisfy the boundary condition at x = L we must have sin(L c/κ) = 0 ⇒ L c/κ = nπ ⇒ c = n2 π 2 κ/L2 With this we obtain: (0) Tn (x.

consider two-dimensional functions in (x. y): ∂2 ∂2 + 2 Φ(x. t) = L n=1 in order to solve the diffusion equation. 0) = 0.15) . Our choice of the expansion automatically satisfies the boundary conditions T (0. FOURIER ANALYSIS and the most general solution is a superposition of these terms: T (x. The solution proceeds as follows: we substitute the Fourier expansion in the differential equation: κ ∞ n=1 bn (t)(− ∞ dbn (t) n2 π 2 nπ nπ ) sin x = sin x 2 L L dt L n=1 Equating term-by-term the two sides of the equation we obtain: n2 π 2 dbn (t) 2 2 2 = κbn (t)(− 2 ) ⇒ bn (t) = e−κn π t/L bn (0) dt L which gives the solution to the initial equation: T (x. we can use the Fourier expansion for the temperature distribution: ∞ nπ bn (t) sin x T (x. t) = ∞ n=1 (0) Tn e−κn 2 π 2 t/L2 sin nπ x L Alternatively. with Tn = bn (0). t) = ∞ n=1 bn (0)e−κn 2 π 2 t/L2 sin nπ x L and bn (0) are obtained from T (x. 0) = T (L. y) = −4πρ(x.126 CHAPTER 5.2. 5. The advantage of using the Fourier expansion was that the boundary conditions are automatically satisfied by proper choice of expansion functions.2 Poisson equation Teh Poisson equation is typically used to determine a multi-dimensional potential function Φ which arises form a source function with the same dimensionality. This is identical to the expression obtained from usual approach (0) through separation of variables. y) ∂x2 ∂y (5. For example. 0) (which is assumed given) by using Euler formulae.

m einπx/a eimπy/b n=−∞ m=−∞ b a 1 Φ(x. ∞ −∞ δ(x − x′ )dx′ = 1 (5. y).3. y) 5.1 The δ-function and the θ-function Definitions A very useful. since ρ(x. y) = cn.16) .m 4π which gives the solution for Φ(x. Specifically: ρ(x.m = ∞ ∞ cn. y) = ∞ ∞ dn. Then ∞ ∞ ∂2 ∂2 n2 π 2 m2 π 2 inπx/a imπy/b + 2 Φ(x.3. n2 π 2 m2 π 2 + 2 a2 b −1 = −4πρ(x.m = dn. if somewhat unconventional. Its definition is given by the relations:: δ(0) → ∞.m given by dn. we use two-dimensional Fourier expansions for Φ(x. δ(x = 0) = 0. function is the so called δfunction.m = 1 4ab a −a b −b ρ(x. y) is known. y)e−inπx/a e−imπy/b dxdy 4ab −a −b The Poisson equation can then be solved by equating coefficients term-byterm on the two sides of the equation and obtaining the coefficients of Φ from those of ρ. y)e−inπx/a e−imπy/b dxdy and considered known. also known as the Dirac function.3 5.m einπx/a eimπy/b n=−∞ m=−∞ with the coefficients dn. y) for a given source function ρ(x.5.m − 2 − 2 e e ∂x2 ∂y a b n=−∞ m=−∞ and comparing the two sides term by term we obtain ⇒ cn. y) which are defined in 0 ≤ x ≤ a and 0 ≤ y ≤ b: Φ(x. y) and ρ(x. y). which is presumed known. y) = cn. THE δ-FUNCTION AND THE θ-FUNCTION 127 To find the solution for the potential Φ(x.

and it integrates to unity. x) = . for |x| ≤ a 2a . for x < x′ . which gives back the function f (x). also known as the Heavyside function: θ(x − x′ ) = 0. it follows that the product of the δ-function with an arbitrary function f (x) integrated over all values of x must satisfy: ∞ −∞ f (x′ )δ(x − x′ )dx′ = f (x) (5. Thus. for x > x′ (5. it is a function with an infinite peak at the zero of its argument. The value of f (x) that is picked by this operation corresponds to the value of x for which the argument of the δ-function vanishes. 5. The set of of values of f (x) that is picked by this operation corresponds to the values of x for which the argument of the θ-function is positive.128 θ (x) CHAPTER 5. The δ-function is not a function in the usual sense. 5. The behavior of the δ-function is shown schematically in Fig. The θ-function picks up a set of values of the function f (x) when the product of the two is integrated over the entire range of values of f (x). a simple generalization of the Kronecker δ is: 1 w(a.17) The behavior of the θ-function is shown schematically in Fig. θ(x − x′ ) = 1. it is a function represented by a limit of usual functions. it is zero everywhere else. For example. From its definition.2: Definition of the θ-function and the δ-function (a → 0). that is. the δ-function picks out one value of another function f (x) when the product of the two is integrated over the entire range of values of f (x). FOURIER ANALYSIS δ (x) height ~ 1/a 1 width ~ a 0 x 0 x Figure 5.2.18) This expression can also be thought of as the convolution of the function f (x) with the δ-function. A function closely related to the δ-function is the so called θ-function or step-function.2.

1 The Fourier transform Definition of Fourier Transform The Fourier transform is the limit of the Fourier expansion for non-periodic functions that are defined in interval −∞ < x < ∞. 5.4.3: Examples of simple functions which in the proper limit become δfunctions: the window function w(a. Taking the limit of w(a.4 5. for |x| > a (5. (5. 5. x) = |x| 1 1− . since its product with any other function picks out the values of that function in the range −a ≤ x ≤ a. Another simple representation of the δ-function is given by the triangular spike function: s(a.19) We will refer to this as the “window” function.5. for |x| ≤ b for |x| > b (5. in a window of width 2a in x around zero (the window can be moved to any other point x0 on the x-axis by replacing x by x − x0 in the above relqations). = 0. THE FOURIER TRANSFORM y 1/2a y 1/b 129 -a 0 a x -b 0 b x Figure 5. The window function and the triangular spike function are shown in Fig.19) and the triangular spike function.20). (5.4. We need to make the . x) for a → 0 produces a function with the desired behavior to represent the δ-function. that is. b b = 0.20) which for b → 0 produces a δ-function. x) defined in Eq. defined in Eq. Both of these functions are continuous functions of x but their first derivatives are discontinuous.3.

130 CHAPTER 5. because: ∞ −∞ f (x)e−iωx dx ≤ ∞ −∞ |f (x)|dx ≤ M From the expression for f (x) in terms of the Fourier expansion. From the Fourier expansion coefficients we obtain: ∞ ˜ lim (cn 2L) = f (x)e−iωx dx ≡ f (ω) L→∞ −∞ ˜ where f (ω) is a finite quantity. when L → ∞ the spacing of values of ω becomes infinitesimal. we start with the Fourier expansion: f (x) = ∞ n=−∞ cn einπx/L cn = 1 2L L −L f (x)e−inπx/L dx Define a variable ω = nπ/L. Note that. To derive the Fourier transform. which can be easily derived from its definition: .4. FOURIER ANALYSIS following assumption in order to make sure that the Fourier transform is a mathematically meaningful expression: ∞ −∞ |f (x)|dx ≤ M where M is a finite number.2 Properties of the Fourier transform We discuss several properties of the FT. we obtain: f (x) = ∞ n=−∞ cn einπx/L = ∞ n=−∞ (2Lcn )einπx/L ∞ 1 1 ˜ = f (ω)eiωx 2L n=−∞ 2L The spacing of values of the new variable ω that we introduced above is: ∆ω = dω 1 ∆ω 1 π = ⇒ = ⇒ lim L→∞ 2L L 2L 2π 2π and therefore. the infinite sum can be turned into an integral in the continuous variable ω: 1 ∞ ˜ f(ω)eiωx dω f (x) = 2π −∞ 5.

the FT of f (n) (x) is given by (iω)n f (ω). Shifting: Given that f(ω) is the FT of f (x). b. we can obtain the FT of f (x − x0 ): ∞ −∞ f (x − x0 )e−iωx dx = ∞ = e −∞ −iωx0 f (x − x0 )e−iω(x−x0 ) d(x − x0 )e−iωx0 ˜ f(ω) (5. Derivative: Given that f (ω) is the FT of f (x).26) . then the FT of the linear combination of f (x) and g(x) with two arbitrary constants a.4. is: ˜ af (x) + bg(x) −→ af (ω) + b˜(ω) g (5.23) ˜ = f (ω − ω0 ) ˜ 3.25) This is easily generalized to the expression for the nth moment: ∞ −∞ x f (x)dx = n n˜ nd f i (0) dω n (5. the FT of eiω0 x f (x) is: ∞ −∞ f (x)eiω0 x e−iωx dx = ∞ −∞ f (x)e−i(ω−ω0 )x dx (5. 4. Linearity: If f (ω) and g (ω) are the FT’s of the functions f (x) and ˜ g(x). we can obtain the ′ FT of f (x).21) ˜ 2. assuming that f (x) → 0 for x → ±∞: ∞ −∞ f ′ (x)e−iωx dx = f (x)e−iωx ∞ −∞ − ∞ ˜ = iω f(ω) −∞ f (x)(−iω)e−iωx dx (5.24) ˜ Similarly.5. Monents: The nth moment of the function f (x) is defined as: ∞ −∞ xn f (x)dx ˜ From the definition of the FT we see: 0th moment = f (0). We take the derivative of the FT with respect to ω: ˜ df (ω) = dω ˜ df i (0) = dω ∞ −∞ ∞ −∞ f (x)(−ix)e−iωx dx ⇒ xf (x)dx (5. THE FOURIER TRANSFORM 131 ˜ 1.22) Similarly.

g(ω). FOURIER ANALYSIS 5.28) ˜ g = f (ω)˜(ω) ˜ 6.132 CHAPTER 5.11): δ(x) = ∞ n=−∞ dn ei L x ⇒ dn = nπ 1 2L L −L δ(x)e−i L x dx = nπ 1 2L (5.29) The Fourier transform of the correlation is: c(ω) = ˜ = ∞ −∞ ∞ −∞ c[f. Convolution: Consider two functions f (x). FT of δ-function: To derive the Fourier transform of the δ-function we start with its Fourier expansion representation as in Eq. The convolution f ⋆ g(x) is defined as: ˜ f ⋆ g(x) ≡ ∞ −∞ f (t)g(x − t)dt = h(x) (5. g(x) and their FT’s f (ω). g](x) ≡ ∞ −∞ f (t + x)g(t)dt = ∞ −∞ f (t)g(t − x)dt (5. the Inverse Fourier transform of the δ-function gives: δ(x − x′ ) = 1 2π 1 ′ ˜ δ(ω)eiω(x−x ) dω = 2π −∞ ∞ ∞ −∞ eiω(x−x ) dω ′ (5.(5.30) ˜ g = f (ω)˜(−ω) 7.31) which in the limit L → ∞ produces: ˜ δ(ω) = lim (2Ldn ) = 1 L→∞ (5. g](x) is defined as: c[f.33) .27) The Fourier transform of the convolution is: ˜ h(ω) = = ∞ −∞ ∞ −∞ h(x)e−iωx dx = ∞ −∞ ∞ −∞ dxe−iωx ∞ −∞ f (t)g(x − t)dt dxe−iω(x−t) g(x − t) e−iωt f (t)dt (5. g(x) and their FT’s ˜ f (ω). g](x)e−iωx dx = ∞ −∞ ∞ −∞ e−iωx ∞ −∞ f (t + x)g(t)dt dx e−iω(t+x) f (t + x)dx e−i(−ω)t g(t)dt (5. g(ω). Correlation: Consider two functions f (x). ˜ The correlation c[f.32) Therefore.

5.4. THE FOURIER TRANSFORM We give below several examples of Fourier transforms. Example 5.3 : Consider the function f (x) defined as f (x) = e−ax , x > 0 = 0, x < 0 The FT of this function is ˜ f(ω) =
∞ 0

133

e−iωx−ax dx =

1 i(ω − ia)

which is easily established by the regular rules of integration. The inverse FT gives: 1 ∞ eiωx f (x) = dω 2π −∞ i(ω − ia) To perform this integral we use contour integration: For x > 0, we close the contour on the upper half plane to satisfy Jordan’s Lemma, f (x) = 1 2πi(Residue at ω = ia) = e−ax 2π

For x < 0, we close the contour on the lower half plane to satisfy Jordan’s Lemma, f (x) = 0 (no residues in contour)

Example 5.4: Consider the function f (x) defined as f (x) = 1, for − 1 < x < 1; = 0 otherwise The FT of this function is ˜ f (ω) = The inverse FT gives: f (x) = 1 2π
∞ −∞ 1 −1

e−iωx dx =

1 2 sin ω (e−iω − eiω ) = −iω ω eiω(x+1) dω − ω eiω(x−1) dω ω

1 eiω e−iω iωx e dω = − iω iω 2πi

∞ −∞

∞ −∞

134

CHAPTER 5. FOURIER ANALYSIS

The last two integrals can be done by contour integration: I1 = = I2 = = Combining the results, we find: −∞ < x < −1 : both x + 1 < 0 and x − 1 < 0 → f (x) = 0 −1 < x < 1 : x + 1 > 0 and x − 1 < 0 → f (x) = 1 1 < x < ∞ : both x + 1 > 0 and x − 1 > 0 → f (x) = 0 πi, for x + 1 > 0; −πi, for x + 1 < 0 πi, for x − 1 > 0; −πi, for x − 1 < 0

Example 5.5: Consider the function defined as f (x) = 1, for − 1 < x < 1, = 0 otherwise and calculate its moments using FT: 2 sin ω ω3 2 ω5 1 1 ˜ f(ω) = ω− = + − · · · = 2 − ω2 + ω4 − · · · ω ω 6 120 3 60 From this we obtain: ˜ ˜ ˜ ˜ d2 f 2 d3 f d4 f 2 df ˜ f(0) = 2, i (0) = 0, i2 2 (0) = , i3 3 (0) = 0, i4 4 (0) = , · · · dω dω 3 dω dω 5 which give the 0th , 1st, 2nd , 3rd , 4th , ... moments of f (x). In this case the results can be easily verified by evaluating
∞ −∞

(5.34)

f (x)xn dx =

1 −1

xn dx =

2 n+1

for n = even, and 0 for n = odd. Example 5.6: Consider the function f (x) which has a discontinuity ∆f at x0 : ∆f = lim[f (x0 + ǫ) − f (x0 − ǫ)]
ǫ→0

5.4. THE FOURIER TRANSFORM We want to calculate the FT of this function: ˜ f (ω) =
∞ −∞

135

e−iωx f (x)dx = lim
x0 −ǫ −∞

x0 −ǫ −∞

ǫ→0

e−iωx f (x)dx +
∞ x0 +ǫ

∞ x0 +ǫ ∞ −∞

e−iωx f (x)dx =

e−iωx = lim f (x) ǫ→0  −iω =

 

e−iωx + f (x) −iω
∞ −∞

  

e−iωx ′ f (x)dx −iω

e−iωx0 1 ∆f + iω iω

e−iωx f ′ (x)dx

where we have assumed that f (x) → 0 for x → ±∞, which is required for the the FT to be meaningful. Simlarly, a discontinuity ∆f (n) in the nth derivative produces the following FT: e−iωx0 ∆f (n) n+1 (iω)

Example 5.7: Calculate the FT of the gaussian function: 1 1 2 2 ˜ f (x) = √ e−x /2a → f(ω) = √ a 2π a 2π
∞ −∞

e−x

2 /2a2 −iωx

dx

Notice that we are dealing with a properly normalized gaussian function. Complete the square in the exponential: x2 + iωx = 2a2 x √ a 2 =
2

+2

x √ a 2
2

ω 2 a2 ω 2 a2 iωa √ − + 2 2 2 + ω 2 a2 2

x iωa √ +√ a 2 2

√ which when inserted in the integral, with the change of variables: z = x/a 2+ √ √ iωa/ 2 ⇒ dz = d(x/a 2), gives: 1 ˜ f(ω) = √ π
x=∞ x=−∞

e−z dz e−ω

2

2 a2 /2

= e−ω

2 a2 /2

where we have used the fact that
x=∞ x=−∞

e−z dz =

2

π,

z = x + iω

Example 5. we find that the function whose FT is 1 ˜ g (ω) + f (ω) = πδ(ω) + ˜ iω is the sum of the function g(x) = 1/2 for all x. for x > 0. Example 5. that is the Heavyside step function. FOURIER ANALYSIS as we showed in chapter 1 1 . and in the lower half plane for x < 0.9: Which function gives g (ω) = πδ(ω) as its Fourier transform? ˜ Here we define the δ-function so that: ∞ −∞ δ(x)dx = 1 ⇒ g(x) = ∞ −∞ δ(x)f (x)dx = f (0) 1 ∞ 1 dωeiωx πδ(ω) = 2π −∞ 2 Using the linearity of the FT. and its FT becomes 1 for all ω. The function f (x) then is: f (x) = 1/2. which gives f (x) = 1/2. 1 .10: From the shifting properties of the FT and Example 5.9. we deduce that the function whose FT is ˜ f(ω) = πδ(ω − ω0 ) + πδ(ω + ω0 ) Notice that this result was proven for a real variable x. Example 5.8: Which function gives 1 ˜ f (ω) = iω as its Fourier transform? f (x) = 1 2π ∞ −∞ dω eiωx iω This is evaluated by contour integration. which gives f (x) = −1/2. f (x) = −1/2. 0. it is easy to extend it to complex variable z = x + iω by the same trick we used to prove the original result. and the function f (x) of the previous example. which gives the following function: 1. for x < 0. Therefore the FT of a gaussian is a gaussian. shifted down by 1/2. for x < 0.136 CHAPTER 5. The inverse FT also involves a gaussian integral which is again performed by completing the square. and simply shifting the origin of both axes by −iω. for x > 0. However. closing in the upper half plane for x > 0. Notice that for a → 0 the gaussian in x becomes a δ-function. This is just like the Heavyside step function.

5.4. THE FOURIER TRANSFORM is given by:

137

1 iω0 x 1 −iω0 x e + e = cos(ω0 x) 2 2 The function whose FT is ˜ f (ω) = −iπδ(ω − ω0 ) + iπδ(ω + ω0 )

is given by:

1 1 −i eiω0 x + i e−iω0 x = sin(ω0 x) 2 2

5.4.3

Singular transforms - limiting functions
1 ˜ f (ω) = iω

We have seen in the previous section that (5.35)

is the Fourier transform of the function f (x) = 1 , for x > 0 2 1 = − , for x < 0 2 (5.36) (5.37)

This FT is singular, i.e. it blows up for ω → 0, which contradicts our ˜ earlier assumption that f(ω) is bounded. The reason for the contradiction is that f (x) is discontinuous: 1 f (x) → , x → 0+ 2 1 f (x) → − , x → 0− 2

If we attempted to calculate the FT from f (x) we would take: ˜ f (ω) =
∞ −∞

e−iωx f (x)dx = −

1 2

0− −∞

e−iωx dx +

1 2

∞ 0+

e−iωx dx

The limits at ±∞ can be evaluated in principal value sense, R → ∞: 1 ˜ f(ω) = − 2
0− −R

e−iωx dx +

1 2

R 0+

e−iωx dx =

1 1 − eiωR + 1 − e−iωR 2iω

In the last expression, [eiωR + e−iωR ] = 2 cos(ωR) vanishes for R → ∞ because the argument of the cosine goes through 2π for infinitesimal changes

138

CHAPTER 5. FOURIER ANALYSIS
sin(wR)/w height ~ R

width ~ 1/R

0

w

Figure 5.4:
for R → ∞.

The function sin(ωR)/ω which behaves just like a δ(ω) function

in ω, and the average of the cosine function is 0. Then, what is left in the above equation is exactly the relation of Eq. (5.35). We have also seen in the previous section that ˜ f (ω) = πδ(ω) is the Fourier transform of 1 f (x) = , 2 for all x (5.38)

This FT is singular, i.e. it blows up for ω → 0, which contradicts our ˜ earlier assumption that f(ω) is bounded. The reason for the contradiction is that the integral of f (x) is not bounded:
∞ −∞

(1/2)dx → ∞

˜ If we attempted to calculate f (ω) from f (x) we would have: 1 2
∞ −∞

e−iωx dx =

1 lim e−iωx R→∞ −2iω

R −R

= lim

R→∞

sin(ωR) ω

The function sin(ωR)/(ωR) is 1 at ω = 0, falls off for ω → ∞, and has width π/R, obtained from the first zero of sin(ωR) which occurs at ωR = π. This behavior is illustrated in Fig. 5.4. Therefore, the function sin(ωR)/ω

5.5. FOURIER ANALYSIS OF SIGNALS

139

has height ∼ R and width ∼ 1/R. Its width goes to 0 and its heigh goes to ∞ when R → ∞, while its integral is: eiωR − e−iωR 1 dω = 2π 2iω 2π −∞ R→∞

lim

∞ −∞

˜ f (ω)eiω0dω = 2πf (0) = π

Therefore, the function sin(ωR)/ω indeed behaves just like a δ function in the variable ω for R → ∞, but is normalized so that it gives π rather than 1 when integrated over all values of ω, so we can identify lim sin(ωR) ω = πδ(ω)

R→∞

5.5

Fourier analysis of signals

In the following we will consider functions of a real variable and their FT’s. We will assume that the real variable represents time and denote it by t, while the variable that appears in the FT represents frequency, and denote it by ω 2 . Very often, when we are trying to measure a time signal f (t), it is convenient to measure its frequency content, that is, its Fourier transform ˜ f (ω). This is because we can construct equipment that responds accurately to specific frequencies. Then, we can reconstruct the original signal f (t) ˜ by doing an inverse Fourier transform on the measured signal f(ω) in the frequency domain: FT ˜ IFT f (t) −→ f(ω) −→ f (t) Some useful notions in doing this type of analysis are the total power and the spectral density. The total power of the signal is defined as: total power : P = while the spectral density is defined as ˜ spectral density : |f (ω)|2 A useful relation linking these two notions is Parceval’s theorem
∞ −∞
2

∞ −∞

|f (t)|2 dt

(5.39)

(5.40)

˜ |f (ω)|2dω = 2πP

(5.41)

The variables t and ω are referred to as “conjugate”, and their physical dimensions are the inverse of each other, so that their product is dimensionless.

An essential feature of time-signal measurements is their sampling at finite time intervals. assume that we are dealing with “band-width limited” signal. which produced a δ function with argument t − t′ .140 CHAPTER 5. which we will call τ tn = nτ. Integrating the absolute value squared of the above expression over all values of ω leads to: ∞ −∞ ˜ |f(ω)|2dω = ∞ −∞ ∞ −∞ ∞ −∞ ∞ −∞ f (t)f(t′ ) ∞ −∞ eiω(t −t) dω dtdt′ = ∞ −∞ ′ 2π f (t)f(t′ )δ(t − t′ )dtdt′ = 2π |f (t)|2 dt where we have used Eq. Usually. n = −N. suppose that we sample a time signal f (t) at the 2N discrete time moments separated by a time interval 3 .33). 0. FOURIER ANALYSIS We can prove Parceval’s theorem with the use of the FT of the δ-function. . but this is a misnomer because a rate has the dimensions of inverse time. . Eq. This leads to important complications in determining the true signal from the measured signal. .43) This time interval is often called the “sampling rate”. we are interested in the limit of N → ∞ and τ → 0 with 2Nτ = ttot . and then performed an integration over one of the two time variables. . the 2N values of f (tn ) are what we can obtain from an experiment. a signal whose FT vanishes for frequencies outside the range |ω| > ωc : ˜ f (ω) = 0 for ω < −ωc and ω > ωc For such a signal. N − 1 (5. that is. the so-called “sampling theorem” asserts that we can reconstruct the entire signal using the expression: f (t) = 3 N −1 n=−N f (tn ) sin ωc (t − tn ) ωc (t − tn ) (5. . (5. Let us define the characteristic frequency ωc (also called Nyquist frequency) π ωc = τ Then. .33) to perform the integration over ω in the square brackets.42) that is. . the total duration of the measurements. To illustrate this problem. From the definition of the FT of f (t) we have: ˜ f (ω) = ∞ −∞ ˜ e−iωt f (t)dt ⇒ |f (ω)|2 = ∞ −∞ e−iωt f (t)dt ∞ −∞ ¯ eiωt f (t′ )dt′ ¯ where f is the complex conjugate of f . (5. . The result is Parcevel’s theorem. .

because it gives the time signal f (t) for all values of t (which is a continuous variable).5. We can apply this analysis to the sampling of a signal f (t). that is. If we are still forced to sample such a signal at the same time moments defined above. FOURIER ANALYSIS OF SIGNALS 141 where tn are the moments in time defined in Eq. When the FT of the signal is measured by sampling the signal at intervals τ apart. at least in the regions near ±ωc . Similarly. This effect is called “aliasing”. then the contributions of these frequencies will not be differentiated. f2 (t) = eiω2 t . which significantly alters the measured specturm from the true one. (5. the contribution of the frequency ωc +δω (where δω > 0) will appear at the frequency −ωc +δω because theses two frequencies differ by 2ωc . they each have a unique frequency content and the two frequencies differ by integer multiple of 2ωc . This is a very useful expression. the information content of the signal can be determined entirely by sampling it at the time moments tn .42). For instance. To show this. then performing an inverse Fourier transform will create difficulties. k : integer that is. However. In other words. consider two functions f1 (t) and f2 (t) which are described by f1 (t) = eiω1 t . If the signal contains frequencies that differ by integer multiples of 2ωc . the contribution of the frequency −ωc − δω will appear at the frequency ωc − δω because theses two frequencies differ by 2ωc . we will have f1 (tn ) = eiω1 tn = ei(ω2 +2ωc k)tn = eiω2 tn = f2 (tn ) because from the definitions of ωc and tn we have: ei2ωc ktn = ei2πkn = 1 The result is that the two functions will appear identical when sampled at the time moments tn .5. even though we measured it only at the discrete time moments tn . The reconstruction of the signal f (t) from the measurements f (tn ) essentially amounts to an inverse Fourier transform. typically a time signal is not band-width limited. the components that fall outside the interval −ωc ≤ ω ≤ ωc will appear as corresponding to a frequency within that interval. Fortunately. typical signals . from which they differ by 2ωc . ω1 − ω2 = 2ωc k. it has Fourier components for all values of ω. Then. if these two functions are sampled at the time moments tn . The net result is that the frequency spectrum outside the limits −ωc ≤ ω ≤ ωc gets mapped to the spectrum within these limits.

the Nyquist frequency is evidently given by ωc = π = 256 τ .11: As a practical demonstration. 0] by 2ωc . where τ = 256 In this case. n = −256. . only the positive portion of the frequency spectrum is measured. Moreover. increasing the value of ωc . .5: have spectra that fall off with the magnitude of the frequency. tends to eliminate the aliasing problem for most signals by simply pushing ωc to values where the signal has essentially died off. 255. . . consider the following time signal which at first sight appears quite noisy: f (t) = sin(25(2π)t) cos(60(2π)t) sin(155(2π)t) + 1 (the signal is shifted by unity for ease of visualization). . Typically. Figure 5. reducing the value of the time interval τ at which the signal is measured. ωc ]. FOURIER ANALYSIS f(ω) measured true signal true signal measured 0 ω ωc −ωc Illustration of aliasing: the tails for ω < −ωc and ω > ωc get mapped to values within the interval [−ωc . giving a measured signal different than the true signal. so that their values appear in the range [ωc .142 ∼ CHAPTER 5. 0. Example 5. that is. . . in which case the negative portion appears in the upper half of the spectrum by aliasing: this is equivalent to shifting all frequencies in the range [−ωc . . so that the aliasing problem is not severe. This signal is sampled at the 2N = 512 time moments π tn = nτ. 2ωc ].

namely: +ω1 + ω2 + ω3 −ω1 + ω2 + ω3 +ω1 − ω2 + ω3 −ω1 − ω2 + ω3 = = = = +25 + 60 + 155 = 240 −25 + 60 + 155 = 190 +25 − 60 + 155 = 120 −25 − 60 + 155 = 70 The other combinations of the frequencies which have negative values appear in the [257.5. The additional feature at ω = 0 comes from the uniform shift of the signal. there are four very prominent peaks in the Fourier transform of the signal in the interval [1. ω2 = 60. 256] 4 . 4 . which appears as a component of zero frequency. 512] range by aliasing: −ω1 − ω2 − ω3 + 512 +ω1 − ω2 − ω3 + 512 −ω1 + ω2 − ω3 + 512 +ω1 + ω2 − ω3 + 512 = = = = 512 − 240 = 272 512 − 190 = 322 512 − 120 = 392 512 − 70 = 442 Notice that all frequencies have equal Fourier components. ω3 = 155 Indeed.5. as expected from the form of the signal. 5. we obtain: f (t) = 1− 1 i25(2π)t (e − e−i25(2π)t )(ei60(2π)t + e−i60(2π)t )(ei155(2π)t − e−i155(2π)t ) 8 from which we expect the FT to contain δ-functions at all the possible combinations ±ω1 ± ω2 ± ω3 of the three frequencies (in units of 2π) ω1 = 25. FOURIER ANALYSIS OF SIGNALS 143 The Fourier analysis of the signal should reveal its characteristic frequencies.6. which are due to the four possible combinations of frequencies that appear in the signal. Expressing the signal in terms of complex exponentials. as is seen in Fig.

5 1. 256.5 100 200 300 400 500 Figure 5. 255 .75 0. PlotJoined True 1 .5 0. FOURIER ANALYSIS Fourier analysis of time signal 1 In[23]:= signal Table N Sin 25 2 Pi n 512 Cos 60 2 Pi n 512 Sin 155 2 Pi n 512 ListPlot signal. 3 In[26]:= ListPlot Abs ftsig .25 100 In[25]:= ftsig 200 300 400 500 Fourier signal .144 CHAPTER 5. 1.6: Illustration of Fourier analysis of a time signal by sampling at a finite time interval.25 1 0.5 2 1. PlotJoined 3 2. n.5 1 0. True. the characteristic frequencies are readily identified by the Fourier trans- . Even though the original signal appears very noisy (top panel). PlotRange 0.

then P (A) = 3/6 = 0. the event space is H. then P (A) = 1/2. which are all the possible outcomes for the face of the coin that is up when the coin lands. if we define as event A that a fair die gives 1. For example. if we label three objects as 145 . the event space cosists of the numbers 1.1) N where m is the number of outcomes in the event space that correpsonds to event A and N is the total number of all possible outcomes in the event space. For example.3.6.T (for “heads” and “tails”).Chapter 6 Probabilities and random numbers 6. that a fair coin gives heads. that the die gives an even number. taken all at a time. 3. if our experiment involves the tossing of a fair die. which are all the possible outcomes for the top face of the labeled die.4.2. This is equal to m P (A) = (6.5. 2.5. Event space: it consists of all possible outcomes of an experiment. If our experiment involves flipping a fair coin. then P (A) = 1/6. Probability of event A: it is the probability that event A will occur. There are n! = n · (n − 1) · (n − 2) · · · 1 permutations of n objects.1 Probability distributions We begin the discussion of probability theory by defining a few key concepts: 1. Permutations: it is the number of ways to arrange n different things. For example.

. we conclude that F (xmax ) = 1.5) for a discrete or a continuous variable. Note that from this definition. acb. q real values.xj ≤x f (xj ) or F (x) = x −∞ f (y)dy (6. But since we do not care which ojects we pick. bac. . . we have to divide the total number by the k possible choices for the first one. etc. . N). Notice that these values enter in the binomial expansion: n n! pk q n−k (6. For f (x) to be a proper probabiity distribution function. 4.4) where the first expression holds for a discrete variable with possible values xj identified by the discrete index j (taking the values 0.2) This expression is easy to justify: there are n ways of picking the first object. We also define the cumulative probability function F (x) that any value of the variable smaller than or equal to x occurs.3) (p + q)n = (n − k)!k! k=0 with p. . . that we have encountered several times before (see Eq. that is. it must be normalized to unity. (n − k + 1) ways of picking the k th object our of a total of n. and the second expression holds for a continuous variable. n − 1 ways of picking the second. c. 3 · 2 · 1 = 6 permutations.146 CHAPTER 6. cab. by the k − 1 possible choices for the second one. .14)). which is given by: F (x) = j. and the probability distribution function f (x) that the value x of the variable occurs. PROBABILITIES AND RANDOM NUMBERS a. b. bca. . Combinations: it is the number of ways of taking k things out of a group of n things. which can be discrete or continuous. The number of combinations of k things out of n are given by: n · (n − 1) · · · (n − k + 1) n! = k · (k − 1) · · · 1 (n − k)!k! (6. without specifying their order or identity. (2. cba. We define a general random variable x. then the possible permutations are: abc. where xmax is the maximum value that x can take (in either the discrete or continuous version). that is: N f (xj ) = 1 or j=0 +∞ −∞ f (x)dx = 1 (6.

the mean and the variance can be expressed as: µ = E(X) σ 2 = E((X − µ)2 ) (6. respectively. The mean value µ and the variance σ of the the probability distribution are defined as: N µ= j=0 N xj f (xj ) or µ = +∞ −∞ xf (x)dx (6.6) From the definition of F (x) we conclude that the total probability of all values of x above m is also 1/2. Thus.9) that is. the median m splits the range of values of x in two parts. With this definition.1. From the definitions of the mean and the variance. The notation E(g(X)) signifies that the expectation involves a function g of the random variable.11) (6. We define the expectation E of a function g(x) of the random variable as N E(g(X)) = j=0 g(xj )f (xj ) or E(g(X)) = +∞ −∞ g(x)f (x)dx (6.12) .7) σ2 = j=0 (xj − µ)2 f (xj ) or σ 2 = +∞ −∞ (x − µ)2 f (x)dx (6.8) for the discrete and continuous cases.6. but the expectation itself is not a function of x. the variance squared is equal to the second moment minus the mean squared. we can easily prove the following relation: N σ2 = j=0 x2 f (xj ) − µ2 or σ 2 = j +∞ −∞ x2 f (x)dx − µ2 (6. since the values of x are summed (or integrated) over in the process of determining the expectation.10) for the discrete and continuous cases. PROBABILITY DISTRIBUTIONS 147 We define the median m as the value of the variable x such that the total probability of all values of x below m is equal to 1/2: F (m) = 1 2 (6. each containing a set of values that have total probability 1/2.

Consider an event A that occurs with probability p. we use the expression of Eq. no matter in what sequence the successes and failures occured.13) fb (x) = (n − x)!x! This is called the binomial distribution for the obvious reason that these expressions are exactly the terms in the binomial expansion. it did not occur on the (n − 1)th trial and it occured in the nth trial. We are interested (as in all cases of probability distributions) in the mean and variance of the binomial distribution. But this corresponds to a particular sequence of trials. Then the probability that event does not occur is q = 1 − p..3) we have: n fb (x) = x=0 n! px q n−x = (p + q)n = 1 x=0 (n − x)!x! n because (p + q) = 1. it did not occur on the second one. . (6. The probability that the event A in n trials occurs x times (x here is by definition an integer.7) for the discrete case: n µb = x=0 x n! px q n−x (n − x)!x! (6. (6. then we have to consider all the possible combinations of the above sequence of outcomes. .148 CHAPTER 6. which we learned earlier is given by Eq.1. PROBABILITIES AND RANDOM NUMBERS 6. we conclude that the probability distribution we are seeking is given by: n! px q n−x (6.15) = (pe−iω + q)n .1 Binomial distribution The simplest case of a probability distribution invloving a single random variable is the binomial distribution.2). since according to Eq. Combining the two results.14) To calculate the value of µb we turn to Fourier transforms. thus we treat it as a discrete variable) is given by: p · q · · · q · p = px q n−x where we assumed that it occured in the first trial. If all we care is to find the probability of x successful outcomes out of n trials. . Note that this is a properly normalized probability distribution function. The Fourier transform of fb (x) is given by ˜ fb (ω) = n n! n! pe−iω px q n−x e−iωx = (n − x)!x! (n − x)!x! x=0 x=0 n x q n−x (6. To this end. (6.

(5. (6. which.18) where we have used q = 1 − p to obtain the last expression.6. The ensuing distribution for the variable x (x being still an integer (discrete) variable) takes the following form: fp (x) = µx e−µ x! (6. Eq. Since the mean is the first moment of the variable x with respect to the function f (x). n very large.1. np = µ finite and x ≪ n. PROBABILITY DISTRIBUTIONS 149 where we have used the binomial expansion Eq. Eq. Eq. combined with the above result produces for the variance of the binomial distribution: 2 σb = i2 ˜ d2 fb dω 2 ω=0 2 − µ2 ⇒ σb = npq b (6. This is actually the limit of the binomial distribution for p very small. q ≤ 1). as is evident from the definition Eq. Conclusion: the binomial distribution applies when we know the total number of trials n. Both n and x are integers.9).14) to obtain the last step. (5. We can also use the expression for the second moment that involves FT’s.19) . (2. 2 The mean is µb = np and the variance σb = npq.7).16) ω=0 where we have also used the fact that p+q = 1 to obtain the last expression.17) ω=0 But we have found that the variance squared is equal to the second moment minus the mean squared. (6. we can evaluate it from the general expression we derived for the first moment using FT’s.2 Poisson distribution Another important probability distribution of a single variable is the so called Poisson distribution. the probability of success for each trial p (the probability of failure is q = 1−p) and we are interested in the probability of x successful outcomes.1. 6. to obtain: i2 ˜ d2 fb dω 2 = n(n − 1)p2 + np = np − np2 + n2 p2 (6.25): µb = i ˜ dfb dω = np(−i) · i(p + q)n−1 ⇒ µb = np (6.26). both p and q are real (0 ≤ p.

in the limit when the number of trials n is much larger than the number of successes we interested in x ≪ n. PROBABILITIES AND RANDOM NUMBERS To obtain this from the binomial distribution. Next. we note first that n! = (n − x + 1) · (n − x + 2) · · · (n − 1) · n ≈ nx (n − x)! since for x ≪ n each term in the above product is approximately equal to n and there are x such terms. x is an integer and µ is a real number. x = µ + δ. we will start again with the binomial distribution and set: µ = np. taking exponentials of both sides. From this last result. and we are interested in the probability of x successful outcomes. Combining this and the previous relation between n!/(n − x)! and nx .150 CHAPTER 6. Conclusion: The Poisson distribution applies when we know the mean value µ.3 Gaussian or normal distribution Another very important probability distribution is the so-called Gaussian or normal distribution. the probability of success for an individual trial is very small (p ≪ 1). but the mean is finite (np = µ). since we assumed x ≪ n.20) 2πσ To obtain this expression. 6. This is obtained from the binomial distribution in the limit when n is very large and x is close to the mean µ. n → ∞ (6. we obtain q n−x ≈ e−np where we have neglected xp compared to np.21) .1. and is given by the expression: 1 2 2 fg (x) = √ e(x−µ) /2σ (6. we arrive at (np)x e−np fp (x) ≈ x! which is the desired expression for the Poisson distribution if set µ = np everywhere. we note that q n−x = (1 − p)n ⇒ ln(q n−x ) = n ln(1 − p) − x ln(1 − p) ≈ −np + xp (1 − p)x where we have used q = 1 − p and ln(1 − ǫ) ≈ −ǫ for ǫ → 0.

(6. with ǫ = δ/µ. p and q through the variables µ. we find that the binomial expression takes the form: nn n n! px q n−x ≈ x px q n−x n−x (n − x)!x! x (n − x) x(n − x) 1 =√ 2π np x x 1/2 1 √ 2π nq n−x n−x x(n − x) n −1/2 We will deal with each term in parentheses in the above expression separately. With this.24) which is valid in the limit of large n.s (6. substituting n. using the relations of Eq. PROBABILITY DISTRIBUTIONS with p.21).22) and (6.23). (6. The first term produces: np x x = µ µ+δ µ+δ δ = 1+ µ −(µ+δ) Taking the logarithm of this expression.1.22) (6.23) We will use the Stirling formula n! ≈ (2πn)1/2 nn e−n (6. to rewrite the factorials that appear in the binomial expression Eq.6. the second term in parentheses produces: nq n−x n−x = nq nq − δ nq−δ = 1− δ nq −(nq−δ) .13). Similarly. x and δ. q finite. which implies that δ ≪ np. n − x = n − µ − δ = nq − δ 151 (6. we get −(µ + δ) ln 1 + δ µ = −(µ + δ) δ δ2 δ2 − 2 + · · · = −δ − ··· µ 2µ 2µ where we have used the fact that δ ≪ np = µ and we have employed the Taylor expansion of the logarithm ln(1 − ǫ) for ǫ ≪ 1 and kept the first two terms in the expansion. δ ≪ nq as well as the relations: n − µ = n(1 − p) = nq.

it is much more convenient to scale and shift the origin of x as follows: We define first the scaled variables x= ˜ x . n µ= ˜ µ . with δ = x − µ gives the desired result of Eq. The third term in parentheses produces: x(n − x) n −1/2 = np + δ np nq − δ npq nq −1/2 ≈ (npq)−1/2 1 − δ δ + + · · · ≈ (npq)−1/2 = (σ 2 )−1/2 2np 2nq 2 /2σ 2 Putting all the partial results together.25) Next. But we have: µ σ ˜ σ µ = pn ⇒ µ = ˜ µ = p ⇒ (1 − µ) = 1 − p = q ˜ n . 1] and the ˜ limits of the variable y are [−˜/˜ . produces: − δ2 δ2 δ2 − =− 2np 2nq 2n 1 1 + p q =− δ 2 (p + q) δ2 =− 2 2npq 2σ since p + q = 1 and σ 2 = 2npq. When n is very large. (1 − µ)/˜ ]. n]. (6. The product of these two parentheses. we get −(nq − δ) ln 1 − δ nq = −(nq − δ) − δ δ2 δ2 − +··· = δ − ··· nq 2(nq)2 2nq where we have used the fact that δ ≪ nq and we have employed again the Taylor expansion of the logarithm ln(1 − ǫ) for ǫ ≪ 1 and kept the first two terms in the expansion. PROBABILITIES AND RANDOM NUMBERS Taking the logarithm of this expression. n σ= ˜ σ n (6. which upon taking logarithms is the sum of their logarithms. it is difficult to work with the original values of the parameters that enter in the definition of fg (x).26) Since the limits of x are [0. In the final form we obtained. with ǫ = δ/nq. we arrive at: fg (x) ≈ e−δ (2πσ 2 )−1/2 which. the limits of the variable x are [0.152 CHAPTER 6.20). we define the variable y through y= x−µ ˜ ˜ d˜ x ⇒ dy = σ ˜ σ ˜ (6. the variable x ranges from 0 to n.

Φ(−∞) = 0. Therefore the probability of at least one head . Φ(−w) = 1 − Φ(w) (6. µ and σ. (1.6. PROBABILITY DISTRIBUTIONS We also have: √ pq σ σ = npq ⇒ σ = = √ ˜ n n 2 153 With these results.1. since +∞ −∞ 1 2 √ e−y /2 dy = 1 2π as was shown in Eq. since it is extremely useful in determining probabilities that involve the gaussian distribution.27) from which we deduce the following relations Φ(+∞) = 1. both real numbers. with p and q finite values between 0 and 1. There are two parameters needed to completely specify this distribution. What is the probability that the result is at least one head? Anwer: There are 25 = 32 combinations. all of which contain at least one head except one. Conclusion: The Gaussian or normal distribution is valid when the number of trials n is very large. Note that the mean and variance of the variable y have now become µy = µ−µ ˜ ˜ = 0.28) The function Φ(w) cannot be easily computed by simple methods so it has been tabulated. σ ˜ σy = 1 The final expression we produced is a properly normalized probability distribution function. that with all tails. we conclude that the limits of the variable y are essentially ±∞. It is also convenient to define the following function of the variable w 1 Φ(w) = √ 2π w −∞ e−y 2 /2 1 dy = √ 2π +∞ −w e−y 2 /2 dy (6.37).1 : Five fair coins are tossed at once. the limits of the variable y become: − √ µ ˜ p =− n . σ ˜ q q (1 − µ) √ ˜ = n σ ˜ p and since we are taking the limit of very large n. x is now a real variable and can range from −∞ to +∞. Example 6.

and we do not care about the order in which the successes or failures occur. We could. fb (1) = 1024 1!9! 2 1 1 2 9 = 10 1024 10! 1 2 1 8 45 120 10! 1 3 1 7 = = . . use again the binomial distribution. What is the probability of getting x heads.154 CHAPTER 6. 30. so the remaining probabilities are the same as the ones we already calculated. with the probability of success in each try p = 0. the probabilities are: fb (0) = 10! 1 0!10! 2 0 1 2 = 1 10! 1 . What is the probability of getting at least three heads? Answer: Using the results of the previous example. p = q = x!(n − x)! 2 10 For instance.2 : A fair coin is tossed n times. fb (2) = Example 6. in principle.1 we show the values of fb (x) for all values of 0 ≤ x ≤ n for three values of n = 4. fb (3) = 2!8! 2 2 1024 3!7! 2 2 1024 4 6 5 5 10! 1 210 252 10! 1 1 1 fb (4) = = = . 10.3: A fair coin is tossed n = 10 times.4: We consider a fair coin tossed n = 106 times.5). for n = 10. Example 6. 6. What is the probability that there is a 0. n? Answer: Here the binomial distribution applies because we need to select x successful outcomes out of n tries. the probability is 1 minus the probability of getting any number of heads that is smaller than three. where x = 0. one and two heads: 1 − fb (0) − fb (1) − fb (2) = 1 − 56 = 0. fb (x) = 1 n! px q n−x .1% imbalance in favor of heads in all of the tosses together? Answer: Here we are obviously dealing with a gaussian distribution since n is very large. fb (5) = 4!6! 2 2 1024 5!5! 2 2 1024 Note that from the general exrpession for fb (x) we have fb (n − x) = fb (x) for 0 ≤ x ≤ n/2. that is. zero. In Fig. . . as in .9453125 1024 Example 6. PROBABILITIES AND RANDOM NUMBERS is 31/32.5 (and probability of failure q = 1 − p = 0. .

σ = = 0. In the context of the gaussian ditribution. we have: 1 1 1 1 1 p = .5 × 10−3 .1.3. n].4 0. is an awkward integral to calculate.5 − 2 × 0.1 0 0 0. which would give essentially the same result. 10 (rhombuses) and 30 (circles). Examples 6.499 = 0.499n (49.9% tails) with µ and σ determined above: 0. q = .499n 0 √ 1 2 2 e−(x−µ) /2σ dx 2πσ This.4 155 0. 1]. This probability is given by the integral of the gaussian probability distribution over all values of x from 0 (no tails) to x1 = 0.2 or 6. for three values of n = 4 (squares). the horizontal axis is scaled. σ 2 = npq = × 106 ⇒ σ = × 103 2 2 2 4 2 The question we are asked requires to calculate the probability that we get at least 50.2 0. We express the limits of the ˜ integral in terms of µ and σ : ˜ ˜ x1 x1 = ˜ = 0. however.1: The binomial probability fb (x) for all x ∈ [0. In order to plot the three distributions on a common scale.6 0.9% tails. but this would lead to extremely tedious and difficult calculations.2 0.5 ˜ n n n so that the range of the new variable x is [0. Note that the shape of fb (x) evolves closer to that of a gaussian for increasing n and is very close to the gaussian form already for the largest value of n = 30. PROBABILITY DISTRIBUTIONS 0. µ = np = × 106 .1% heads or at most 49. x/n.6.5 × 10−3 = µ − 2˜ ˜ σ n .3 0. µ = = 0. Instead. we change variables as in the general discussion of the gaussian distribution: σ µ x ˜ ˜ x = .8 1 Figure 6.

for τ = 60 sec. For example. there are n = 5 such subdivisions in the total time interval of interest (5 min). What is the probability that in 5 minutes 3 cars will go through? Answer: In this case we can calculate the average number of cars µ that passes through the toll booth in the given time interval from the rate r: µ = 5 min × 20 5 = 60 min 3 and we also know the number of cars that we want to go through at this given interval. so we can use the Poisson distribution: fp (x = 3) = (5/3)3e−5/3 µx e−µ = = 0.0228 2π or a little over 2%.(6.27). since for such large negative values of y the integrand is zero for all practical purposes.5 − 103 × 0.156 CHAPTER 6.29) Alternatively. The numerical value is obtained from the values of the normalized gaussian integral Φ(w) defined in Eq. the value that it takes for y → −∞. Example 6.5: Cars pass though a toll booth at the rate of r = 20 per hour. x = 3. and the probability of success in one subdivision is p = 1min × 20 1 = 60min 3 .145737 x! 3! (6. In the last expression we have replaced the lower limit of y = −103 with −∞. but then we must subdivide the total time interval of three minutes into more elementary units τ in order to have a well posed problem. and eventually take the limit of these elementary units going to zero. defining a ˜ µ ˜ new variable y: d˜ x x−µ ˜ ˜ ⇒ dy = y= σ ˜ σ ˜ with which the above integral becomes: −2 −∞ 1 2 √ e−y /2 dy = Φ(−2) = 0. PROBABILITIES AND RANDOM NUMBERS 0 = 0. we can look at this problem from the point of view of the binomial distribution.5 × 10−3 = µ − 103 σ ˜ ˜ µ−2˜ ˜ σ µ−103 σ ˜ ˜ and the desired integral then becomes: √ 1 x µ 2 σ2 e−(˜−˜) /2˜ d˜ x 2π˜ σ We next shift the origin of the variable x by −˜ and scale it by σ .

We will deal explicitly with two random variables.2 Multivariable probabilities We consider next cases where we have more than one random variable. we find: 1 5 10! 1 n = 10.1550 11 12 17 = 0. x = 3 ⇒ fb (3) = 3 3 3!2! 3 Similarly.6. we find: n = 20. since nτ = t. Y = y) (6. We define the probability distribution f (x. giving: 2 5! 1 1 n = 5. When these variables take specific values the qantity P (X. for τ = 0. we can define the probability of getting the value x for the first variable independent of what the value of the second variable . p = 1 11 20! 1 .25 min. It is evident that in the limit τ → 0 we find n → ∞.30) With this definition. 6. We will use the symbol P (X. q = . Y ) to denote the value of the probability that depends on two variables. Y mean). Y ) is equal to the probability of these specific values occuring. x = 3 ⇒ fb (3) = 6 6 3!7! 6 and for τ = 0. but np = 5/3 = µ.29). y) for the two random variables x. MULTIVARIABLE PROBABILITIES 157 while the number of successful outcomes is x = 3 and the probability of failure is q = 1 − p = 5/6.1503 We see that the result keeps changing and gets closer and closer to the answer we obtained with the Poisson distribution as we decrease the size of the subdivision τ or equivalently increase the number of trials n.1646 5 6 7 = 0. q = . x = 3 ⇒ fb (3) = 12 12 3!17! 12 3 3 3 2 3 2 = 0. and we recover the earlier result of Eq. as in the case of the Poisson distribution. p = . y as the probability of getting the first variable with value x and the second variable with value y: f (x. (6. p → 0. without specifying the values of the variables (this is what the capital symbols X. but the discussion is easily generalized to more.2. y) = P (X = x. q = .5 min. p = . the total time interval (5 min in this example).

By analogy to what we had done in the case of single-variable probability distributions. is given by: f2 (x) = P (X : arbitrary. Similarly. y) (6.38) (6. σx = E((X − µx )2 ) 2 µy = E(Y ). y) = f1 (x)f2 (y) (6. which is given by: f1 (x) = P (X = x. We can express the mean and variance of each of the random variables as: 2 µx = E(X).158 CHAPTER 6. the probability of getting the value y for the second variable independent of what the value of the first variable is. Y = y) = x f (x. y)dxdy (6. y) = F1 (x)F2 (y) (6. y)f (x.37) (6. Also: E(XY ) = E(X)E(Y ) (6. we can derive the following relations E(X + Y ) = E(X) + E(Y ) which holds for any pair of random variables. If the two variables are independent of each other.35) Using this definition. then the following relation is true: f (x. Y )) = +∞ −∞ g(x. we define the expectation value of a function of the two variables g(x.39) . Y : arbitrary) = y f (x. the two marginal probabilities.32) which is the marginal probability for the random variable y.36) which holds for independent variables only. y) (6. F2 (y) are the cummulative probability distributions corresponding to the single-variable probability distributions f1 (x) and f2 (y). y) as: E(g(X.31) where the symbol y implies a summation over all the possible values of y. The quantity f1 (x) is known as the “marginal probability” for the random variable x. PROBABILITIES AND RANDOM NUMBERS is.34) where F1 (x).33) A consequence of this is that the cummulative probability distributions obey a similar relation: F (x. whether this is a discrete or a continuous variable. σy = E((Y − µy )2 ) (6.

6: Consider two yards.44) If the two variables are independent.42) But the first term in the last expression can be re-written as: E(Z 2 ) = E(X 2 + 2XY + Y 2 ) = E(X 2 ) + 2E(XY ) + E(Y 2 ) and from the definition of E(Z) we have: [E(Z)]2 = [E(X) + E(Y )]2 = [E(X)]2 + [E(Y )]2 + +2E(X)E(Y ) Subtracting the second expression from the first and using Eq. MULTIVARIABLE PROBABILITIES A consequence of these relations is: 2 σx = E(X 2 ) − µ2 = E(X 2 ) − [E(X)]2 x 2 2 σy = E(Y ) − µ2 = E(Y 2 ) − [E(Y )]2 y 159 (6. we can calculate the mean value and the variance. Similarly. using the corresponding values for the original random variables µx . The values of p and q are not necessarily equal (for example. the covariance is equal to zero. σx . and a big tree which sits right in the middle of the fence separating the two yards.2.40) (6. Let x be the random variable associated with the fate of leaf 1. a slight breeze may lead to p > q) but they always add up to unity: p + q = 1. The leaves from the tree fall randomly in one of the two yards. in which case we assign the value x0 = 0 to it. in which case we assign the value x1 = 1. σy as follows: µz = E(Z) = E(X + Y ) = µx + µy 2 σz = E((Z − µz )2 ) = E(Z 2 − 2Zµz + µ2 ) = E(Z 2 ) − µ2 z z (6.41). (6.41) We can define a new random variable z which is the sum of the original two random variables: z =x+y For this new random variable.6. . with probabilities p and q. and it can also have values y0 = 0 or y1 = 1. or in yard B. the random variable y describes the fate of leaf 2.40). Example 6.43) where we have defined the covariance σxy as: σxy = E(XY ) − E(X)E(Y ) (6. we obtain: 2 2 2 σz = σx + σy + 2σxy (6. µy .s (6. which may land either in yard A. which we will label A and B.

y1 ) px (Xi )py (Yk−i) but this expression is exactly what we have defined as the convolution of the two functions px (x) and py (y). y0 ) z1 = 1 → (X. y. y0) or (x0 . Y ) = (x1 . Y ) = (x1 . py (Y = 1) = q We define a new random variable z as the sum of x and y. PROBABILITIES AND RANDOM NUMBERS depending on where leaf 2 lands. that is px (X = 0) = p. From the definition of the probabilities for px (Xi ) and py (Yj ) we then find: pz (0) = px (0)py (0) = p2 pz (1) = px (1)py (0) + px (0)py (1) = 2pq pz (2) = px (1)py (1) = q 2 It is easy to check that pz (Zk ) = p2 + 2pq + q 2 = (p + q)2 = 1 k . 1 The values that the random variable z assumes are: z0 = 0 → (X.160 CHAPTER 6. that is: px (X = 1) = q. The random variables x. Y as the integration variable in the general definition of the convolution. i. j = 0. From the way we have indexed the possible values of the random variables x. from the above relation we find that: pz (Zk ) = i z2 = 2 → (X. y1) From the way we have indexed the possible values of z we deduce: zk = xi + yk−i As far as the probabilities associated with the three possible values of z are concerned. py (Y = 0) = p and they take on the values 1 with probability q each. yj = j. if we think of the subscript of the variables X. we have: xi = i. Y ) = (x0 . y take on the values 0 with probability p each.

py (Y ). ˜ p (0) = σy + µ2 ˜ x y 2 x (i) (i)2 y 1st moment : 2nd while for the Fourier transform of the probability pz (Z) we have: 1 ˜ p′z (ω) = p′x (ω)˜y (ω) + px (ω)˜′y (ω) ⇒ p′z (0) = µz = µx + µy ˜ ˜ p ˜ p i as we would expect for any random variable defined to be the sum of two other random variables. CONDITIONAL PROBABILITIES 161 that is. pz (Z) is normalized to unity. respectively. σy . Similarly. We can use the fact that pz (Z) is the covolution of the probability distributions px (X) and py (Y ) and the fact that the Fourier transform of the convolution of two functions is the product of the Fourier transforms of the the two functions: pz (ω) = px (ω)˜y (ω) ˜ ˜ p From the moments of the probabilities px (X). pz (Z) is a proper probability distribution function. 6. The events that can be either in A or in B comprise the union A ∪ B.6. If all events belong to a superset S that includes both A and B and possibly other events that . We next want to calculate the average and variance of pz (Z) in terms of the averages and variances of px (X). µy .3 Conditional probabilities Consider two sets of events. we have: 0th moment : px (0) = 1 = py (0) ˜ ˜ 1 1 ′ px (0) = µx . the other by B. defined as µx . and σx . p′y (0) = µy ˜ ˜ i i 1 ′′ 1 ′′ 2 2 moment : p (0) = σx + µ2 . for the variance in the variable z we obtain: p′′ (ω) = p′′ (ω)˜y (ω) + 2˜′x (ω)˜′y (ω) + px (ω)˜′′ (ω) ⇒ ˜z ˜x p p p ˜ py 2 2 2 2 2 2 σz + µ2 = (σx + µ2 ) + 2µx µy + (σy + µ2 ) ⇒ σz = σx + σy z x y as we would expect for two independent variables. using the Fourier transforms and their derivatives (which are all evaluated at ω = 0). py (Y ). The events common to both A and B comprise the intrsection A ∩ B.3. whatever the values of p and q are. one denoted by A. thus. which is what the variables x and y are in the present case. as long as they add up to unity.

the complements satisfy the relations: Ac = S − A. we find that the probability of an event which belongs to both A and B to occur. given by P (A ∩ B). Notice that P (A ∩ B) = P (B ∩ A) P (A|B) = P (B|A) We call the events in a A and B independent if the probability of an event in A occuring has no connection to the probability of an event in B occuring.49) Based on this definition.50) that is. is obtained from the conditional probabilities as: P (A ∩ B) = P (B|A)P (A) = P (A|B)P (B) (6. in that case: P (A ∪ B) = P (A) + P (B) (6. and similarly for the complement of B.45) (6. this probability is equal to the probability of an event in B occuring under the condition that an event in A has occured times the probability of an event in A occuring (the same holds with the roles of A and B interchanged).47) We further define the complement of A. associated with the various sets: P (S) = 1. P (S).46) the last relation justified by the fact that the events that belong to both A and B are counted twice in P (A) + P (B). A ∪ Ac = S. then the following relations will hold for the probabilities P (A). that is. as the set of all events that do not belong to A. PROBABILITIES AND RANDOM NUMBERS do not belong to either A or B. denoted by B c . 0 ≤ P (B) ≤ 1 P (A ∪ B) = P (A) + P (B) − P (A ∩ B) (6.48) We define the conditional probability P (A|B) as the probability that an event in B will occur under the condition an event in A has occured: P (B|A) = P (A ∩ B) P (A) (6. P (Ac ) = 1 − P (A) (6.162 CHAPTER 6. P (B).51) . A ∩ B = 0. We define as mutually exclusive the sets of events that have no common elements. This is expressed by the relation: P (A ∩ B) = P (A)P (B) (6. 0 ≤ P (A) ≤ 1. denoted by Ac .

We can consruct the table of all possible outcomes and identify the events A and B in this table. B B 2 B 3 B A 4 B A 5 A. CONDITIONAL PROBABILITIES In this case the conditional probabilities become: P (A|B) = P (A ∩ B) = P (A). event B: only one of the two dice has the value two. divided by the total number of possible outcomes which is 36. The boxes labeled with A represent the set of values corresponding to event A and the boxes laleled with B the set ov values corresponding to event B.52) which simply says that the probability of an event in A occuring under the condition that an event in B has occured is the same as the probability of an event in A occuring. the probability of event A is the total number of boxes containing the label A. We also have: P (A) = 6 × P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = 6 10 2 14 + − = 36 36 36 36 . A ∪ B. and this is the intersection of the two sets. which is 6. Example 6. the column of values j = 1 − 6 represents the values of the second die): j. There are several boxes carying both labels. B 6 A B A The total number of outcomes 6 × 6 = 36 is the event space S. P (B) P (B|A) = P (B ∩ A) = P (B) P (A) 163 (6. A ∩ B. i 1 2 3 4 5 6 1 B B B A. From this we can calculate the probabilities: 1 1 1 10 = . P (B) = 10 × = 36 6 36 36 that is.3.7: Consider two fair dice that are thrown simultaneously. We define two events A and B as follows: event A: the sum of values of the two dice is equal to seven.6. similarly for the probability of event B. since this has no connection to an event in B (the same holds with the roles of A and B interchanged). as shown below (the line of values i = 1 − 6 represents the outcome of the first die. all the labeled boxes comprise the union of A and B. P (A).

We will use conditinal probabilies to do this. 30% silver coins and the second box contains the reverse ratio. We define the following events: Event A: choose a coin from box 1. both containing gold and silver coins.5. and put them back in the same box. We want to figure out which contains more gold coins by samplng them: we are allowed to pick coins from each box. which is precisely the conditional probability P (A|B). The total number of coins in each box is the same. the conditional probabilities are obtained as follows: the probability that event A will occur (sum of values is 7) under the condition that event B has occured (only one die has value 2) is P (A|B) = P (A ∩ B) 2/36 2 = = P (B) 10/36 10 and the probability that event B will occur (only one die has value 2) under the condition that event A has occured (sum of values is 7) is P (B|A) = 2/36 2 P (B ∩ A) = = P (A) 6/36 6 These results make good sense if we check the table of events: out of the 10 boxes containing a label B.8: Consider two boxes.164 CHAPTER 6. Since the boxes look identical from the outside we must assign P (A) = P (Ac ) = 0. Event B: choose a gold coin. B c then must correspond to choosing a silver coin. that is. two of which have two labels (representing the intersection A ∩ B). all the cases that only one die has value 2. Ac then must correspond to choosing a coin from box 2. we should be able to figure out which box is which with certain degree of certainty. Based on our sampling. there are 14 occupied entries in the entire table. Example 6. the sum of values of the two dice is 7: thus. Indeed. the probability that the sum of values is 7 under the condition that only of the two dice has value 2 is 2/10. since these are the only two possibilities as far as the type of coin we can . Finally. the boxes are identical in size. the probability of either event occuring is the probability of the event A occuring plus the probability of event B occuring minus the probability of the intersection wich is counted twice. From the outside. PROBABILITIES AND RANDOM NUMBERS that is. see what kind of coin they are. We can rationalize the value we obtained for the conditional probability P (B|A) by a similar argument. that is. shape and total weight. but in different proportions: the first box contains 70% gold. there are only two boxes that also contain the label A.

P (Ac ∩ B c ).35.6.5.3. These are interesting quantities: The first row sum is the sum of the probabilities P (A ∩ B) + P (A ∩ B c ). the first column represents the probability of getting a gold coin (P (B)) whether we picked it from box 1 (event A) or from box 2 (event Ac ). P (B c ) = 0. P (A ∩ B c ) = P (B c |A)P (A) = 0. which depend on which box we are chosing the coin from. (6. B c and across A. We illustrate what can happen if we pick three coins from one . similarly for the second column. etc. which turns out to be P (B) = 0.35 0. as indeed it is.50 0.50 The four entries under B.50): P (A ∩ B) = P (B|A)P (A) = 0. P (B c |A) = 0. P (A ∩ B c ).3. and we cannot tell this from the outside. The first column sum is the probability of choosing a gold coin (event B) from either box 1 (event A) or box 2 (event Ac ). P (Ac ∩ B).15 P (Ac ∩B c ) = P (Ac |B c )P (Ac ) = 0.50 0. These are the marginal probailities for events B and B c : that is. P (Ac ∩B) = P (B|Ac )P (Ac ) = 0. which in this case are actually easy to determine: P (B|A) = 0. using the general fromula of Eq.5. Similarly for the second column.50 A Ac 0.7 These are the result of the definitions we have made: The probability of getting a gold coin under the condition that we have chosen the first box is 70%.35 We put these intersection probabilities in a table: B Bc 0. but this is the same as the probability of choosing box 1 itself. P (B|Ac ) = 0. Similarly for the second row. We next calculate the conditional probabilities.7. Our next step is to consider a situation where we pick not one but several coins at a time. CONDITIONAL PROBABILITIES 165 pick are concerned. that is the probability of getting either a gold coin (event B) or a silver coin (event B c ) from box 1 (event A). In this case we cannot easily determine P (B) and P (B c ).15.35 0. We can use these results to calculate the probabilities of the intersections P (A ∩ B). P (B c |Ac ) = 0.15 0. or P (B|A) = 0.15 0. The extra entries are the column and row sums.5.3. Ac are the corresponding intersection probabilities P (A ∩ B) = P (B ∩ A). etc.7. which must be equal to P (A) = 0.

7)1 (0. each time we pick a coin these probabilites are the same.185 A Ac 0. 1.185 0. 0 gold coins (and hence 0.027 : 0 gold.5. Notice that the column sums represent the marginal probabilities P (Bj ) = P (Bj ∩ A) + P (Bj ∩ Ac ) .166 CHAPTER 6.315 0. We put all these intersection probabilities in a table as before: B0 B1 B2 B3 0. through the relation P (A ∩ Bj ) = P (Bj |A)P (A) with P (A) = 0. 1.50 The numbers under both entries A or Ac and Bj are the intersection probabilities P (Bj ∩A) or P (Bj ∩Ac ).1715 0.1715 0.2205 0.0945 0. on the condition that we are picking coins from box 1 (event A). as expected. one gold and two silver coins or three silver coins. 0 silver 3! (0. If we assume that the box was the one containing 70% gold and 30% silver coins. We know that the probability of getting a gold coin from box 1 is p = 0.7 and the probability of getting a silver coin is q = 1 − p = 0. These probabilities add up properly to 1. with P (Ac ) = 0.2205 0. 3) in the three picks.315 0. 2.7)2 (0.3)2 = 0.343 : 3!0! 3 gold.3)1 = 0.3)0 = 0. are: P (B3 |A) = P (B2 |A) = 3! (0.441 : 2 gold.0135 0.3)3 = 0. Therefore. using the binomial distribution.0135 0.7)3 (0. 3 silver P (B0 |A) = 0!3! where we have now defined the events Bj : get n gold coins (j = 0. 3. PROBABILITIES AND RANDOM NUMBERS box. 2.3. then we can actually figure out the probabilities of picking three gold coins. We can use these conditional probabilities to obtain the intersection probabilities P (A ∩ Bj ) by the same procedure as before. and the the numbers in the last row are the column sums of the P (Bj ∩ A) or P (Bj ∩ Ac ) entries. the probabilities of gettins.7)0 (0. 2 silver P (B1 |A) = 1!2! 3! (0. 1. two gold and one silver coins. sum up to 0. Similarly for the intersection probabilities P (Ac ∩ Bj ). 1 silver 2!1! 3! (0. 3 silver coins) in n = 3 tries.5. 2.5.0945 0.50 0. while the last column is the sum of individual rows of intersection probabilities which.189 : 1 gold.

whether we chose box 1 (event A) or box 2 (event Ac ). P (B3 ).3%.6. and will remain equal to 0. we can use those to calculate another set of conditional probabilities. if we had obtained three silver coins (and hence zero gold coins) in our three picks. the numbers of the bottom row in the above table are the marginal probabilities P (B0 ). we can calculate the probability that we have chosen box 1.927 P (B3 ) 0.185 or 7. then the probability that we have found the box with the most gold coins (box 1 or event A) is 92.0135 P (B0 ∩ A) = = 0. under the condition that we drew three gold coins in our three trials.315 + 2 × 0. CONDITIONAL PROBABILITIES 167 of getting j coins.073 P (A|B0) = P (B0 ) 0. Let us also calculate the average number of gold coins in the n = 3 picks: it is given by n µ= j=0 jP (Bj ) = 0 × 0. Having determined the intersection and marginal probabilities. the probability that we had found box 1 would be: 0.1715 = = 0. What would be the right strategy for finding the box with the most gold coins? A logical strategy would be to choose as the box with the most gold coins that from which we get a larger than average number of gold coins in three picks.5. Specifically.5 1. This is indeed very useful information. the average number scaled by the nmber of picks will not change if we change n.3. in this order. this is given by the expression: P (A|B3) = P (B3 ∩ A) 0.5. Thus. a rather small value. and equal to 1 − P (A|B3). if in our three tries we get three gold coins. getting more than average gold coins means getting two or three gold coins.315 + 3 × 0. which are actually more interesting.185 In other words.185 + 1 × 0. if our goal were to identify the box with the most gold coins by picking coins from one of the two identicallooking boxes! Conversely.185 = 1.5 µ = = 0.5 n 3 We note that because of the statement of the problem and the fact that the ratio of gold and silver coins in box 1 is the reverse of that in box 2. P (B1 ). P (B2 ). We should then compare the probability of getting tow or three gold coins to the probability of getting zero or one gold . Since the average is 1.7%.

then we would be making the correct choice with a probability 78.6% probability that we have the wrong box.6% probability that we were picking from the box with the larger percentage of gold coins. P ((B2 ∪ B3 )|Ac ) = 0.6%.392 0. P ((B2 ∪ B3 ) ∩ Ac ) and P ((B0 ∪ B1 ) ∩ Ac ): A Ac B0 ∪ B1 0. i = j From the table of intersection probabilities for the individual Bj events.216. we can next obtain the conditional probabilities of getting more than average gold coins (B2 ∪ B3 ) or less than average gold coins (B0 ∪ B1 ). i=j and the same relation will hold for the intersection probabilities: P ((Bi ∪ Bj ) ∩ A) = P (Bi ∩ A) + P (Bj ∩ A).50 0. There is also a nonnegligible 21. and the wrong choice with probability 21. Similarly. there is a 78.392 0. on the condition that we are picking from box 1 (event A) or box 2 (event Ac ): P ((B2 ∪ B3 )|A) = 0.50 0.4%. We note that the events of getting i and j gold coins are mutually exclusive for i = j.168 CHAPTER 6. the one with smaller percentage of gold coins. if we had gotten fewer than average gold coins. even though we got more than average gold coins in our three picks. Assuming that our strategy is to keep as the right choice the box if we got more than the average gold coins and to reject the box if we got fewer than the average gold coins. PROBABILITIES AND RANDOM NUMBERS coins.216 P ((B0 ∪ B1 )|Ac ) = 0. but there is also a 21.4% probability that we are picking from box 2. therefore: P (Bi ∪ Bj ) = P (Bi ) + P (Bj ). the box with the smaller percentage of gold coins. P ((B0 ∪ B1 )|A) = 0.108 0. i=j P ((Bi ∪ Bj ) ∩ Ac ) = P (Bi ∩ Ac ) + P (Bj ∩ Ac ). that is.4% probability that we have hit on the right answer. the box that actually contains the larger percentage of gold coins.784. P ((B0 ∪B1 )∩A). .784 These results tell us that if we get more than average gold coins there is a 78.50 B2 ∪ B3 0. that is.50 From these entries.108 0. we then can calculate the intersection probabilities P ((B2 ∪B3 )∩A).

2: The probabilities of intersections. (shown by We next pose the problem in a more general. 6. and for large enough n the gaussians corresponding to A and Ac will have very little overlap.5 1 open symbols) and the marginal probabilities P (Bj ) = P (Bj ∩ A) + P (Bj ∩ Ac ) (shown by filled symbols). n. the horizontal axis is j/n.4 n=3 0. then: A : µ1 = np1 . P (Bj ∩Ac ).4 and 0. j is scaled by n. the distribution of probabilities for each case evolves into a gaussian. Essentially. . The probabilities for n = 10 and n = 30 have been shifted up for clarity (by 0. n=30 0. . with n increasing. then: Ac : µ2 = np2 .6. In order to display all values in the same range.6. If we are picking from box 1. P (Bj ∩A). for j = 1.2. p2 = 0. This is shown in Fig. . the odds of finding the right box will increase.3. .3 ⇒ µ2 = 0. and challenging manner: Suppose that we wanted to be assured of a desired degree of certitude that we picked the right box. p1 = 0. but there will always be a non-zero chance of making the wrong choice. the one that has a majority of gold coins.3 n µ1 = 0. squares to n = 10 and rhombuses to n = 30.7 ⇒ If we are picking from box 2. Circles correspond to n = 3. respectively).6 n=10 0. that is.2 0 0 0. How many trials should we make to make sure we have the right box with the right degree of certitude? If the number of trials n is large enough we know that we will have gaussian distributions.7 n . Figure 6. CONDITIONAL PROBABILITIES 169 It is evident that if we increase the number of samplings n.

PROBABILITIES AND RANDOM NUMBERS Since the two boxes look identical from the utside.326 .5.5 corresponds to situations where we would get more than half gold coins by picking from box 2. if we hit an event that corresponds to the tail of the second curve. centered at µ1 = 0. we may be pcking coins from either box. and for large enough n the overlap between them will be small. We will choose the gaussian that corresponds to event A. Assuming that the strategy is to keep the box if we get more than gold coins and to reject it if we get less than half.5 corresponds to situations where we would get less than half gold coins by picking from box 1. – the tail of the second curve for values of the random variable greater than 0. and their position with respect to the average is symmetrical. the second type of mistake would be a false acceptance. The first type of mistake would be a false rejection because we hit an event in the tail of the first curve. the two gaussians are the same as far as variance and normalization is concerned. we have to sum up all the probabilities that correspond to the tails of the two gaussian curves above and below the average number of gold coins. The only freedom we have is to make more picks n. the random variable represents the number of gold coins divided by n): – the tail of the first curve for values of the random variable less than 0.7. The tails of the gaussians have interesting meaning (in both cases. then we could be making a mistake if we hit the values corresponding to either tail. this turns one of the gaussians into a properly normalized one. the second at µ2 /n = 0. the variance will be given by σ1 = np1 q1 . We then simply calculate the contribution of one of the tails and simply multiply that by 2. and therefore the normalization of each gaussian representing the probabilities of getting any combination of gold and silver coins must be equal to P (A) = P (Ac ) = 0. Suppose that we want to make sure that either type of mistake is avoided with a certain degree of confidence. In the present example.3.01 ⇒ w = −2.7n. therefore the contribution of each tail is the same. How big should n be to guarantee that the probability of making either type of mistake is less than 1%? To answer this question. For both curves. √ 0.21 σ1 σ2 σ2 = np2 q2 ⇒ = = √ n n n The first curve will be peaked at µ1 /n = 0. We then want to calculate the gaussian integral: 1 Φ(w) = √ 2π w −∞ e−y 2 /2 dy = 0.170 CHAPTER 6.

The tabulated values of Φ(w) give the value of w = −2. For large n the distributions will be gaussian.6. in terms of the mean and variance of the gaussian: 0. then the tails representing the two types of errors extend from µ → +∞ for the first distribution and from −∞ → µ for the second. σ2 . this is f course expressed in units of the variance (defined to be 1 for the integrand of Φ(w)) away from the mean (defined to be 0 for the integrqand of Φ(w)).326˜1 ˜ σ Plugging into this expression the values of µ1 and σ1 from above. then we are guaranteed that the overall error of either type in choosing the right box will not exceed 1%. CONDITIONAL PROBABILITIES 171 which says that the integral of the tail of the normalized gaussian from −∞ to u gives a 1% contribution. All that remains is to turn this result to an actual number of picks n.4. The two choices will have different means µ1 . we need to express everything in terms of the scaled variables √ σ1 µ1 0. and solving for n we find n = 28. N2 are such that: ˜ ˜ µ1 < µ < µ2 ˜ ˜ ˜ If the value of µ is used as the criterion for choosing one of the two possible options. one of which is the desired. The overall mean value of a random variable sampling the two districutions will be given by: µ = µ1 N1 + µ2 N2 ⇒ µ µ1 µ2 = N1 + N2 ⇒ µ = µ1 N1 + µ2 (1 − N1 ) ˜ ˜ ˜ n n n We will assume that the values of µ1 .3.5 = µ1 − 2. µ2 and different variances σ1 . σ1 = ˜ = √ µ1 = ˜ n n n and express the value of the average (scaled by n). but the normalizations N1 and N2 will not necessarily be each equal to 0. the latter ˜ ˜ √ containing a factor of n.7. µ2 and N1 . although their sum must be 1. To this end.5.326 for this case. The two choices need not be symmetrical as was the case in the example above. We can generalize the results of the example above to the case where we have a choice of two possible events.21 = 0. and we try to figure which one is the right choice by sampling n times the possible outcomes. these ˜ ˜ . A and Ac . which denotes the cutoff point of the tail. If we take the value of n to be the nearest integer larger than this non-integer calculated value.

2. 2. ˜ ˜ ˜ λ1 . Also. from which the number of trials n can be determined. 3. The rules of the game are: 1. but can open door 2 or 3 with equal probability as far as the player is concerned. more information is needed about the various quantities that enter in the above expression. The player chooses one of the three doors without opening it. PROBABILITIES AND RANDOM NUMBERS must be expressed in terms of the mean and standard deviation of each distribution: µ = µ1 + λ1 σ1 . which does not have the prize behind it.8. the prize can be behind any one of the three doors. Bi : the host opens door i = 1. the total proability of a wrong call is given by: −λ2 −∞ e−t 2 /2 dt + ∞ λ1 e−t 2 /2 dt This will be related to the tolerable value of error. as was the case for example 6. From the player’s perspective. What is the right strategy for the player to maximize her chances of finding the prize: switch choice of doors or stay with the original choice? Answer: We label the door that the player chooses initially as door 1 and the other two doors 2 and 3. that is. ˜ ˜ ˜ Example 6. hence P (B1 ) = 0. In order to proceed. taking into account only the information that the player has. We have to calculate probabilities from the player’s perspective. 3. 2.9: A television game played between a host and a player consists of finding a prize which is hidden behind one of three doors. ˜ ˜ ˜ µ = µ2 − λ2 σ2 . P (B2) = P (B3 ) = 1/2. 3. We define the following events: Ai : the prize is behind door i = 1. We can also calculate the following conditional probabilities: if the prize is behind door 1 : P (B2 |A1 ) = 1/2. The player has a chance to change her choice of door or stick with the original choice. σ2 and N1 . µ1 .172 CHAPTER 6. P (B3 |A1 ) = 1/2 . The host opens one of the two remaining doors. ˜ µ2 . the host will not open door 1 since the player picked it first. σ1 . hence P (A1 ) = P (A2) = P (A3 ) = 1/3. λ2 > 0 Then.

With this information and the marginal probabilities for Ai and Bi . A2 . P (B3 ) 3 P (A1 |B3 ) = P (A3 |B2 ) = P (A1 ∩ B3 ) 1 = P (B3 ) 3 P (A3 ∩ B2 ) 2 = P (B2 ) 3 Thus. 6. B3 and across A1 .4 Random numbers It is often useful to be able to generate a sequence of random numbers n the computer. the player should always switch choice of doors to maximize her chances. we can now calculate the table of intersection probabilities: B1 0 0 0 0 B2 B3 1/6 1/6 0 1/3 1/3 0 1/2 1/2 A1 A2 A3 1/3 1/3 1/3 The entries that are under both B1 . if the prize is behind door 3 : P (B2 |A3 ) = 1. but the probability that the prize is behind door 2 if the host opens door 3 or behind door 3 if the host opens door 2 is 2/3 in each case. B2 . and if the prize is behind door 2 : P (B2 |A2 ) = 0.6. P (B3 |A2 ) = 1 P (B3 |A3 ) = 0 since the host will not open the door with the prize. in order to apply it to various problems related to probabilities. since by definition these numbers will be generated by . the probability that the prize is behind door 1 (the original choice) under the condition that the host opens door 2 or 3 is 1/3 in each case.4. the entries in the last column are the marginal probabilities P (Ai) and the entries in the last row are the marginal probabilities P (Bi ). which is what the palyer actually wants to know: P (A1|B2 ) = P (A2|B3 ) = P (A1 ∩ B2 ) 1 = . Therefore. From this table we can then construct the following conditional probabilities. P (B2 ) 3 P (A2 ∩ B3 ) 2 = . However. RANDOM NUMBERS 173 since the host is equally likely (from the player’s perspective) to open door 2 or 3. A3 are the intersection probabilities.

R2 = 3.174 CHAPTER 6. B. In particular M should be very large. M. Specifically. The reason for this restriction is that the numbers generated by this algorithm repeat after a period M. etc. and then the sequence starts over. R0 are positive integers.). a careful choice of the variables A. because otherwise the numbers are not really random. etc. the sequence of pseudo-random numbers generated by the modulo algorithm always involves correlations for any finite value of M. then the numbers we generate do not have any semblance of randomness anymore! As this simple example shows. M]. Actually. R0 is usually referred to as the “seed”. 2. with M = 4. A = 5. M. R0 are carefully chosen (there can be fewer such . ri = Ri . R3 = 2. These numbers appear to occur in a “random” sequence in the first period. Note that with this algorithm we can generate a number N of pseudo-random numbers where N ≪ M. D = 3 for a cube. M i = 1. R0 is important to avoid a short period. since M = 4 and all the possible outcomes of the modulo are 0. if we use D such numbers to define a random variable in an D-dimensional space (D = 2 for a plane. B = 3. then these points do not fill uniformly the D-dimensional space but lie on (D − 1)-dimensional hyper-planes. such numbers are “pseudo-random” numbers. they cannot be truly random. R5 = 0 and then the numbers keep repeating (R6 = R1 . 3. it can be very useful. if we allow N to reach M. B. and there are at most M 1/D such hyperlanes. R4 = 1. B. Actually this is to be expected. N ≪ M (6. . the ri numbers are in the range ri ∈ [0. . M. until the range of possible outcomes is exhausted. 1).53) where ri are the random numbers and A. . 1. if A. PROBABILITIES AND RANDOM NUMBERS some deterministic algorithm.) because at R5 we have hit the same value that R0 had. If a sequence of pseudo-random numbers can adequately approximate the properties of a truly random sequence of numbers. A different choice of seed with the same values for the rest of the parameters generates a different sequence of random numbers. For example. A standard algorithm for generating random numbers is the modulo algorithm: Ri = mod[(ARi−1 + B). . Thus. R0 = 1 the first five values generated are R1 = 0.

and derive from it random numbers y that have an exponential probability distribution. This can be generated from random numbers of uniform distribution as follows: starting with two random . we would then need: e−y = dx ⇒ x = e−y ⇒ y(x) = − ln(x) dy that is. but at the same time we take it out of the array. . ∞] with a probability distribution p(y) = e−y . N. Next we choose at random a value j between 1 and N and use the random number rj . RANDOM NUMBERS 175 hyper-planes. 6. This works as follows: we first create an array of N random numbers rj . meaning poorer quality pseudo-random numbers. Elaborate algorithms have been produced to make this very very efficient. we may start with a uniform probability distribution of random numbers x in the range [0. Another important distribution. with N large. The scheme we described earlier. that is.54) For example. .1 Arbitrary probability distributions An important consideration is to be able to generate random numbers of any desired probability distribution. p(x) = 1. . To achieve this. 1]. which makes sure that the new numbers are positive) generates random numbers y in the range [0.4. that is. if the values of the parameters are not well chosen). and replace it by a new random number. One way to improve the randomness of pseudo-random numbers is to create a shuffling algorithm. This can be achieved by requiring that two distributions p(x) and p(y) are related by |p(x)dx| = |p(y)dy| ⇒ p(y) = p(x) dx dy (6. 1]. the new random numbers y are given in terms of the original random numbers x (which have uniform probability distribution) through the relation y(x) = − ln(x). which we discussed in detail in this chapter is the gaussian distribution. Taking the natural logarithm of these numbers (with a minus sign. This continuous shuffling of the random numbers reduces the correlations between them. using the modulo algorithm. j = 1.6.4. the probability of finding the value y is proportional to e−y . generates random numbers x with uniform probability distribution p(x) = 1 in the range [0. which are known as Random Number Generators (RNG’s). .

y2) = √ e−y1 /2 √ e−y2 /2 2π 2π (6. we can calculate the value of π by generating random numbers in pairs.55) Using the relations between x1 . representing the (x. 2 2 y2 = x2 = 1 y2 tan−1 2π y1 −2 ln(x1 ) sin(2πx2 ) ⇒ and we use the general formula of Eq. 1] and r ′ is a random number with uniform distribution in the interval [−1. y2) where the quantity inside the absolute value is the Jacobian: ∂(x1 . If we keep generating such numbers in . x2 of uniform distribution. because no computation will be wasted in producing two (rather than one) random numbers x1 . and y1 . x2 .4. x2 ) dy1 dy2 ∂(y1 . we define the new numbers y1 .2 Monte Carlo integration We can use random numbers to do integration and other calculations that may be otherwise very difficult to perform. PROBABILITIES AND RANDOM NUMBERS numbers of uniform distribution x1 .56) which is actually two indpendent gaussian probabilities in the variables y1 . y) coordinates in space. 1]: r ′ = 2(r − 0.57) where r is the original set of random vlaues generated by the RNG with uniform distribution in [0. 6. we can calculate the Jacobian to obtain: 1 1 2 2 p(y1 . 1].54) to relate the two distributions: p(y1 . y2 ) ∂x1 ∂y1 ∂x1 ∂y2 ∂x2 ∂y1 ∂x2 ∂y2 (6. By shifting and scaling the random numbers generated by a RNG we can arrange their values to span the interval [−1. For example. y2 . y2 tgrough: y1 = −2 ln(x1 ) cos(2πx2 ) .(6. by sampling efficiently the relevant space of possible outcomes. y2 )dy1 dy2 = p(x1 . x2 . x1 = e−(y1 +y2 )/2 .176 CHAPTER 6. y2 from above. x2 ) ∂(x1 . x2 ) = ∂(y1 .5) (6. This is useful.

6. x2 + y 2 < 1. that is. 6.5 1 Figure 6. The area of this quadrant is π/4.5 0 0. we can use the random values in the interval [0.4. we would eventually fill a square of side 2 and total area 2 × 2 = 4.5 0 −0. which would give us the number of points within one quadrant of the circle of radius 1. Alternatively. Filled dots are random points within the range of the circle and open circles are outside this region. we can calculate the value of π. and for each point compare the value of f (x) to another . For example. pairs corresponding to points on a plane.3. 1] in pairs as the (x.3: Example of Monte Carlo evaluation of π. and compare this number to the total number of generated points we could calculate the value or π: The number of points inside the circle of radius 1 compared to the number of points inside the square of side 2 must give the same ratio as the respective areas. We can use the same concept to calculate integrals.58) by generating random points in the range [0. If we keep track of how many of these points are inside a circle or radius 1.333333 0 (6. and check whether x2 + y 2 < 1. which would fill the square of side 1. RANDOM NUMBERS 1 177 0. that is π/4. In Fig. which will be tha values of the variable x. A simple code in Mathematica that does this calculation with N = 106 random points is given in Fig.5 −1 −1 −0.6.4. Comparing again the number of points within the quadrant to the total number of points generated. 1]. we calculate the integral I1 = 1 0 1 x dx = x3 3 2 1 = 0. we show the distribution of random points for N = 103 for this calculation. y) points.

141624 0.333970 0.000107 0.954552 0. 1.001633 0. PROBABILITIES AND RANDOM NUMBERS π ∆0 I1 ∆1 I2 ∆2 3. Since the integral represents the area under the curve. we can do the integration in the interval [0.001759 0.58). then this point would be under the curve f (x). we can use Monte Carlo integration to calculate the following integral: 1 2 √ e−x /2 dx (6.164000 0. A simple code in Mathematica that does this calculation with N = 106 random points is given in Fig.142856 0.331700 -0.954511 0.000052 0.59).1: Monte Carlo calculation for π and the integrals I1 of Eq.000011 Table 6.954613 0.000017 0. I2 (Q) = Q In Table 6. 6.000052 3.39894228.333544 0.002143 0. In this case. using different numbers of random points N (from one thousand to one .141668 0. y. random point.6. due to the fact that the integrand is an even function of x.000967 0. We must also make sure that we scale the result by the total area which would be covered by random points.000075 0.58) and I2 (2) of Eq. and this ratio will give the result for the integral of Eq.(6.333350 0.333385 0.954688 0. and height 1/ 2π = 0. (6. 6.957605 0.178 N 103 104 105 106 107 108 109 CHAPTER 6.000113 3.59) −Q 2π for finite values of Q.143352 0. which is the highest value of the integrand.000210 0.022407 0.000637 0.333440 0. In order to not waste computation. 6. the range of the variable x. Q]).003105 3. we can arange the total area to be a rectangle of width 2Q.334000 0.000188 3. Q] and multiply the result by a factor of 2. In Fig.001666 3.001263 0.5. we show the distribution of random points for N = 103 for this calculation. (or Q if we use the √ half range [0. As another example. N is the number of random points used and ∆i .4. we must generate values of the random variable in the interval [−Q.000031 0. we can compare the number of points that satisfy the relation y < f (x) to the total number of points generated.121200 -0. If y < f (x). Q]. we show the distribution of random points for N = 103 for this calculation. i = 0. (6.000667 1.1 we collect the results of the three Monte Carlo calculations.952834 -0.142560 0. or.020393 0.047643 3. In Fig. This caan also be obtained from the tabulated values of the gaussian integral Φ(Q). 2 are the differences from the exact result.

This is not an exact result. assuming that it is one standard deviation in a gaussian type distribution. for an integral in the range x ∈ [0.6. but a rough estimate of the deviation from the exact value. as we would intuitively expect. RANDOM NUMBERS 179 billion). the larger the number of points used.61) This expression states that the fluctuations. This can be quantified by using the following expression. which are proportional to √ 1/2 [ f 2 − f 2] .4. . i=1 f 2 1 = N N [f (xi )]2 i=1 (6. L]: 1 L L 0 1 f (x)dx = f ± √ N f2 − f 2 1/2 (6. As is seen from this table.60) where the expressions in brackets are defined as: 1 f = N N f (xi ). die out with a prefactor 1/ N . the more accurate the result (closer to the exact value).

frange 0. 1000000 1000000 0.4: .333142 MONTE CARLO CALCULATION OF Pi In[44]:= mcount 0 Out[44]= 0 In[45]:= Do If Random ^2 4 Random ^2 1 0. 1000000 In[46]:= N mcount Out[46]= 3. i. PROBABILITIES AND RANDOM NUMBERS Examples of Monte Carlo Calculations 1 MONTE CARLO INTEGRATION OF x ^ 2 f x_ : x^2 1 In[39]:= xrange Out[39]= 1 In[40]:= frange Out[40]= 1 In[41]:= f xrange mcount 0 Out[41]= 0 In[42]:= Do If f Random Real.180 CHAPTER 6. xrange Random Real. mcount . mcount . 0. In[43]:= N mcount Out[43]= 0.14136 1000000 Figure 6. i.

Filled dots are random points under the curve f (x) = e−x and open circles are outside this region.4 0. (6. Figure 6.5 1 Figure 6. Filled dots are random points under the curve f (x) = x2 and open circles are outside this region.4. √ 2 /2 / 2π .58). RANDOM NUMBERS 1 181 0. 0.6: Example of Monte Carlo integration for the integral of Eq.5 0 0 0.59). (6.6.2 0 −2 −1 0 1 2 for Q = 2.5: Example of Monte Carlo integration for the integral of Eq.

625516 0.594835 0.535856 0.03 1.94 0.32 0.92 0.680822 0.701944 0.64 0.27 0.802337 0.833977 0.523922 0.779350 0.785236 0.05 0.725747 0.08 1.69 0.25 Φ(w) 0.788145 0.841345 w 1.67 0.583166 0.22 0.54 0.13 1.590954 0.886861 0.838913 0.894350 w 0.859929 0.65 0.11 0.761148 0.629300 0.87 0.602568 0.18 1.810570 0.881000 0.44 0.531881 0.892512 0.57 0.56 0.15 0.846136 0.722405 0.606420 0.5 w 0.888768 0.666402 0.813267 0.17 0.76 0.815940 0.07 1.93 0.828944 0.70 0.23 0.719043 0.644309 0.796731 0. PROBABILITIES AND RANDOM NUMBERS APPENDIX .99 1.567495 0.547758 0.71 0.12 1.662757 0.06 1.694974 0.14 1.98 0.559618 0.00 Φ(w) 0.20 1.708840 0.770350 0.24 1.34 0.58 0.648027 0.09 0.80 0.96 0.83 0.61 0.66 0.79 0.74 0.21 1.874928 0.651732 0.06 0.35 0.519939 0.776373 0.43 0.04 1.10 0.40 0.579260 0.55 0.77 0.72 0.22 1.17 1.543795 0.75 Φ(w) 0.90 0.GAUSSIAN TABLE 1 Φ(w) = √ 2π w 0.872857 0.14 0.507978 0.807850 0.712260 0. Φ(0) = 0.89 0.735653 0.29 0.890651 0.49 0.26 0.60 0.41 0.48 0.52 0.51 0.04 0.758036 0.555670 0.23 1.870762 0.82 0.670031 0.857690 0.11 1.687933 0.799546 0.03 0.868643 0.805105 0.25 Φ(w) 0.53 0.36 0.19 0.21 0.16 0.539828 0.575345 0.15 1.37 0.59 0.91 0.862143 0.182 CHAPTER 6.705401 0.826391 0.515953 0.47 0.764238 0.45 0.01 1.18 0.95 0.587064 0.836457 0.01 0.07 0.677242 0.28 0.614092 0.62 0.831472 0.503989 0.855428 0.09 1.732371 0.698468 0.729069 0.19 1.793892 0.86 0.864334 0.30 0.97 0.68 0.84 0.882977 0.823814 0.31 0.621720 0.73 0.745373 0.85 0.63 0.633072 0.12 0.527903 0.767305 0.33 0.610261 0.673645 0.617911 0.563559 0.38 0.24 0.02 0.748571 0.78 0.640576 0.88 0.551717 0.754903 0.10 1.655422 0.818589 0.843752 0.821214 0.715661 0.50 Φ(w) 0.02 1.08 0.13 0.853141 0.42 0.46 0.879000 0.738914 0.20 0.636831 0.850830 0.684386 0.691462 .659097 0.511966 0.876976 0.782305 0. Φ(−w) = 1 − Φ(w).742154 0.751748 0.81 0.884930 0.16 1.773373 w 0.39 0.571424 0.848495 0.791030 0.866500 0.598706 w −∞ e−u 2 /2 du.05 1.

964070 0.44 2.896165 0.984614 0.944083 0.18 2.75 Φ(w) 0.986447 0.26 2.962462 0.973197 0.926471 0.969258 0.34 1.958185 0.42 2.01 2.950529 0.99 2.74 1.78 1.51 1.32 2.72 1.41 2.969946 0.39 1.41 1.992240 0.55 1.26 1.6.991344 0.31 2.977250 w 2.953521 0.65 1.59 1.963273 0.45 1.980301 0.37 1.69 1.975002 0.80 1.36 2.974412 0.984997 0.983823 0.989276 0.50 Φ(w) 0.35 2.00 Φ(w) 0.60 1.985738 0.925066 0.901475 0.988989 0.980774 0.45 2.40 1.33 2.08 2.54 1.21 2.972571 0.27 2.22 2.64 1.936992 0.959941 w 1.897958 0.992451 0.58 1.989556 0.991106 0.14 2.904902 0.27 1.923641 0.993244 0.43 1.23 2.50 Φ(w) 0.913085 0.990097 0.899727 0.930563 0.62 1.964852 0.28 1.920730 0.49 1.84 1.92 1.951543 0.28 2.989830 0.71 1.43 2.927855 0.35 1.992024 0.949497 0.984222 0.37 2.957284 0.931888 0.66 1.76 1.47 2.82 1.39 2.909877 0.971933 0.48 2.09 2.966375 0.939429 0.15 2.34 2.948449 0.987776 w 2.917736 0.977784 0.61 1.993431 0.959070 0.990863 0.988696 0.81 1.10 2.978308 0.987126 0.991802 0.46 2.942947 0.29 2.85 1.988089 0.44 1.992656 0.985371 0.24 2.25 Φ(w) 0.17 2.90 1.02 2.31 1.979325 0.983414 0.982571 0.938220 0.911492 0.929219 0.83 1.73 1.976705 0.988396 0.79 1.993053 0.32 1.992857 0.42 1.4.952540 0.91 1.982997 0.04 2.16 2.12 2.87 1.03 2.49 2.986791 0.970621 0.97 1.981237 0.70 1.06 2.922196 0.908241 0.47 1.961636 0.940620 0.36 1.38 2.916207 0.968557 0.982136 0.935745 0.971283 0.987455 0.67 1.96 1.20 2.903200 0.956367 0.945201 0.946301 0.93 1.960796 0.40 2.94 1.07 2.947384 0.11 2.89 1.990613 0.57 1.955435 0.976148 0.30 1. RANDOM NUMBERS 183 w 1.965620 0.48 1.914657 0.68 1.88 1.46 1.906582 0.919243 0.967843 0.979818 0.52 1.19 2.993790 .29 1.967116 0.981691 0.05 2.986097 0.978822 0.53 1.56 1.33 1.990358 0.95 1.86 1.38 1.975581 0.954486 0.77 1.941792 0.934478 0.993613 0.991576 0.30 2.973810 0.13 2.63 1.98 1.933193 w 1.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.