# 1

BASIC CALCULUS REFRESHER
Ismor Fischer, Ph.D. Dept. of Statistics UW-Madison

1.

Introduction.
This is a very condensed and simplified version of basic calculus, which is a prerequisite for many courses in Mathematics, Statistics, Engineering, Pharmacy, etc. It is not comprehensive, and absolutely not intended to be a substitute for a one-year freshman course in differential and integral calculus. You are strongly encouraged to do the included Exercises to reinforce the ideas. Important mathematical terms are in boldface; key formulas and concepts are boxed and highlighted . To view a color .pdf version of this document (recommended), see http://www.stat.wisc.edu/~ifischer.
()

2.

Exponents – Basic Definitions and Properties
For any real number base x, we define powers of x: x0 = 1, x1 = x, x2 = x ⋅ x, x3 = x ⋅ x ⋅ x, etc. (The exception is 00, which is considered indeterminate.) Powers are also called exponents. Examples: 50 = 1, (−11.2)1 = −11.2, (8.6)2 = 8.6 × 8.6 = 73.96, 103 = 10 × 10 × 10 = 1000, (−3)4 = (−3) × (−3) × (−3) × (−3) = 81. Also, we can define fractional exponents in terms of roots, such as x1/2 = Similarly, x1/3 =
th 3

x , the square root of x.

x , the cube root of x, x2/3 =
th

( x)
3

2

, etc. In general, we have xm/n =

( x)
n

m

,

i.e., the n root of x, raised to the m power. Examples: 641/2 = 64 = 8, 643/2 =

(

64

)

3

= 83 = 512, 641/3 = 64 = 4, 642/3 =

3

(

3

64

)

2

= 42 = 16.

1 1 1 1 1 Finally, we can define negative exponents: x−r = xr . Thus, x−1 = x1 , x−2 = x2 , x−1/2 = x1/2 = , etc. x 1 1 1 Examples: 10−1 = 101 = 0.1, 7−2 = 72 = 49 , 36−1/2 = 1 1 = 6 , 9−5/2 = 36 1 1 1 = 35 = 243 .

( 9)

5

Properties of Exponents
1.
2.

xa ⋅ xb = xa+b Examples: x3 ⋅ x2 = x5, x1/2 ⋅ x1/3 = x5/6, x3 ⋅ x−1/2 = x5/2 xa a −b xb = x (xa)b = xab x5 x3 x3 Examples: x3 = x2, x5 = x−2, x1/2 = x5/2 Examples: (x3)2 = x6, (x−1/2)7 = x−7/2, (x2/3)5/7 = x10/21

3.

2

3. Functions and Their Graphs

Input x →

f

→ Output y

If a quantity y always depends on another quantity x in such a way that every value of x corresponds to one and only one value of y, then we say that “y is a function of x,” written y = f (x); x is said to be the independent variable, y is the dependent variable. (Example: “Distance traveled per hour (y) is a function of velocity (x).”) For a given function y = f(x), the set of all ordered pairs of (x, y)values that algebraically satisfy its equation is called the graph of the function, and can be represented geometrically by a collection of points in the XY-plane. (Recall that the XY-plane consists of two perpendicular copies of the real number line – a horizontal X-axis, and a vertical Y-axis – that intersect at a reference point (0, 0) called the origin, and which partition the plane into four disjoint regions called quadrants. Every point P in the plane can be represented by the ordered pair (x, y), where the first value is the x-coordinate – indicating its horizontal position relative to the origin – and the second value is the y-coordinate – indicating its vertical position relative to the origin. Thus, the point P(4, 7) is 4 units to right of, and 7 units up from, the origin.) Examples: y = f (x) = 7; y = f (x) = 2x + 3; y = f (x) = x2; y = f (x) = x1/2; y = f (x) = x−1; y = f (x) = 2x. The first three are examples of polynomial functions. (In particular, the first is constant, the second is linear, the third is quadratic.) The last is an exponential function; note that x is an exponent! Let’s consider these examples, one at a time. • y = f (x) = 7: If x = any value, then y = 7. That is, no matter what value of x is chosen, the value of the height y remains at a constant level of 7. Therefore, all points that satisfy this equation must have the form (x, 7), and thus determine the graph of a horizontal line, 7 units up. A few typical points are plotted in the figure. Exercise: What would the graph of the equation y = −4 look like? x = −4 ? y = 0 ? x = 0 ?

Descartes ~ 1640

(–1.8, 7)

(0, 7)

(2.5, 7)

y = f (x) = 2x + 3: If x = 0, then y = f (0) = 2(0) + 3 = 3, so the point (0, 3) is on the graph of this function. Likewise, if x = −1.5, then y = f (−1.5) = 2(−1.5) + 3 = 0, so the point (−1.5, 0) is also on the graph of this function. (However, many points, such as (1, 1), do not satisfy the equation, and so do not lie on the graph.) The set of all points (x, y) that do satisfy this linear equation forms the graph of a line in the XY-plane, hence the name. Exercise: What would the graph of the line y = x look like? y = −x ? The absolute value function y = |x| ?

(0, 3) (−1.5, 0)

the curve is said to be concave up.”) Exercise: How does this graph differ from y = f(x) = x3 ? x4 ? Find a pattern for y = xn. 2. m = +2). 1). For example. it “holds water.. for the two points (0. along with the first-quadrant portion of y = x2 for comparison. . etc.g.. which confirms our observation. 3 3 3 y = x2 y= x –64 = ???) Graph this function together with y = x3. (In this case. (3. 4). for n = 1. 4. m = y2 − y1 Δy = Δx x2 − x1 for any two points (x1. 1). 3. 4). (1. when graphed together.e. In general.5 − 0 Δx • y = f (x) = x2: This is not the equation of a straight line (because of the “squaring” operation). (–1. it “spills water. (2. 0).g.” Similarly. – forms a curved parabola in the XY-plane. The set of all points that satisfies this quadratic equation – e. 9).g. the composition of the function f(x) with its inverse – written f –1(x) – is equal to the initial value x itself. +64 = +8). the graph of –x2 is concave down. where the “square root” operation is defined (e. 0) on our line. lands you back to where you started from. Pictured here is its graph.. +64 = ??.… • y = f (x) = x1/2 = x : The “square root” operation is not defined for negative values of x (e. where b is the Y-intercept (in this example. The “square” operation x2 and “square root” operation x1/2 = x are examples of inverse functions of one another. (–3. that is. y1) and (x2. (−1. starting with a domain value x. –64 does not exist as a real number.. More precisely. the slope is m = −1. the effect of applying of either one.. 9). 3) and 0−3 Δy = = 2. the slope of any line is defined as the ratio of “height change” Δy to “length change” Δx. positive values and zero).5.) Hence the real-valued domain of this function is restricted to x ≥ 0 (i.e. for x ≥ 0. (0. since both (+8)2 = +64 and (–8)2 = +64.. and m is the slope of the line (in this example. That is.3 Notice that the line has the generic equation y = f (x) = mx + b . y2) that lie on the line. b = +3). followed immediately by the other.g. As in the figure above. Exercise: How does this graph differ from y = f(x) = x1/3 = x ? Hint: What is its domain? (E. i. (–2. the two functions exhibit symmetry across the “diagonal” line y = x (not explicitly drawn).

This is the graph of a hyperbola.) A similar situation exists for negative domain x-values. 0. then y becomes infinite (+∞). one in the first quadrant and the other in the third.1‚ 100 = 0. (1.. x = 0.001 = 1000‚ etc.e. Therefore. Let’s first restrict our attention to positive 1 domain x-values.⎟ become larger. which has two symmetric branches. the values of the height 1 ⎛ 1 1 1 ⎞ y = x ⎜e.).1 = 10‚ 0.g. etc.).e.4 • 1 y = f (x) = x−1 = x : This is a bit more delicate.⎟ become smaller... the graph shoots upwards. the graph approaches the X-axis as a horizontal asymptote. approaching the Y-axis as a vertical asymptote. 1000.. so x = 0 is not in the domain of this function. i. as x gets smaller from the point (1. 0.1.‚ 10 = 0. although they ⎝ ⎠ never actually reach 0.001. so the point (1. etc. 100. Therefore. then y = f (1) = 1 = 1. x = 10. If x = 1. the values of the 1 1 1 1 ⎛ ⎞ height y = x ⎜e. (If x = 0.01 = 100‚ 0. as we ⎝ ⎠ continue to move to the left. as x grows larger (e. 1) 1 Exercise: How does this graph differ from that of y = f (x) = x−2 = x2 ? Why? .g.01. i. Now from here.g.g. Moreover.‚ 0. which is undefined as a real number. without ever actually touching it. as we continue to move to the right. x > 0.01‚ 1000 = 0. 1) lies on the graph of this function.001‚ etc. 1) on the graph (e. without ever actually touching it. x < 0.

such as the parabola y = x2. then the graph starts at the origin and continues to rise to infinity.) However. if x ≤ 1 if x > 1 This graph is the parabola y = x2 up to and including the point (1. y = xp p>1 p=1 0<p<1 p=0 p<0 Exercise: Why is y = xx not a power function? Sketch its graph for x > 0. if x ≤ 1 if x > 1. and the X-axis is a horizontal asymptote. starting at (1. such as y = x−1 = x . or “jump discontinuity. . this graph has a break. since it is continuous before and after that value. g is described as being piecewise continuous. then the graph is concave up. This graph is the parabola y = x2 up to and including the point (1. (In particular. or y = x−2 = x2 . the graph is the straight line y = x. Exercise: Sketch the graph of the piecewise-defined functions x2. but then abruptly changes over to the curve y = x3 + 5 after that. (Think of switching a light from off = 0 to on = 1. 6). 1). x2. which have the general form y = x p. such as 1 1 the parabola y = x1/2 = x .5 NOTE: The preceding examples are special cases of power functions. if p > 1. then the Y-axis acts as a vertical asymptote for the graph.) However. Note that this function is therefore continuous at x = 1. for x > 0. 1). for any real value of p. And if 0 < p < 1. Therefore. f (x) = x3 . then the graph is concave down. g (x) = x3 + 5. If p = 1. if p < 0. then picks up with the curve y = x3 after that. If p > 0. and hence for all real values of x.” at x = 1.

g. + 1) in x.e. so a = 0. Exercise: How do the graphs of the functions (1/2)x and 2 –x compare. log5(0.2. 5–1 = 51 = 0. b = 2 in the previous example).g.6931. denoted exp(x).6931 because e0. which doubles in height y with every “unit increase” (i.) If the exponent is changed from x to −x..015625) = –6. • The choice of base b = 10 results in the so-called common logarithm “log10(x). e. For example. (Think of “half-life” of a radioactive isotope.4) = 1. (9)1/2 = 9 = 3.2) = –1. then the resulting graph represents an exponential decay curve.015625 102 = 100.g. see figure.71828… .” usually denoted “ln(x). in order to obtain the value x. log2(0.4)1 = 37. 1707-1783. b = 1/2). (Think of a population of bacteria that doubles its size ever hour.. etc. carbon dating of prehistoric artifacts. log10(100) = 2. respectively: 1 1 (37. For example. how do we get a? The inverse of the exponential function y = bx is.6)0 = 1.) NOTE: For general exponential functions y = f(x) = bx. log9(3) = 1/2.4(37. decreasing by a constant factor of one-half from left to right.” And because it is the inverse of the exponential function ex..6931 x. ln(2) = 0.) The graphs of y = ex and y = ln(x) show the typical symmetry with respect to the line y = x of inverse functions. log37.6931 = 2. The resulting function. is sometimes considered as “THE” exponential function. the logarithm function y = logb(x). the statements in the following line are equivalent to those in the line below it. among other things… y = ex y = ln(x) . fossils. for a given base b. Other exponential functions y = f(x) = bx can be equivalently expressed as y = f(x) = eax (via b = ea). But given b. with constant a > 0 (growth) or a < 0 (decay). That is. labeled for Swiss mathematician Leonhard Euler (pronounced “oiler”).71828…. and why? One very important special choice for calculus applications is called e = 2. y = f(x) = ex. (98. (For example. the preferred base is “Euler’s constant” e = 2.” But for calculus applications. 2–6 = 26 = 0. it follows that eln(x) = x and ln(ex) = x..6 • y = f (x) = 2x: This is not a power function! The graph of this increasing function is an exponential growth curve. resulting in the natural logarithm “loge(x).6(1) = 0. and decreases if b < 1 (e. the base b can be any positive constant. by definition. Logarithms satisfy many properties that make them extremely useful for manipulating complex algebraic expressions. The resulting graph increases if b > 1 (e. 2x can be written e 0. log98.4. the “logarithm of x” is equal to the exponent to which the base b must be raised.

9). we say that 0 is a limiting value of the x values. if Δx = 0.1. Now suppose that we slide to a new them. 9.7 4. we can force 1/x < 10−500 by making 1 x > 10500. as x gets arbitrarily large. via the previous formula: msec = Δx 4−3 Δy = point Q(3. closer to P(3. 12. 9) to (3 + Δx)2 − 9 Δy 2 2 = = any nearby point Q(3 + Δx. Leibniz ~ 1680 lim x→+∞ 1 x = 0. 9).5. forcing x to become as close to 0 as we like. Limits and Derivatives 1 We saw above that as the values of x grow ever larger. the values of x become ever smaller.25) on the graph. The average rate of change between the two points P(3. Y Q secant lines tangent line P X . 9) and Q(4. simply by making x large enough. we have msec = Δx Δx 9 + 6 Δx + (Δx)2 − 9 6 Δx + (Δx)2 Δx ⋅ (6 + Δx) = = = 6 + Δx.5.) As Q approaches P – i. but we now wish to consider a special kind.5. then msec = 6.1.61) still closer to P(3. then msec = 7 . but we can “sneak up” on it. If we now slide to a new point 3. 16) on the graph can be calculated as the slope of the secant line connecting 16 − 9 Δy = = 7.25 − 9 = 6. as Δx approaches 0 – this quantity msec = 6 + Δx approaches the quantity mtan = 6 as its limiting value. (For instance. the slope of the new secant line between P and Q. then the new slope is msec = = = 6. (3 + Δx) ) on the graph of y = x .61 − 9 Δy Q(3. and so on. Δx 3. consider again the parabola example y = f (x) = x2. 1 We can’t actually reach 0 exactly.1 .1. (We can check this formula against Δx Δx Δx the msec values that we already computed: if Δx = 1. The average rate of change is now msec = Δx 12. 9). A mathematically concise way to express this is a “limit statement”: Newton. if Δx = 0.) In this context. the slopes msec of the secant lines appear to get ever closer to 6 – the slope mtan of the tangent line to the curve y = x2 at the point P(3.5 − 3 9. Many other limits are possible. To motivate this. 9) – thus measuring the instantaneous rate of change of this function at this point P(3. confirming what we initially suspected.. then msec = 6.e.) We can actually verify this by an explicit computation: From fixed point P(3. (The same thing also happens if we approach P from the left side.1 − 3 As Q approaches P.5 .

25). defined by: mtan = lim Δx → 0 Δy = Δx lim Δx → 0 f(x + Δx) − f(x) Δx dy This object – denoted compactly by dx – is called the derivative of the function y = f(x) .” thereby defining the instantaneous rate of change of the function at the point P. which can be used to estimate a small output difference Δy in terms of a small input difference Δx. 16) is equal to mtan = 8. 16) or P(−5. As Q approaches P – i. 0) is mtan = 0. As Q approaches P – i. (x + Δx)2) on its graph. We can use the same calculation as we did above: the average rate of change of y = x2 between any two generic points P(x.. This can also be written more succinctly as dx = 2x. dx [f (x)]. . is given by msec = = = Δx Δx Δx 2 Δx ⋅ (2x + Δx) 2x Δx + (Δx) = = 2x + Δx. mtan = 0 X In principle. Thus. as Δx approaches its graph.e. dx (x2) = 2x. 16). say at P(4. there is nothing that prevents us from applying these same ideas to other functions y = f(x). f(x)) and a nearby point Q(x + Δx. 0). i.”) The process of calculating the derivative of a function is called differentiation. the instantaneous rate of change of the function y = f (x) = x2 at the point P(4. as Δx approaches 0 – this = Δx Δx quantity approaches mtan = 2x “in the limit. x2) and (x + Δx)2 − x2 x2 + 2x Δx + (Δx)2 − x2 Δy Q(x + Δx. for example.8 Suppose now we wish to find the instantaneous rate of change of y = f(x) = x2 at some other point P on the graph. Δy ≈ 2x Δx. (Note that if x = 3. we first calculate the average rate of change between P(x. as measured by msec = Δx Δx 0 from both sides – this quantity becomes the instantaneous rate of change at P. f(x + Δx)) on Δy f(x + Δx) − f(x) = . mtan = −4 P(0. 4).. and at the origin P(0. dx The last form expresses the so-called differential dy in terms of the differential dx.. 25) or even P(0. mtan = −10 P(4. these calculations agree with those previously done for msec and mtan .e. at P(−5. 0). or dy = 2x dx. the derivative of the function y = f (x) = x2 is equal to the function d (x2) dy d = f ′(x) = 2x. and is also symbolized by several other interchangeable notations: d f(x) d dx . 25) is mtan = −10.e. mtan = 6 P(−2. f ′(x). To find the instantaneous rate of change at an arbitrary point P on its graph. Y P(−5. mtan = 8 P(3.) Thus. 9). (The last is sometimes referred to as “prime notation. etc.

then dy the derivative of y = ex is simply dx = ex (1) = ex.e. then dx = (−1) e−x = −e−x. dx One last important example is worth making explicit: let y = f(x) = 7. then dx = f ′(x) = p x p−1. whose graph is a horizontal line.. Logarithm Rule • 1 1 1 d d [logb(x)] = x ln(b) ⇒ Special case (b = e): dx [ln(x)] = x . the derivative of the function bx is equal to the product of bx times a constant factor of ln(b) that “hangs along for the ride. if y = f(x) = e . slope) between any two points on this graph is Δx Δx hence the instantaneous rate of change = 0 as well.9 • Using methods very similar to those above. Exponential Rule d x d ax x ax dx (b ) = b ln(b) – or equivalently – dx (e ) = a e . then dx = 3 x2. Namely. dy dy Examples: If y = 2x. it can be proved that if y = f(x) = bx. Hence.” However. x1/3. if y = f(x) = x p. and x−1 are not differentiable at x = 0. Equivalently. x = 0 is not even in its domain. dy 1 dy More examples: If y = ex/2. then d x dy dy x x ax ax dx = f ′(x) = b ln(b). so any talk of a tangent line there is completely meaningless. hence the slope is infinite. 7−7 Δy = = 0. dx (x ) = p x dy dy 1 dy Examples: If y = x3.. dy If y = f(x) = C (any constant) for all x‚ then the derivative dx = f ′(x) = 0 for all x. the last function is undefined at the origin.. the first has a V-shaped graph. then dx = 2x ln(2) = 2x (0. as it should! (The line y = x has slope m = 1 everywhere. • Finally. Also dy note that if y = x = x1. for any positive base b.) Not every function has a derivative everywhere! For example. it is possible to prove that such a general rule exists for dy any power function x p. then dx = 10x ln(10) = 10x (2. dx (b ) = b ln(b). if b = e (or equivalently. we have for x > 0. And as we’ve seen. then dx = 2 ex/2.e.” The second graph has a vertical tangent line there. Although the first two are continuous through the origin (0. then dx = 1 x0 = 1. If y = e−x. has an infinite – or undefined – slope. If y = x−1.6931). (However. the derivative of ex is equal to just ex itself! This explains why base e is a particularly desirable special case.) • Again using the preceding “limit definition” of a derivative. i. If y = x1/2. then dx = − x−2. and The average rate of change (i. all for different reasons. having equation x = C. i. the constant multiple is just 1. a = 1)..e. If y = 10x. Power Rule d p p− 1 . a uniquely defined tangent line does not exist at the “corner. the functions y = f(x) = |x|. 0).e. many complex functions are indeed differentiable… . then dx = 2 x−1/2. exploiting the fact that exponentials and logarithms are inverses functions. not just p = 2.3026). But. In the same way. then dx = f ′(x) = a e . i. we have the following result. In other words. observe that a vertical line.

then dx = (11 x10)(e6x) + (x11)(6 e6x) = (11 + 6x) x10 e6x. 2. then dx = 7x10 + 8x6 – 4x + 11 (70x9 + 48x5 – 4). because the exponent is a function of x. Example: If y = x3/2 − 7 x4 + 10 e −3x − 5. dy 2x Example: If y = e−x²/2. then dx = 5 (3 x2) = 15 x2. . then dx = −3 (2 e 2x) = −6 e 2x. Product Rule [ f(x) g(x) ] ′ = f ′(x) g(x) + f(x) g ′(x) dy Example: If y = x11 e6x. 1 1 dy 1 Example: If y = 9 x. 4. Chain Rule [ f ( g(x) )] ′ = f ′( g(x) ) × g ′(x) dy 2 Example: If y = (x2/3 + 2e−9x)6. For any constant c. Example: If y = x7 + 8 . then dx = 9 (1) = 9 .10 Properties of Derivatives 1. Quotient Rule f(x) [ g(x) ] ′ = f ′(x) g(x) − f (x) g ′(x) provided g(x) ≠ 0 [g (x)]2 e4x (4x7 − 7x6 + 32) e4x dy (x7 + 8) 4e4x − (7x6) e4x = . not just x itself (or a constant multiple ax)! 1 dy Example: If y = ln(7x10 + 8x6 – 4x + 11). then dx = (x7 + 8)2 (x7 + 8)2 NOTE: See below for a more detailed explanation. dy Example: If y = −3 e 2x. 5. For any two differentiable functions f(x) and g(x). Note that you cannot calculate its derivative by the “exponential rule” given above. d df dx [c f(x)] = c dx [c f(x)]′ = c f ′(x) dy Example: If y = 5 x3. then dx = 6 (x2/3 + 2e−9x)5 ( 3 x−1/3 − 18e−9x). The graph of this function is related to the “bell curve” of probability and statistics. then dx = e−x²/2 (− 2 ) = −x e−x²/2. Sum and Difference Rules d df dg dx [ f(x) ± g(x) ] = dx ± dx [ f(x) ± g(x) ] ′ = f ′(x) ± g ′(x) 3. then dy 3 1/2 3 −3x dx = 2 x − 28 x − 30 e . and any differentiable function f(x).

e.) The idea behind the Chain Rule is that what is true for average rates of change also holds for instantaneous rates of change. and C travels at a rate of 20 mph. an alternate – perhaps more illuminating – equivalent way to write the Chain Rule dy dy du is dx = du ⋅ dx which. can be obtained by multiplying together these two quantities. a function of a function y = f ( g(x) ). but A does not know how fast C is traveling. if A has direct knowledge of C. Imagine three cars traveling at different rates of speed over a given time interval: A travels at a rate of 60 mph. the average rate of change of A. (Of course. dy Hence. respectively: dy du If y = u p.) . then multiply that by the derivative of the inner. Similarly. and we get our answer. B travels at a rate of 40 mph. twice as much.11 The Chain Rule. that is. the function in the second example can be viewed as composing the outer exponential function f(u) = eu (whose derivative. and B knows how fast C is traveling. d(u p) = p u p− 1 du. by the Power Rule given above). Therefore. The function in the first example above can be viewed as composing the “outer” function f(u) = u6. That is. Similarly. = = . With this insight. As it is. if y = f(u)‚ and u = g(x)‚ then dx = f ′(u) g ′(x) is another way to express this procedure. for ΔC 20 1 auxiliary function. and General Logarithm Rule for differentiation.e. The logic behind the Chain Rule is actually quite simple and intuitive (though a formal proof involves certain technicalities that we do not pursue here). And the last case can be seen as composing the outer logarithm function f(u) = ln(u) with the inner polynomial function u = g(x) = 7x10 + 8x6 – 4x + 11. then dx = p u p− 1 dx . which can be written several different ways. first take the derivative of the outer function (6u5. with the “inner” function u = g(x) = x2/3 + 2e−9x. the average rate of change of B. derivatives. relative to B. relative to C. General Exponential Rule. Exercise: Recall from page 5 that y = xx is not a power function. specialize to a General Power Rule. Over this time span. logarithmic) differentiation. B acts an intermediate link in the chain. with the inner power function u = g(x) = −x2/2. three times as much. which makes calculations easier in some contexts. In addition. as the last three examples illustrate. relative to ΔC 20 1 C. 1 That is. 1 du dy If y = ln(u). then dx = e u dx . via the elementary algebraic identity 3 2 3 ΔA ΔA ΔB = × . d(e u) = e u du. dy du If y = e u. bears some further explanation. ΔA 60 3 i. the average rate of change of A.” i. or an would not be necessary. three-halves as much. Obtain its derivative via the technique of implicit (in particular. It is a rule for differentiating a composition of two functions f and g. this ΔC ΔB ΔC ΔA 60 3 = = . is equal to the ratio of their respective distances traveled. (Not covered here. is itself). suppose that A knows how fast B is traveling. To find its derivative. That is.. d[ln(u)] = u du. is ΔB 40 2 ΔB 40 2 equal to the ratio = = . as the time interval shrinks to 0 “in the limit. recall. then dx = u dx .. or 2 × 1 = 1.

Applications: Estimation. or Δy ≈ f ′(x) Δx .461941. That is. In the popular Newton-Raphson Method. to the previous formula: 1 1 1 1 Δy ≈ 1! f ′(x) Δx + 2! f ′′(x) (Δx)2 + 3! f ′′′(x) (Δx)3 + … + n! f (n)(x) (Δx)n (where 1! = 1. x1.. 7. Exercise: Compare this with the exact value of Δy = f(1. Algebraically. (Check: f(2. . (Iteration is the simple process of repeatedly f′ (x) applying the same formula to itself. x3. by continuity. it follows that msec ≈ mtan. when x = 1 and Δx = 0. Δx dx Hence the first derivative f ′(x) can be used in a crude local estimate of the amount of change Δy of the function y = f(x). 6. 5.e. Applying Newton’s Method. the third derivative f ′′′(x). Hence.4 × 10–5. e.g.03). Roots and Maxima & Minima. we x3 – 21 x2 + 135 x – 220 2 x3 – 21 x2 + 220 f(x) = x– = 3 x2 – 42 x + 135 . 4.) Exercise: Try different initial values. 3. 3! = 3 × 2 × 1 = 6. The formal mathematical statement of equality between Δy and the sum of higher derivatives of f is an extremely useful result known as Taylor’s Formula. the small difference from 0 is due to roundoff error.12 5.461290. the values where the graph of f(x) intersects the X-axis (also called the zeros of the function f(x)). this can be extremely tedious or even impossible..e. i. at Δy dy least informally. Explain the different behaviors. 1. x0 = 0. We note that f(2) = –26 < 0. x4 = 2. can be estimated by Δy ≈ f ′(x) Δx = (x2 ex) Δx. to four decimal places. Starting with iterate the formula x – 3 x2 – 42 x + 135 f′ (x) x0 = 2. this cubic (degree 3) polynomial must have a zero somewhere between 2 and 3. x3 = 2. so we often turn to numerical techniques which yield computer-generated approximations. x2.461941. The change in function value from say.412698.03) – f(1).03 e = 0.… that converges to a numerical f(x) solution. and in general “n factorial” is defined as n! = n × (n – 1) × (n – 2) × … × 1). thereafter.”) Example: Solve for x: f(x) = x3 – 21 x2 + 135 x – 220 = 0.0815. Δy ≈ 0. i. x2 = 2. then produce a sequence of values x0.. Suppose we wish to solve for the roots of the equation f(x) = 0. f(1) to f(1. Related Rates As seen.461941. for nearby points P and Q on the graph of a function f(x).03. etc. as in a continual “feedback loop. for a small change Δx. 2! = 2 × 1 = 2. the iterates remain fixed at this value 2. we have ≈ = f ′(x). Example: Let y = f(x) = (x2 – 2 x + 2) ex. and f(3) = +23 > 0. we generate the sequence x1 = 2. by iterating the expression x – . NOTE: A better estimate of the difference Δy = f(x + Δx) – f(x) can be obtained by adding information from the second derivative f ′′(x) (= derivative of f ′(x)). we start with an initial guess x0.461941) = 1..

Let us now evaluate the derivative f ′(x) at two nearby values that bracket x = 5 on the left and right. 55) is a relative maximum for f. mtan(6) = f ′(6) = –9 < 0. levels off. local maximum) or a relative minimum (i. 23). the function f(x) rises.e. it is necessary to determine the exact nature of these critical points. . We can use the quadratic formula to solve this explicitly. This value x1 can then be used as the x-coordinate of a new point P1 on the graph. and solve the resulting algebraic equation for the critical values of f. Consider the first critical value. Hence.e. Generally speaking. First Derivative Test Second Derivative Test In either case. we set the derivative f ′(x) equal to zero. Once obtained. or simply observe that. which indicates that the original function f is increasing at x = 4. In fact. We have f ′(x) = 3 x2 – 42 x + 135 = 0.. This indicates that the point (5. This feature of quadratic convergence is a main reason why this is a favorite method. local extrema). conclude that the point (9. this also shows that the point (5. If a function f(x) has either a relative maximum (i. x = 5. Exercise: Show that: (1) f ′(8) < 0. which indicates that the original function f is concave down (“spills”) at this value. since f(5) = 55 and f(9) = 23. which indicates that f is neither increasing nor decreasing at x = 5. in order to find such relative extrema (i. a numerical approximation technique is not necessary. i. Algebraically formalizing this process results in the general formula given above. Furthermore. it follows that the corresponding critical points on the graph of f(x) are (5. whose x-coordinate x0 is reasonably close to a root. 55) and (9. slope mtan = f ′(x) = 0. which indicates that the original function f is decreasing at x = 6. mtan(5) = f ′(5) = 0. perhaps using a numerical approximation technique like Newton’s Method described above. x = 5 and x = 9.” we evaluate f ′′(x) = 6 x – 42 at the critical value x = 5. f ′′(5) = –12 < 0. the “Second Derivative Test. f ′(10) > 0 (2) f ′′(9) > 0. and demonstrates an application of the “First Derivative Test” for determining the nature of critical points.e. via factoring. In an alternate method. 3 x2 – 42 x + 135 = 3 (x – 5)(x – 9) = 0. i. 23) is a relative minimum for f. then its tangent line must be horizontal. As this is a quadratic (degree 2) polynomial equation. say x = 4 and x = 6.13 Notice the extremely rapid convergence to the root. We calculate that: mtan(4) = f ′(4) = +15 > 0. where f ′ = 0.e.. Hence there are two critical values.. the tangent line at P0 will then intersect the X-axis at a value much closer to the root. as we move from left to right in a local neighborhood of x = 5. consistent with the above.. then falls. Why does it work at all? Suppose P0 is a point on the graph of f(x). local minimum) at some value of x. it can be shown that the small error (between each value and the true solution) in each iteration is approximately squared in the next iteration. and the cycle repeated until some predetermined error tolerance is reached. and if f(x) is differentiable there. resulting in a much smaller error. This suggests that.e. 55) is a relative maximum for f. (But beware: Not all critical values necessarily correspond to relative extrema! More on this later…) Example (cont’d): Find and classify the critical points of y = f(x) = x3 – 21 x2 + 135 x – 220. Hence..

14 Finally. But clearly. the relative extreme points are also the absolute extreme points. if x ≥ 11. compact). f ′′(8) = +6 > 0. . and a relative minimum (i. global maximum) value. the function attains its global maximum and minimum values of 55 (at x = 5) and 23 (at x = 9). 10]. f increases f concave down f decreases f increases f concave up (5. the global maximum of the function is equal to 104.. f ′′(6) = –6 < 0. (Why?) Therefore. this function has no absolute maximum (i. global minimum) value.e. then f(x) ≥ 55.. and in a local neighborhood of that value.. i. 55) (7. which indicates that the original function f is concave down (“spills”) at x = 6. likewise.e. there is change in concavity of the function f(x).e. which indicates that the original function f is concave up (“holds”) at x = 8.e. in the interval [4. across x = 7. 39) (9. However. which indicates that f is neither concave down nor concave up at x = 7. attained at the right endpoint x = 12.e. in the interval [0.. attained at the left endpoint x = 0. For instance. 39) is a point of inflection for f. in the interval [4. then f(x) ≤ 23.461941. f ′′(7) = 0. there are both higher and lower points on the graph! For example. if x ≤ 3. local minimum) value = 23 at x = 7.e. the global minimum of the function is equal to –220. Similarly. this function has a relative maximum (i. then both absolute extrema are attained. using all of this information. Hence. respectively. and no absolute minimum (i. 12]. 12]. This indicates that (7.. The full graph of f(x) = x3 – 21 x2 + 135 x – 220 is shown below. 23) (2. 0) As we have shown. notice that f ′′(x) = 6 x – 42 = 0 when x = 7. However. if we restrict the domain to an interval that is “closed and bounded” (i. local maximum) value = 55 at x = 5.

We wish to find the area under the graph of f. Graph both of these functions.e. informally. to any variable upper value x. yielding the equation dV 2 dr dt = 4 π r dt . 6. from some fixed lower value a. in an interval [a. the derivative dr = 4 π r2 naturally expresses the instantaneous rate of change of V with respect to r. let’s assume that the function is nonnegative (i..15 Exercise: Graph each of the following.e. x]. 0) is a critical point for both f(x) = x3 and f(x) = x4. the “time derivative” of a functional relation between two or more variables results in a type of differential equation that relates their rates. dV Hence. since dt = dr dt . but the second form illustrates a simple application of a general technique called related rates. by differentiating the original relation with respect to the time variable t. f(x) ≥ 0) and continuous (i. formally show that it is a relative minimum of the latter. f(x) = x3 – 21 x2 + 135 x – 243 f(x) = x3 – 21 x2 + 135 x – 265 f(x) = x3 – 21 x2 + 147 x – 265 Exercise: The origin (0. 4 The volume V of a spherical cell is functionally related to its radius r via the formula V = 3 π r3. but a point of inflection of the former. as shown below. Observant readers will note that the two equations are mathematically equivalent dV dV dr via the Chain Rule. for simplicity. y = f (x) f (z) F(x + Δx) − F(x) F(x) a x z x + Δx X . But as V and r are functions of time. it has no breaks or jumps). (Why?) Using the tests above.. That is. Integrals and Antiderivatives Suppose we have a general function y = f(x). we can relate both their rates directly.

where C is an arbitrary ⌡ constant. previously given) and. Clearly. We can write this relation succinctly as ⌠x9 dx = 10 x10 + C. F must also have a strong connection with f itself. this is a function of x. Therefore. we see that the right hand side f(z) becomes (via continuity) f(x). F(x + Δx) = Area under the graph of f in the interval [a. x + Δx].) More generally. To see what that connection must be. F′ (x) = f(x) . f is called the integrand. F is an antiderivative of f. then the two functions are related via the indefinite integral: ⌠f(x) dx = F(x) + C . x + Δx] = Area of the rectangle with height f(z) and width Δx (where z is some value in the interval [x. consider a nearby value x + Δx. if F is any antiderivative of f. x]. because F′ (x) = 10 (10x9) + 0 = x9 = f(x). we have F(x + Δx) − F(x) = f(z). ⌡ a where the right-hand side ⌠ f(t) dt represents the definite integral of the function f from a to x. We see that the left hand side becomes the derivative of F(x) (recall the definition of the derivative of a function. ⌡ a (In this context. Example 1: F(x) = 10 x10 + C (where C is any constant) is the general antiderivative of f(x) = x9. we formally express… x F(x) = ⌠ f(t) dt . ⌡ 1 1 1 .. Hence. x + Δx]) = f(z) Δx. and take the difference of these two areas (highlighted above in green): F(x + Δx) − F(x) = Area under the graph of f in the interval [x. because every value of x results in one and only one area (shown highlighted above in blue). i.16 We formally define a new function F(x) = Area under the graph of f in the interval [a. Δx Now take the limit of both sides as Δx → 0. Then.e. x Therefore. noting that z → x as Δx → 0. by definition! Moreover.

difference) of the integrals. not reviewed here. We can write this relation succinctly as ⌠e x/8 dx = 8 e x/8 + C. 2. it is evident that the differentiation rules for power and exponential functions can be inverted (essentially by taking the integral of both sides) to the General Power Rule. the integral of a sum (respectively. c f(x). Sum and Difference Rules ∫ [f(x) ± g(x)] dx = ∫ f(x) dx ± ∫ g(x) dx From the previous two examples.) These are extremely important properties for the applications that follow. ∫ [c f(x)] dx = c ∫ f(x) dx For any two integrable functions f(x) and g(x). is equal to that constant multiple c. because F′ (x) = 8 8 e x/8 + 0 = e x/8 = f(x). Also. times the integral of the function f(x). ⌠u p ⌡ du = ln |u| + C. In particular. and General Exponential Rule for integration: u p+1 p + 1 + C. ⌡ NOTE: Integrals possess the analogues of Properties 1 and 2 for derivatives. (1 ) Properties of Integrals 1. General Logarithm Rule.17 Example 2: F(x) = 8 e x/8 + C (where C is any constant) is the general antiderivative of f(x) = e x/8. ⌠eu du = eu + C ⌡ if p ≠ −1 if p = −1 NOTE: In order to use these formulas correctly. the integral of a constant multiple of a function. For any constant c. difference) of two functions is equal to the sum (respectively. du must be present in the integrand (up to a constant multiple). and any integrable function f(x). To illustrate… . found on page 10. (The integral analogue for products corresponds to a technique known as integration by parts.

provided we preserve the balance via multiplication by 1/3 on the outside of the integral sign). is to recognize that if we substitute u = x5 + 2. ⌡ ⌡ ⌠ eu ⌡ du = eu + C In this example. or were replaced by any other function. (This procedure is demonstrated in the next example. This is present in the original integrand. integral. if u = −z2/2. the values of its corresponding definite integral are estimated and tabulated for general use. let u = 1 + x3. it can be shown that without this factor of z. as is! Therefore.” Because it is related to the important “bell curve” of probability and statistics. take the derivative of the right-hand function. The first is to expand out the algebraic expression in the integrand. or were replaced by any other function. and perform the subsequent exponential integration. except for the constant multiple −1. if the x2 were missing from the integrand. and integrate the resulting polynomial (of degree 49) term-by-term… Yuk. carried out above. revealing that this is again a “power rule” integral. 3 ⌡ 1+x ⌠ ⌡ u−1/2 du = u+1/2 1/2 + C In this example. we would not be able to introduce and balance for it. ⌡ 10 ⌠ ⌡ u9 du = u10 10 + C 18 There are two ways to solve this problem. (To check the answer. this is essentially just a “power rule” integration. if the x4 were absent. not functions! 2 2 1 1 3 ⌠ x ⌠(1 + x3)−1/2 3x2 dx = × 2 (1 + x3) 1/2 + C = Example 4: ⎮ 3 dx = 3 ⌡ 3 1 + x + C. And again.) However. in terms of the variable u. then we would not have been able to carry out the integration exactly in the manner shown. which we can easily balance. as illustrated. . In fact. Verify that the answer is correct via differentiation. so that du = 3x2 dx. not a power. we could introduce and compensate for it. then du = −z dz. this integral is not expressible in terms of “elementary functions.(x5 + 2)10 Example 3: ⌠(x5 + 2)9 5x4 dx = + C. This is present in the integrand. Don’t forget to use the Chain Rule!) Note that if the constant multiple 5 were absent from the original integrand. we can only balance constant multiples. Example 6: ⌠z e−z²/2 dz = −⌠e−z²/2 (−z) dz = − e−z²/2 + C. if the z were missing from the integrand. ⎮ ⌡ 3 ⌡ ⌠ ⌡ u−1 du = ln |u| + C This is very similar to the previous example. except that it is a logarithmic. The second way. x2 1 1 Example 5: ⌠1 + x3 dx = ⌠(1 + x3)−1 3x2 dx = 3 ln |1 + x3| + C. except for the constant multiple of 3 (which we can introduce. then du = 5x4 dx. Again. which is precisely the other factor in the integrand. then we would not be able to carry out the integration in the manner shown. and verify that the original integrand is restored. since via Property 1 above.

1 0 0 4 8 12 1 Method 1. Use the power function formula (if possible): If u = 1 – x4. and x3 is indeed present in the integrand. such as the Trapezoidal Rule. trigonometric substitution. 8 8 4 8 12 Method 2. the Fundamental Theorem b of Calculus for definite integrals: ⌠f (x) dx = F(b) − F(a) . we get u = 1 − 14 = 0. 1 − 4 1 ⌠ (1 − x ) (−4x ) dx = − ⌡ 4 4 2 3 x=1 x=0 u=1 1 1 u3 2 ⌠u du = 4 ⌠u du = 4 3 ⌡ ⌡ u=0 2 1 [ ] 1 0 0 1 13 03 = − 3 4 3 [ 1 ] = 12 . etc. from x = 0 to x = 1. all these results can be summarized into one elegant statement.) b ⌠f (x) dx ⌡ a 1 ⌠ Example 7: ⌡x3 (1 − x4)2 dx 0 This definite integral represents the amount of area under the curve f(x) = x3 (1 – x4)2. we get u = 1 – 04 = 1. . partial fractions. then du = −4x3 dx. u2 du NOTE: Numerical integration techniques. – will not be reviewed here. when x = 1. are sometimes used also. (Advanced techniques of integration ⌡ a – such as integration by parts.19 Finally. Expand and integrate term-wise: ⌠x3 (1 − 2x4 + x8) dx = ⌠(x3 − 2x7 + x11) dx ⌡ ⌡ = [ x4 − 2x + x12 ] 8 4 8 12 1 0 = 1 1 [ 14 − 2 (1) + 112 ] − [ 04 − 2 (0) + 012 ] = 12 – 0 = 12 . Recall that the x-limits of integration should also be converted to u-limits: when x = 0.

and solving gives y = e ax + C = e ax eC. dx = a y (1 – y). suppose that population size y = f(x) is restricted dy between 0 and 1. Specifying an initial amount y(0) = y0 “when the clock starts at time zero” yields the unique solution y = y0 e ax.. by separation of variables – y on the left.20 7. such equations can be solved via Euler’s Method and other. 1 Separating variables produces y dy = a dx. numerical techniques. or y = A e ax.e. integrating yields ln(y) = a x + C. Like integration above.) as dy = x2 dx. In this differential form. y0) (0. Formally. 5) (0. each function corresponds to a different value of C. Further specifying an initial value (or initial condition). dx = x2. NOTE: Many types of differential equation exist. where A is an arbitrary (positive) multiplicative constant. the y0 e ax solution is given by y = y e ax + (1 – y ). (2. such that the rate of change dx is dy proportional to the product y (1 – y). illustrated) or decreasing for a < 0 (as in radioactive isotope decay). singles out exactly one of them passing 7 1 through the chosen point. we can now integrate both sides 1 explicitly to obtain y = 3 x3 + C. Differential Equations As a first example. dy dx = a y. where a is a known constant of proportionality (either positive or negative). either increasing for a > 0 (as in unrestricted population growth. With the initial condition y(0) = y0.. such as y(2) = 5.d.. i.e. including those that cannot be explicitly solved using “elementary” techniques. Hence this is a family of exponential curves. and x on the right – we can rewrite this ordinary differential equation (or o. more sophisticated. but remains bounded as x gets large. in this case. e. suppose we wish to find a function dy dy y = f(x) whose derivative dx is given.g. This is known 0 0 as the logistic curve. y = 3 x3 + 3. where C is an arbitrary additive constant. Now consider the case of finding a function y = f(x) dy whose rate of change dx is proportional to y itself. Note that the “solution” therefore actually represents an entire family of functions.e. y0) . i. where a > 0. Finally. which initially resembles the exponential curve.

average value. Numerical methods are often used when explicit solutions are intractable. and relate rates of change of different but connected functions. and can be interpreted as the slope of the line tangent to the graph of y = f(x). probability. and expressed in terms of an indefinite integral: ⌡ f (x) dx = F(x) + C. Derivatives and integrals can be generally be used to analyze the dynamical behavior of complex systems. As complex functions are built up from simpler ones by taking sums. quotients. Suggestions for improving this document? Send them to ifischer@wisc. and compositions.edu. In particular. is given by its dy derivative dx = f′ (x). we have: d p p − 1 du General Power Rule dx (u ) = p u dx du d (e u) = e u dx dx 1 du d [ln(u)] = u dx dx General Exponential Rule General Logarithm Rule Derivatives can be applied to estimate functions locally. surface area. amount of work done over a path. . if p ≠ −1 ⌡ ⌠u−1 ⌡ du = ln |u| + C ⌠e u du = eu + C ⌡ b General Power Rule General Logarithm Rule General Exponential Rule The corresponding definite integral ⌠ f(x) dx = F(b) − F(a) is commonly interpreted as the area ⌡ a under the graph of y = f(x) in the interval [a. b].21 8. Other quantities that can be interpreted as definite integrals include volume. for example. In particular: ⌠ u p+1 ⌠u p du = p + 1 + C. This function is mathematically defined in terms of a particular “limiting value” of average rates of change over progressively smaller intervals (when that limit exists). products. via the Chain Rule dx [f(u)] = f′ (u) dx . differences. or equivalently. via differential equations of various types. formulas exist d du for computing their derivatives as well. Summary of Main Points The instantaneous rate of change of a function y = f(x) at a value of x in its domain. and many others. dF A function f(x) has an antiderivative F(x) if its derivative dx = f(x). arc length. f(x) dx = dF. though other interpretations do exist. find the relative extrema of functions.