You are on page 1of 39

dynamic programming

Definition: Solve an optimization problem by caching subproblem solutions (memoization) rather than recomputing them.

optimization problem
Definition: A computational problem in which the object is to find the best of all possible solutions. More formally, find a solution in the feasible region which has the minimum (or maximum) value of the objective function. Note: An optimization problem asks, what is the best solution? A decision problem asks, is there a solution with a certain characteristic? For instance, the traveling salesman problem is an optimization problem, while the corresponding decision problem asks if there is a Hamiltonian cycle with a cost less than some fixed amount k. From Algorithms and Theory of Computation Handbook, pages 29-20 and 34-17, Copyright © 1999 by CRC Press LLC. Appearing in the Dictionary of Computer Science, Engineering and Technology, Copyright © 2000 CRC Press LLC.

First Example
Let's begin with a simple capital budgeting problem. A corporation has $5 million to allocate to its three plants for possible expansion. Each plant has submitted a number of proposals on how it intends to spend the money. Each proposal gives the cost of the expansion (c) and the total revenue expected (r). The following table gives the proposals generated:

Table 1: Investment Possibilities

Each plant will only be permitted to enact one of its proposals. The goal is to maximize the firm's revenues resulting from the allocation of the $5 million. We will assume that any of the $5 million we don't spend is lost (you can work out how a more reasonable assumption will change the problem as an exercise). A straightforward way to solve this is to try all possibilities and choose the best. In this case, there are only ways of allocating the money. Many of these are infeasible (for instance, proposals 3, 4, and 1 for the three plants costs $6 million). Other proposals are feasible, but very poor (like proposals 1, 1, and 2, which is feasible but returns only $4 million). Here are some disadvantages of total enumeration: 1. For larger problems the enumeration of all possible solutions may not be computationally feasible. 2. Infeasible combinations cannot be detected a priori, leading to inefficiency. 3. Information about previously investigated combinations is not used to eliminate inferior, or infeasible, combinations. Note also that this problem cannot be formulated as a linear program, for the revenues returned are not linear functions. One method of calculating the solution is as follows: Let's break the problem into three stages: each stage represents the money allocated to a single plant. So stage 1 represents the money allocated to plant 1, stage 2 the money to plant 2, and stage 3 the money to plant 3. We will artificially place an ordering on the stages, saying that we will first allocate to plant 1, then plant 2, then plant 3. Each stage is divided into states. A state encompasses the information required to go from one stage to the next. In this case the states for stages 1, 2, and 3 are
• • •

{0,1,2,3,4,5}: the amount of money spent on plant 1, represented as , {0,1,2,3,4,5}: the amount of money spent on plants 1 and 2 ( ), and {5}: the amount of money spent on plants 1, 2, and 3 ( ).

Unlike linear programming, the do not represent decision variables: they are simply representations of a generic state in the stage. Associated with each state is a revenue. Note that to make a decision at stage 3, it is only necessary to know how much was spent on plants 1 and 2, not how it was spent. Also notice that we will want to be 5. Let's try to figure out the revenues associated with each state. The only easy possibility is in stage 1, the states . Table 2 gives the revenue associated with .

Table 2: Stage 1 computations. We are now ready to tackle the computations for stage 2. In this case, we want to find the best solution for both plants 1 and 2. If we want to calculate the best revenue for a given , we simply go through all the plant 2 proposals, allocate the given amount of funds to plant 2, and use the above table to see how plant 1 will spend the remainder. For instance, suppose we want to determine the best allocation for state we can do one of the following proposals: 1. 2. 3. 4. . In stage 2

Proposal 1 gives revenue of 0, leaves 4 for stage 1, which returns 6. Total: 6. Proposal 2 gives revenue of 8, leaves 2 for stage 1, which returns 6. Total: 14. Proposal 3 gives revenue of 9, leaves 1 for stage 1, which returns 5. Total: 14. Proposal 4 gives revenue of 12, leaves 0 for stage 1, which returns 0. Total: 12.

The best thing to do with four units is proposal 1 for plant 2 and proposal 2 for plant 1, returning 14, or proposal 2 for plant 2 and proposal 1 for plant 1, also returning 14. In either case, the revenue for being in state similarly. is 14. The rest of table 3 can be filled out

Table 3: Stage 2 computations.

determine the amount of money remaining and use Table 3 to decide the value for the previous stages. and proposal 3 or 2 (respectively) at plant 1. This defines a backward recursion. This gives a revenue of 18. Previous stages give 17. 2. Total: 17. in stage j. and by the corresponding cost. So here we can do the following at plant 3: • • Proposal 1 gives revenue 0. Therefore.We can now go on to stage 3. Proposal 2 gives revenue 4. we go through all the proposals for this stage. you will find that the calculations are done recursively. We can sum up these calculations in the following formulas: Denote by Let the revenue for proposal be the revenue of state at stage j. and = amount allocated to stage 3. the optimal solution is to implement proposal 2 at plant 3. The computations were carried out in a forward procedure. We could define • • • = amount allocated to stages 1. Stage 2 calculations are based on stage 1. all future decisions are made independent of how you got to the state. . Then we have the following calculations and All we were doing with the above calculations was determining these functions. Indeed. The only value we are interested in is . Once again. proposal 2 or 3 at plant 2. = amount allocated to stages 2 and 3. Total: 18. this is illustrated in Figure 1. Previous stages give 14. and 3. leaves 4. Graphically. leaves 5. given you are at a state. It was also possible to calculate things from the ``last'' stage back to the first stage. This is the principle of optimality and all of dynamic programming rests on this assumption. If you study this procedure. stage 3 only on stage 2.

given . Therefore. You may wonder why I have introduced backward recursion. the backward recursion has been found to be more effective in most applications. . given be the optimal revenue for stages 1. In other cases. in the future. you will come up with the same answer. except in cases where I wish to contrast the two recursions. and . Backward Recursion Corresponding formulas are: • • • Let be the optimal revenue for stage 3. I will be presenting only the backward recursion. the ordering of the stages made no difference. . In this particular case.Figure 1: Forward vs. there may be computational advantages of choosing one over another. and 3. In general. though. given The recursion formulas are: and If you carry out the calculations. particularly since the forward recursion seems more natural. be the optimal revenue for stages 2 and 3. 2.

C. Stage 1 contains node A. This gives the recursion needed to solve this . and G.A second example Dynamic programming may look somewhat familiar. stage 2 contains nodes B. The states in each stage correspond just to the node names. If we let S denote a node in stage j and let destination J. and D. F. stage 4 contains H and I. Both our shortest path algorithm and our method for CPM project scheduling have a lot in common with it. and G. So stage 3 contains states E. F. Due to the special structure of this problem. . and stage 5 contains J. we can write be the shortest distance from node S to the where denotes the length of arc SZ. Suppose we wish to get from A to J in the road network of Figure 2. we can break it up into stages. Here are the rest of the calculations: problem. We begin by setting Stage 4. Let's look at a particular type of shortest path problem. stage 3 contains node E. Figure 2: Road Network The numbers on the arcs represent distances.

Stage 1. if you are ever at F. . Stage 3. so The next table gives all the calculations: You now continue working back through the stages one by one. there are no real decisions to make: you simply go to your destination J. The following cost is for a total of 7. Therefore. The total cost is 7. Here's how to calculate . From F you can either go to H or I. the best thing to . The total is 9. So you get: • • by going to J. by going to J. do is to go to I.During stage 4. The immediate cost of going to H is 6. The immediate cost of going to I is 3. The results are: Stage 2. each time completely computing a stage before continuing to the preceding one. The following cost is . Here there are more choices.

The decision was how much to spend. 2. In the capital budgeting problem the stages were the allocations to a single plant. 5. The knapsack has a certain capacity. they were defined by the structure of the graph. The states for the shortest path problem was the node reached. In the shortest path problem. is to take a problem and determine stages and states so that all of the above hold. 4. 3. only that you did. The final stage must be solvable by itself. What should go into the knapsack so as to maximize the total benefit? As an . it was not necessary to know how you got to a node. The problem can be divided into stages with a decision required at each stage. 6. The decision was were to go next. The Knapsack Problem. The states for the capital budgeting problem corresponded to the amount spent at that point in time. There exists a recursive relationship that identifies the optimal decision for stage j. If you can. Each item that can go into the knapsack has a size and a benefit. The big skill in dynamic programming. Because of the difficulty in identifying stages and states. given that stage j+1 has already been solved. The decision of how much to spend gave a total amount spent for the next stage.Common Characteristics There are a number of characteristics that are common to these two problems and to all dynamic programming problems. Each stage has a number of states associated with it. only how much was spent. The last two properties are tied up in the recursive relationships given above. These are: 1. it is not necessary to know how the money was spent in previous stages. we will do a fair number of examples. The knapsack problem is a particular type of integer program with just one constraint. and the art involved. the optimal decision for each of the remaining states does not depend on the previous states or decisions. The decision at one stage transforms one state into a state in the next stage. then the recursive relationship makes finding the values relatively easy. Given the current state. The decision of where to go next defined where you arrived in the next stage. In the budgeting problem. In the path problem.

and suppose the capacity of the knapsack is 5. This illustrates how arbitrary our definitions of stages. the following relates g(w) to previously calculated g values: . for item j. Let a.3. Call this value This leads to the following recursive formulas: Let capacity for items j and following. respectively. The decision is to determine the last item added to bring the weight to w. The state at stage j represents the total weight of items j and all following items in the knapsack. Continuing to use and as the weight and benefit. For a knapsack problem.example. suppose we have three items as shown in Table 4. There is just one state per stage. Our definitions required a decision at a stage to take us to the next stage (which we would already have calculated through backwards recursion). In fact. This gives us a bit more flexibility in our calculations. The recursion I am about to present is a forward recursion. • • . states. Table 4: Knapsack Items The stages represent the items: we have three stages j=1. It also points out that there is some flexibility on the rules for dynamic programming. it could take us to any stage we have already calculated. and decisions are. Let g(w) be the maximum benefit that can be gained from a w pound knapsack. units of be the value of using represent the largest integer less than or equal to An Alternative Formulation There is another formulation for the knapsack problem. The decision at stage j is how many items j to place in the knapsack. let the stages be indexed by w.2. the weight filled.

. How can the shop minimize costs over the five year period? Let the stages correspond to each year. and . If we add item j. The decisions are whether to keep the machine or trade it in for a new one. we must end off by adding some item. and . to fill a w pound knapsack. given the machine is x years old in time t. The cost of maintaining the machine during its ith year of operation is as follows: . add item 1 or 3. To illustrate on the above example: • • • • g(0) = 0 g(1) = 30 add item 3. add item 1. The state is the age of the machine for that year. Each new machine costs $1000. Let be the minimum cost incurred from time t to time 5. Since we have to trade in at time 5. . which is gained by adding 2 of item 1 and 1 of item 3. you already saw how to formulate and solve an equipment replacement problem using a shortest path algorithm.Intuitively. • add item 1 or 3. we end up with a knapsack of size to fill. Equipment Replacement In the network homework. The trade in value after i years is . Let's look at an alternative dynamic programming formulation. This gives a maximum of 160. three years before being traded in. Suppose a shop needs to have a certain machine over the next five year period. A machine may be kept up to . • add item 1.

. so If you have a two year old machine. Similarly Finally. Stage 4. So the best thing to do with a two year old machine is the minimum of the two. at time zero.Now consider other time periods. If you have a three year old machine in time t. so This is solved with backwards recursion as follows: Stage 5. Stage 3. you can either trade or keep. . you must trade in. • • Trade costs you Keep costs you . we have to buy.

So the cost is 1280. The traveling salesperson problem is to visit a number of cities in the minimum distance. Let's try another.Stage 2. For instance. The Traveling Salesperson Problem We have seen that we can solve one type of integer programming (the knapsack problem) with dynamic programming. and Chicago before returning to New York. Stage 0. a politician begins in New York and has to visit Miami. Dallas. and one solution is to trade in years 1 and 2. Stage 1. . There are other optimal solutions. How can she minimize the distance traveled? The distances are as in Table 5.

the number of states in the 15th stage is more than a billion. the probability of failing English is . states. One important aspect of this problem is the so called curse of dimensionality. Instead. and let the decision be where to go next. plus the city we ended up in.Table 5: TSP example problem. Imagine we chose the city we are in to be the state.000. This turns out to be enough to get a recursion.000. The stage 3 calculations are • • • For other stages.000.000. So a state is represented by a pair (i. and decisions. suppose there are 20 cities. This is not the sort of problem that will go away as computers get better. Here is one example where we multiply to get the recursion.000. and the .S) where S is the set of t cities already visited and i is the last city visited (so i must be in S). The state space here is so large that it becomes impossible to solve even moderate size problems. Nonadditive Recursions Not every recursion must be additive. For instance.000. That leaves us with states. the number of states at the 50th stage is more than 5. The number of states in the 10th stage is more than a million.000. One natural choice is to let stage t represent visiting t cities. It is important that he not fail all of them.000. For 30 cities. the recursion is You can continue with these calculations. A student is currently taking three courses.000. If the probability of failing French is . the state has to include information about all the cities visited. for we do not know where we have gone before. And for 100 cities. The real problem in solving this is to define the stages. We could not make the decision where to go next.000.

and stage 3 for Statistics. The state will correspond to the number of hours studying for that stage and all following stages.probability of failing Statistics is . then the probability of failing all of them is . He has left himself with four hours to study. . Table 6: Student failure probabilities. Let be the probability of failing t and all following courses. assuming x hours are available. stage 2 for English. How should he minimize his probability of failing all his courses? The following gives the probability of failing each course given he studies for a certain number of hours on that subject. The final stage is easy: The recursion is as follows: We can now solve this recursion: Stage 3. Denote the entries in the above table as . (What kind of student is this?) We let stage 1 correspond to studying French. the probability of failing course t given k hours are spent on it. Stage 2. as shown in Table 6.

Stochastic Dynamic Programming In deterministic dynamic programming. then the chain receives revenue of $2. given a state and a decision. Unfortunately. Stage 1.So.50. decisions. and is given in the following table: . The basic ideas of determining stages. The overall optimal strategy is to spend one hour on French. Any unsold milk is worth just $. the demand for milk is uncertain. The chain must allocate the 6 gallons to its three stores. the optimum way of dividing time between studying English and Statistics is to spend it all on Statistics. and three on Statistics. If a store sells a gallon of milk. If we know either of these only as a probability function. and recursive formulae still hold: they simply take on a slightly different form. The probability of failing all three courses is about 29%. states. then we have a stochastic dynamic program. both the immediate payoff and next state are known. Uncertain Payoffs Consider a supermarket chain that has purchased 6 gallons of milk from a local dairy.

2. 2. determine an expected revenue for each allocation of milk to a store. 3). the value of allocating 2 gallons to store 1 is: We can do this for all allocations to get the following values: We have changed what looked to be a stochastic problem into a deterministic one! We simply use the above expected values. however. 1. 4. 1. and 3 (6). The resulting problem is identical to our previous resource allocation problems. We can. (This is not the only possible objective.The goal of the chain is to maximize the expected revenue from these 6 gallons.) Note that this is quite similar to some of our previous resource allocation problems: the only difference is that the revenue is not known for certain. 3. 6) and the state for stage 1 is the number of gallons given to stores 1. 2. but a reasonable one. For instance. 5. We have a stage for each store. The states for stage 3 are the number of gallons given to store 3 (0. the states for stage 2 are the number of gallons given to stores 2 and 3 (0. The decision at stage i is how many gallons to give .

3. The current price of Netscape is $140. we will have x-k dollars. Suppose we have the option to buy Netscape stock at $150. We can formulate this as a dynamic program as follows: create a stage for the decision point before each flip of the coin.5. you can wager $0. Another example comes from the pricing of stock options. the stock will go up by $2 with probability .5 we will have x+k dollars next period. If we let the above table be represented by store i. with one solution assigning 1 gallon to store 1.1. you should get a valuation of $9. or $2 (provided you have sufficient funds). consider the following coin tossing game: a coin will be tossed 4 times. For example. Before each toss. We can still easily determine from .to store i. We can exercise this option anytime in the next 10 days (american option. For stage 1. then the recursive formulae are (the value of giving k gallons to If you would like to work out the values. but there is no sense in worrying too much about that). but is a probabilistic mixing of states. and your objective is to maximize the probability you have $5 at the end. 3 gallons to store 2 and 2 gallons to store 3.1 and go down by $2 with probability .4.2. of the coin tosses. some of these states are not possible. $1. You begin with $1. for each of the others. stay the same with probability . representing the result of the final coin flip. . if we are in stage i and bet k and we have x dollars.4.4. you can set it to ``0. We have a model of Netscape stock movement that predicts the following: on each day. Let the probability of ending up with at least $5 given we have $x before the ith coin flip. There is a state in each stage for each possible amount you can have. Uncertain States A more interesting use of uncertainty occurs when the state that results from a decision is uncertain.75. This gives us the following recursion: be Note that the next state is not known for certain. then with probability . and with probability . rather than a european option that could only be exercised 10 days from now). the only state is ``1''. and from and so on back to . and a ``final'' stage. Now.5'' (of course.

the optimal decision is given by: and Given the size of this problem. There are T spaces leading up to the restaurant. For example. As we pass a spot. you may have to make a decision on a job before knowing if another job is going to be offered to you. ``solving'' a stochastic dynamic program involves giving a decision rule for every possible state. one spot right in front of the restaurant. the actual decision path will depend on the way the random aspects play out. There is one major difference between stochastic dynamic programs and deterministic dynamic programs: in the latter. we need to make a decision to take the spot or try for another (hopefully better) spot. then we slink away in embarrasment at large cost M. involve choosing one out of a number of choices where future choices are uncertain. . say. when getting (or not getting!) a series of job offers. In a stochastic dynamic program. We can formulate this as a stochastic dynamic program as follows: we will have stage i for each day i. The value for parking in spot t is . it is clear that we should use a spreadsheet to do the calculations. Here is a simplification of these types of problems: Suppose we are trying to find a parking space near a restaurant. the complete decision path is known. Then. and our goal is to park as close to the restaurant as possible. ``Linear'' decision making . If we do not get a spot.9) or empty (. just before the exercise or keep decision. Because of this.1). of course).Note that the overall trend is downward (probably conterfactual. Let be the expected value of the option on day i given that the stock price is x. and T after the restaurant as follows: Each spot can either be full (with probability. Many decision problems (and some of the most frustrating ones). not just along an optimal path. The value of the option if we exercise it at price x is x-150 (we will only exercise at prices above 150). What is our optimal decision rule? . This restaurant is on a long stretch of road. The state for each stage will be the stock price of Netscape on that day.

html The following is an example of global sequence alignment using Needleman/Wunsch techniques. the two sequences to be globally aligned are G A A T T C A G T T A (sequence #1) G G A T C G A (sequence #2) So M = 11 and N = 7 (the length of sequence #1 and sequence #2. take the first empty spot on or after spot t (where t will be negative). respectively) A simple scoring scheme is assumed where • • • Si. otherwise Si.j = 0 (mismatch score) w = 0 (gap penalty) Three steps in dynamic programming 1. Dynamic Programming http://www.sbc. Traceback (alignment) .su. then we have: In general. For this example. The states in each stage are either e (for empty) or o (for occupied). the optimal rule will look something like. The decision is whether to park in the spot or not (cannot if state is o).se/~per/molbioinfo2001/dynprog/dynamic. Matrix fill (scoring) 3.We can have a stage for each spot t. Initialization 2. If we let and be the values for each state.j = 1 if the residue at position i of sequence #1 is the same as the residue at position j of sequence #2 (match score).

In order to find Mi. Matrix Fill Step One possible (inefficient) solution of the matrix fill step finds the maximum global alignment score by starting in the upper left hand corner in the matrix and finding the maximal score Mi. A value of 1 is then placed in position 1. M1.1 = MAX[M0. M1. In terms of matrix positions.j-1 will be green and Mi-1. Using this information.j + w (gap in sequence #2)] Note that in the example.j will be blue.j for any i.j for each position in the matrix.Initialization Step The first step in the global alignment dynamic programming approach is to create a matrix with M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be aligned.e.j. M0. Mi. above and diagonal to i. . w = 0.0 + 1.j-1 and Mi-1. the score at position 1. j-1 + Si. Since this example assumes there is no gap opening or gap extension penalty. and by the assumptions stated at the beginning.j.j-1 will be red.1 + 0] = MAX [1. the first row and first column of the matrix can be initially filled with 0. i.1 in the matrix can be calculated. S1. For each position.j is defined to be the maximum score at position i. Thus. 0 + 0.j-1 + w (gap in sequence #1).1 of the scoring matrix. Mi. j.j (match/mismatch in the diagonal). Mi-1.j it is minimal to know the score for the matrix positions to the left. Mi-1. j-1. it is necessary to know Mi-1.j = MAXIMUM[ Mi-1. Since the first residue in both sequences is a G. Mi. 0.1 = 1. 0] = 1. Mi. Mi.

0 (for a vertical gap) or 1 (horizontal gap). At column 2. The final row will contain the value 2 since it is the maximum of 2 (match). Thus. 2 (vertical gap) so its value is 2. Take the example of row 1. 1(horizontal gap) or 1 (vertical gap). 1 (horizontal gap) and 2(vertical gap). its value will be the maximum of 2(match).Since the gap penalty (w) is 0. . the rest of row 1 and column 1 can be filled in with the value 1. 1 (horizontal gap). The rest of row 1 can be filled out similarly until we get to column 8. Thus. The rest of row 1 and column 1 can be filled with 1 using the above reasoning. So its value is 1. The location at row 2 will be assigned the value of the maximum of 1(mismatch). 1 (horizontal gap). there is an A in both sequences. Now let's look at column 2. there is a G in both sequences (light blue). its value will be the maximum of 1 (mismatch). the value is the max of 0 (for a mismatch). The value will again be 1. the choices for the value will be the exact same as in row 4 since there are no matches. 1 (vertical gap) so its value is 2. the value for the cell at row 1 column 8 is the maximum of 1 (for a match). Note that for all of the remaining positions except the last one in column 2. At this point. 0 (for a vertical gap) or 1 (horizontal gap). At the position column 2 row 3. Moving along to position colum 2 row 4.

In this case. After filling in all of the values the score matrix is as follows: Traceback Step After the matrix fill step. Note that with a simple scoring algorithm such as one that is used here. The traceback step begins in the M.J position in the matrix.e. the position that leads to the maximal score. .Using the same techniques as described for column 2. we can fill in column 3. The traceback step determines the actual alignment(s) that result in the maximum score. there are likely to be multiple maximal alignments. i. the maximum alignment score for the two test sequences is 6. there is a 6 in that location.

If more than one possible predacessor exists. In this case. This gives us a current alignment of (Seq #1) (Seq #2) A | A So now we look at the current cell and determine which cell is its direct predacessor. any can be chosen. the neighbors are marked in red. The algorithm for traceback chooses as the next cell in the sequence one of the possible predacessors. so the current alignment is (Seq #1) (Seq #2) T A | _ A . In this case. the only possible predacessor is the diagonal match/mismatch neighbor. They are all also equal to 5. it is the cell with the red 5. The alignment as described in the above step adds a gap to sequence #2.Traceback takes the current cell and looks to the neighbor cells that could be direct predacessors. Since the current cell has a value of 6 and the scores are 1 for a match and 0 for anything else. This means it looks to the neighbor to the left (gap in sequence #2). the diagonal neighbor (match/mismatch). and the neighbor above it (gap in sequence #1).

Once again. One possible maximum alignment is : Giving an alignment of : G A A T T C A G T T A | | | | | | G G A _ T C _ G _ _ A An alternate solution is: . we eventually get to a position in column 0 row 0 which tells us that traceback is completed. After this step. the current alignment is (Seq #1) T T A | _ _ A Continuing on with the traceback step. the direct predacessor produces a gap in sequence #2.

Advanced Dynamic Programming Tutorial If you haven't looked at an example of a simple scoring scheme. Since this is an exponential problem. respectively) An advanced scoring scheme is assumed where • • • Si. For this example. most dynamic programming algorithms will only print out a single solution. otherwise Si.j = 2 if the residue at position i of sequence #1 is the same as the residue at position j of sequence #2 (match score).j = -1 (mismatch score) w = -2 (gap penalty) Initialization Step The first step in the global alignment dynamic programming approach is to create a matrix with M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be aligned. . please go to the simple dynamic programming example The following is an example of global sequence alignment using Needleman/Wunsch techniques. the two sequences to be globally aligned are G A A T T C A G T T A (sequence #1) G G A T C G A (sequence #2) So M = 11 and N = 7 (the length of sequence #1 and sequence #2.Giving an alignment of : G _ A A T T C A G T T A | | | | | | G G _ A _ T C _ G _ _ A There are more alternative solutions each resulting in a maximal global alignment score of 6.

M[0. Mi. Note that there is also an arrow placed back into the cell that resulted in the maximum score. i. Since the first residue in both sequences is a G.j + w (gap in sequence #2)] Note that in the example. Mi.e. -2. j. .0 + 2.1 . Thus. M1. j-1 + Si. In order to find Mi.j-1 will be red. S1.0]. -2].1 of the scoring matrix. the score at position 1. A value of 2 is then placed in position 1.j-1 will be green and Mi-1.j for each position in the matrix. it is necessary to know Mi-1. j-1.j-1 + w (gap in sequence #1). Mi.1 in the matrix can be calculated.j (match/mismatch in the diagonal). Mi-1.1 = MAX[M0.1 = 2.j = MAXIMUM[ Mi-1. For each position. Mi. M0.j. Using this information. Mi-1. w = -2.j.j it is minimal to know the score for the matrix positions to the left. Mi.j is defined to be the maximum score at position i.The first row and first column of the matrix can be initially filled with 0. above and diagonal to i.2] = MAX[2. In terms of matrix positions. M1.j for any i.0 .2. and by the assumptions stated earlier.j will be blue. Matrix Fill Step One possible (inefficient) solution of the matrix fill step finds the maximum global alignment score by starting in the upper left hand corner in the matrix and finding the maximal score Mi.j-1 and Mi-1.

M1.2 = MAX[M0. so S 1. 0 . -1].3 = -1. Thus. there is not a match in the sequences.3 .2. S1.2.1 + 2.2. M3. 1 -2] = MAX[-1. M1.2] = MAX[0 .2 -2] = MAX[0 + 2. Looking at column 1 row 3.1 . 0.1 . A value of 0 is then placed in position 1. So M1. -1 .1.2 .2 of the scoring matrix and an arrow is placed to point back to M[0. 2 .2.2] = MAX[2. We can continue filling in the cells of the scoring matrix using the same reasoning.Moving down the first column to row 2.2] = MAX[0 . -2]. we get to column 3 row 2. M3.3 of the scoring matrix and an arrow is placed to point back to M[1. we can see that there is once again a match in both sequences. .2] which led to the maximum score. Since there is not a match in the sequences at this positon.1 .2 . -3.2 . S3.1. M0.2 = MAX[ M2.3 = MAX[M0.1. -2]. 2 . M0.2 = -1. M2.2] = MAX[-1.1] which led to the maximum score. 0.2 = 2. 0 . A value of 2 is then placed in position 1.2. M1. Eventually.2.1.

The traceback step begins in the M. The completed score matrix will be as follows: Traceback Step After the matrix fill step. The traceback step will determine the actual alignment(s) that result in the maximum score. there are two different ways to get the maximum score. . The rest of the score matrix can then be filled in.e. the maximum global alignment score for the two sequences is 3. the position where both sequences are globally aligned.Note that in the above case. i. In such a case.J position in the matrix. pointers are placed back to all of the cells that can produce the maximum score.

one of the neighbors is arbitrarily chosen. the only possible predacessor is the diagonal match.Since we have kept pointers back to all possible predacessors. The alignment at this point is T C A G T T A | | | | T C _ G _ _ A Note that there are now two possible neighbors that could result in the current score. the traceback step is simple. At each cell. We can continue to follow the path using a single pointer until we get to the following situation. To begin. . This gives us an alignment of A | A Note that the blue letters and gold arrows indicate the path leading to the maximum score. In such a case. we look to see where we move next according to the pointers.

both sequences can be tested to make sure that they result in a score of 3. G A A T T C A G T T A | | | | | | G G A _ T C _ G _ _ A .Once the traceback is completed. -1 for a mismatch. One possible path is as follows: This gives an alignment of G A A T T C A G T T A | | | | | | G G A _ T C _ G _ _ A The other possible path is as follows: This gives an alignment of G A A T T C A G T T A | | | | | | G G A T _ C _ G _ _ A Remembering that the scoring scheme is +2 for a match. it can be seen that there are only two possible paths leading to a maximal global alignment. and -2 for a gap.

marriage and finding somewhere to eat".org/issue3/dynamic/ If XN appears as XN then your browser does not support subscripts or superscripts.+ 2 1 2 2 2 2 2 2 2 2 2 2-1+2+2-2+2-2+2-2-2+2=3 so both of these alignments do indeed result in the maximal alignment score. David Smith investigated the problem of finding the best potential partner from a fixed number of potential partners using a technique known as "optimal stopping". Measuring potential partners An alternative way of looking at the problem assumes that you know that each potential partner has a score..+ . Helen of Troy is fabled to have had "a face that could launch a thousand ships".. mathematicians and mathematical psychologists have constructed other models of the problem. Dynamic programming: an introduction by David K.+ .+ . one millihelen is therefore the beauty required to launch just one ship! . Please use this alternative version..+ .+ + . In this issue's article "Mathematics.+ .+ 2 1 2 2 2 2 2 2 2 2 2 2-1+2-2+2+2-2+2-2-2+2=3 G A A T T C A G T T A | | | | | | G G A T _ C _ G _ _ A + ..maths.+ + . Inevitably. Smith and PASS Maths http://plus. There's an old joke that beauty can therefore be measured: one helen is the beauty needed to launch a thousand ships.

Helen: her affair with Paris. We must now search for a rule which will make sure that the average score of the partner we choose is as large as possible. They do not have to be written in a computer language. Unfortunately. who described the way of solving problems where you need to find the best decisions one after another. This way of tackling the problem backwards is Dynamic programming. So you think about the best decision with the last potential partner (which you must choose) and then the last but one and so on. Dynamic programming was the brainchild of an American Mathematician. Prince of Troy. started the Trojan wars. Mathematicians use the word to describe a set of rules which anyone can follow to solve a problem. In the forty-odd years since this development. if you get to that stage. working out what you might expect later on is a complicated mixture of all the possible decisions you could make that becomes too much to work out. the number of uses and applications of dynamic programming has increased enormously. a mathematical way of thinking about it is to look at what you should do at the end. Dynamic programming Instead. Suppose we use this scale to measure each potential partner's score from 0millihelens up to a maximum of 1000millihelens with all values equally likely. Then you would compare this partner's score with what you might expect to get later on if you rejected them. The obvious way to look at this would be to think about what you would do when you met the first potential partner. Richard Bellman. . The word Programming in the name has nothing to do with writing computer programs.

and that will be 500 because XN varies from 01000. accept that potential partner. Common sense says that you take the better of these. you know XN-2 and the average value of the score you will get by waiting. In his paper.For example. In darts. each player must score exactly 301. starting and finishing on a double. You expect to get the average value of XN. he acknowledges "The Wheatsheaf at Writtle and the Norfolk Hotel in Nairobi for making research facilities available to me". published in the Journal of the Operational Research Society. go on to potential partner number N When you encounter potential partner number N-2. Half the time. averaging 750. Solving the potential partner problem The dynamic programming approach to the potential partner problem starts by thinking about what happens when faced with the last partner. whose score will be between 500 and 1000. so your rule will be: If XN-1 is more than 500. you will pass over that potential partner and you will expect a . If you need to make a decision about potential partner number N then you must accept their score (which we'll call XN) and live happily ever after! When you encounter potential partner number N-1 all that you know are the values XN-1 and what you expect to get if you wait. If not. waiting will mean that you accept potential partner number N-1. the other half of the time. in 1982 David Kohler used dynamic programming to analyse the best way to play the game of darts.

waiting will give you an average score of 625. is defined as the minimum number of point mutations required to change s1 into s2. and taking the better will give the rule: If XN-2 is more than 625. go on to potential partner number N-1 And so on. glossary `sort' can be changed into `sport' by the insertion of `p'. all you need to do is look at the last 6 values in the table. s1 and s2. if you wanted to know the critical values when there are only 6 potential partners. where a point Bioinformatics . For each potential partner that you meet. The critical values when N=10 are: One of the characteristics of dynamic programming is that the solution to smaller problems is built into that of larger The words `computer' and `commuter' are very similar. Thus. Dynamic P' Edit dist' Hirschberg's The edit distance of two strings. even though the future is not certain. If not. p->m will change the first word into the second. the best set of decisions afterwards will give a critical value for comparison. if the potential partner does better than it.csse. choose that partner. and working forwards. Acknowledgements This article was based on material written by Dr David K. go on. 800. or equivalently. Dynamic Programming Algorithm (DPA) for Edit-Distance http://www. Mathematical Statistics and Operational Research Department. 775 and so on. It is much simpler than starting with potential partner number 1 and trying to think of all the possible sequences of and a change of LA home just one University of Exeter. The word Algorithms `sport' can be changed into `sort' by the deletion of the `p'. accept that potential partner If not. So.score of 500. Smith.

The recurrence relations imply an obvious ternary-recursive routine. s) = |s| -. Row m[i.s2). Here.] depends only on row m[i- .j-1] + if s1[i]=s2[j] then 0 else 1 fi.|s2| m[i. 0. d(s1.. they can be matched for no penalty. and impractical for strings of more than a very few characters. We take the least expensive. s2) + 1.e.mutation is one of: 1. d(s1+ch1.'' = empty string d(s. and the overall edit distance is d(s1.. Examination of the relations reveals that d(s1. i=1. or both. so each has a last character.s2)+1.e. '') = 0 -.e. or s2' is shorter than s2. of two strings s1 and s2: d(''..j] = min(m[i-1. of these alternatives. The last possibility is to edit s1+ch1 into s2 and then insert ch2. '') = d(''. ch1 and ch2 have to be explained in an edit of s1+ch1 into s2+ch2.s2+ch2)+1.s2)+1.. Another possibility is to delete ch1 and edit s1 into s2+ch2.0] = i. ch1 and ch2 respectively. This is not a good idea because it is exponentially slow.j]) m[0. If ch1 equals ch2.. m[i-1.|s2| m[.. 1. m[i. neither string is the empty string.e.0] = 0 m[i. s2+ch2) + 1 ) The first two rules above are obviously true. delete a letter The following recurrence relations define the edit distance. m[0.i.0. j] + 1.. min. change a letter. d(s1.s2') where s1' is shorter than s1. insert a letter or 3. length of s d(s1+ch1.|s1|.|s2|] is used to hold the edit distance values: m[i.i]. s2) + if ch1=ch2 then 0 else 1 fi. so it is only necessary consider the last one. m[0. i. s2+ch2) = min( d(s1..s2).|s1| j=1. i=1.s2) depends only on d(s1'. s2[1. j=1.|s1|. 2. d(s1+ch1.j] = j. If ch1 differs from ch2. There are no other alternatives. j-1] + 1 ). This allows the dynamic programming technique to be used. Somehow. i. giving an overall cost d(s1.j] = d(s1[1. i. d(s1. A two-dimensional matrix.] can be computed row by row. then ch1 could be changed into ch2.

only two rows of the matrix need be allocated. etc. i.. the strings are similar. or dissimilar.]. The time complexity of this algorithm is O(|s1|*|s2|). Some of these algorithms are fast if certain conditions hold. A l l i s o n Complexity The time-complexity of the algorithm is O(|s1|*|s2|). Linear gap-costs are sometimes used where a run of insertions (or deletions) of length `x'. Variations The costs of the point mutations can be varied to be numbers other than 0 or 1. where n is the length of the strings. Longest Common Subsequence The longest common subsequence (LCS) of two sequences.1. about `n' say. The more alike that s1 and s2 are. and d is their edit distance. much better than exponential! YOU NEED A BROWSER WITH NETSC@PE'S JAVASCRIPT ON! Try `go'. the longer is their LCS. s1 and s2. has a cost of `ax+b'.e. If only the value of the edit distance is needed. and for similar problems. and the average complexity is O(n+d2). Other Algorithms There are faster algorithms for the edit distance problem. they can be "recycled". O(n).e. this penalises numerous short runs of insertions and deletions. The space-complexity is also O(n2) if the whole of the matrix is kept for a trace-back to find an optimal alignment. i.g. and the space complexity is then O(|s1|). or the alphabet is large. O(n2) if the lengths of both strings is about `n'. this complexity is O(n2). is a subsequence of both s1 and of s2 of maximum possible length. If s1 and s2 have a `similar' length. for constants `a' and `b'. This is fast for similar strings . change the strings and experiment: © L . Ukkonen (1983) gave an algorithm with worst case time complexity O(n*d). If b>0. e.

The differences can be found (by an algorithm related to editdistance) and the differences transmitted. A transposition can be treated as a deletion plus an insertion. i. e. then machine-1 may need to update the screen on machine-2 as the computation proceeds. and someone on machine-1 edits F=F. when d<<n. but a simple variation on the algorithm can treat a transposition as a single point to machine-2. However. may be suggested as a correction. i. Transposition errors are common in written text. .bak F. saving on transmission bandwidth.G..C. Spelling Correction Algorithms related to the edit distance may be used in spelling correctors. One approach is for the program (on machine-1) to keep a "picture" of what the screen currently is (on machine-2) and another picture of what it should become. producing an edit script to convert f1 into f2. {A.T}. Remote Screen Update Problem If a computer program on machine-1 is being used by someone from a screen on (distant) If a text contains a word. via rlogin etc. that is not in the dictionary. diff Try `man diff' to see the manual entry for diff.where d is small. making a few changes. Applications File Revision The Unix command diff f1 f2 finds the difference between files f1 and f2. to give F.e. diff F. there are many chance character-matches in DNA where the alphabet size is just 4. one with a small edit distance to w. treats a whole line as a "character" and uses a special edit-distance algorithm that is fast when the "alphabet" is large and there are few chance matches between elements of the two strings (files).. If two (or more) computers share copies of a large file F. In will give a small edit script which can be transmitted quickly to machine-2 where the local copy of the file can be updated to equal F. a `close' word.e.g. it might be very expensive and/or slow to transmit the whole revised file F.

2. 1970. Defined a similarity score on molecular-biology sequences. Feb 1966. to infer family relationships and evolutionary trees over different organisms Speech Recognition Algorithms similar to those for the edit-distance problem are used in some speech recognition systems: find a close match between a new utterance and one in a library of classified utterances.Plagiarism Detection The edit distance provides an indication of similarity that might be too close in some situations .: a few thousand bases. think about it. to find genes or proteins that may have shared functions or properties 2. for various normally be run on sequences of at most purposes. O(n)-space. E. . Needleman and C. LNCS 158 p487495. I.C. see [here].G. Springer-Verlag. average case O(n+d2)-time algorithm for edit-distance. Biol.g. 1965. Binary codes capable of correcting deletions. B. also Soviet Physics Doklady 10(8) p707-710. V. Wunsch. 4. 1983. Notes 1. Jrnl Molec. Doklady Akademii Nauk SSSR 163(4) p845-848. Worst case O(nd)-time.. Discovered the basic DPA for edit distance. insertions and reversals. Conf. The two strings are. 48 p443-453. Ukkonen On approximate string matching.. Levenshtein. with an O(n2) algorithm that is closely related to those discussed here. or protein sequences (over an alphabet of 20 amino acids). on Foundations of Comp. Similar measures are used to simple edit distance compute a distance between DNA sequences algorithm would (strings over {A. Proc. Int. Hirschberg (1975) presented a method of recovering an alignment (of an LCS) in O(n2) time but in only linear. Theory. Molecular Biology 1. S. A general method applicable to the search for similarities in the amino acid sequence of two proteins. e.T}. Example An example of a DNA sequence from The edit distance gives an indication of how `close' `Genebank' can be found [here]. 3. D.

See also exact. some home-grown address translation will be needed to program the DPA defined above. Modify the edit distance DPA to that it treats a transposition as a single point-mutation. and requires arrays or strings to indexed from zero upwards.where d is the edit-distance between the two strings. as opposed to approximate. 2. . 7. More research information on "the" DPA and Bioinformatics [here]. If your programming language does not support 2-dimensional arrays. Exercises 1. 5. (sub-)string [matching]. Give a DPA for the longest common subsequence problem (LCS). 6.