You are on page 1of 64

Data Structures and Algorithm Analysis

Dr. Malek Mouhoub

Computer Science Department University of Regina Fall 2010

Malek Mouhoub, CS340 Fall 2010

1

1. Algorithm Analysis

1. Algorithm Analysis

• 1.1 Mathematics Review • 1.2 Introduction to Algorithm Analysis • 1.3 Asymptotic notation and Growth of functions • 1.4 Case Study • 1.5 Data Structures and Algorithm Analysis

Malek Mouhoub, CS340 Fall 2010

2

1.1 Mathematics Review

1.1 Mathematics Review

§ ¤ Exponents ¦ ¥ XA XB XA XB (X A )B XN + XN 2N + 2N X A+ B X A− B X AB 2N +1

= = = =

= 2 X N = X 2N

Malek Mouhoub, CS340 Fall 2010

3

A. • X A = B ⇔ logX B = A • logA B = logC B logC A . B. A = 1 • log AB = log A + log B . CS340 Fall 2010 . logarithms used in this course are to the base 2. 024 = Malek Mouhoub. C > 0. log 2 = 1.1. B > 0 • log A/B = log A − log B • log (AB ) = B log A • log X < X ∀ X > 0 • log 1 = 0. A. 048.1 Mathematics Review § ¤ Logarithms ¦ ¥ • By default. log 1. 576 = 4 . log 1.

1. CS340 Fall 2010 5 .1 Mathematics Review £   ¢Summations ¡ • N i=1 ai = a1 + a2 + · · · + aN N i=1 • limN →∞ • Linearity – – N i=1 N i=1 ai = ∞ i=1 ai = a1 + a2 + · · · (infinite sum) N i=1 N i=1 (cai + dbi ) = c Θ(f (i)) = Θ( N ai + d bi i=1 f (i)) • General algebraic manipulations : – – N i=1 N i=n0 f (N ) = N f (N ) f (i) = N i=1 f (i) − n0 −1 i=1 f (i) Malek Mouhoub.

CS340 Fall 2010 6 .1 Mathematics Review   £ ¢Summations ¡ • Geometric series : – – if 0 N i=0 Ai = AN +1 −1 A− 1 N i=0 < A < 1 then Ai ≤ 1 1− A • Arithmetic series : – – – N i=1 N i=1 N i=1 i= N (N +1) 2 = N 2 +N 2 ≈ N2 2 i2 = ik ≈ N (N +1)(2N +1) 6 N k+1 |k+1| ≈ N3 3 k = −1 N 1 i=1 i ∗ if k = −1 then HN = ≈ loge N ∗ error in approx : γ ≈ 0. 57721566 Malek Mouhoub.1.

1 Mathematics Review £   ¢Products ¡ • N i=1 ai = a1 × a2 × · · · × aN N i=1 • log( ai ) = N i=1 log ai Malek Mouhoub. CS340 Fall 2010 7 .1.

1.1 Mathematics Review

§ ¤ Proving statements ¦ ¥
• Proof by induction
1. Proving a base case : establishing that a theorem is true for some small values. 2. Inductive hypothesis : the theorem is assumed to be true for all cases up to some limit

k.
3. Given this assumption, show that the theorem is true for k

+1

• Proof by Counter example : find an example showing that the theorem is not true. • Proof by Contradiction : Assuming that the theorem is false and showing that this
assumption implies that some known property is false, and hence the original assumption was erroneous.

Malek Mouhoub, CS340 Fall 2010

8

1.2 Introduction to algorithm analysis

1.2 Introduction to algorithm analysis

Boss gives the following problem to Mr Dupont, a fresh hired BSc in computer science (to test him . . . or may be just for fun) :

T (1) T (2)
What is T (100) ?

= 3 = 10

T (n) = 2T (n − 1) − T (n − 2)

Malek Mouhoub, CS340 Fall 2010

9

1.2 Introduction to algorithm analysis

Mr Dupont decides to directly code a recursive function tfn, in Java, to solve the problem : if (n==1) { return 3; } else if (n==2) {return 10;} else { return 2 ∗ tfn(n-1) - tfn(n-2); } 1. First mistake : no analysis of the problem

⇒ risk to be fired !
2. Second mistake : bad choice of the programming language

⇒ increasing the risk to be fired !

Malek Mouhoub, CS340 Fall 2010

10

CS340 Fall 2010 11 . .19 seconds waits . if (n==2) return 10. and then kills the program ! n = 35 n = 50 → → it takes only 1. → → → → → 3 10 17 it takes 4.tfn(n-2).2 Introduction to algorithm analysis n=1 n=2 n=3 n = 35 n = 100 Mr Dupont decides then to use C : if (n==1) return 3. return 2 ∗ tfn(n-1) . .25 seconds waits and then kills the program ! Malek Mouhoub.1.

62. Mr Dupont decides to (experimentally) analyze the problem : he times both programs and plots the results. 995 years!!! Malek Mouhoub. so for n = 40 = 100 he estimates : 13. CS340 Fall 2010 12 .79 seconds.1.1 25 30 35 40 N It seems that each time n increases by 1 the time increases by ≈ 1. 100 seconds 10 Java C 1 0. 627.6260 ≈ 1.79 × 1. At n the C program took 13.2 Introduction to algorithm analysis Finally.

1. t[1]=3. CS340 Fall 2010 13 . return t[n].i¡=n. • After consulting his course notes.i++) t[i] = 2 ∗t[i-1] . t[2]=10.2 Introduction to algorithm analysis • Mr Dupont remembers he has seen this kind of problem in one of the courses he has taken (data structures course). Mr Dupont decides to use Dynamic Programming : int t[n+1].t[i-2]. Malek Mouhoub. for(i=3.

Malek Mouhoub. • but for n = 10. 000 (a test that may make the boss happy .1. CS340 Fall 2010 14 . if it succeeds) a segmentation fault occurs. . Too much memory required.2 Introduction to algorithm analysis This solution provides a much better complexity in time but despite of the space complexity : • n = 100 takes only a fraction of a second. . 000.

00 seconds – At n = 200.last.99 seconds Malek Mouhoub. } return current. for (i=3.000.000 it takes 3.99 seconds – at n = 300.000.2 Introduction to algorithm analysis • Mr Dupont analyses the problem again : – there is no reason to keep all the values.000 it takes 8. last = temp. current = 2∗current . – At n = 100. last = 3.1.i++) { temp = current.000.i<=n. CS340 Fall 2010 15 .000 it takes 5. current = 10. only the last 2 : if (n==1) return 3.

Code : Malek Mouhoub. (a) First look at the problem : T (1) T (2) T (3) T (4) T (5) Each step increases the result by 7. CS340 Fall 2010 16 .2 Introduction to algorithm analysis How to solve such problems ? 1. (b) Guess : = = = = = 3 10 17 24 31 T (n) = 7n − 4 return 7∗n-4 (c) Proof by induction 2.1. Analyze the problem on paper in order to find an efficient algorithm in terms of time and memory space complexity.

Is the last version really the ultimate solution ? Malek Mouhoub.1. How fast are the 3 programs asymptotically ? 3. CS340 Fall 2010 17 . What makes the first program so slow ? 2.2 Introduction to algorithm analysis An algorithm analyst might ask : 1.

4 3 2 1 2 Here each circle represents one call to the routine tfn.2 Introduction to algorithm analysis Let us look at the recursion tree for the first program at n=4. So. Malek Mouhoub. a call to tfn(n) requires a recursive call to tfn(n-1)(represented by the shaded region on the left) and a call to tfn(n-2)(shaded region on the right). In general. for n=4 there are 5 such calls. CS340 Fall 2010 18 .1.

CS340 Fall 2010 19 . Malek Mouhoub.618 This answers question (1). then : f (n) f (1) = = f (n − 1) + f (n − 2) + 1 f (2) = 1 This is a version of the famous Fibonacci recurrence.618n . This agrees very well with the times we presented earlier where each increase of n by 1 increases the time by a factor of a little under 1.62.1. We say such growth is exponential with asymptotic growth rate O (1.2 Introduction to algorithm analysis If we let f (n) represent the number of calls to compute T (n). It is known that f (n) ≈ 1. n ).

i++) This loop contained two or three assignments.2 Introduction to algorithm analysis In the second and third program there was a loop for (i=3. This means that running time is proportional to n. We say such a loop takes O (n) time. Malek Mouhoub.1. The last program has one multiplication and one subtraction and takes O (1) or constant time. This answers question (2).i<=n. a multiplication and a subtraction. CS340 Fall 2010 20 . Recall that increasing n from 100 million to 300 million increased the time from approximately 3 to approximately 9 seconds.

• The only alternative is to use a method to represent and manipulate large integers. • Switching to a floating point representation would be of no value since we need to maintain all the significant digits in our results. CS340 Fall 2010 21 .2 Introduction to algorithm analysis • The answer to the last question is also NO. If the boss asked for T (123456789879876543215566340014733134213) we would get integer overflow on most computers. Malek Mouhoub.1.

where each array slot stores one digit. • A simple algorithm for multiplication requires a quadratic-time cost.1. • The addition and subtraction require a linear-time algorithm. Malek Mouhoub.2 Introduction to algorithm analysis • A natural way to represent a large integer is to use an array of integers. CS340 Fall 2010 22 .

is more of a computer science question. counting array assignments in the second case.2 Introduction to algorithm analysis The third question. counting variable assignments in the third. Is it really true that you can multiply two arbitrary large numbers together in constant time ? 3. CS340 Fall 2010 23 . Is the last program really the ultimate one ? Malek Mouhoub. A Computer Scientist might ask : 1. and counting arithmetic operations in the last ? 2.1. How do you justify counting function calls in the first case. “is the last program the ultimate solution”.

which aspects should we focus on first ? Malek Mouhoub. and in what situations ? • Which models should we use to analyze algorithms in practice ? • When trying to improve the efficiency of a given program.1. CS340 Fall 2010 24 .2 Introduction to algorithm analysis CS questions with an engineering orientation : • What general techniques can we use to solve computational problems ? • What data structures are best.

N . – the quality of the program that implements the algorithm. or T (N ) = f (N ) where N is in general a natural number. – the basic fundamentals of the algorithm • Typically. – the quality of the compiler and optimizer. is a function of the amount of input. • The exact value of the function depends on : – the speed of the machine.3 Asymptotic notation and Growth of functions 1. Thus the running time T .3 Asymptotic notation and Growth of functions • Running time of an algorithm almost always depends on the amount of input : more inputs means more time. CS340 Fall 2010 25 .1. the last item is most important. Malek Mouhoub.

3 Asymptotic notation and Growth of functions Worst-case versus Average-case • Worst-case running time is a bound over all inputs of a certain size N .1. CS340 Fall 2010 26 . (Guarantee) • Average-case running time is an average over all inputs of a certain size N . (Prediction) Malek Mouhoub.

3 Asymptotic notation and Growth of functions Θ-notation For a given function g (n). we denote by Θ(g (n)) the set of functions Θ(g (n)) = {f (n) : ∃ c1 .1. and n0 such that 0 ≤ c1 g (n) ≤ f (n) ≤ c2 g (n) for all n ≥ n0 } We say that g (n) is an asymptotically tight bound for f (n). CS340 Fall 2010 27 . Example : The running time of insertion sort is T (n) = Θ(n2 ). Malek Mouhoub. c2 .

3 Asymptotic notation and Growth of functions Ω-notation For a given function g (n). Malek Mouhoub.1. CS340 Fall 2010 28 . we denote by Ω(g (n)) the set of functions : Ω(g (n)) = {f (n) : ∃ c and n0 such that 0 ≤ cg (n) ≤ f (n) for all n ≥ n0 } We say that g (n) is an asymptotic lower bound for f (n).

Malek Mouhoub.1. in general. used informally to describe asymptotically tight bounds (Θ-notation).3 Asymptotic notation and Growth of functions Big-Oh notation For a given function g (n). Note that O -notation is. we denote by O (g (n)) the set of functions : O(g (n)) = {f (n) : ∃ c and n0 such that 0 ≤ f (n) ≤ cg (n) for all n ≥ n0 } We say that g (n) is an asymptotic upper bound for f (n). CS340 Fall 2010 29 .

Malek Mouhoub.3 Asymptotic notation and Growth of functions Big-Oh notation • Exponential : dominant term is some constant times 2N . • Constant : c. Note : Big-Oh ignores leading constants. We say O(N ). CS340 Fall 2010 30 . We say O(N 3 ). • O(N log N ) : dominant term is some constant times N log N . • Linear : dominant term is some constant times N . • Cubic : dominant term is some constant times N 3 . • Quadratic : dominant term is some constant times N 2 . • Logarithmic : dominant term is some constant times log N.1. We say O(N 2 ).

• For large N .500.000 – Error in estimate is 0. • For N=10000 : – Actual value is 1. dominant term is usually indicative of algorithm’s behavior.010.000. typically programs on small inputs run so fast we don’t care anyway.1. • For small N . CS340 Fall 2010 31 . BUT.000.000 – Estimate is 1.3 Asymptotic notation and Growth of functions Dominant Term Matters • Suppose we estimate 35N 2 + N + N 3 . Malek Mouhoub. which is negligible.000. dominant term is not necessarily indicative of behavior.003.35%.

CS340 Fall 2010 32 .1.3 Asymptotic notation and Growth of functions Example 1 : Computing the Minimum • Minimum item in an array – Given an array of N items. • Running time is O(N ) (linear) because we repeat a fixed amount of work for each element in the array. • A linear algorithm is a good as we can hope for because we have to examine every element in the array. Malek Mouhoub. find the smallest. • Obvious algorithm is sequential scan. a process that requires linear time.

• N (N − 1)/2 pairs of points. Malek Mouhoub. so the algorithm is quadratic. and retain the minimum distance. • Fundamental problem in graphics. • Better algorithms that use more subtle observations are known. find the pair of points that are closest together).1. CS340 Fall 2010 33 . an x-y coordinate system.3 Asymptotic notation and Growth of functions Example 2 : Closest Points • Closest Points in the Plane – Given N points in a plane (that is. • Solution : Calculate the distance between each pair of points.

This is a cubic algorithm.1. • Solution : enumerate all groups of three points. for each possible triplet of three points check if the points are co-linear. CS340 Fall 2010 34 . Malek Mouhoub.3 Asymptotic notation and Growth of functions Example 3 : Co-linear Points in the Plane • Co-linear points in the plane – Given N points in the plane. • Important in graphics : co-linear points introduce nasty degenerate cases that require special handling. determine if any three form a straight line.

– Some algorithms much.4 Case Study 1.1. Malek Mouhoub. CS340 Fall 2010 35 .4 Case Study • Examine a problem with several different solutions. much faster (or slower) than others. – Will look at four algorithms – Some algorithms much easier to code than others. – Some algorithms much easier to prove correct than others.

CS340 Fall 2010 36 . • Examples : – -2. -4. AN find (and identify the sequence corresponding to) the maximum value of (Ai + Ai+1 + · · · + Aj ). -4. A2 .4 Case Study The problem • Maximum Contiguous Subsequence Sum Problem – Given (possibly negative integers) A1 . • The maximum contiguous subsequence sum is zero if all the integers are negative. -3. .1. . 2 – 1. -1. -2. 13. 6 Malek Mouhoub. 4. . 11. .

if (ThisSum > MaxSum) MaxSum = ThisSum.size(). for (int i=0. i<A. k++) ThisSum += A[k]. } Malek Mouhoub. j <A. i++) for (int j=i. for (int k=i.size(). } return MaxSum. k<=j. CS340 Fall 2010 37 .1.4 Case Study Brute Force Algorithm int MaxSubSum1(const vector<int> & A) { int MaxSum =0.j++) { int ThisSum = 0.

Malek Mouhoub. or cubic algorithm.4 Case Study Analysis • Loop of size N inside of loop of size N inside of loop of size N means O (N 3 ). CS340 Fall 2010 38 .1. • Slight over-estimate that results from some loops being of size less than N is not important.

000. Malek Mouhoub. (Actual was 449 seconds). estimate 449000 seconds (6 days).47 seconds on a particular computer. actual time is 0.1. • For N=10.4 Case Study Actual Running time • For N = 100.000. CS340 Fall 2010 39 . • For N=1000. estimate an actual time of 470 seconds. • Can use this to estimate time for larger inputs : T (N ) = cN 3 T (10N ) = c(10N )3 = 1000cN 3 = 1000T (N ) • Inputs size increases by a factor of 10 means that running time increases by a factor of 1.

1. • ThisSum for next j is easily obtained from old value of ThisSum : – Need Ai + Ai+1 + · · · + Aj −1 + Aj + Ai+1 + · · · + Aj −1 – Just computed Ai – What we need is what we just computed +Aj . Malek Mouhoub. • Here it is : innermost loop is unnecessary because it throws away information.4 Case Study How to improve • Remove a loop. CS340 Fall 2010 40 . not always possible.

size(). for (int j=i. } } return MaxSum. } Malek Mouhoub. i < A.size(). for (int i=0.1.4 Case Study The Better Algorithm int MaxSubSum2(const vector<int> & A) { int MaxSum = 0. if (ThisSum > MaxSum) MaxSum = ThisSum. CS340 Fall 2010 41 . j++) { ThisSum += A[j]. j< A. i++) { int ThisSum =0.

1. • Recall that the cubic algorithm was not practical for this amount of input. • As we will see.4 Case Study Analysis • Same logic as before : now the running time is quadratic. or O(N 2 ). this algorithm is still usable for inputs in the tens of thousands. Malek Mouhoub. CS340 Fall 2010 42 .

estimate 111 seconds (=actual). actual time is 0. (Actual was 1. CS340 Fall 2010 43 . • Can use this to estimate time for larger inputs : T (N ) = cN 2 T (10N ) = c(10N )2 = 100cN 2 = 100T (N ) • Inputs size increases by a factor of 10 means that running time increases by a factor of 100. Malek Mouhoub. • for N = 1000.12 seconds).11 seconds.4 Case Study Actual running time • For N = 100.1. • For N = 10.0111 seconds on the same particular computer. 000. estimate a running time of 1.

• Running time is proportional to amount of input. Hard to do better for an algorithm.4 Case Study Linear Algorithms • Linear algorithm would be best. CS340 Fall 2010 44 . then so does running time.1. Malek Mouhoub. • If inputs increases by a factor of ten.

4 Case Study Recursive algorithm • Use a divide-and-conquer approach. CS340 Fall 2010 45 . and use the maximum. ends somewhere in the second half. • First two possibilities easily computed recursively. goes to the last element in the first half. • Compute all three possibilities. • The maximum subsequence either – lies entirely in the first half – lies entirely in the second half – starts somewhere in the first half. continues at the first element in the second half. Malek Mouhoub.1.

3.4 Case Study Computing the third case • Idea : 1. Find the largest sum in the first half.1. use a right-to-left scan starting at the last element in the first half. do a left-to-right scan. starting at the first half. Add the 2 sums together. – For maximum sum that starts in the first half and extends to the last element in the first half. CS340 Fall 2010 46 . that includes the last element in the first half. 2. Malek Mouhoub. – For the other maximum sum. • Implementation : – Easily done with two loops. Find the largest sum in the second half that includes the first element in the second half.

CS340 Fall 2010 47 . • T (N ) = 2T (N/2) + N – Two recursive calls. • Then T (1) = 1 (1 will be the quantum time unit. because we will throw out the constants eventually. – Case three takes O (N ) time.4 Case Study Analysis • Let T (N ) = the time for an algorithm to solve a problem of size N . each of size N/2. Malek Mouhoub. constants don’t matter). The time to solve each recursive call is T (N/2) by the above definition.1. we use N .

4 Case Study Bottom Line T (1) = T (2) = T (4) = T (8) = T (16) = T (32) = T (N ) = Malek Mouhoub.1. CS340 Fall 2010 1=1∗1 2 ∗ T (1) + 2 = 4 = 2 ∗ 2 = 21 ∗ 2 2 ∗ T (2) + 4 = 12 = 4 ∗ 3 = 22 ∗ 3 2 ∗ T (4) + 8 = 32 = 8 ∗ 4 = 23 ∗ 4 2 ∗ T (8) + 16 = 80 = 16 ∗ 5 = 24 ∗ 5 2 ∗ T (16) + 32 = 192 = 32 ∗ 6 = 25 ∗ 6 2k ∗ (k + 1) = N (1 + log N ) = O(N log N ) 48 T (64) = 2 ∗ T (32) + 64 = 448 = 64 ∗ 7 = 26 ∗ 7 .

4 Case Study N log N • Any recursive algorithm that solves two half-sized problems and does linear non-recursive work to combine/split these solutions will always take O (N will always hold. There is a linear-time algorithm for this problem. CS340 Fall 2010 49 . The running time is clear. Malek Mouhoub. but the correctness is non-trivial. • It is still not as good as O(N ).1. but is not that far away either. log N ) time because the above analysis • This is a very significant improvement over quadratic.

1. thisSum = 0.4 Case Study The Linear-time algorithm /** Linear-time maximum contiguous subsequence. /* 4*/ if( thisSum > maxSum ) /* 5*/ maxSum = thisSum.} /* 8*/ return maxSum. /* 2*/ for( int j = 0. CS340 Fall 2010 50 . sum algorithm */ int maxSubSum4( const vector<int> & a ) { /* 1*/ int maxSum = 0. /* 6*/ else if( thisSum < 0 ) /* 7*/ thisSum = 0. j++ ) { /* 3*/ thisSum += a[ j ].} Malek Mouhoub.size( ). j < a.

• Examples : – – – – log 32 = 5 (because 25 = 32) log 1024 = 10 log 1048576 = 20 log 1billion = about 30 • The logarithm grows much more slowly than N . N > 0. it defaults to 2 in computer science. CS340 Fall 2010 51 . logB N = K if B K = N – If the base B is omitted.4 Case Study The Logarithm • Formal Definition – For any B. and slower √ than N . Malek Mouhoub.1.

Repeated halving : Starting from X Answer to all of the above is log N (rounded up). how many times ˙ should X be doubled before it is at least as large as N ? = N . Malek Mouhoub. CS340 Fall 2010 52 .1.4 Case Study Examples of the Logarithm Bits in a binary number : how many bits are required to represent N consecutive integers ? Repeated doubling : starting from X = 1. if N is repeatedly halved. how many iterations must be applied to make N smaller than or equal to 1 ? (Halving rounds up).

CS340 Fall 2010 53 . Since B must be an integer.4 Case Study Why log N • B bits represents 2B integers. Malek Mouhoub.1. Thus 2B is at least as big as N . round up if needed. • Same logic for the other examples. so B is at least log N .

CS340 Fall 2010 54 . • Reason : there will be log N iterations of constant work. Malek Mouhoub.4 Case Study Repeated Halving Principle • An algorithm is O(log N ) if it takes constant time to reduce the problem size by a constant fraction (which is usually 1/2).1.

∗ Worst case : O(N ). • If input array is not sorted. The array A is not altered. If X occurs more than once. CS340 Fall 2010 55 . • Can we do better if we know the array is sorted ? Malek Mouhoub. Running times : – Unsuccessful search : – Successful search : O(N ). every item is examined.4 Case Study Static Searching • Given an integer X and an array A.1. ∗ Average case :O(N/2). solution is to use a sequential search. return the position of X in A or an indicator that it is not present. half the items are examined. every item is examined. return any occurrence.

4 Case Study Binary Search • Yes ! use a binary search. then look in the subarray to the right of the middle. Malek Mouhoub. • Look in the middle : Case 1: If X is less than the item in the middle. CS340 Fall 2010 56 . Case 2: If X is greater than the item in the middle. Case 3 : If X is equal to the item in the middle. then look in the subarray to the left of the middle. • This is logarithmic by the repeated halving principle. then we have a match. X is not found.1. Base Case : If the subarray is empty.

• In this course we examine different data structures. – Find : O(log N ) time per operation.4 Case Study Binary Search Continued • Binary search is an example of a data structure implementation : – Insert : O(N ) time per operation. because we must slide and maintain the array in sorted order. via binary search. because we must insert O(N ) time per operation. CS340 Fall 2010 57 . and Find.1. Malek Mouhoub. Delete. but Find and Delete are usually restricted. – Delete : elements that are to the right of the deleted element over one spot to maintain contiguity. Generally we allow Insert.

1. CS340 Fall 2010 A hierarchical tree 58 .5 Data Structures and Algorithm Analysis 1.5 Data Structures and Algorithm Analysis A scalar item A sequential vector A linked list A n-dimentional space Malek Mouhoub.

or by priority (priority queues) ? Duplicates : are duplicates allowed ? Boundedness : is the object bounded in size or unbounded ? Can the bound change or it is fixed at creation time ? Associative access : are elements retrieved by an index or key ? Is the type of the index built-in (e. acyclic..1. graphs.g. the kinds of distinguishing properties include : Ordering : are elements ordered or unordered ? If ordering matters. • In this course we used entities that are structured objects (e. • When determining its type. is the order partial or total ? Are elements removed FIFO (queues). hierarchical. as for symbol tables and hash tables) ? Shape : is the structure of the object linear. as for sequences and arrays) or user-definable (e. LIFO (stacks). n-dimensional.g.g. an object that is a collection of other objects). or arbitrarily complex (e. CS340 Fall 2010 59 . forests) ? Malek Mouhoub.g.5 Data Structures and Algorithm Analysis • The most important property to express of any entity in a system is its type.

• Implementation of an ADT : [how to do it ?] – How are the objects and operations implemented. CS340 Fall 2010 60 .5 Data Structures and Algorithm Analysis Abstract Data Type (ADT) • Set of data together with a set of operations.1. Malek Mouhoub. ⇒ use the C++ class. • Definition of an ADT : [what to do ?] – Definition of data and the set of operations (functions).

Malek Mouhoub.1.5 Data Structures and Algorithm Analysis Array Implementation of Lists • Contiguous allocation of memory to store the elements of the list. • O(N ) for find. CS340 Fall 2010 61 . ⇒ building a list by N successive inserts would require O(N 2 ) in the worst case. • Estimation (overestimation) of the maximum size of the list is required ⇒ waste of memory space. constant time for findKth • But O(N ) is required for insertion and deletion in the worst case.

1.5 Data Structures and Algorithm Analysis findKth(3)=52 List 1 2 3 4 5 34 12 52 16 22 1 2 3 4 5 find(52)=3 34 12 52 16 22 1 2 3 4 5 remove(34) 12 52 16 22 1 2 3 4 5 insert(1.34) 34 12 52 16 12 n findKth=List[Kth] O(C) n find(X): O(n) n removeKth: O(n) remove(X):O(n) n insert(kth. CS340 Fall 2010 62 .X) O(n) Figure 1: Contiguous allocation of memory to store the elements of the list Malek Mouhoub.

CS340 Fall 2010 63 . • O(N ) for find • O(N ) for findKth (but better time in practice if the calls to findKth are in sorted order by the argument).5 Data Structures and Algorithm Analysis Linked Lists • Non contiguous allocation of memory.1. Malek Mouhoub. • Constant time for insertion and deletion.

1.5 Data Structures and Algorithm Analysis Head 1 2 3 4 5 6 7 8 Data 34 52 22 12 16 Link 5 Head 7 \ 3 4 34 12 52 16 22 List = 34 12 52 16 22 printList():O(n) find(x):O(n) findKth(i):O(i) Figure 2: Non contiguous allocation of memory Malek Mouhoub. CS340 Fall 2010 64 .