You are on page 1of 48

AST20105

Data Structures & Algorithms


Chapter 3 - Design Paradigms and Complexity Analysis
Common Algorithm Design Paradigms
Brute-Force:

● a very general problem-solving


technique and algorithmic paradigm
that consists of systematically
enumerating all possible candidates for
the solution and checking whether each
candidate satisfies the problem
statement.

2
Common Algorithm Design Paradigms
Divide and Conquer:

● is an algorithm design paradigm based


on multi-branched recursion;
● a divide-and-conquer algorithm works
by recursively breaking down a problem
into two or more sub-problems of the
same or related type, until these
become simple enough to be solved
directly.

3
Common Algorithm Design Paradigms
Greedy:

● follows the problem-solving heuristic of Use least number of coin to represent 7 cents...
making the locally optimal choice at
each stage;
● does not usually produce an optimal
solution, but nonetheless a greedy
heuristic may yield locally optimal
solutions that approximate a globally
optimal solution in a reasonable
amount of time.

4
Common Algorithm Design Paradigms
Dynamic Programming:

● is a method for solving a complex


problem by breaking it down into a
collection of simpler subproblems,
solving each of those subproblems just
once, and storing their solutions using a int fib_nth_itr(int n){
int fprev = 1, fcurr = 1, fnext = 1;
memory-based data structure (array, for (int i = 2; i <= n; i++){
map, etc). fnext = fcurr + fprev; //fcurr and fprev are known
fprev = fcurr;
fcurr = fnext;
}
return fnext;
} 5
Algorithm (Complexity) Analysis

6
Computational Complexity
● Writing a working programme is not sufficient as the programme may not be efficient.
● If an inefficient programme applies on a huge set of data, problem may arise.
● Inefficiency may not matter when handling small dataset but it will become an
headache when dataset is huge.
● To assess a programme’s efficiency, we use a term computational complexity to
measure the degree of difficulty of an algorithm.

7
Computational Complexity
● Indicates how much effort is required to apply an algorithm (operation) or how costly it
is.
● In this course, we emphasize two efficiency criteria:
○ Time
○ Space
● The factor of time is usually more important than space. So, efficiency considerations
usually focuses on the relationship between time elapsed and amount of data to be
processed.

8
Computational Complexity
● Things to be reminded about complexity measurement
○ System-dependent:
■ There is no point to compare a single algorithm between a slow and a
supercomputer.
■ Therefore, we compare various choices of algorithm under the same system
environment.
○ Language-dependent:
■ Different programming language leads to different result of complexity
measurement.
■ A compiled programme runs faster than a programme requiring
interpretation (e.g. web-based).
9
Computational Complexity
● Things to be reminded about complexity measurement (cont’)
○ Actual time elapsed is not important
■ Knowing exact how much time taken for a particular algorithm handling a
fixed size set of data is not important.
■ Rather, we’d prefer to approximate a relationship between the size, n, of
data and time, t, required to process.

10
Study complexity
void Linear(long n){ Time vs Data Size
long count = 0;
for(long i = 0; i < n; i++){

Time (micro sec)


count++;
}
} Data Size (n) Time (t)

1000 1 f(n) = n

2000 2
Data Size (n)
4000 4

16000 19

11
Study complexity
void Quad(long n){
long count = 0; Time vs Data Size
for(long i = 0; i < n; i++){
for(long k = 0; k < n; k++)
count++;

Time (micro sec)


}
}
Data Size (n) Time (t)

1000 1255
f(n) = n2
2000 5332

4000 19556
Data Size (n)
16000 299189
12
Study complexity Time vs Data Size

void Logar(long n){


long count = 0;

Time (micro sec)


for(long i = 1; i < n; i*=2){
s1; //s1 refers to statements opers;
}
} f(n) = log (n)

Data Size (n) Time (t)

1000 1749

32000 2460

128000 3491

256000 3546
13
Study complexity
# of opers vs Data Size
for(long i = 1; i < n; i*=2){
for(long p = 0; p < n; p++)
count++;
}

# of operations
f(n) = n lg (n)
Data Size (n) # of opers

1000 10000

3000 36000 f(n) = n

6000 78000

9000 126000

30000 450000
14
Study complexity
# of
Time
opers
vs vs
Data
Data
Size
Size
for(long i = 0; i < n; i++){
for(long p = 1; p < n; p*=2)
count++;

of operations
(micro sec)
}
f(n) = n lg (n)
Data Size (n) # of opers

#Time
1000 10000

3000 36000 f(n) = n

6000 78000

9000 126000

30000 450000
15
Questions!
● Do we need to plot graphs for every programme?
● What happen when programmes get more complicated?
● Are there any ways to approximate the complexity at the stage of programme design?
● Let’s take a look the following equation:

f(n) = 2n2 + n + 45

● Which term (n2, n, or 45) will dominate f(n) when n becomes very large?

n2
● When n becomes large enough, term n and term 45 become negligible.
16
Remember!
● It is difficult (and pretty stupid) to find an equation that exactly describe the growing
curve of an algorithm.
● What we actually need to know is the TREND of its growth.
● To make our lives easier, we approximate the time complexity of an algorithm instead.
● We could do this because if the size of data, n, becomes very large. Some terms become
insignificant in terms of its contribution in the equation. Therefore, we will only focus
on the most important term.
● Since all we wish to know is the trend, the coefficient of the most important term could
also be ignored.
○ For example f1(n) = n and f2(n) = 3n, both of these functions have the same trend
(linear growth) regardless its coefficient.
17
Complexity Notation
(Big-O, Big-𝛀, Big-ϴ)

18
Order of Growth
● The running time of an algorithm can be described as a function of n, f(n).
● To establish a relative order among functions for LARGE n, several asymptotic notations
could be used:
○ Big-O - class of functions f(n) that grows no faster than g(n).
○ Big-Omega(𝛀) - class of functions f(n) that grows at least as fast as g(n).
○ Big-Theta(ϴ) - class of functions f(n) that grows at the same rate as g(n).
○ g(n) is a function that could be generally described and understood.
■ For instance, g(n) = n, g(n) = n2, g(n) = lg n, g(n) = n lg n, etc.

19
Asymptotic Notation: Big-O
● f(n) = O(g(n)) The growth rate of f(n) is
less than or equal to the growth rate of
g(n).
● There are positive constant c and n0
such that
f(n) ≤ cg(n), when n ≥ n0
● g(n) is an upper bound of f(n)
● Worst-case scenario!

20
Rules for Big-O
● When considering the growth rate of a function using Big-O:
○ Ignore the lower order terms
○ Ignore the coefficients of the highest order term
○ Not necessary to specify the base of logarithm
● Since the base change only change the value of logarithm by a constant factor, recall:

● If f1(n) = O(g(n)) and f2(n) = O(h(n)), then:


○ f1(n) + f2(n) = max( O(g(n)), O(h(n)))
○ f1(n) * f2(n) = O(g(n) * h(n))

21
Asymptotic Notation: Big-Omega
● f(n) = 𝛀(g(n)) The growth rate of f(n) is
greater than or equal to the growth
rate of g(n).
● There are positive constant c and n0
such that
f(n) ≥ cg(n), when n ≥ n0
● g(n) is an lower bound of f(n)

22
Asymptotic Notation: Big-Theta
● f(n) = ϴ(g(n)) The growth rate of f(n) is
the same growth rate of g(n).
● f(n) = ϴ(g(n)) if and only if:
○ f(n) = O(g(n)) and
○ f(n) = 𝛀(g(n))

23
Common Growth Rates

24
25
Algorithm Analysis
● Simple statement sequence (block)
○ A sequence of statements which is executed once only is O(1), constant time.
○ It does not matter how many statements are in the sequence - only that the
number of statements (or the time that they take to execute) is constant for all
problems.
● For example:
int x = 5;
Take constant time, c. So, the time complexity for
x = 25;
this block is O(1).
cout << x << endl;

26
Algorithm Analysis
● Simple loop
for (i = 0; i < n; i++){ n # of operations
//takes O(1) time
s; 1 1 =1
}
2 2 =2
● where s is an O(1) sequence of statements,
then the time complexity is n * O(1) → O(n). 3 3 =3

4 4 =4

n = n opers f(n) = O(n)

27
Algorithm Analysis
● Nested loop
for (i = 0; i < n; i++){
for (j = 0; j < n; j++){
//takes O(1) time
n # of operations
s; 1 1 =1
}
2 4 = 22
● We have n repetitions of an O(n)
3 9 = 32
sequence, giving a complexity of
n * O(n) → O(n2). 4 16 = 42

n = n2 opers f(n) = O(n2)

28
Algorithm Analysis
● Simple loop
for (i = 0; i < m; i++){
//takes O(1) time O(m)
s1;
} O(m + n), if m = n,
then O(2n) → O(n).
for (i = 0; i < n; i++){
O(n)
//takes O(1) time
s2;
}

29
Algorithm Analysis
Let study the following code segment:
n # of operations
int count = 0;
for(long i = 1; i < n; i=i*2){ 1 0 = log220
// taking O(1) time
s; 2 1 = log221
}
4 2 = log222

8 3 = log223

16 4 = log224

n = log22n f(n) = O(lg n)

30
Algorithm Analysis
Let study the following code segment:
n # of operations
int count = 0;
for(long i = 0; i < n; i++){ 1 0 = 1 x log220
for(long p = 1; p < n; p*=2)
// taking O(1) time 2 2 = 2 x log221
s;
4 8 = 4 x log222
}
8 24 = 8 x log223

16 64 = 16 x log224

32 160 = 32 x log225

n = n x log22n f(n) = O(n lg n) 31


Algorithm Analysis
Let study the following code segment:
The inner loop for (p = 0;... get executed i
int count = 0; times, so the total is:
for(long i = 0; i < n; i++){
for(long p = 0; p < i; p++)
// taking O(1) time and the complexity is O(n2)
s;
}

# of opers: 0+1+2+3+4+...+n

32
Algorithm Analysis
Let study the following code segment: There are log2n iterations of the outer loop
and the inner loop is O(n).
h = n;
while(h > 0){ So the overall complexity is O(n lg n).
for(i = 0; i < n; i++)
s;
h = h/2;
}

33
Algorithm Analysis
Let study the following code segment:
for (i = 4; i < n; i++) {
for (j = i - 3, sum = a[i - 4]; j <= i; j++)
sum += a[j];
cout << "sum for subarray "<<i-4<<" through "<< i <<" is "
<<sum<<endl;
}

The outer loop is executed n - 4 times. For each i, the inner loop is executed only 4 times.
Therefore, the complexity is (n-4) * O(1) → O(n).

34
Algorithm Analysis
Let study the following code segment:
for (i = 0, length = 1; i < n - 1; i++) {
for (i1 = i2 = k = i; k < n - 1 && a[k] < a[k+1]; k++, i2++)
if (length < i2 - i1 + 1)
length = i2 – i1 + 1;
}
● If all numbers in the array are in decreasing order, the outer loop is executed n - 1 times. The
inner loop just executes one time (validation purpose). Thus, the algorithm is O(n).

● If all numbers in the array are in increasing order, the outer loop is executed n - 1 times. The
inner loop executes (n - 1 - i) times for each i 𝜖 {0, … , n - 2}. Thus, the algorithm is O(n2).

● We may conclude the time complexity T(n) above algorithm: T(n) = 𝛀(n) and T(n) = O(n2). 35
Algorithm Analysis
Let study the following code segment:
int sum(int n) {
int partialSum;
partialSum = 0; // Line 1: 1 operation
for(int i=1; i<=n; i++) // Line 2: 3n+2 operations
partialSum += i*i*i; // Line 3: 4 operations
return partialSum; // Line 4: 1 operation
}
● Line 1 and 4: 1 unit each (assignment)
cost = 2
● Line 3: Executed n times, each time 4 units
cost = 4n
(2 multiplication, 1 addition, 1 assignment)
● Line 2: 1 for initialization, n+1 for checking, 2n for increment cost = 3n + 36
2
Algorithm Analysis
Let study the following code segment:
for(i=0; i<n; i++)
O(n)
arr[i] = 0;
for(i=0; i<n; i++)
for(j=0; j<n; j++) O(n2)
arr[i] += arr[j] + i + j;
● T(n) = O(n) + O(n2) = O(n2)

37
Algorithm Analysis
Let study the following code segment:
if(x == 3){
for(i=0; i<n; i++) O(n)
arr[i] = 0;
}else{
for(i=0; i<n; i++)
for(j=0; j<n; j++) O(n2)
arr[i] += arr[j] + i + j;
T(n) = max(O(n), O(n2)) = O(n2)
}

38
Time Complexity of
Recursive Function

39
Algorithm Analysis
Let study the following code segment: Let T(n) be the time complexity of the algorithm
// recursive function
int factorial(int n) { T(n) = T(n-1) + c; // c is a constant time for * oper
and T(n-1) = T(n-2) + c;
if(n==0 || n==1) // base case
T(n) = T(n-2) + 2c
return 1; = T(n-3) + 3c
else ...
// recursive case ∴ T(n) = T(n-k) + kc; //for any positive k value
return n * factorial(n-1);
} Let k=n:
T(n) = T(n-n) + nc;
= T(0) + nc;
T(0) = c; // T(0) reaches base case
∴ T(n) = c + nc = O(n)
40
Algorithm Analysis - Master Theorem
● To compute the time complexity recursive algorithms are not that intuitive. They
divide the input into one or more subproblems. We are going to learn how to get the
big O notation for most recursive algorithms.
● We are going to explore how to obtain the time complexity of recursive algorithms.
For that, we are going to use the Master Theorem (or master method).

41
Algorithm Analysis - Master Theorem
● The general forms for Master Theorem are:
○ Divide and Conquer

○ Subtract and Conquer

42
Master Theorem - Subtract and Conquer
● To obtain runtime of recursive algorithms, you need to identify three elements:
1. a: Subproblems. How many recursion (split) functions are there?
2. b: Relative subproblem size. What is the input reduced? E.g., factorial function
is reduced by 1.
3. f(n) Runtime of the work done outside the recursion? E.g. O(n) or O(1), etc.
● The general form for Master Theorem (for subtract and conquer recurrences) is:

43
Algorithm Analysis - Master Theorem

for some constants c, a>0, b>0, k>=0 and function f(n). If f(n) is O(nk), then

1. If a<1 then T(n) = O(nk)

2. If a=1 then T(n) = O(nk+1)

3. if a>1 then T(n) = O(nkan/b)


44
Master Theorem - Divide and Conquer
● To obtain runtime of recursive algorithms, you need to identify three elements:
1. a: Subproblems. How many recursion (split) functions are there?
2. n/b: Relative subproblem size. What rate is the input reduced? E.g., Binary
search and Merge sort cut input in half.
3. f(n) Runtime of the work done outside the recursion? E.g. O(n) or O(1), etc.
● The general form for Master Theorem (for divide and conquer recurrences) is:

45
Algorithm Analysis - Master Theorem
● Once, we have a, b and f(n) we can determine the runtime of the work done by the
recursion. That is given by:

● Then we compare the runtime of the split/recursion functions and f(n). There are 3
possible cases:
1. Recursion/split runtime is higher: → Final runtime:
2. Same runtime inside and outside recursion: → Final runtime:
3. Recursion/split runtime is lower: → Final runtime:

46
Master Theorem - Divide and Conquer (in short)
Let a ≥ 1 and b > 1 be constants, let f(n) be a function, and let T(n) be a function
over the positive numbers defined by the recurrence

● T(n) = aT(n/b) + f(n).

If f(n) = Θ(nd), where d ≥ 0, then

● T(n) = Θ(nd) if a < bd,


● T(n) = Θ(ndlog n) if a = bd,
● T(n) = if a > bd.

47
In practice...
● To simply our approximation:
○ we have dropped coefficients (e.g. O(¼n) → O(n));
○ we focus on the worst (Big-O) and the best (Big-𝛀) scenarios.
● In practice,
○ Coefficients are important to notice as there could make a differences between n and
¼n.
○ Average cases are important as well (not covered in this syllabus). However, average
cases involve probability and could pose difficult computational problem.

48

You might also like