Eng HuyDQ Chapter-1-Fundementals 2023

Data structure and
algorithm
Fundamental Concepts
1
Chapter 1
Fundamental
CONTENT
1.1. Algorithm and Complexity
1.2. Beginning example
1.3. Asymptotic Symbols
1.4. Pseudo Code
1.5. Algorithm analysis techniques

What is problem solving?
• Problem solving
• The process of posing a problem and developing a computer

program to solve the problem
• Problem solution includes:

• Algorithms: sequence of steps that need to be taken to
produce the output of the problem from input data in a finite
time.
• Data structure:
• How to organize input and output data storage

Algorithm notation
Definition: A deterministic procedure consists of a finite
sequence of steps that must be performed to obtain an
output for a given input of the problem.
For example:
• Cooking instructions
• Instructions for installing a device
• The rules make a game
• Directions from A to B
• Motorcycle repair instructions
• etc.
5
Algorithm notation
• Algorithm has the following characteristics:
• Input
• Output
• Precision
• Finiteness
• Uniqueness
• Generality
Data structure
• Set of data
• Has related relation in a predetermined problem
• Represent data structure in memory:

• Internal storage
• External storage
• Choosing the appropriate data structure and

algorithm: very important
Data structure + Algorithm = Program
7
Algorithm’s Complexity
• Evaluating the computational complexity of an algorithm
is to evaluate the amount of resources of all kinds that
the algorithm requires.
• Commonly used resources:
• time
• Memory
• bandwidth
• Interested in evaluating the time needed to execute the

algorithm (calculation time of the algorithm).
Algorithm computation time
Factors affecting calculation time:
-Computer
-Compiler
-Algorithm used
-Input data of the algorithm
• The value of the data affects the calculation time
• Usually, the size of the input data is the main factor
that determines the computation time
• For example: with sorting problem → number of
sorting elements
• For example: matrix multiplication problem → total
number of elements of 2 matrices
Basic operation
• Definition. We call a basic operation an operation
that can be performed with time bounded by a
constant that does not depend on the data size.
• To calculate the algorithm’s computation time: count the

number of basic operations that must be performed.
(assignment, comparison, arithmetic operations, etc.)
Types of computation time
For a given input data of size n,
• Best computation time.
• Minimum time required to execute the algorithm with any input
data set of size n.
• Worst computation time.

• Maximum time needed to execute the algorithm with any input
data set of size n.
• Average computation time.

• Average time needed to execute the algorithm on a finite set of
inputs of size n.
CONTENT
1.4. Pseudo Code
Beginning example
• Problem of finding the largest subsequence:
• Give a series of numbers
a1 , a2 , … , an
• The sequence of numbers ai, ai+1 , …, aj with 1 ≤ i ≤ j ≤ n is
called a subsequence of the given sequence and ∑jk=i ak is called the
weight of this subsequence
• The problem is: Find the maximum weight of the subsequences,
that is, find the maximum value ∑jk=i ak .
• For simplicity, call the subsequence with the largest weight the
largest subsequence.
• Ex: The given sequence is -2, 11, -4, 13, -5, 2 . Largest weight?
→ the answer is 20 (weight of the subsequence 11, -4, 13)

Direct algorithm
• Traverse all possible subsequences
ai, ai+1 , …, aj with 1 ≤ i ≤ j ≤ n
calculate the sum of each subsequence to find the largest

number.
• The total number of possible subsequences of the given

sequence is
C(n,2) + n = n2/2 + n/2 .

Direct algorithm
• Implementation
int maxSum = 0;
for (int i = 0; i < n; i++) {
for (int j = i; j < n; j++) {
int sum = 0;
for (int k = i; k <= j; k++)
sum += a[k];
if (sum > maxSum)
maxSum = sum;
}
}
Direct algorithm
• Algorithm analysis: Calculate the number of additions
that must be performed, i.e. count how many time the
lines of code Sum += a[k]
must be executed.
• Number of additions:
n −1 n −1 n −1 n −1
(n − i )(n − i + 1)

i = 0 j =i
( j − i + 1) = (1 + 2 + ... + (n − i )) = 
i =0 i =0 2
1 n 1  n 2 n  1  n(n + 1)(2n + 1) n(n + 1) 
=  k (k + 1) =   k +  k  =  +
2 k =1 2  k =1 k =1  2 6 2 
n3 n 2 n
= + +
6 2 3
Faster algorithm
• Note that the sum of the terms i to j can be obtained from

the sum of the terms i to j-1 by one addition,
• We have: j j −1
 a[k ] = a[ j ] +  a[k ]
k =i k =i
• This note allows the innermost “for” loop to be removed.

Faster algorithm (cont..1)
• Implementation:
int maxSum = a[0];

for (int i=0; i<n; i++) {
int sum = 0;
for (int j=i; j<n; j++) {
sum += a[j];
if (sum > maxSum)
maxSum = sum;
}
}
Faster algorithm (cont..2)
• Algorithm analysis. Calculate the number of additions:
n −1
n2 n

i =0
(n − i ) = n + (n − 1) + ... + 1 = +
2 2
• This number is exactly equal to the number of
subsequences.
• => It seems that the resulting algorithm is very good,
because each subsequence must be considered exactly
once.
Recursive algorithm
• Better algorithm!
• Divide and conquer technique:

• Divide the problem to be solved into sub-
problems of the same form
• Solve each subproblem recursively
• Combine the solutions of the subproblems to

obtain the solution of the original problem.
Recursive algorithm
• Apply this technique to the problem of finding the
maximum weight of subsequences:
• Divide the given range into 2 ranges using the

middle element
• and obtain 2 sequences of numbers with the length
reduced by half
• (referred to as the left and right subsequence).
Recursive algorithm
• To combine the solution, realize that only 1 of 3 cases can occur
respectively when the largest subsequence is located at:
• Left subsequence (left half)
• Right subsequence (right half)
• Start in the left half and end in the right half (middle).
• Denotes the weight of the largest subsequence in
• The left half is wL
• right half is wR
• in the middle is wM
• The weight to find will be max(wL, wR, wM).
Recursive algorithm
• Finding the weight of the largest subsequence in the left
half (wL) and right half (wR) can be done recursively.
• Find the weight wM of the largest subsequence starting
in the left half and ending in the right half:
• Calculate the weight of the largest subsequence in the left half
ending at the division point (wML) and
• Calculate the weight of the largest subsequence in the right
half starting at the division point (wMR).
• Then wM = wML + wMR.
Recursive algorithm
• m – division point of the left sequence, m+1 is the
division point of the right sequence
a1, a2,…,am, am+1, am+2,…,an
Calculate the WML of the Calculate the WMR of the

largest subsequence in the largest subsequence in the
left half ending at am right half starting from am+1
Recursive algorithm
• Calculate the weight of the largest subsequence in the

left half (from a[i] to a[j]) ending at a[j]:
MaxLeft(a, i, j);
{
maxSum = -; sum = 0;
for (int k=j; k>=i; k--) {
sum = sum+a[k];
maxSum = max(sum, maxSum);
}
return maxSum;
}
Recursive algorithm
• Calculate the weight of the largest subsequence in the
right half (from a[i] to a[j]) starting from a[i] :
MaxRight(a, i, j);
{
maxSum = -; sum = 0;
for (int k=i; k<=j; k++){
sum = sum+a[k];
maxSum = max(sum, maxSum);
}
return maxSum;
}
Recursive algorithm
recursive algorithm’s diagram can be described as follows:

MaxSub(a, i, j);
{
if (i == j) return a[i]
else
{ m = (i+j)/2;
wL = MaxSub(a, i, m);
wR = MaxSub(a, m+1, j);
wM = MaxLeft(a, i, m)+
MaxRight(a, m+1, j);
return max(wL, wR, wM);
}
}
Recursive algorithm
• Algorithm analysis:
• MaxLeft and MaxRight required
n/2 + n/2 = n additions
• Therefore, if we call T(n) the number of additions to
find -> recursive formula:
0 n =1

T ( n) =  n n n
T ( 2 ) + T ( 2 ) + n = 2T ( 2 ) + n n 1
Recursive algorithm
• We confirm that T(2k) = k.2k. We prove it by induction

• Inductive basis: If k=0 then T(20) = T(1) = 0 = 0.20.
• Inductive transfer: If k>0, suppose T(2k-1) = (k-1)2k-1 is
correct. Then
T(2k) = 2T(2k-1)+2k = 2(k-1).2k-1 + 2k = k.2k.
• Returning to the notation n, we have
T(n) = n log n .
• The results obtained are better than the second
algorithm!
Calculation time
Calculation time
Time conversion table
• The following table is used to calculate execution time

Dynamic Programming algorithm
Algorithm development based on DP includes 3 stages:

1. Decomposition: Divide the problem to be solved into
smaller sub-problems of the same form as the original
problem.
2. Record solutions: Store the solutions of subproblems in a
table.
3. Synthesize solutions: In turn, from the solutions of
smaller sized sub-problems, build the solution of the
larger sized problem, until the solution of the starting
problem (which is the sub-problem has the largest size)
is obtained.
• Decomposition: Let si be the weight of the largest

subsequence in the sequence a1, a2, ..., ai , i = 1, 2, ..., n.
Obviously sn is the value to find.
• Synthesize solutions.
• We have
s 1 = a1 .
• Suppose i > 1 and sk is known with k = 1, 2, ..., i-1.
need to calculate si as the weight of the largest subsequence of the
sequence
a1, a2, ..., ai-1, ai .
• Since the largest subsequence of this sequence either contains the element ai or
does not contain the element ai, it can only be 1 of 2 sequences:
• largest subsequence of a1, a2, ..., ai-1
• largest subsequence of a1, a2, ..., ai which end at ai.
• Thence inferred
si = max {si-1, ei}, i = 2, …, n.
where ei is the weight of the largest subsequence of a1, a2, ..., ai end at ai.
• Calculate ei, use the following recursive formula:
• e1 = a1;
• ei = max {ai, ei-1 + ai}, i = 2, ..., n.
MaxSub(a); {
giải thuật QHĐ
smax = a[1]; (* smax – weight of the largest subsequence*)
maxendhere = a[1]; (* maxendhere –weight of the largest subsequence ends at a[i] *)
imax = 1; (* imax – The end position of the largest subsequence*)
for i = 2 to n {
u = maxendhere + a[i];
v = a[i];
if (u > v) maxendhere = u
else maxendhere = v;
if (maxendhere > smax)then {
smax := maxendhere;
imax := i;
}
}
}
It is easy to see the number of addition operations that must be performed in the
algorithm (number of times the statement u = maxendhere + a[i]; is executed) is n.
CONTENT
1.4. Pseudo Code
Asymptotic Notation
Q, W, O
• Used to describe the calculation time of the algorithm

• Instead of saying the exact calculation time, we say Q(n2)
• Defined for functions that take non-negative integer values
• Used to compare the increasing speed of two functions
Q symbol
For the function g(n), the symbol Q(g(n)) is the set of functions
Q(g(n)) = {f(n): there exist constants c1, c2 and n0 so that
0  c1g(n)  f(n)  c2g(n), with every n  n0 }
g(n) is the asymptotically correct estimate for f(n)

Example
10n2 - 3n = Q(n2) ?
• For what values of the constants n0, c1, and c2 is the

inequality in the definition true?
• Taking c1 as smaller than the coefficient of the term with

the highest exponent, and taking c2 as larger, we have
n2 ≤ 10n2 – 3n ≤ 11n2, with every n ≥ 1.
• For polynomial functions: To compare the growth rate,

you need to look at the term with the highest exponent
Symbol O (pronounced big O)
For a given function g(n), we denote O(g(n)) as the set of functions
O(g(n)) = {f(n): there exist constants c and n0 so that :
f(n)  cg(n) with every n  n0 }
We say g(n) is the asymptotic upper bound of f(n).

Symbol W
For a given function g(n), we denote W(g(n)) as the set of functions
W(g(n)) = {f(n): there exist constants c and n0 so that :
f(n) ≥ cg(n) with every n  n0 }
We say g(n) is the asymptotic lower bound of f(n).

Relationship between Q, W, O
For any two functions g(n) and f(n),

f(n) = Q(g(n))
If and only if
f(n) = O(g(n)) and f(n) = W(g(n))
It mean
Q(g(n)) = O(g(n))  W(g(n))
how to use these symbols
• Say “the running time for this algorithm is O(f(n))”
means: The worst-case running time is O(f(n)).
• “the running time is W(f(n))”: The best case running

time is W(f(n)).
Asymptotic notation in equalities
• Used to replace expressions containing operands

with slow increments
• Example
4n3 + 3n2 + 2n + 1 = 4n3 + 3n2 + Q(n)
= 4n3 + Q(n2) = Q(n3)
• In the equations, Q(f(n)) replaces a certain function
g(n)  Q(f(n))
• In above example, Q(n2) replace for 3n2 + 2n + 1
Graphs of some basic functions
Similarities between comparing functions and comparing
numbers
fg  ab
f (n) = O(g(n))  a  b
f (n) = W(g(n))  a  b
f (n) = Q(g(n))  a = b
Properties
• Transitivity
f(n) = Q(g(n)) & g(n) = Q(h(n))  f(n) = Q(h(n))
f(n) = O(g(n)) & g(n) = O(h(n))  f(n) = O(h(n))
f(n) = W(g(n)) & g(n) = W(h(n))  f(n) = W(h(n))
• Symmetry
f(n) = Q(g(n)) if and only if g(n) = Q(f(n))
• Transpose Symmetry
f(n) = O(g(n)) if and only if g(n) = W(f(n))
Example
A B
• 5n2 + 100n 3n2 + 2
• log3(n2) log2(n3)
Recall some logarithmic functions
x a = b  log b = a
x
log ab = log a + log b

log m b
log a b =
log m a
log a b = b log a
alogn = nloga
logb a = (log a)b  log ab
d ln x = 1
dx x
N vs lognlogn
Example
A B
• 5n2 + 100n 3n2 + 2 A  Q(B)
A  Q(n2), n2  Q(B)  A  Q(B)
• log3(n2) log2(n3) A  Q(B)

logba = logca / logcb;
A = 2lgn / lg3 B = 3lgn,
A/B =2/(3lg3)
Example
• 2n2 = O(n3): 2n2 ≤ cn3  2 ≤ cn  c = 1 and n0= 2
• n2 = O(n2): n2 ≤ cn2  c ≥ 1  c = 1 and n0= 1
• 1000n2+1000n = O(n2):
1000n2+1000n ≤ cn2  c=1001 and n0 = 1000
• n = O(n2): n ≤ cn2  cn ≥ 1  c = 1 and n0= 1

Example
• 5n2 = W(n)
 c, n0 such that: 0  cn  5n2  cn  5n2  c = 1 and n0 = 1
• 100n + 5 ≠ W(n2)
Assume:  c, n0 such that: 0  cn2  100n + 5.

We have: 100n + 5  100n + 5n ( n  1) = 105n
Inferred: cn2  105n  n(cn – 105)  0
Because n is positive  cn – 105  0  n  105/c
absurd: n cannot be smaller than a constant
• n = W(2n), n3 = W(n2), n = W(logn)
Attention
• The values of n0 and c are not unique in the proof of the asymptotic formula
• Prove that100n + 5 = O(n2)
• 100n + 5 ≤ 100n + n = 101n ≤ 101n2 for every n ≥ 5
n0 = 5 and c = 101 are the constants to find
• 100n + 5 ≤ 100n + 5n = 105n ≤ 105n2 for every n ≥ 1
n0 = 1 and c = 105 are also the constants to find
Just find certain constants c and n0 that satisfy the

inequality in the definition of the asymptotic formula
Some special algorithm classes
• O(1): constant
• O(log n): logarithmic
• O(n): linear
• O(n log n): superlinear
• O(n2): quadratic
• O(n3): cubic
• O(an): exponential (a > 1)
• O(nk): polynomial (k ≥ 1)
CONTENT
1.4. Pseudo Code
Algorithm description: pseudo-language
• Use certain programming language to describe the

algorithm → can make the algorithm description
complicated and difficult to grasp.
• →use:
• Block diagram
• pseudo language,
Block diagram
Control instructions can be:
-instruction block
-Conditional instruction Begin
-Repeat instruction
Input n
Begin or end
R=n%2
Assign instruction
N Y
R is 0 ?
Input/Output
Condition Odd Even
Continue the
command
Execution flow
End
Block of instructions
Syntax:
{ S1
S1;
S2; S2
S3;
} S3
Condition instruction
 Syntax
if(condition)
action false
condition
true
action
Condition instruction
 Syntax:
if (B) then
false true
S1; B
else
S2; S2 S1
Loop instruction:
Syntax:
while (B) do
S;
true
B S
false
For loop
 Syntax
for (initialization; condition; update)
action
initialization
false
condition
true
action
update
do-while loop
 Syntax
do action
while (condition)
action
true
condition
false
• pseudo language
• allows both describing algorithms in everyday language
and using command structures similar to those of
programming languages.
• Declare variable
integer x,y;
real u, v;
boolean a, b;
char c, d;
datatype x;
• Assignment statement
x = expression;
or
x ← expression;
Ex: x ← 1+4;
y = a*y+2;
• Control structure
• if condition then
sequence of instructions
else
endif;
while condition do
endwhile;
repeat
sequence of instructions; Case statement:
until condition; Case
cond1: stat1;
for i=n1 to n2 [step d] cond2: stat2;
sequence of instructions; .
.
endfor;
.
condn: stat n;
• Input-Output endcase;
read(X); /* X can be single variable or
array */
print(data) or print(message)
• Function and procedure

Function name(paramenters)
begin
variable declaration; Pass parameters:
statements in the body of the function;
• Values
return (value)
end; • Reference
• Local variable
Procedure name(paramenters) • Global variable
begin
variable declaration;
statements in the body of the procedure;
end;
• Example: Algorithm to find the largest element in an array A(1:n)

Function max(A(1:n))
begin
datatype x; /*to keep the maximum value found */
integer i;
x=A[1];
for i=2 to n do
if x < A[i] then
x=A[i];
endif
endfor ;
return (x);
end max;
• For example: the two-variable content swap algorithm

Procedure swap(x, y)
begin
temp=x;
x = y;
y = temp;
end swap;
• Example: Find a prime number greater than the positive

integer n.
• First, we build function Is_prime to check whether a
positive integer m is prime or not.
• Using this function, we build an algorithm to solve the
given problem.
• If m=a*b with 1 < a, b < m, then either of the factors a, b
will not exceed 𝑚.
→ m will be prime if it has no divisors among the
positive integers from 2 to 𝑚.
• Algorithm to check whether a positive integer is prime

or not.
• Input: Positive integer m.
• Output: true if m is prime, false otherwise.
function Is_prime(m);
begin
i =2;
while (i*i <= m) and (m mod i ≠ 0) do i=i+1;
Is_prime = i > sqrt(m);
End Is_Prime;
• Algorithm to find prime number greater than positive

integer n.
• The algorithm will use Is_prime as a subroutine.
• Input: Positive integer n.
• Output: m - prime number greater than n.
procedure Lagre_Prime(n);
begin
m = n+1;
while not Is_prime(m) do m=m+1;
end;
• Since the set of prime numbers is infinite, the Lagre_Prime
algorithm is finite.
CONTENT
1.4. Pseudo Code
Basic techniques for analyzing algorithm complexity
Sequential structure.
• Suppose P and Q are two segments of the algorithm,
• Can be a command but can also be a sub-algorithm.
• Time(P), Time(Q): calculation time of P and Q respectively.
• Sequence rule:Calculation time required by “P; Q”,

meaning P is executed first, followed by Q, which will be
Time(P; Q) = Time(P) + Time(Q) ,
or in Theta notation::
Time(P; Q) = max(Time(P), Time(Q).
For loop
for i =1 to m do P(i);
• Suppose the execution time P(i) is t(i)

• Then the for loop execution time will be
m
 t(i )
i =1
Typical statement
• Definition: A typical statement is a statement that is
executed at least as often as any other statement in
the algorithm.
• If we assume that the execution time of each
statement is bounded by a constant
• => To evaluate the calculation time, you can count
the number of times a typical command is executed
Example: FibIter
function Fibiter(n)
begin
i:=0; j:=1; Typical instruction
for k:=1 to n do
begin
j:= j+i; •The number of times a typical
i:= j-i; statement is executed is n.
end;
•Calculation time of the
Fibiter:= j; algorithm is O(n)
end;
Example: Algorithm1 of beginning
example
int maxSum =0;
for (int i=0; i<n; i++) {
for (int j=i; j<n; j++) {
int sum = 0;
for (int k=i; k<=j; k++)
sum += a[k];
if sum > maxSum
maxSum = sum;
}
}
Select the typical statement sum+=a[k].
=> Evaluate the calculation time of the algorithm as O(n3)
Questions?

Eng HuyDQ Chapter-1-Fundementals 2023

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Eng HuyDQ Chapter-1-Fundementals 2023

Uploaded by

Copyright:

Available Formats

Data structure and

1.2. Beginning example

1.3. Asymptotic Symbols

1.4. Pseudo Code

1.5. Algorithm analysis techniques

• The process of posing a problem and developing a computer

• Problem solution includes:

• How to organize input and output data storage

• Represent data structure in memory:

• Choosing the appropriate data structure and

Data structure + Algorithm = Program

• Interested in evaluating the time needed to execute the

• To calculate the algorithm’s computation time: count the

• Worst computation time.

• Average computation time.

→ the answer is 20 (weight of the subsequence 11, -4, 13)

• Traverse all possible subsequences

ai, ai+1 , …, aj with 1 ≤ i ≤ j ≤ n

calculate the sum of each subsequence to find the largest

• The total number of possible subsequences of the given

C(n,2) + n = n2/2 + n/2 .

• Note that the sum of the terms i to j can be obtained from

• This note allows the innermost “for” loop to be removed.

int maxSum = a[0];

• Algorithm analysis. Calculate the number of additions:

• Divide and conquer technique:

• Solve each subproblem recursively

• Combine the solutions of the subproblems to

• Divide the given range into 2 ranges using the

a1, a2,…,am, am+1, am+2,…,an

Calculate the WML of the Calculate the WMR of the

• Calculate the weight of the largest subsequence in the

recursive algorithm’s diagram can be described as follows:

• We confirm that T(2k) = k.2k. We prove it by induction

• The following table is used to calculate execution time

Algorithm development based on DP includes 3 stages:

• Decomposition: Let si be the weight of the largest

• Used to describe the calculation time of the algorithm

g(n) is the asymptotically correct estimate for f(n)

• For what values of the constants n0, c1, and c2 is the

• Taking c1 as smaller than the coefficient of the term with

• For polynomial functions: To compare the growth rate,

We say g(n) is the asymptotic upper bound of f(n).

We say g(n) is the asymptotic lower bound of f(n).

For any two functions g(n) and f(n),

• “the running time is W(f(n))”: The best case running

• Used to replace expressions containing operands

log ab = log a + log b

• log3(n2) log2(n3) A  Q(B)

• 2n2 = O(n3): 2n2 ≤ cn3  2 ≤ cn  c = 1 and n0= 2

• n2 = O(n2): n2 ≤ cn2  c ≥ 1  c = 1 and n0= 1

1000n2+1000n ≤ cn2  c=1001 and n0 = 1000

• n = O(n2): n ≤ cn2  cn ≥ 1  c = 1 and n0= 1

Assume:  c, n0 such that: 0  cn2  100n + 5.

• Prove that100n + 5 = O(n2)

• 100n + 5 ≤ 100n + n = 101n ≤ 101n2 for every n ≥ 5

n0 = 5 and c = 101 are the constants to find

• 100n + 5 ≤ 100n + 5n = 105n ≤ 105n2 for every n ≥ 1

n0 = 1 and c = 105 are also the constants to find

Just find certain constants c and n0 that satisfy the

• Use certain programming language to describe the

Condition Odd Even

• Function and procedure