You are on page 1of 79

ALGORITHM ANALYSIS

COURSE CODE: COSC3094


CHAPTER ONE
Introduction and elementary data
structures

10/24/2021 Chapter 1 by: Kokeb A. 2


Introduction
A program is written in order to solve a problem.

A solution to a problem actually consists of two things:

A way to organize the data

Sequence of steps to solve the problem

• The way data are organized in a computer’s memory is said to be Data Structure and

the sequence of computational steps to solve a problem is said to be an algorithm.

• Therefore, a program is data structures + algorithms.

10/24/2021 Chapter 1 by: Kokeb A. 3


Introduction to Data Structures
• Abstraction: The first step to solve the problem is obtaining one’s own abstract view, or model,

of the problem.
This process of modeling is called abstraction.

The model defines an abstract view to the problem.

This implies that the model focuses only on problem related stuff and that a programmer tries

to define the properties of the problem.

These properties include:


The data which are affected and

The operations that are involved in the problem.


10/24/2021 Chapter 1 by: Kokeb A. 4
Data Structure
Is the structural representation of logical relationship between elements of data.

It affects the design of both the structural and function aspect of the program.

Classification of Data Structure


1. Primitive data structure:- basic data structure and they are directly operated upon by the machine

instruction. (integer, float, character)

2. Non primitive data structure: - is more sophisticated data structure emphasizing on structuring of a

group of homogeneous and heterogeneous data items.(array, structure, list, tree)

10/24/2021 Chapter 1 by: Kokeb A. 5


Abstract Data Types
ADT is an abstraction of a data structure.

ADT specifies:
What can be stored in the Abstract Data Type
What operations can be done on/by the Abstract Data Type.

e.g. Model of employees of an organization:

This ADT:
Stores employees with their relevant attributes and discarding irrelevant attributes.
Supports hiring, firing, retiring … operations.

10/24/2021 Chapter 1 by: Kokeb A. 6


A data structure is a language construct that the programmer has
defined in order to implement an abstract data type.
There are lots of formalized and standard Abstract data types such as
Stacks, Queues, Trees, etc.
Do all characteristics need to be modeled?
Not at all
It depends on the scope of the model
It depends on the reason for developing the model

10/24/2021 Chapter 1 by: Kokeb A. 7


Abstraction
Abstraction is a process of classifying characteristics as relevant and irrelevant
for the particular purpose at hand and ignoring the irrelevant ones.

How do data structures model the world or some part of the world?
The value held by a data structure represents some specific characteristic of the world

The characteristic being modeled restricts the possible values held by a data structure

The characteristic being modeled restricts the possible operations to be performed on


the data structure.

10/24/2021 Chapter 1 by: Kokeb A. 8


Algorithms
Is a well-defined computational procedure that takes some value or a set of
values as input and produces some value or a set of values as output.

Data structures model the static part of the world.

Algorithms model the dynamic part of the world.

An algorithm transforms data structures from one state to another state in
two ways:
An algorithm may change the value held by a data structure

An algorithm may change the data structure itself

10/24/2021 Chapter 1 by: Kokeb A. 9


Alg…
The quality of a data structure is related to its ability to successfully model the
characteristics of the world.

Similarly, the quality of an algorithm is related to its ability to successfully


simulate the changes in the world.

The quality of data structure and algorithms is determined by their ability to


work together well.

Correct data structures lead to simple and efficient algorithms

Correct algorithms lead to accurate and efficient data structures.


10/24/2021 Chapter 1 by: Kokeb A. 10
Properties of an algorithm
Finiteness: Algorithm must complete after a finite number of steps.

Definiteness: Each step must be clearly defined, having one and only one

interpretation. At each point in computation, one should be able to tell exactly


what happens next.

Sequence: Each step must have a unique defined preceding and succeeding step.
The first step (start step) and last step (halt step) must be clearly noted.

Feasibility: It must be possible to perform each instruction.

Correctness: It must compute correct answer for all possible legal inputs.

10/24/2021 Chapter 1 by: Kokeb A. 11


Properties of an algorithm Cont.…
Language Independence: It must not depend on any one programming language.

Completeness: It must solve the problem completely.

Effectiveness: It must be possible to perform each step exactly and in a finite amount of
time.

Efficiency: It must solve with the least amount of computational resources such as time
and space.

Generality: Algorithm should be valid on all possible inputs.

Input/output: There must be a specified number of input values, and one or more result
values.
10/24/2021 Chapter 1 by: Kokeb A. 12
Algorithm Analysis Concepts
 It’s a process of predicting the resource requirement of algorithms in a given environment. (Time and Space
required)

 The main resources are:


 Running Time

 Memory Usage

 Communication Bandwidth

 Running time is usually treated as the most important since computational time is the most precious
resource in most problem domains.

 There are two approaches to measure the efficiency of algorithms:


 Empirical: Programming competing algorithms and trying them on different instances.

 Theoretical: Determining the quantity of resources required mathematically (Execution time, memory space, etc.)
needed by each algorithm.Chapter 1
10/24/2021 by: Kokeb A. 13
Algorithm Analysis Concepts Cont…
It is difficult to use actual clock-time as a consistent measure of an algorithm’s
efficiency, because clock-time can vary based:
Specific processor speed

Current processor load

Specific data for a particular run of the program


Input Size

Input Properties

Operating Environment

• Accordingly, we can analyze an algorithm according to the number of operations


required, rather than according to an absolute amount of time involved.
10/24/2021 Chapter 1 by: Kokeb A. 14
Complexity Analysis
Is the systematic study of the cost of computation, measured either in time units or in
operations performed, or in the amount of storage space required.

There are two things to consider:


 Time Complexity: Determine the approximate number of operations required to solve a problem of size n.

 Space Complexity: Determine the approximate memory required to solve a problem of size n.

Complexity analysis involves two distinct phases:


 Algorithm Analysis: Analysis of the algorithm or data structure to produce a function T (n) that describes the
algorithm in terms of the operations performed in order to measure the complexity of the algorithm.

 Order of Magnitude Analysis: Analysis of the function T (n) to determine the general complexity category to
which it belongs.

10/24/2021 Chapter 1 by: Kokeb A. 15


Analysis Rules
We assume an arbitrary time unit.

Execution of one of the following operations takes time 1:


Assignment Operation

Single Input/output Operation

Single Boolean Operations

Single Arithmetic Operations

Function Return

Running time of a selection statement (if, switch) is the time for the condition evaluation +

the maximum of the running times for the individual clauses in the selection.

10/24/2021 Chapter 1 by: Kokeb A. 16


Analysis Rules Cont…
Loops: Running time for a loop is equal to the running time for the
statements inside the loop * number of iterations.
The total running time of a statement inside a group of nested loops is the running
time of the statements multiplied by the product of the sizes of all the loops.
For nested loops, analyze inside out.

Always assume that the loop executes the maximum number of iterations possible.

Running time of a function call is 1 for setup + the time for any parameter
calculations + the time required for the execution of the function body.

Chapter 1
10/24/2021 17
by: Kokeb A.
Eg.
Examples:
1. int count()
{
int k=0;
cout<< “Enter an integer”; T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6 =
cin>>n; O(n)
for (i=0;i<n;i++)
k=k+1;
return 0;
}

Time Units to Compute


-------------------------------------------------
1 for the assignment statement: int k=0
1 for the output statement.
1 for the input statement.
In the for loop:
1 assignment, n+ 1 test, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------

10/24/2021 Chapter 1 by: Kokeb A. 18


Eg.
int total(int n)
{
int sum=0;
for (int i=1;i<=n;i++) T (n)= 1+ (1+n+1+n)+2n+1 = 4n+4 = O(n)
sum=sum+1;
return sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int sum=0
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------

10/24/2021 Chapter 1 by: Kokeb A. 19


Formal Approach to Analysis
For Loops: Formally
For loop translates to a summation. The index and bounds of the summation
are the same as the index and bounds of the for loop.

There is 1 addition per iteration of the loop, hence N additions in


total.

10/24/2021 Chapter 1 by: Kokeb A. 20


Formal Approach to Analysis Cont…
Nested Loops: Formally
Nested for loops translate into multiple summations, one for each for loop.

Again, count the number of additions. The outer summation is for the
outer for loop

10/24/2021 Chapter 1 by: Kokeb A. 21


Formal Approach to Analysis Cont…
Consecutive Statements: Formally
Add the running times of the separate blocks of your code

Conditionals: Formally
If (test) s1 else s2: Compute the maximum of the running time for s1 and s2.

10/24/2021 Chapter 1 by: Kokeb A. 22


Formal Approach to Analysis Cont…

Example: Suppose we have hardware capable of executing 106 instructions per second.
How long would it take to execute an algorithm whose complexity function was:
T (n) = 2n2 on an input size of n= 108?
The total number of operations to be performed would be T (108):
T(108) = 2*(108)2 =2*1016
The required number of seconds required would be given by T(108)/106 so:
Running time =2* 1016 /106 = 2* 1010
The number of seconds per day is 86,400 so this is about 231,480 days (634 years).
Chapter 1
10/24/2021 23
by: Kokeb A.
Measures of Time
In order to determine the running time of an algorithm it is possible to define three functions
Tbest(n), Tavg(n) and Tworst(n) as the best, the average and the worst case running time of the
algorithm respectively.

Average Case (Tavg): The amount of time the algorithm takes on an "average" set of inputs.

Worst Case (Tworst): The amount of time the algorithm takes on the worst possible set of inputs.

Best Case (Tbest): The amount of time the algorithm takes on the smallest possible set of inputs.

We are interested in the worst-case time, since it provides a bound for all input – this is called the
“Big-Oh” estimate.

10/24/2021 Chapter 1 by: Kokeb A. 24


Asymptotic Analysis
Asymptotic analysis is concerned with how the running time of an algorithm
increases with the size of the input in the limit, as the size of the input increases
without bound.

There are five notations used to describe a running time function.

These are:
 Big-Oh Notation (O)

 Big-Omega Notation ()

 Theta Notation ()

 Little-o Notation (o)

 Little-Omega Notation ()

10/24/2021 Chapter 1 by: Kokeb A. 25


The Big-Oh Notation
• Big-Oh notation is a way of comparing algorithms and is used for computing the
complexity of algorithms; i.e., the amount of time that it takes for computer program to
run.

• It’s only concerned with what happens for a very large value of n.

• Therefore only the largest term in the expression (function) is needed.

• For example, if the number of operations in an algorithm is n2–n, n is insignificant


compared to n2 for large values of n.

• Hence the n term is ignored. Of course, for small 1 values of n, it may be important.
However, Big-Oh is mainly concerned with large values of n.
10/24/2021 Chapter 1 by: Kokeb A. 26
Big-Oh Notation Cont…
Formal Definition: f (n)= O (g (n)) if there exist c, k ∊ ℛ+ such that for all n≥ k, f
(n) ≤ c.g (n).

Examples: The following points are facts that you can use for Big-Oh problems:
1<=n for all n>=1

n<= n2 for all n>=1

2n <=n! for all n>=4

log2n<=n for all n>=2

n<=nlog2n for all n>=2

10/24/2021 Chapter 1 by: Kokeb A. 27


Big-Oh Notation Cont…
f(n)=10n+5 and g(n)=n. Show that f(n) is O(g(n)).

To show that f(n) is O(g(n)) we must show that constants c and k such that
f(n) <=c.g(n) for all n>=k

Or 10n+5<=c.n for all n>=k

Try c=15. Then we need to show that 10n+5<=15n

Solving for n we get: 5<5n or 1<=n.

So f(n) =10n+5 <=15.g(n) for all n>=1.

(c=15,k=1).

10/24/2021 Chapter 1 by: Kokeb A. 28


Big-Oh Notation Cont…
f(n) = 3n2 +4n+1. Show that f(n)=O(n2).
4n <=4n2 for all n>=1 and 1<= n2 for all n>=1
3n2 +4n+1<=3n2 +4n2 + n2 for all n>=1
<=8n2 for all n>=1

So we have shown that f(n)<=8n2 for all n>=1


Therefore, f (n) is O(n2) (c=8,k=1)

10/24/2021 Chapter 1 by: Kokeb A. 29


Typical Orders
Here is a table of some typical cases.

This uses logarithms to base 2, but these are simply proportional to


logarithms in other base.
N O(1) O(log n) O(n) O(n log n) O(n2) O(n3)
1 1 1 1 1 1 1
2 1 1 2 2 4 8
4 1 2 4 8 16 64
8 1 3 8 24 64 512
16 1 4 16 64 256 4,096
1024 1 10 1,024 10,240 1,048,576 1,073,741,824

10/24/2021 Chapter 1 by: Kokeb A. 30


Typical Orders Cont…

 Demonstrating that a function f(n) is big-O of a function g(n) requires that we find specific

constants c and k for which the inequality holds (and show that the inequality does in fact hold).

 Big-O expresses an upper bound on the growth rate of a function, for sufficiently large values of

n.

 An upper bound is the best algorithmic solution that has been found for a problem.

“What is the best that we know we can do?”

 Exercise:

 f(n) = (3/2)n2+(5/2)n+3

 Show that f(n)= O(n2)

In simple words, f (n) =O(g(n))


10/24/2021 Chapter 1
means that the growth rate of f(n)by:isKokeb
lessA.than or equal to g(n). 31
Big-O Theorems
For all the following theorems, assume that f(n) is a function of n and that k
is an arbitrary constant.

Theorem 1: k is O(1)

Theorem 2: A polynomial is O(the term containing the highest power of n).

Polynomial’s growth rate is determined by the leading term


If f(n) is a polynomial of degree d, then f(n) is O(nd)

In general, f(n) is big-O of the dominant term of f(n).

10/24/2021 Chapter 1 by: Kokeb A. 32


Big-O Theorems Cont…

Theorem 3: k*f(n) is O(f(n))


Constant factors may be ignored
 E.g. f(n) =7n4+3n2+5n+1000 is O(n4)
Theorem 4(Transitivity): If f(n) is O(g(n))and g(n) is O(h(n)),
then f(n) is O(h(n)).
Theorem 5: For any base b, logb(n) is O(logn).
All logarithms grow at the same rate
 logbn is O(logdn) b, d > 1

10/24/2021 Chapter 1 by: Kokeb A. 33


Big-Omega Notation
Just as O-notation provides an asymptotic upper bound on a function,  notation provides an
asymptotic lower bound.

Formal Definition: A function f(n) is  ( g (n)) if there exist constants c and k ∊ ℛ+ such that

f(n) >=c. g(n) for all n>=k.

f(n)= ( g (n)) means that f(n) is greater than or equal to some constant multiple of g(n) for all
values of n greater than or equal to some k.

Example: If f(n) =n2, then f(n)= ( n)

In simple terms, f(n)= ( g (n)) means that the growth rate of f(n) is greater than or equal to g(n).

10/24/2021 Chapter 1 by: Kokeb A. 34


Theta Notation
• A function f (n) belongs to the set of (g(n)) if there exist positive
constants c1 and c2 such that it can be sandwiched between c1.g(n)
and c2.g(n), for sufficiently large values of n.

• Formal Definition: A function f (n) is (g(n)) if it is both O( g(n) ) and


( g(n) ). In other words, there exist constants c1, c2, and k >0 such
that c1.g (n)<=f(n)<=c2. g(n) for all n >= k

• If f(n)= (g(n)), then g(n) is an asymptotically tight bound for f(n).

10/24/2021 Chapter 1 by: Kokeb A. 35


Theta Notation Cont…
 In simple terms, f(n)= (g(n)) means that f(n) and g(n) have the same rate of growth.

 Example:

 If f(n)=2n+1, then f(n) = (n)

 f(n) =2n2 then

 f(n)=O(n4)

 f(n)=O(n3)

 f(n)=O(n2)

 All these are technically correct, but the last expression is the best and tight one. Since 2n2 and n2 have

the same growth rate, it canChapter


10/24/2021
be written
1
as f(n)= (n2). by: Kokeb A. 36
Little-o Notation
Big-Oh notation may or may not be asymptotically tight, for example:

2n2 = O(n2)=O(n3)

f(n)=o(g(n)) means for all c>0 there exists some k>0 such that f(n)<c.g(n) for all
n>=k. Informally, f(n)=o(g(n)) means f(n) becomes insignificant relative to g(n) as n
approaches infinity.

Example: f(n)=3n+4 is o(n2)

In simple terms, f(n) has less growth rate compared to g(n).

g(n)= 2n2 g(n) =o(n3), O(n2), g(n) is not o(n2).


10/24/2021 Chapter 1 by: Kokeb A. 37
Little-Omega ( notation)
• Little-omega () notation is to big-omega () notation as little-o
notation is to Big-Oh notation. We use  notation to denote a lower
bound that is not asymptotically tight.

• Formal Definition: f(n)= (g(n)) if there exists a constant no>0 such


that 0<= c. g(n)<f(n) for all n>=k.

• Example: 2n2 = (n) but it’s not (n2).

10/24/2021 Chapter 1 by: Kokeb A. 38


Relational Properties of the Asymptotic Notations
Transitivity
if f(n)= (g(n)) and g(n)= (h(n)) then f(n)= (h(n)),

if f(n)=O(g(n)) and g(n)= O(h(n)) then f(n)=O(h(n)),

if f(n)= (g(n)) and g(n)= (h(n)) then f(n)= (h(n)),

if f(n)=o(g(n)) and g(n)= o(h(n)) then f(n)=o(h(n)), and

if f(n)= (g(n)) and g(n)= (h(n)) then f(n)= (h(n)).

10/24/2021 Chapter 1 by: Kokeb A. 39


Relational Properties of the Asymptotic Notations
Symmetry
f(n)= (g(n)) if and only if g(n)= (f(n)).

Transpose symmetry
f(n)=O(g(n)) if and only if g(n)= (f(n)),

f(n)=o(g(n)) if and only if g(n)= (f(n)).

Reflexivity
f(n)= (f(n)),

f(n)=O(f(n)),

f(n)= (f(n)).

10/24/2021 Chapter 1 by: Kokeb A. 40


Review of elementary data structures
Stacks, Queues and Lists are elementary data structures fundamental to implementing
any program requiring data storage and retrieval.
ADT Mathematical Model Operations
Stack Push(S, Data)
Pop(S)
Makenull(S)
Empty(S)
Top(S)

The mathematical model of a Stack is LIFO. Data placed in the Stack is accessed through one path.
The standard operations on Stack are:
Push(s, data): push ‘data’ in stack ‘s’
POP(S): withdraw next available data from stack ‘S’
Makenull(S): clear stack ‘S’ of all data
Empty(S):return ‘true’ if Stack ‘s’ is empty and ‘false’ otherwise.
Top(S): view the next available data on stack ‘S’
 All operation can be implemented in O(1) time/

10/24/2021 Chapter 1 by: Kokeb A. 41


Review of elementary data structures
ADT Mathematical Model Operations

Queue Enqueue(Q)
Dequeue(Q)
Front(Q)
Rear(Q)
Makenull(Q)

The mathematical model of Queue is FIFO. Data placed in the Queue goes through one path, while data
withdrawn goes t;hrough another path. The next available data is the first data placed in the queue.
The standard operations on queue are:
Enqueue(Q, data): put ‘data’ in the rear path of queue ‘Q’
Dequeue(Q): withdraw next available data from front path of queue ‘Q’
Front(Q): view the next available data on queue ‘Q’
Rear(Q): view the last data entered on queue ‘Q’
Makenull(Q): clear queue ‘Q’ of all data

 All operations can be implemented in O(1) time.

10/24/2021 Chapter 1 by: Kokeb A. 42


Review of elementary data structures
TREES:
A tree is a finite set of one or more nodes such that:
(i) there is a specially designated node called the root;
(ii) the remaining nodes are partitioned into n>=0 disjoint sets Tl, ...,Tn where
each of these sets is a tree. Tl, ... , Tn are called the subtrees of the root.

10/24/2021 Chapter 1 by: Kokeb A. 43


TREES
Binary Tree:

A binary tree is an important type of tree structure which occurs very often.

It is characterized by the fact that any node can have at most two children, i.e. there is
no node with degree greater than two.

For binary trees we distinguish between the subtree on the left and on the right,
whereas for trees the order of the subtrees was irrelevant.

Furthermore a binary tree is allowed to have zero nodes while a tree must have at least
one node.

Thus a binary tree is really a different object than a tree.

10/24/2021 Chapter 1 by: Kokeb A. 44


Binary Tree

A binary tree is a finite set of nodes which is either empty or consists


of a root and two disjoint binary trees called the left and right
subtrees.

10/24/2021 Chapter 1 by: Kokeb A. 45


Heap and Heap Sort(Assignment1)
In this section we study a way of structuring data which permits one to insert elements

into a set and also to find the largest element efficiently.

A data structure which provides for these two operations is called a priority queue.

Many algorithms need to make use of priority queues and so an efficient way to

implement these operations will be very useful.

We might first consider using a queue since inserting new elements would be very

efficient.

But finding the largest element would necessitate a scan of the entire queue.

10/24/2021 Chapter 1 by: Kokeb A. 46


Heap cont…
A second suggestion would be to use a sorted list which is stored sequentially.

But an insertion could require moving all of the items in the list.

What we want is a data structure which allows both operations to be done

efficiently.

A heap is a complete binary tree with the property that the value at each

node is at least as large as the values at its children (if they exist).

The largest element is at the root of the heap.


10/24/2021 Chapter 1 by: Kokeb A. 47
Heap cont…

Heap has the following two properties:


The value of each node is greater than or equal to the
values of both of its children.
The tree is perfectly balanced, and the leaves in the last
level are all in the leftmost positions.

10/24/2021 Chapter 1 by: Kokeb A. 48


Which one of the following is correct heap?

10/24/2021 Chapter 1 by: Kokeb A. 49


Heap cont…
Heaps can be easily implemented using arrays. For example, the
second heap in Figure a above can be stored using the following array:

22 14 17 5 13 16 2 2

The two conditions for defining a heap ensure that the array has no
gaps (i.e. spaces where there could be a node but isn’t).

10/24/2021 Chapter 1 by: Kokeb A. 50


Heap cont…
What is the use of a heap? One common application is to implement
a priority queue.

Recall that priority queues are required to dequeue the element with
the highest priority.

In a heap, the highest valued element is guaranteed to be at the root


node, so accessing this element is very fast.

10/24/2021 Chapter 1 by: Kokeb A. 51


Enqueueing and dequeueing operations can be specified as follows:

heapEnqueue(el):
Put el at the end of the heap (i.e. in the next free space)
While el is not the root and el > parent(el)
Swap el with its parent

heapDequeue():
Extract the element from the root
Move the element at the last leaf to the root
p = the root
While p is not a leaf and p < any of its children
Swap p with its larger child

10/24/2021 Chapter 1 by: Kokeb A. 52


Heap cont…
In the heapEnqueue() algorithm, the new item is added directly after the last leaf
in the heap.

Because this may violate the first heap condition, the new node is continually
swapped with its parent until the heap condition is satisfied again.

In the heapDequeue() operation, the root element (i.e. the one with the highest
priority) is extracted, then the last leaf node is moved to the root.

Because this will also violate the first heap condition, it is continually swapped
with its larger child until it the condition is satisfied again.

10/24/2021 Chapter 1 by: Kokeb A. 53


Heap cont…
If the elements are distinct, then the root contains the largest item.

The relation greater than or equal to may be reversed so that the


parent node contains a value as small as or smaller than its children.

In this case the root contains the smallest element.

But clinging to historical tradition we will assume that the larger


values are closer to the root.

10/24/2021 Chapter 1 by: Kokeb A. 54


Heap cont…
It is possible to take any binary tree containing values for which an ordering
exists and move these values around so that the shape of the tree is
preserved and the heap property is satisfied.

However, it is more often the case that we are given n items, say n integers
and we are free to choose whatever shape binary tree seems most desirable.

In this case the complete binary tree is chosen and represented sequentially.

This is why in the definition of heap we insist that a complete binary tree is
used.
10/24/2021 Chapter 1 by: Kokeb A. 55
Heap cont…

A binary tree and a heap that preserves the tree's shape

10/24/2021 Chapter 1 by: Kokeb A. 56


Heap Sort
• The most well known example of the use of a heap arises in its application
to sorting.

• A conceptually simple sorting strategy is one which continually removes the


maximum value from the remaining unsorted elements.

• A straightforward implementation of this idea leads to an algorithm whose


worst case time is O(n2).

• A heap allows the maximum element to be found and deleted in O(log n)


time thus yielding a sorting method whose worst case time is O(n log n ).
10/24/2021 Chapter 1 by: Kokeb A. 57
Graph
Graphs can be represented using either static or dynamic data
structures.

One common representation is to use a two-dimensional array called


an adjacency matrix.

An adjacency matrix of a graph G=(V, E) is a |V| by |V| array where


each element aij of the array is defined as

10/24/2021 Chapter 1 by: Kokeb A. 58


Graph Cont…

Notice that this matrix is symmetrical about the leading diagonal.

This is because the edges have no direction: an edge from node A to node B is
the same as an edge from node B to node A.

If the graph were a digraph then this would not be the case.

Adjacency matrices can also be used to represent multigraphs, in which case


each array element will store the number of edges connecting the two vertices.

To represent a weighted graph, the array elements would contain the weights
of their respective edges.

10/24/2021 Chapter 1 by: Kokeb A. 59


Graph cont…

Examples of graphs: (a-c) simple graphs; (d) a complete graph; (e) a multigraph;
(f) a pseudograph; (g) a digraph; (h) a weighted graph

10/24/2021 Chapter 1 by: Kokeb A. 60


Graph cont…
• An example of adjacency matrix representation is shown in following
Figure.

Graph representations using matrices

10/24/2021 Chapter 1 by: Kokeb A. 61


Graph cont…
• An alternative array representation, also shown in the above Figure, is the incidence matrix.

• An incidence matrix of graph G=(V, E) is a |V| by |E| array, defined as

• Finally, graphs can also be implemented using dynamic data structures.

• Each GraphVertex consists of a name and a linked list of edges. Each EdgeNode contains a
pointer to a GraphNode (i.e. an edge) and a link to the next edge emanating from this
vertex.

• Therefore there is no limit to how many edges a vertex can have.

10/24/2021 Chapter 1 by: Kokeb A. 62


Set Representation
• A set can representation using list notation.
• The members of the set are listed explicitly.
• If the set has an infinite number of members then an Ellipsis can be
used.
• For example, S={1,3,5,7} is a finite set, where A={1,3,5,7….} is an
infinite set represented in a list form.

Chapter 1
10/24/2021 63
by: Kokeb A.
Set operation
• There are many set operations that are quite useful for algorithm
study. Some of the set operations are as follows.
• 1. Intersection:- this operation returns the common elements of a set.
• Example. A={1,3,5,7}
• B={3}
A intersection B={3}.

Chapter 1
10/24/2021 64
by: Kokeb A.
Hashing
• Hashing is a technique used for storing and retrieving keys in a rapid
manner.
• It is a concept of distributing keys in an array called a hash table using
predetermined function called a hash function.

Chapter 1
10/24/2021 65
by: Kokeb A.
.

Hash Functions

 Mid-Square
 Division (Modulus)
 Folding
 Extraction:

10/24/2021 Chapter 1 by: Kokeb A. 66


Mid-Square Function:
• The mid-square hash function determines the home bucket for a key
• by squaring the key and then using an appropriate number of bits from
• the middle of the square to obtain the bucket address.
• ex)
• We assume the key = 8125 , and hashing table has 1000
buckets.
(8125) 2  66015625
so address is “156” or “015”

Chapter 1
10/24/2021 67
by: Kokeb A.
Division
• The home bucket is obtained by using the modulo (%) operator. The
• key x is divided by some number M, and the remainder is used as the
• home bucket for x.
• f(x) = x mod M
ex)

Chapter 1
10/24/2021 68
by: Kokeb A.
Folding
In this method the key k is partitioned into several parts, all but
possibly the last being of the same length. These partitions are then
added together to obtain the hash address for k.

There are two ways of carrying out this addition. (1) Shift (2) Boundary
ex1)
We assume the key = 12320324111220 , and hashing table has 1000 buckets.
123|203|241|112|20
(1) 123+203+241+112+020=699
(2) 123+302+241+211+020=897

69
Folding
ex2)

(1) Shift (2) Boundary


70
Extraction:
• Extraction: In the extraction method, only a part of the key is used to generate the
address.

For the key 123456789, this method might use the first four digits (1234), or the last
four (6789), or the first two and last two (1289).

Extraction methods can be satisfactory so long as the omitted portion of the key is not
significant in distinguishing the keys.

For example, at MTU University many student ID numbers begin with the letters
“SCIR”, so the first four letters can be omitted and the following numbers used to
generate the key using one of the other hash function techniques.
10/24/2021 Chapter 1 by: Kokeb A. 71
Collision Resolution
If the hash function being used is not a perfect hash function (which is
usually the case), then the problem of collisions will arise.

Collisions occur when two keys hash to the same address.

The chance of collisions occurring can be reduced by choosing the right


hash function, or by increasing the size of the table, but it can never be
completely eliminated.

For this reason, any hashing system should adopt a collision resolution
strategy.
10/24/2021 Chapter 1 by: Kokeb A. 72
Collision Handling
Linear Probing(Linear Open Addressing)
When the overflow occurs, we search the hash table buckets in the
order (H(x)+1, H(x)+2…), until the hash table is full or reaching the
first unfilled bucket.

ex) 0 10 0 10 0 10
1 1 1
Insert 55 Insert 25
2 75 2 75 2 75
3 3 55 3 55
4 4 4 25
5 43 5 43 5 43

6 6 6
73
Chaining

 In chaining, each address in the table refers to a list, or chain, of data values.
 If a collision occurs the new data is simply added to end of the chain.
 Provided that the lists do not become very long, chaining is an efficient technique.
0 10 0 10
Ex.
1 1
Insert 25
Insert 55
2 75 2 75
55
3 3
25
4 4
5 43 5 43

6 6
74
Quadratic Probing

When the overflow occurs, we search the hash table buckets by using

ex) Key k, hash function H


 22
1st search : H(k)
12

H(x), overflow 2nd search : (H(k)+12)%b


12

3th search : (H(k)-12)%b


 22
4th search : (H(k)+22)%b
5th search : (H(k)-22)%b
Nth search : (H(k)±((B-1)/2)2)%b

75
Question1:

Suppose the hashing function f(x) = x mod 11 is used to hash a list of input
value (in the given order) into a hash table implemented by the array
bucked[0],bucket[1],…bucket[10]. The inputs are
10,100,32,45,126,3,24,200,and 53. Each bucket can hold only one number.
Overflow is resolved by quadratic probing, which examines buckets f(x),
(f(x)+ ) mod 11,and (f(x)- ) mod 11, i=1 to 5.Show2 the final contents in
bucket[0] to bucket[10].
2
i
i

76
Ans: 0 32
1 100
2 45
3 3
4
5 126
6 24
7
8 53
9 200
10 10

77
Question2:

For each hash table below, show the result of


inserting the following sequence of key values, in
the given order , into an initially empty hash table
of that type: 26,17,20,9,34,32,15,21.
In both cases , assume a hash table size of 11 and
a hash function h(x) = x mod 11.
(1)Static hash table that uses chaining
(2)Hash table that uses linear probing

78
(1) (2)
0 0 32
1 →34 1 34

2 2 21
3
3
4 26
4 →26→15
5 15
5 →17
6 17
6
7
7
8
8
9 20
9 →20→9 10 9
10 →32→21 79

You might also like