Lect 2 DynamicDS PDF

Graduate Studies Program
Term: Fall 2022/2023
Computing
Lecture 2
Linear Dynamic Data Structures (Draft)
This lecture notes have been compiled from different resources,
I’d like to thank those authors who make them available to use.
1
Lecture Outline
✓Dynamic Data Structures

➢Linked Lists
➢Stacks
➢Queues
✓Hash Tables
✓Hashing Searching Techniques
2
Dynamic Data Structures
✓Dynamic data structures
➢Grow and shrink at execution time.
✓Linear Data Structures
➢Linked Lists:
✓“Lined up in a row”
✓Insertions and removals can occur anywhere in the list
➢Stacks: Insertions and removals only at top
✓Push and pop
➢Queues: Insertions made at back, removals from front
✓Non-Linear Data Structures
➢Trees, including binary trees:
✓Facilitate high-speed searching and sorting of data
✓Efficient elimination of duplicate items
3
Structures
◼ Data of various types or multiple data items of the same type
can be attributed to a single object.
◼ Structured Programming languages provide a construct that
allows to give a cover name to the object that we create.
◼ In C/C++ this construct is known as a structure and is
introduced with the keyword struct.
◼ For example, a person can possess both a name and an age,
which would be stored in different kinds of variables -- a char
array for the name and probably an int for the age.
◼ In mathematics, a point on a graph would be represented by
numbers for the coordinates, both of which could be integers or
floats.
◼ For example, we could use xcoord and ycoord for the
coordinates of a point.
4
Structures
◼ The definition below groups four characteristics of a rectangle
into a single struct called rectangle.
struct rectangle {
float L; /* length */
float W; /* width */
float A; /* Area */
float P; /* Perimeter */
};
◼ This definition creates a new programmer-defined data type.

◼ Consequently, no data can be stored in rectangle.
◼ To get memory allocated, we need to declare a data structure of
this type, such as: struct rectangle rect;
5
Dynamic Memory Allocation
✓Self-Referential Class
➢Contains a reference member to an object of the same class type
✓E.g.: class Node
{
private int data;
private Node next; // self-reference to node
…
}
✓Reference can be used to link objects of the same type together
✓Dynamic data structures require dynamic memory allocation
➢Ability to obtain memory when needed
➢Release memory when not needed any more
➢Uses new operator
✓Ex: Node nodeToAdd = new Node(10);
data next data next
15 10
Two self-referential class objects (nodes) linked together.

6
Linked Lists
✓ Linked List:
➢ Linear collection of self-referential nodes connected by links
✓Nodes: class objects of linked-lists
✓Programs access linked lists through a reference to first node
✓Subsequent nodes accessed by link-reference members
✓Last node’s link set to null to indicate end of list
✓Nodes can hold data of any type
✓Nodes created dynamically
reference to first node
lastNode
firstNode (with null link)
H e …… o
7
Linked Lists
✓Linked lists - similar to arrays, however:
➢Arrays are a fixed size
➢Linked lists have no limit to size
✓More nodes can be added as program executes
✓Insert At Front
(a) firstNode
7 11
new ListNode
12
(b) firstNode
7 11
new ListNode
12
8
Linked Lists
◼ Insert At Back
(a) firtNode lastNode New ListNode
12 7 11 5
(b) firstNode lastNode New ListNode
12 7 11 5
9
Linked Lists
◼ Remove From Front
(a) firstNode lastNode
12 7 11 5
(b) firstNode lastNode
12 7 11 5
removeItem
10
Linked Lists
◼ Remove From Back
(a) firstNode lastNode
12 7 11 5
(b) firstNode lastNode
12 7 11 5
removeItem
11
Stacks
◼ A stack is an Abstract Data Type (ADT), commonly used in most
programming languages.
◼ Stack – a special version of a linked list:
◼ Last-in, first-out (LIFO) data structure:
◼ Takes and releases new nodes only at top
◼ Stack ADT allows all data operations at one end only. At any
given time, we can only access the top element of a stack.
◼ A stack is used for the following two primary operations
◼ Push: adds new entry (node) to top of stack
◼ Pop: removes top entry (node) from stack
◼ Can be used for:
◼ Storing return addresses
◼ Storing local variables
◼ …
12
PUSH Operation
◼ Push operation involves a series of steps −
◼ Step 1 − Checks if the stack is full.
◼ Step 2 − If the stack is full, produces an error and exit.
◼ Step 3 − If the stack is not full, increment top to point next empty
space.
◼ Step 4 − Adds data element to the stack location, where top is
pointing.
◼ Step 5 − Returns success.
◼ A simple algorithm for Push operation can be derived as follows:
begin procedure push: stack, data
if stack is full
return null
endif
top ← top + 1
stack[top] ← data
end procedure
13
Pop operation
◼ A Pop operation may involve the following steps:
◼ Step 1 − Checks if the stack is empty.
◼ Step 2 − If the stack is empty, produces an error and exit.
◼ Step 3 − If the stack is not empty, accesses the data element at
which top is pointing.
◼ Step 4 − Decreases the value of top by 1.
◼ Step 5 − Returns success.
begin procedure pop: stack

if stack is empty
return null
endif
data ← stack[top]
top ← top - 1
return data
end procedure
14
Queues
◼ Queue is an abstract data structure, somewhat similar to
Stacks.
◼ Unlike stacks, a queue is open at both its ends. One end is
always used to insert data (enqueue) and the other is used to
remove data (dequeue).
◼ Queue follows First-In-First-Out methodology, i.e., the data
item stored first will be accessed first.
◼ Queue: First-in, first-out (FIFO) data structure
◼ Nodes added to tail, removed from head
◼ Many computer applications:

◼ Printer spooling
◼ Information packets on networks
◼ …
15
Basic Operations
◼ enqueue() − add (store) an item to the queue.
◼ dequeue() − remove (access) an item from the queue.
◼ The following steps should be taken to enqueue (insert) data into a
queue:
◼ Step 1 − Check if the queue is full.
◼ Step 2 − If the queue is full, produce overflow error and exit.
◼ Step 3 − If the queue is not full, increment rear pointer to point
the next empty space.

◼ Step 4 − Add data element to the queue location, where the rear is
pointing.
◼ Step 5 − return success.
16
Algorithm for enqueue operation
procedure enqueue(data)
if queue is full
return overflow
endif
rear ← rear + 1
queue[rear] ← data
return true
end procedure
◼ The following steps are taken to perform dequeue operation −

◼ Step 1 − Check if the queue is empty.
◼ Step 2 − If the queue is empty, produce underflow error and exit.
◼ Step 3 − If the queue is not empty, access the data where front is pointing.
◼ Step 4 − Increment front pointer to point to the next available data
element.
◼ Step 5 − Return success.
17
Algorithm for dequeue operation
procedure dequeue
if queue is empty
return underflow
end if
data = queue[front]
front ← front + 1
return true
end procedure
18
Hash Tables and Dictionaries
A dictionary consists of
key/element pairs in Example Key Element
which the key is used
to look up the English Word Definition
Dictionary
element.
Student Student Rest of
Records Number record:
Ordered Dictionary: Name, …
Elements stored in Symbol Table Variable Variable’s
sorted order by key in Compiler Name Address in
Memory
Unordered Dictionary:
Elements not stored
in sorted order
19
Dictionary as a Function
Given a key, return an element
Key Element
(domain: (range:
type of the keys) type of the elements)
20
Hashing
◼ Hashing is a technique to convert a range of key values into a
range of indexes of an array of hash table.
◼ We're going to use modulo operator to get a range of key values.
◼ Consider an example of hash table of size 20, and the following
items are to be stored. Item are in the (key, value) format:
(1,20), (2,70), (42,80), (4,25), (12,44), (14,32), (17,11), (13,78), (37,98)
Sr. No. Key Hash Array Index

1 1 1 = 20 % 1 1
2 2 2 = 20 % 2 2
3 42 2 = 20 % 42 2
4 4 4 = 20 % 4 4
5 12 12 = 20 % 12 12
6 14 14 = 20 % 14 14
7 17 17 = 20 % 17 17
8 13 13 = 20 % 13 13
9 37 17 = 20 % 37 17
21
Linear Probing
◼ Two different values collide when they produce the same hash
index.
◼ Handling collisions involves storing the new value elsewhere in
the hash table.
◼ In such a case, we can search the next empty location in the
array by looking into the next cell until we find an empty cell.
This technique is called linear probing.
◼ If h(k) is full, examine (h(k) + 1) % N, then (h(k) + 2) % N, then
..., (h(k) + N - 1) % N.
◼ Linear probing leads to clustering.

◼ Quadratic probing spreads out successive probes.
◼ (h(k) + i2) % N for 0 ≤ i < N.
22
Linear Probing
Array After Linear Probing,

S.N Key Hash
Index Array Index
1 1 1 = 20 % 1 1 1
2 2 2 = 20 % 2 2 2
3 42 2 = 20 % 42 2 3
4 4 4 = 20 % 4 4 4
5 12 12 = 20 % 12 12 12
6 14 14 = 20 % 14 14 14
7 17 17 = 20 % 17 17 17
8 13 13 = 20 % 13 13 13
9 37 17 = 20 % 37 17 18
23
Hash Table with Collision
h( k )
return k mod m
where k is the key and m is the size of the table
24
Collisions and their Resolution
◼ A collision occurs when two different keys hash to the
same value
◼ E.g. For TableSize = 17, the keys 18 and 35 hash to the same
value
◼ 18 mod 17 = 1 and 35 mod 17 = 1
◼ Cannot store both data records in the same slot in
array!
◼ Two different methods for collision resolution:
◼ Open Hashing (Separate Chaining): Use a dictionary
data structure (such as a linked list) to store multiple
items that hash to the same slot. Separate chaining =
Open hashing.
◼ Closed Hashing (or probing): search for empty slots
using a second function and store item in first empty slot
that is found. Closed hashing = Open addressing.
25
Collision Resolution Schemes: Chaining
The hash table is an array
of linked lists 0 0
Insert Keys: 0, 1, 4, 9, 16,

1 81 1
25, 36, 49, 64, 81 2
3
Notes:
4 64 4
◼ As before, elements would
be associated with the 5 25
keys
6 36 16
◼ We’re using the hash
function h(k) = k mod m 7
8
9 49 9
26
Collision Resolution Strategies: Open Addressing
All elements stored in the hash table itself (the array). If a
collision occurs, try alternate cells until empty cell is found.
Three Resolution Strategies:

◼ Linear Probing
◼ Quadratic Probing
◼ Double Hashing
All these try cells h(k,0), h(k,1), h(k,2), …, h(k, m-1)

where h(k,i) = ( h(k) + f(i) ) mod m, with f(0) = 0
The function f is the collision resolution strategy and the

function h is the original hash function.
27
Linear Probing
Function f is linear. Typically,

f(i) = i 0
So, h( k, i ) = ( h(k) + i ) mod m 1
Offsets: 0, 1, 2, …, m-1 2
With H = h( k ), we try the 3
following cells with wraparound: 4
H, H + 1, H + 2, H + 3, … 5
6
7
8
9
28
Rehashing
Problem with both chaining & probing:
When the table gets too full, the average search
time get worse from O(1) to O(n).
Solution: Create a larger table and then rehash all

the elements into the new table.
29
Choosing Hash Functions
A good hash function must be O(1) and must

distribute keys evenly.
Division Method Hash Function for Integer Keys:
h(k) = k mod m
Hash Function for String Keys?
30
Requirement: Prime Table Size for Division
Method Hash Functions
If the table is not prime, the number of alternative locations can
be severely reduced, since the hash position is a value mod
the table size
Example: Table Size 16, with Quadratic Probing
h(k) + Offset
0 + 1 mod 16 = 1
4 mod 16 = 4
9 mod 16 = 9
16 mod 16 = 0
25 mod 16 = 9
36 mod 16 = 4
49 mod 16 = 1
…
31
Important Factors for Designing Hash Tables
To Minimize Collisions:
◼ Make the table size, m, a prime number not near
a power of two if using a division method hash
function
◼ Use a load factor, λ = n / m, that’s appropriate
for the implementation.
◼ 1.0 or less for chaining ( i.e., n ≤ m ).
◼ 0.5 or less for linear or quadratic probing or
double hashing ( i.e., n ≤ m / 2 )
32
Collision Resolution Comparison
Let n = number of elements in hash table
Let m = hash table size
Let λ = n / m ( the load factor, i.e, the average number of
elements stored in a chain )
Recommended Load Factor

Chaining λ ≤ 1.0
Linear or Quadratic λ ≤ 0.5 (half full)
Probing
Double Hashing λ ≤ 0.5 (half full)
Note: If a table using quadratic probing is more than half full, it is not
guaranteed that an empty cell will be found
33
Linear Probing (insert 12)
12 = 1 x 11 + 1
12 mod 11 = 1
0 42 0 42
1 1 1 1
2 24 2 24
3 14 3 14
4 4 12
5 16 5 16
6 28 6 28
7 7 7 7
8 8
9 31 9
10 9 10 9
34
Search with linear probing (Search 15)
15 = 1 x 11 + 4
15 mod 11 = 4
0 42
1 1
2 24
3 14
4 12
5 16
6 28
7 7
8 NOT FOUND !
9 31
10 9
35
Collision Resolution by Closed Hashing
◼ Given an item X, try

cells h0(X), h1(X), h2(X), …, hi(X)
◼ hi(X) = (Hash(X) + F(i)) mod TableSize
◼ Define F(0) = 0
◼ F is the collision resolution function. Some

possibilities:
◼ Linear: F(i) = i
◼ Quadratic: F(i) = i
2
◼ Double Hashing: F(i) = i Hash2(X)
36
Closed Hashing I: Linear Probing
◼ Main Idea: When collision occurs, scan down the

array one cell at a time looking for an empty cell
◼ hi(X) = (Hash(X) + i) mod TableSize (i = 0, 1, 2, …)
◼ Compute hash value and increment it until a free
cell is found
37
Linear Probing Example
insert(14) insert(8) insert(21) insert(2)
14%7 = 0 8%7 = 1 21%7 =0 2%7 = 2
0 0 0 0
14 14 14 14
1 1 1 1
8 8 8
2 2 2 2
21 12
3 3 3 3
2
4 4 4 4
5 5 5 5
6 6 6 6
probes: 1 1 3 2
38
Drawbacks of Linear Probing
◼ Works until array is full, but as number of items N

approaches TableSize (  1), access time
approaches O(N)
◼ Very prone to cluster formation (as in our
example)
◼ If a key hashes anywhere into a cluster, finding a
free cell involves going through the entire cluster –
and making it grow!
◼ Primary clustering – clusters grow when keys hash
to values close to each other
◼ Can have cases where table is empty except for a
few clusters
◼ Does not satisfy good hash function criterion of
distributing keys uniformly 39
Closed Hashing II: Quadratic Probing
◼ Main Idea: Spread out the search for an empty slot

Increment by i2 instead of i
◼ hi(X) = (Hash(X) + i2) % TableSize

h0(X) = Hash(X) % TableSize
h1(X) = Hash(X) + 1 % TableSize
40
Quadratic Probing Example
insert(14) insert(8) insert(21) insert(2)
14%7 = 0 8%7 = 1 21%7 =0 2%7 = 2
0 0 0 0
14 14 14 14
1 1 1 1
8 8 8
2 2 2 2
2
3 3 3 3
4 4 4 4
21 21
5 5 5 5
6 6 6 6
probes: 1 1 3 1
41
Problem With Quadratic Probing
insert(14) insert(8) insert(21) insert(2) insert(7)
14%7 = 0 8%7 = 1 21%7 =0 2%7 = 2 7%7 = 0
0 0 0 0 0
14 14 14 14 14
1 1 1 1 1
8 8 8 8
2 2 2 2 2
2 2
3 3 3 3 3
4 4 4 4 4
21 21 21
5 5 5 5 5
6 6 6 6 6
probes: 1 1 3 1 ??
42

Lect 2 DynamicDS PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lect 2 DynamicDS PDF

Uploaded by

Copyright:

Available Formats

Graduate Studies Program

Term: Fall 2022/2023

✓Dynamic Data Structures

◼ This definition creates a new programmer-defined data type.

data next data next

Two self-referential class objects (nodes) linked together.

(a) firtNode lastNode New ListNode

(b) firstNode lastNode New ListNode

(a) firstNode lastNode

(b) firstNode lastNode

(a) firstNode lastNode

(b) firstNode lastNode

begin procedure pop: stack

◼ Many computer applications:

◼ Step 2 − If the queue is full, produce overflow error and exit.

◼ Step 3 − If the queue is not full, increment rear pointer to point

the next empty space.

◼ The following steps are taken to perform dequeue operation −

Sr. No. Key Hash Array Index

◼ Linear probing leads to clustering.

Array After Linear Probing,

Insert Keys: 0, 1, 4, 9, 16,

Three Resolution Strategies:

All these try cells h(k,0), h(k,1), h(k,2), …, h(k, m-1)

The function f is the collision resolution strategy and the

Function f is linear. Typically,

Solution: Create a larger table and then rehash all

A good hash function must be O(1) and must

Example: Table Size 16, with Quadratic Probing

Recommended Load Factor

◼ Given an item X, try

◼ F is the collision resolution function. Some

◼ Double Hashing: F(i) = i Hash2(X)

◼ Main Idea: When collision occurs, scan down the

◼ Works until array is full, but as number of items N

◼ Main Idea: Spread out the search for an empty slot

◼ hi(X) = (Hash(X) + i2) % TableSize

You might also like