Professional Documents
Culture Documents
Computing
Lecture 2
Linear Dynamic Data Structures (Draft)
This lecture notes have been compiled from different resources,
I’d like to thank those authors who make them available to use.
1
Lecture Outline
2
Dynamic Data Structures
✓Dynamic data structures
➢Grow and shrink at execution time.
✓Linear Data Structures
➢Linked Lists:
✓“Lined up in a row”
✓Insertions and removals can occur anywhere in the list
➢Stacks: Insertions and removals only at top
✓Push and pop
➢Queues: Insertions made at back, removals from front
✓Non-Linear Data Structures
➢Trees, including binary trees:
✓Facilitate high-speed searching and sorting of data
✓Efficient elimination of duplicate items
3
Structures
◼ Data of various types or multiple data items of the same type
can be attributed to a single object.
◼ Structured Programming languages provide a construct that
allows to give a cover name to the object that we create.
◼ In C/C++ this construct is known as a structure and is
introduced with the keyword struct.
◼ For example, a person can possess both a name and an age,
which would be stored in different kinds of variables -- a char
array for the name and probably an int for the age.
◼ In mathematics, a point on a graph would be represented by
numbers for the coordinates, both of which could be integers or
floats.
◼ For example, we could use xcoord and ycoord for the
coordinates of a point.
4
Structures
◼ The definition below groups four characteristics of a rectangle
into a single struct called rectangle.
struct rectangle {
float L; /* length */
float W; /* width */
float A; /* Area */
float P; /* Perimeter */
};
5
Dynamic Memory Allocation
✓Self-Referential Class
➢Contains a reference member to an object of the same class type
✓E.g.: class Node
{
private int data;
private Node next; // self-reference to node
…
}
✓Reference can be used to link objects of the same type together
✓Dynamic data structures require dynamic memory allocation
➢Ability to obtain memory when needed
➢Release memory when not needed any more
➢Uses new operator
✓Ex: Node nodeToAdd = new Node(10);
15 10
lastNode
firstNode (with null link)
H e …… o
7
Linked Lists
✓Linked lists - similar to arrays, however:
➢Arrays are a fixed size
➢Linked lists have no limit to size
✓More nodes can be added as program executes
✓Insert At Front
(a) firstNode
7 11
new ListNode
12
(b) firstNode
7 11
new ListNode
12
8
Linked Lists
◼ Insert At Back
12 7 11 5
12 7 11 5
9
Linked Lists
◼ Remove From Front
12 7 11 5
12 7 11 5
removeItem
10
Linked Lists
◼ Remove From Back
12 7 11 5
12 7 11 5
removeItem
11
Stacks
◼ A stack is an Abstract Data Type (ADT), commonly used in most
programming languages.
◼ Stack – a special version of a linked list:
◼ Last-in, first-out (LIFO) data structure:
◼ Takes and releases new nodes only at top
◼ Stack ADT allows all data operations at one end only. At any
given time, we can only access the top element of a stack.
◼ A stack is used for the following two primary operations
◼ Push: adds new entry (node) to top of stack
◼ Pop: removes top entry (node) from stack
◼ Can be used for:
◼ Storing return addresses
◼ Storing local variables
◼ …
12
PUSH Operation
◼ Push operation involves a series of steps −
◼ Step 1 − Checks if the stack is full.
◼ Step 2 − If the stack is full, produces an error and exit.
◼ Step 3 − If the stack is not full, increment top to point next empty
space.
◼ Step 4 − Adds data element to the stack location, where top is
pointing.
◼ Step 5 − Returns success.
◼ A simple algorithm for Push operation can be derived as follows:
begin procedure push: stack, data
if stack is full
return null
endif
top ← top + 1
stack[top] ← data
end procedure
13
Pop operation
◼ A Pop operation may involve the following steps:
◼ Step 1 − Checks if the stack is empty.
◼ Step 2 − If the stack is empty, produces an error and exit.
◼ Step 3 − If the stack is not empty, accesses the data element at
which top is pointing.
◼ Step 4 − Decreases the value of top by 1.
◼ Step 5 − Returns success.
14
Queues
◼ Queue is an abstract data structure, somewhat similar to
Stacks.
◼ Unlike stacks, a queue is open at both its ends. One end is
always used to insert data (enqueue) and the other is used to
remove data (dequeue).
◼ Queue follows First-In-First-Out methodology, i.e., the data
item stored first will be accessed first.
◼ Queue: First-in, first-out (FIFO) data structure
◼ Nodes added to tail, removed from head
pointing.
◼ Step 5 − return success.
16
Algorithm for enqueue operation
procedure enqueue(data)
if queue is full
return overflow
endif
rear ← rear + 1
queue[rear] ← data
return true
end procedure
17
Algorithm for dequeue operation
procedure dequeue
if queue is empty
return underflow
end if
data = queue[front]
front ← front + 1
return true
end procedure
18
Hash Tables and Dictionaries
A dictionary consists of
key/element pairs in Example Key Element
which the key is used
to look up the English Word Definition
Dictionary
element.
Student Student Rest of
Records Number record:
Ordered Dictionary: Name, …
Elements stored in Symbol Table Variable Variable’s
sorted order by key in Compiler Name Address in
Memory
Unordered Dictionary:
Elements not stored
in sorted order
19
Dictionary as a Function
Given a key, return an element
Key Element
(domain: (range:
type of the keys) type of the elements)
20
Hashing
◼ Hashing is a technique to convert a range of key values into a
range of indexes of an array of hash table.
◼ We're going to use modulo operator to get a range of key values.
◼ Consider an example of hash table of size 20, and the following
items are to be stored. Item are in the (key, value) format:
(1,20), (2,70), (42,80), (4,25), (12,44), (14,32), (17,11), (13,78), (37,98)
22
Linear Probing
23
Hash Table with Collision
h( k )
return k mod m
where k is the key and m is the size of the table
24
Collisions and their Resolution
◼ A collision occurs when two different keys hash to the
same value
◼ E.g. For TableSize = 17, the keys 18 and 35 hash to the same
value
◼ 18 mod 17 = 1 and 35 mod 17 = 1
◼ Cannot store both data records in the same slot in
array!
◼ Two different methods for collision resolution:
◼ Open Hashing (Separate Chaining): Use a dictionary
data structure (such as a linked list) to store multiple
items that hash to the same slot. Separate chaining =
Open hashing.
◼ Closed Hashing (or probing): search for empty slots
using a second function and store item in first empty slot
that is found. Closed hashing = Open addressing.
25
Collision Resolution Schemes: Chaining
The hash table is an array
of linked lists 0 0
26
Collision Resolution Strategies: Open Addressing
All elements stored in the hash table itself (the array). If a
collision occurs, try alternate cells until empty cell is found.
◼ Quadratic Probing
◼ Double Hashing
27
Linear Probing
28
Rehashing
Problem with both chaining & probing:
When the table gets too full, the average search
time get worse from O(1) to O(n).
29
Choosing Hash Functions
30
Requirement: Prime Table Size for Division
Method Hash Functions
If the table is not prime, the number of alternative locations can
be severely reduced, since the hash position is a value mod
the table size
h(k) + Offset
0 + 1 mod 16 = 1
4 mod 16 = 4
9 mod 16 = 9
16 mod 16 = 0
25 mod 16 = 9
36 mod 16 = 4
49 mod 16 = 1
…
31
Important Factors for Designing Hash Tables
To Minimize Collisions:
◼ Make the table size, m, a prime number not near
a power of two if using a division method hash
function
◼ Use a load factor, λ = n / m, that’s appropriate
for the implementation.
◼ 1.0 or less for chaining ( i.e., n ≤ m ).
◼ 0.5 or less for linear or quadratic probing or
double hashing ( i.e., n ≤ m / 2 )
32
Collision Resolution Comparison
Let n = number of elements in hash table
Let m = hash table size
Let λ = n / m ( the load factor, i.e, the average number of
elements stored in a chain )
Note: If a table using quadratic probing is more than half full, it is not
guaranteed that an empty cell will be found
33
Linear Probing (insert 12)
12 = 1 x 11 + 1
12 mod 11 = 1
0 42 0 42
1 1 1 1
2 24 2 24
3 14 3 14
4 4 12
5 16 5 16
6 28 6 28
7 7 7 7
8 8
9 31 9
10 9 10 9
34
Search with linear probing (Search 15)
15 = 1 x 11 + 4
15 mod 11 = 4
0 42
1 1
2 24
3 14
4 12
5 16
6 28
7 7
8 NOT FOUND !
9 31
10 9
35
Collision Resolution by Closed Hashing
◼ Quadratic: F(i) = i
2
36
Closed Hashing I: Linear Probing
37
Linear Probing Example
insert(14) insert(8) insert(21) insert(2)
14%7 = 0 8%7 = 1 21%7 =0 2%7 = 2
0 0 0 0
14 14 14 14
1 1 1 1
8 8 8
2 2 2 2
21 12
3 3 3 3
2
4 4 4 4
5 5 5 5
6 6 6 6
probes: 1 1 3 2
38
Drawbacks of Linear Probing
40
Quadratic Probing Example
insert(14) insert(8) insert(21) insert(2)
14%7 = 0 8%7 = 1 21%7 =0 2%7 = 2
0 0 0 0
14 14 14 14
1 1 1 1
8 8 8
2 2 2 2
2
3 3 3 3
4 4 4 4
21 21
5 5 5 5
6 6 6 6
probes: 1 1 3 1
41
Problem With Quadratic Probing
insert(14) insert(8) insert(21) insert(2) insert(7)
14%7 = 0 8%7 = 1 21%7 =0 2%7 = 2 7%7 = 0
0 0 0 0 0
14 14 14 14 14
1 1 1 1 1
8 8 8 8
2 2 2 2 2
2 2
3 3 3 3 3
4 4 4 4 4
21 21 21
5 5 5 5 5
6 6 6 6 6
probes: 1 1 3 1 ??
42