Professional Documents
Culture Documents
INFO125
Lecture 14:
Physical Data Organization (part 2)
Mehdi Elahi
University of Bergen (UiB)
Memory
✤ Access speed
✤ Reliability
Operations on Files
✤ Operations for locating and accessing le records vary
from system to system.
✤ Open ✤ Delete
✤ Reset ✤ Modify
✤ FindNext
fi
File Header
✤ Heap les
✤ Sorted les
✤ Hashing
fi
fi
fi
Heap Files
1 Record
3
Heap Files
✤ Advantage:
✤ Disadvantage:
✤ Deletion is expensive
fi
fi
Heap Files
Sorted files
…
fi
fi
Sorted Files
✤ If the ordering eld is also a key eld of the le, the eld is
guaranteed to have a unique value in each record, then the
eld is called the ordering key.
…
fi
fi
fi
fi
fi
Sorted files
✤ Advantages:
✤ reading the records becomes ef cient
✤ nding the next record from the current one usually
requires no additional block accesses
✤ using a search condition based on the value of the ordering
key eld results in faster access with binary search
✤ Disadvantages:
✤ Internal Hashing
✤ External Hashing
fi
Internal Hashing
✤ Example:
✤ Example:
i. h(K) = eK*M
iv. h(K) = K / M
Quiz
i. h(K) = eK*M
iv. h(K) = K / M
Alternative Methods
✤ Example:
(A) 641
(B) 666
(C) 0
(D) 654
fi
Answer
(A) 641
(D) 654
fi
Collision
fi
Other Hash Methods
✤ Chaining
✤ Open addressing
✤ Multiple hashing
Other Hash Methods
Pointing to
here
✤ A big picture
Dynamic Hashing
✤ Hashing scheme described so far is called static hashing
because a xed number of buckets m is allocated.
✤ h(K) =K mod m
Directory Buckets
2 2
00
01 2
10
2
11
2 2
00
01 2
10
2
11
✤ global depth
✤ local depth
h(K) = K mod M
Quiz (part A)
Numbers Directory Buckets
to insert
2 2
4
00
6
01 2
7
9 10
2
10 11
22
2
24
16
31
Answer
2 2
00
01 2
10
2
11
2
Answer
10 9
20 2
11
26 6, 22, 10
2
Add more numbers! 7, 31
by applying extensible hashing!
Answer
Now, there is over ow in here!
& Global depth is equal Local depth
More numbers
to insert
2 2
00 4, 24, 16, 20
01 2
10 9
2
11
26 6, 22, 10
2
So we need to double 7, 31
the size of directory!
fl
Answer
Global depth Local Depth
3 3
000
001 2
010
2
double the 011
directory! 100
2
101
110
111
3 3
000 4, 24, 16, 20
001 2
010
2
011
100 split!
2
101
110 3
111
Answer
Directory Buckets
More numbers 3 3
to insert 000 4, 24, 16, 20
001 2
010 9
2
011
6, 22, 10
100
2
26 101
7, 31
110
3
111
Answer
Directory Buckets
More numbers 3 3
to insert
0
000 24, 16
1
001 2
010 2 9
3 2
011
6, 22, 10
100 4
2
26 101 5 7, 31
110 6 3
111
7 4, 20
h(num)= num mod 8
Now, there is over ow in here!
Answer
but Global depth is more than Local depth
Directory Buckets
More numbers 3 3
to insert
0
000 24, 16
1
001 2
010 2 9
3 2
011
6, 22, 10, 26
100 4
2
101 5 7, 31
110 6 3
111
7 4, 20
h(num)= num mod 8
fl
Answer
Directory Buckets
3
3
All numbers 0 24, 16
are inserted 000
1 2
001
2 9
010 2
011
3 10, 26
100 4 2
101 5 7, 31
3
110 6
4, 20
111 7 3
h(num)= num mod 8 6, 22
Linear hashing
✤ Example:
✤ Where j = 0, 1, 2, …
l = r / ( bfr * n )
✤ where:
Buckets
1
Linear Hashing
1
Linear Hashing Meta-info
i=1
✤ Example: n=2
r=0
r=4
Numbers
to insert Buckets
8 0
13
10 1
15
19
hi(K) = K mod M
22
Linear Hashing Meta-info
i=1
n=2
r=1
r=4
Numbers
to insert Buckets
0 8
8 0
13
10 1
15
19
h1(8) = 8 mod 2
22
=0
Linear Hashing Meta-info
i=1
n=2
r=2
r=4
Numbers
to insert Buckets
0 8
13 1
10 13
1
15
19
h1(13) = 13 mod 2
22
=1
Linear Hashing Meta-info
i=1
n=2
r=3
r=4
Numbers
to insert Buckets
0 8
0 10
10 13
1
15
19
h1(10) = 10 mod 2
22
=0
Linear Hashing Meta-info
i=1
n=2
r=3
r=4
= 0.75
Linear Hashing Meta-info
i=1
n=2
r=4
Numbers
to insert Buckets
0 8
10
13
1 1
15 15
19
h1(15) = 15 mod 2
22
=1
Linear Hashing Meta-info
i=1
n=2
r=4
= 1.0
Linear Hashing
✤ So what to do?
i=1
n=2
r=4
Buckets
0 8
We should split the 10
bucket 0! 1
13
15
Linear Hashing Meta-info
i=2
n=3
r=4
Buckets
0 8
We should split the 10
bucket 0! 1
13
split!
15
2
Linear Hashing Meta-info
i=2
n=3
r=4
Buckets
h2(K) = K mod 2M 0 8
10
h1(K) = K mod 1M 13
1 split!
15
h2(K) = K mod 2M 2
Linear Hashing Meta-info
i=2
n=3
r=4
Numbers
to insert
Buckets
h2 (8) = 8 m
8 od 4
0 8
13
10 h (
15 2 10) 13
=1 1
0m
19 od 15
4
22 10
2
Linear Hashing Meta-info
i=2
n=3
r=4
Numbers
to insert
Buckets
8
0 8
13
10
15 13
1
19 15
22 10
2
Linear Hashing Meta-info
i=2
n=3
r=4
Numbers
to insert
Buckets
0 8
1 9 m o d 2 13
h 1(19) = 1
19 15
22 10
2
Linear Hashing Meta-info
i=2
n=3
r=5
r=4
Buckets
No more space in 0 8
bucket 1!
13
1
This is over ow! 15
10
2
fl
Linear Hashing
?
fl
Linear Hashing
Buckets
0
Pointer
2
fl
Linear Hashing Meta-info
i=2
n=3
r=5
r=4
i=2
n=4
r=5
r=4
8 Buckets
0
13
1
We should split the 15
10
bucket 1! 2 split!
3
Linear Hashing Meta-info
i=2
n=4
r=5
r=4
Numbers
to insert
8 Buckets
0
8
h2(13) = 13 mod 4 13
13 1
10
15 h (1 10
2 5) = 2
19 15 m
od 4
22 h (19 15
2 ) = 19
mod 3
4 19
Linear Hashing Meta-info
i=2
n=4
r=6
r=4
Numbers
to insert
8 Buckets
0
8
13 13
1
10
15 10
= 2 2 mod 4 2
19 h 2(2 2) 22
22 15
3
19
Linear Hashing Meta-info
i=2
n=4
r=6
r=4
Numbers
to insert
8 Buckets
0
8
13 13
1
10
15 10
2
19 22
22 15
3
19
Linear Hashing
8 Buckets
00
We can alternatively
write: 13
01
10
10
22
15
11
19
Linear Hashing
1000 Buckets
00
We can alternatively
write: 1101
01
1010
10
10110
1111
11
10011
Linear Hashing
1000 Buckets
00
1010
10
10110
1111
11
10011
Linear Hashing
1000 Buckets
Seems bits from the 00
right (i) are indicative
of the number of 1101
01
bucket (n)!
1010
10
n = 2i 10110
1111
11
10011
Linear Hashing
1000 Buckets
Seems bits from the 00
right (i) are indicative
of the number of 1101
01
bucket (n)!
1010
10
log(n) = i 10110
1111
11
10011
Linear Hashing
Link: youtu.be/h37Jhr21ByQ
Next Lecture
✤ Introduction to NoSQL