Professional Documents
Culture Documents
f s sa fais F k r zta ii T faid ecoiary key index: duo icates may exist. It may be
Normalization| 161
Introduction
form of records. Every
is stored in dhe generated from a field whih is a candidate key and has
that data
data 5s used to recognize the
This keya field is Et ft 77
We know that
we know a
unique value in every recorc, or a non-key with du-
record ha a key ield.
is used to
technique which
is of plicate values.
Indexing tiles on the basis
record uniGueb: from the
database
retrieve records
some key field.
forbes ave
all the index main str
difficult to search hashing is
But, sometimes becomes
it
At that dime,
reach the desired da:a.
values and
to directly search the
location of de.
Used. It is a technique
the disk without using
index structure. This Address-inder
sired data 07
123ismith imain str
chapter deak
with these two techniques. 234joneS forbes ave
345 tomsonmain str
Indexing su r fN ZT f 456 stevens forbes ave
a is to get
data
database to storejnsert/update 567smith forbes ave
Main aim behind designing one can easily
and
quick access of any datadatabase.
secondary key index: typically, with 'postings lists'
delete datz of the for even one
size hence searching
As database is of large even a smallest transaction
record will teke time. Moreover reduce the
to
will take time to perform action. In order
the Postings s
in transactions, Indexes
are used. Concept be
time spent is similar to the one we
used in
forbes ave
hind indeaes used in DBMS
books. Indexing a data structure
is technique to efficiently main str 123 smith
database files based on he at- nan
retrieve records from the 234 joness :forbes a
has been done.
on vhich the indexing 345 tomson main str
tributes access to de-
Indexing mechanisms
are
used to speed up 458:stevens forbes ave
sired data. Eg. author catalog in library 567 smith
forbes ave
index entries) of the
index fle consists of records (called
An
form Clustering (= sparse) index: Clustering index is de
fined on an ordered data file. Records are
search-key pointer physicaly
sorted on that key (and rot all key values are neeced in
the index)
Search Key attribute to set of attributes used to look up
records in a fie.
Non-dustering (=dense) index: the opposite 7-zffT (TH) TA - aifaz (favta)
sf7 t Clustering'sparse indexon ssn
Index files are ypically much smaller than the original file ITT-TA: STfarT mg F Eg:
Ordered indexing is of two.types
Two basic kinds of indices: Sparse Index: contains index records for only some
Ordered indices: seanch keys are stored in sorted order search-key values. In sparse index, index records are not
for key.
Hash indices: search keys are distributed uniformly Created every search An index recordhere con-
tainsa search key and an actual poirter to the data on
acros "buckets using a "hash function'. he disk. To search a record, we first
Indexing are of following ypes: proceed
by
reccrd and reach at the actual location of the data. index
if we
Primarykey duplicates). Itis
index: onanprimary key (no file. are
unable to find the record then the system e1orms
generaly defined on Data is or
ordered data sequential search to find the desired data.
derec on a primary key (key field).
Nomalization | 163
D E M S
B-trees B-a
Imar Index Daablods 8 tree is.n xample of multilevel indexing. Record pointers
will be presnt at leaf nodes as well as on Internal nodes.
Quter inde
Daa Bocks
A
B/B+ Iree is a ropted tree satisfying thefollowing prop-
erties:
All path- from root to leaf are of the same length
Each iude that is not a root or a leaf has between
Datablodss n 2 : and n children.
1, al the search-keys in
sub-ree to
in Pi points have values greater than or cquai to ki K-1 1 2 E a rez
crease in the numder of leves in the B-tree. This all results f vfvTaT B-tree md Kiem id q Kah d a ns
in
oing he euch dime of a record. and less than Kq-1
Bt-tree for account file (n 3)
S e e eiminates the above
drawback
oinbers only at te leuf nodes oí the tree. by storing data B+ n at s atga T STa Nzel »il r K*
B tree is a
ary search uree that foilows bal-F seancn-
beigh hus bakanced. Apart from this, the leaf nodes are s t kTY FHA ZT vt Td forrd uE an fle
irkd sing a link ist herefore a 8" and 4 values
tree can suppot ran R Leal nodes must have between 2
diam acres wel as sequertdal acces T frR 3vam FTT T . 5).
there
with n
T (-1)/2) and n -1, e r pa
Adantage of B+-tee index fles: automaticaily reorga- Non-leaf nodes other than root must have between 3 ety
es sef wh Smal, bcail, changes, in the R ITT- ard 37y attit wta
and deletiors face of inserdons tt frteeqa d ** ita,
atra, Ta, T Tq faFprrg1 and 5 children (fn/2) and n with n =5).
Reorganizaion cí endre file is not requirèed
maintain performance I TTTR *I A fN F ETRstufrar i T. Root must have at least 2 children.
. T**2
>k.
K<K<Kg...<K 2. If such a value exists, assume it is K. Then follow P
2 f 3T t at a he
Leaf Nodes in B-Trees TT R TITR P upstairs
Non-Leaf Nodes in B+-Trees lthere are K search-key values in the file, the path is no
longer than o n{)J.
Non leaf nodes form a mult-level sparse index on the leaf
nodes. For a non-leaf node with m pointers: With 1 million search key values and n 100, at most
All the search-keys in the sub-tree to which P, points are oBsol1,000,000) = 4 nodes are accessed in a lookup. log,(1,000,000) =4 e fn zi
less than K1 - a d-sta fcu fs P, RA K, i 341 4 .
NoTraizaun | 167
value woukd Jp
node ie ehh the rch-key
k aee s aveudh there n the kat noxke
l e to fle and d necesay a Katers mserted
DNSERTION OF KEY'K
t warchkey value to L such that the keys are in
cnder
( l overfirws{
PPHH
168bONS
insertion of "Ceariew
B+-Tree bekore and after dress of a data item by computing a function on the search
8+-Tree File Organization B+ 3tfdyra key value.
sobed by using 84-Tree H ET fid7 sA B ata g0 zia feriHashing is an effective technique calculate the direct
Index fle degradation problem s to
lo
cation of a data record on the disk without using the index tfrin
indices
B+tree file organization store 5 B+IE TAAA d ens ika fatR E ,sructure. Hashing uses hash functions with search keys a5
The leaf nodes in a
|parameters to generate the address of a data record.
records, insteeed of pointen
Since records are larger than pointers, the maximum num- i izi sdT aAt zrd 8, fesiga afrsa7A hash file stores data in bucket format. Bucket is considered
a
ber of records that can be stored in a leaf node is less than He, g iy As ¥R TA, 4ie Az 7 launit of storage. Abucket typically stores one complete disk
the mumber of pointers in a nonleaf node. block which'in turn càn store one or more records.
Leafnodes are siquired to be half full. A hash fundtion h is a function from the set of all search
Insertion and deletion are handled in the same way as in A r setva TM TT isa fsy Td ra B+-key values Kto the set of all bucket addresses B.
serdion and deletion of entries in a B+ree index. o We choose a number of buckets to correspond to
the number of search key values we will have
stored in the database.
To perform a lookup on a
o
search key value K, we
compute htK), and search the bucket with that ad-
,
dress.
lf two search keys i andj map to the same address,
because h{K) = h{K), then the bucket at the ad-
dress obtained will contain records with both bki) hki), i Ta
T
seàrch key values.
o In this case we will have to check the search keyY
-Tree Index Files value of every record in the bucket to get the ones
Similar to B+-ree, but B-tvee alows, search-key values to we want
B-tree (above) and B+ree (below) on same data Many more names will be mapped to "A" than to "X 37tT "A"
X'.
Typical hash functions perform some operation on
the internal binary machine representations of
characters in a key.
O For example, compute the sum, modulo # of
buckets, of the binary representations of characters T, T5Z
of the search key.
Using this method for 10 buckets (assuming the ith
character in the alphabet is represented by integer
address 3. Here, there wll be no change in ts 3 e zai, a*z ter eai n ti zu. Closed Hlashing
cket 2. tfi
ket address. he output address shall always be
or thrs funcuon. When br cke arefull, then
ts data bucket is allo-
a new
cated for th: same hash result and is linked after the
n the static hashing number of data buckeis in previous onit. This mechanism is known as overtlow
e
rernairs constant throughout. chaining
ofy
Data Bckes
Now, in he 2xample given in the open hashing, a new
Meerut bucket i inserted at the end of 110 buckets and is
Seardh Keys
Dah linked te it.
L100 Aga Data Buckets
101 Lucknow L 102
102 Verana Data Record 105
103F Mathura
erauons
Deradin 10H HASH -
110
110
New Record
insertion- When a new record is inserted into the
table, then we will generate an address for a new
based the hash key and record is stored in
tecord
s2 that location.
on
The prollem with static hashing is that it does not ex-
Searching - when a record needs to be searched,
RT -
TAT fTIE T gà ë at ast n vA ZPdna or shirink dynamically as the size of the database
grows or sh rinks. Dynamic hashing helps us to over
then the same hash function retieves
the bucket where he data is stored.
the address of
come th problem of bucket overilow. Dynamic hasing
Deletion To delete a record, we will first fetch the .fsi-frots
fad
fseitz rà* fay EH TE TÀ ftaië aprovides a mechanism in which data buckets are
feritz aTT fr T7 àard i a a*added ind removed dynamically and on-demand.
PCord which is supposed to be deleted. Then-we' w Dynami
j e l e t e the records tor that address in memory hashing is also known as extended hashing8
Update a recordlo update a record, we will firsz In dynainic hashing hash funcion is made to
produce
a r c h it using a hash function and then the data 3¥I À FÈ TNi a fr a fsid initially. number
3aude salarge of values and only a few are used
e c o r d is updated.
af T fsri T f s i
wJnt to
insert some new
record into he file but tte Ed , "g n
dynarnic hashing, the hash
e
Operations
Querying- Look at the depth value of the hash index
o
Querying dn a 7z zH
and use those bits to compute the bucket address.
above and the Update R N TY T t T si I a
update o
-
Update- Perfom a query as In the above example, an index 1DX_STUDENTS is 3TIR 3KTGU T STUDENTS TH Z
data. created on the STUDENTS' table. This index contains
T ldx Stude
Deletlon Perform a locate the desired data o Deletion -
aA ZA ST A T zi catesz »t a only one column of the table 'NAME: So, it is a Simple - NAME FAfa Bi 37: T 7
-
SQL
o Unique Index: The Indexes, where duplicate Unique Index: T fors colunns Et
values are prahibited in columns, ît means, in which (Duplicate values) tT fre Tr i, zuq f
S
******************e***d***vi **** **
174 DBMS
17; nthe above exaMple, index s created for 'PERSONS
with the name PERSOAS_INDEX. t includes 3TU 3TCT 'PERSONS }AH fTN TG R TM OPpin lhe Index Nomalization|175
tabe a,frrI TH Persons-lndex i zu7 A 7
Col The pro of removing an Index related to a
tuo columns of the 1able, FiRST_NAME" and umns-First Name' table i5 a zaTa fozi zi i s y
a "Last-Name' zfztea ti ta called di y;
AST NAME This s nothing but Composite Dupli ze
ing index. Ihe DROP INUEX COmmana DROPINDEX THE
Tm fan a I g fa a a t:
Cate Incex Compasite Duplicate Index used for sB same. The
syntax is as follows:
n the Index, the data ot the column is stored in DROP INDEX
Kending order, by delault. t t is to be sorted in the zTI uf Fa sTTdEt (Descending) 4 I P INDEX <IndexName>; <Index Name>;
dekending order, then 'DESC keyword is used ater Sort E1 8), a Example
he sColumaNanme> at the üme of defining the In- Or cte S01 9s
dex, as shown in the toilowing example: Name> au 'DESC Keywordfrat tiT