Indexing Techniques

Storage Technology: Topic 4

Introduction to Database Systems

1

Indexes

An index on a collection of records speeds
up selections on the search key fields.
– Any subset of the fields of a record can be the
search key for an index on the collection.

An index is a collection of index entries.
– Retrieve all entries k* with key value k
– Retrieve all entries k* between two key values
– Retrieve entries in search key order

Introduction to Database Systems

2

Alternatives for Data Entry k* in
Index

Three alternatives:
 Data record with search key value k

issue : how much data repetition?
Issue: is this simply a fancy file format?

 <k, rid of data record with search key value k>
 <k, list of rids of data records with search key
k>

Our focus: alternative 2.
– Examples of indexing techniques: B+ trees,
hash-based structures

Introduction to Database Systems

3

Index Classification:
Clustering

Clustered vs. unclustered: If order of data
records is the same as, or ``close to’’,
order of data entries, then called clustered
index.
– At most one independent clustered index.
– Cost of retrieving data through index varies
greatly based on whether index is clustered or
not! Why?
– Usually, clustering desired for sorted access.

Introduction to Database Systems

4

Clustered vs. Unclustered
Index

Title: es_f52.fig
Creator: /s/transfig-3.1.1/exe/fig2dev Version 3.1 Patchlevel 1
CreationDate: Wed Oct 11 19:11:29 1995

Introduction to Database Systems

Title: es_f62.fi g
Creator: /s/transfi g-3.1.1/exe/fi g2dev Version 3.1 Patchlevel 1
CreationDate: Wed Oct 11 19:11:11 1995

5

Sparse Clustering

Dense vs. Sparse: If
there is at least one
data entry per
search key value (in
some data record),
then dense.

Title: l3_f1.fig
Creator: /s/transfig-3.1.1/exe/fig2dev Version 3.1 Patchlevel 1
CreationDate: Wed Sep 6 17:49:58 1995

– Every sparse index is
clustered!

Introduction to Database Systems

6

Primary/Secondary Indexes

Definition 1: Primary == Clustered
Definition 2: Primary == search key
contains primary key of the relation

We will use Definition 2

Introduction to Database Systems

7

Tree-Structured Indexing

Tree-structured indexing techniques
support both range searches and
equality searches
``Find all students with gpa > 3.0’’
– If data is in sorted file, use binary search.

Title: l45_f2.fig
Simple
idea: Create an `index’ file.
Creator: fig2dev Version 3.1 Patchlevel 1
CreationDate: Thu Sep 14 08:24:31 1995

 Can do binary search on (smaller) index file!
Introduction to Database Systems

8

ISAM

Title: l45_f1.fig
C reator: /s/transfig-3.1.1/exe/fig2dev Version 3.1 Patchlevel 1
C reationDate: Wed Sep 6 18:49:46 1995

Index file may still be quite large. But we
can apply the idea repeatedly!

Title: l45_f3.fig
Creator: fig2dev Version 3.1 Patchlevel 1
CreationDate: Fri Sep 15 05:26:52 1995

 Leaf pages contain data entries.
Introduction to Database Systems

9

Comments on ISAM


Title: l45_f32.fig
Creator: /s/transfig-3.1.1/exe/fi
CreationDate: Wed Sep 6 18:

File creation: Leaf pages allocated sequentially,
sorted by search key; then index pages
allocated, then space for overflow pages.
Index entries: <search key value, page id>; they
`direct’ search for data entries, which are in leaf pages.
Search: Start at root; use key comparisons to go to leaf.
Cost
log F N ; F = # entries/index pg, N = # leaf pgs
Insert: Find leaf data entry belongs to, and put it there.
Delete: Find and remove from leaf; if empty overflow
page, de-allocate.

Static tree structure: inserts/deletes affect only leaf

Introduction to Database Systems

10

Example ISAM Tree

Each node can hold 2 entries; no need
for `next-leaf-page’ pointers. (Why?)

Title: l45_f4.fi g
Cre ator: /s/transfi g-3.1.1/exe/fi g2dev Version 3.1 Patchlevel 1
Cre ationDate: Wed Sep 6 18:56:21 1995

Introduction to Database Systems

11

After Inserting 23*, 48*, 41*,
42* ...
Title: l45_f5.fig
Creator: fig2dev Version 3.1 Patchlevel 1
CreationDate: Fri Sep 15 09:11:57 1995

Introduction to Database Systems

12

... Then Deleting 42*,
51*,
97*
Title: l45_f6.fi g
Cre ator: /s/transfi g-3.1.1/exe/fi g2dev Version 3.1 Patchlevel 1
Cre ationDate: Wed Sep 6 18:56:59 1995

 Note that 51* appears in index levels, but not in lea
Introduction to Database Systems

13