VIEWS

Prof. Navneet Goyal
Department of Computer Science & I nformation Systems
BI TS, Pilani
Topics
 Query Modification
 View Materialization
 Which Views to Materialize?
 How to exploit Materialized
Views to answer queries?
 View Maintenance

View Modification
(Evaluate On Demand)
CREATE VIEW RegionalSales(category,sales,state)
AS SELECT P.category, S.sales, L.state
FROM Products P, Sales S, Locations L
WHERE P.pid=S.pid AND S.locid=L.locid
SELECT R.category, R.state, SUM(R.sales)
FROM RegionalSales AS R GROUP BY R.category, R.state
SELECT R.category, R.state, SUM(R.sales)
FROM (SELECT P.category, S.sales, L.state
FROM Products P, Sales S, Locations L
WHERE P.pid=S.pid AND S.locid=L.locid) AS R
GROUP BY R.category, R.state
View
Query
Modified
Query
View Materialization
(Precomputation)
 Suppose we precompute RegionalSales and
store it with a clustered B+ tree index on
[category,state,sales].
 Then, previous query can be answered by an
index-only scan.
SELECT R.state, SUM(R.sales)
FROM RegionalSales R
WHERE R.category=“Laptop”
GROUP BY R.state
SELECT R.state, SUM(R.sales)
FROM RegionalSales R
WHERE R. state=“Wisconsin”
GROUP BY R.category
Index on precomputed view
is great!
Index is less useful (must
scan entire leaf level).
Materialized Views
 A view whose tuples are stored in the
database is said to be materialized
 Provides fast access, like a (very high-
level) cache.
 Need to maintain the view as the
underlying tables change.
 Ideally, we want incremental view
maintenance algorithms.
 Close relationship to Data
Warehousing, OLAP,
Issues in View
Materialization
 What views should we materialize, and
what indexes should we build on the
precomputed results?
 Given a query and a set of materialized
views, can we use the materialized views
to answer the query?
 How frequently should we refresh
materialized views to make them
consistent with the underlying tables?
(And how can we do this incrementally?)
View Materialization:
Example
SELECT P.Category, SUM(S.sales)
FROM Product P, Sales S
WHERE P.pid=S.pid
GROUP BY P.Category
SELECT L.State, SUM(S.sales)
FROM Location L, Sales S
WHERE L.locid=S.locid
GROUP BY L.State
Both queries
require us to
join the Sales
table with
another table &
aggregate the
result
How can we use
materialization
to speed up
these queries?
View Materialization:
Example
 Pre-compute the two joins involved
( product & sales & Location & sales)
 Pre-compute each query in its entirety
 OR let us define the following view:
CREATE VIEW TOTALSALES (pid, lid, total)
AS Select S.pid, S.locid, SUM(S.sales)
FROM Sales S
GROUP BY S.pid, S.locid
View Materialization:
Example
 The View TOTALSALES can be
materialized & used instead os Sales in
our two example queries
SELECT P.Category, SUM(T.Total)
FROM Product P, TOTALSALES T
WHERE P.pid=T.pid
GROUP BY P.Category
SELECT L.State, SUM(T.Total)
FROM Location L, TOTALSALES T
WHERE L.locid=T.locid
GROUP BY L.State
View Maintenance
 A materialized view is said to be refreshed
when it is made consistent with changes ot
its underlying tables
 Often referred to as VIEW MAINTENANCE
 Two issues:
 HOW do we refresh a view when an underlying
table is refreshed? Can we do it incrementally?
 WHEN should we refresh a view in response to a
change in the underlying table?
View Maintenance
 The task of keeping a materialized view up-to-date with
the underlying data is known as materialized view
maintenance
 Materialized views can be maintained by recomputation on
every update
 A better option is to use incremental view maintenance
 Changes to database relations are used to compute
changes to materialized view, which is then updated
 View maintenance can be done by
 Manually defining triggers on insert, delete, and update of
each relation in the view definition
 Manually written code to update the view whenever database
relations are updated
 Supported directly by the database
View Maintenance
 Two steps:
 Propagate: Compute changes to view when data
changes.
 Refresh: Apply changes to the materialized view
table.
 Maintenance policy: Controls when we do
refresh.
 Immediate: As part of the transaction that
modifies the underlying data tables. (+
Materialized view is always consistent; - updates
are slowed)
 Deferred: Some time later, in a separate
transaction. (- View becomes inconsistent; + can
scale to maintain many views without slowing
updates)
Deferred Maintenance
Three flavors:
 Lazy: Delay refresh until next query on view;
then refresh before answering the query.
 Periodic (Snapshot): Refresh periodically.
Queries possibly answered using outdated
version of view tuples. Widely used, especially
for asynchronous replication in distributed
databases, and for warehouse applications.
 Event-based: E.g., Refresh after a fixed number
of updates to underlying data tables.
View Maintenance:
Incremental Algorithms
 Recomputing the view when an
underlying table is modified –
straightforward approach
 Not feasible to do so for all changes
made
 Ideally algorithms for refreshing a
view should be incremental
 Cost of refresh is proportional to
the extent of the change
View Maintenance:
Incremental Algorithms
 Note that a given row in the
materialized view can appear many
times (duplicates are not eliminated)
 Main idea behind incremental
algorithms is to efficiently compute
changes to the rows of the view
• New rows
• Changes to count associated with a row
• A row is deleted if its count becomes 0
View Maintenance
 The changes (inserts and deletes) to a relation
or expressions are referred to as its differential
 Set of tuples inserted to and deleted from r are denoted
i
r
and d
r
 To simplify our description, we only consider
inserts and deletes
 We replace updates to a tuple by deletion of the tuple
followed by insertion of the update tuple
 We describe how to compute the change to the
result of each relational operation, given
changes to its inputs
 We then outline how to handle relational algebra
expressions
Join Operation
 Consider the materialized view v = r s and an
update to r
 Let rold and rnew denote the old and new states
of relation r
 Consider the case of an insert to r:
 We can write rnew s as (rold  ir) s
 And rewrite the above to (rold s)  (ir s)
 But (rold s) is simply the old value of the materialized
view, so the incremental change to the view is just ir s
 Thus, for inserts vnew = vold (ir s)
 Similarly for deletes vnew = vold – (dr s)
Selection & Projection
Operations
 Selection: Consider a view v = 

(r).
 v
new
= v
old


(i
r
)
 v
new
= v
old
- 

(d
r
)
 Projection is a more difficult operation
 R = (A,B), and r(R) = { (a,2), (a,3)}
 
A
(r) has a single tuple (a).
 If we delete the tuple (a,2) from r, we should not delete the
tuple (a) from 
A
(r), but if we then delete (a,3) as well, we
should delete the tuple
 For each tuple in a projection 
A
(r) , we will keep a count
of how many times it was derived
 On insert of a tuple to r, if the resultant tuple is already in

A
(r) we increment its count, else we add a new tuple with
count = 1
 On delete of a tuple from r, we decrement the count of the
corresponding tuple in 
A
(r)
• if the count becomes 0, we delete the tuple from 
A
(r)
Aggregate Operations
 count : v =
A
g
count(B)
(r)
( count of the attribute B, after grouping r by attribute A)
 When a set of tuples i
r
is inserted
• For each tuple t in i
r
, if the group t.A is present
in v, we increment its count, else we add a new
tuple (t.A, 1) with count = 1
 When a set of tuples d
r
is deleted
• for each tuple t in i
r
.
we look for the group t.A in
v, and subtract 1 from the count for the group.
• If the count becomes 0, we delete from v the tuple
for the group t.A
Example
 Relation account grouped by branch-name:
branch_name
g
sum(balance)
(account)
branch_name account_number balance
Perryridge
Perryridge
Brighton
Brighton
Redwood
A-102
A-201
A-217
A-215
A-222
400
900
750
750
700
branch_name sum(balance)
Perryridge
Brighton
Redwood
1300
1500
700
Aggregate Operations
 sum: v =
A
g
sum (B)
(r)

 We maintain the sum in a manner similar to count,
except we add/subtract the B value instead of
adding/subtracting 1 for the count
 Additionally we maintain the count in order to
detect groups with no tuples. Such groups are
deleted from v
• Cannot simply test for sum = 0 (why?)
 To handle the case of avg, we maintain the
sum and count aggregate values separately,
and divide at the end
Aggregate Operations
 min, max: v =
A
g
min (B)
(r).
 Handling insertions on r is
straightforward.
 Maintaining the aggregate values min
and max on deletions may be more
expensive.

We have to look at the
other tuples of r that are in the same
group to find the new minimum

Example
Test
g
Min(marks)
(student_record))
TEST IDNO Marks
T1
T1
T2
T2
T3
A-102
A-103
A-102
A-103
A-104
15
20
25
10
10
TEST Min(marks)
T1
T2
15
10