You are on page 1of 7

UB CSE Database Courses

CSE 562 CSE 462


Database
Database Systems Concepts

Introduction CSE 562


Database
Systems

Some slides are based or modified from originals by


Database Systems: The Complete Book,
Pearson Prentice Hall 2nd Edition
©2008 Garcia-Molina, Ullman, and Widom CSE 636 CSE 7xx CSE 7xx
Data Dr. Chomicki’s Dr. Petropoulos’
Integration Seminar Seminar

UB CSE 562 Spring 2010 UB CSE 562 Spring 2010 2

Resources Goals of the Course

•  Instructor: Dr. Michalis Petropoulos •  Study key DBMS design issues:


Office Hours: Tue & Thu @ 2-3pm –  Query Languages, Indexing, Query Compilation, Query
Location: 210 Bell Hall Execution, Query Optimization
•  TA: Xinglei Zhu –  Recovery, Concurrency and other transaction
Office Hours: Wednesday @ 12-2pm management issues
Location: 329 Bell Hall –  Data warehousing, Column-Oriented DBMS
•  Web Page –  Database as a Service (DaaS), Cloud DBMS
http://www.cse.buffalo.edu/~mpetropo/CSE562-SP10/ •  Ensure that:
•  Newsgroup –  You are comfortable using a DBMS
sunyab.cse.562
–  You acquire hands-on experience by implementing the
internal components of a relational database engine

UB CSE 562 Spring 2010 3 UB CSE 562 Spring 2010 4

1
Course Evaluation Reading Material

•  Homework Assignments: 15% Textbook


•  Midterm Exam: 20% (in class) •  Database Systems: The Complete Book (2nd Edition)
–  by Garcia-Molina, Ullman and Widom
•  Final Exam: 30% (in class) Also Recommended
•  Project: 35% •  Database System Concepts (5th Edition)
–  Teams of 2 –  by Avi Silberschatz, Henry F. Korth and S. Sudarshan
•  Database Management Systems (3rd Edition)
–  by Ramakrishnan and Gehrke
•  Late Submission Policy:
•  Foundations of Databases
–  Submissions less than 24 hours late will be penalized
–  by Abiteboul, Hull and Vianu
10%
–  Submissions more than 24 hours but less than 48
hours late will be penalized 25%

UB CSE 562 Spring 2010 5 UB CSE 562 Spring 2010 6

Prerequisites Let’s Get Started!

•  Chapters 2 through 8 of the textbook: •  Isn’t Implementing a Database System Simple?


–  Relational Model, Entity/Relationship (E/R) Model
–  Constraints, Functional Dependencies
–  Views, Triggers
–  Database query languages (Relational Algebra, SQL) Relations Statements Results
•  Solid background in Data Structures and
Algorithms
•  Significant programming experience in Java
•  Some knowledge of Compilers

•  Curiosity! You should ask a lot of questions!

UB CSE 562 Spring 2010 7 UB CSE 562 Spring 2010 8

2
Let’s Get Started! Megatron 3000 Implementation Details

•  Relations stored in files (ASCII)


–  e.g., relation Movie
Wild # Lynch # Winger
Sky # Berto # Winger
Reds # Beatty # Beatty

•  Directory file (ASCII)


Movie # Title # STR # Director # STR # Actor # STR …
Schedule # Theater # STR # Title # STR …

UB CSE 562 Spring 2010 9 UB CSE 562 Spring 2010 10

Megatron 3000 Sample Sessions Megatron 3000 Sample Sessions

% MEGATRON3000 & select *


Welcome to MEGATRON 3000! from Movie #
&
Title Director Actor
… Wild Lynch Winger
Sky Berto Winger
& quit Reds Beatty Beatty
% Tango Berto Brando
Tango Berto Winger
Tango Berto Snyder

&

UB CSE 562 Spring 2010 11 UB CSE 562 Spring 2010 12

3
Megatron 3000 Sample Sessions Megatron 3000

•  To execute select * from Movie where condition


& select Theater, Movie.Title (1) Read dictionary to get Movie attributes
from Movie, Schedule (2) Read Movie file, for each line:
where Movie.Title = Schedule.Title
(a) Check condition
AND Actor = “Winger” #
(b) If OK, display
Theater Title
Odeon Wild
Forum Sky

&

UB CSE 562 Spring 2010 13 UB CSE 562 Spring 2010 14

What’s wrong with the Megatron 3000


Megatron 3000
DBMS?

•  To execute •  Tuple layout on disk


select Theater, Movie.Title –  Change string from ‘Cat’ to ‘Cats’ and we have to
from Movie, Schedule rewrite file
where Movie.Title = Schedule.Title –  Deletions are expensive
AND Actor = “Winger”
(1) Read dictionary to get Movie, Schedule attributes
(2) Read Movie file, for each line:
(a) Read Schedule file, for each line:
(i) Create join tuple
(ii) Check condition
(iii) Display if OK

UB CSE 562 Spring 2010 15 UB CSE 562 Spring 2010 16

4
What’s wrong with the Megatron 3000 What’s wrong with the Megatron 3000
DBMS? DBMS?

•  Search expensive; no indexes •  Brute force query processing


–  Cannot find tuple with given key quickly For example:
–  Always have to read full relation select Theater, Movie.Title
from Movie, Schedule
where Movie.Title=Schedule.Title
and Actor = “Winger”
•  Much better if
–  Use index to select tuples with “Winger” first
–  Use index to find theaters where qualified titles play
•  Or
–  Sort both relations on Title and merge-join
•  Exploit cache and buffers

UB CSE 562 Spring 2010 17 UB CSE 562 Spring 2010 18

What’s wrong with the Megatron 3000 What’s wrong with the Megatron 3000
DBMS? DBMS?

•  Concurrency control & recovery •  Security


•  No reliability •  Interoperation with other systems
–  Can lose data •  Consistency enforcement
–  Can leave operations half done

UB CSE 562 Spring 2010 19 UB CSE 562 Spring 2010 20

5
Course Topics Course Topics

•  Hardware Aspects •  Concurrency Control


•  Physical Organization Structure Correctness, locks, deadlocks…
Records in blocks, dictionary, buffer management,… •  On-Line Analytical Processing
•  Indexing Map-Reduce,…
B-Trees, hashing,… •  Database as a Service (DaaS)
•  Query Processing •  Cloud DBMS
parsing, rewriting, logical/physical operators,
algebraic and cost-based optimization,
semantic optimization…
•  Crash Recovery
Failures, stable storage,…

UB CSE 562 Spring 2010 21 UB CSE 562 Spring 2010 22

Database System Architecture Example Journey of a Query


Query Processing Transaction Management Initial logical
SELECT Theater query plan
SQL Query FROM Movie M, Schedule S Parse/
Calls from Transactions (read,write)
WHERE M.Title = S.Title Convert
Parser AND M.Actor=“Winger”
relational Transaction
algebra Manager
Query View
Definitions Query Rewriter
Rewriter Concurrency Lock Yet Another logical applies Algebraic Rewriting
and Statistics &
Catalogs & Controller Table query plan
Optimizer System Data Another logical
query
query plan
execution Recovery
plan
Manager Query
Execution Buffer Rewriter
Engine Manager
Log
Data + Indexes
Cont’d
UB CSE 562 Spring 2010 23 UB CSE 562 Spring 2010 24

6
Example Journey of a Query (cont’d) The Journey of a Query (cont’d)

ActorIndex
Title Director Actor
Algebraic Wild Lynch Winger
Optimization Sky Berto Winger
Winger Reds Beatty Beatty
Tango Berto Brando
TitleIndex Tango Berto Winger
Tango Berto Snyder
Cost-based
What are the rules used for the
Optimization
transformation of the query?
EXECUTION ENGINE How is the table arranged on
How do we evaluate the cost of Query the disk?
a possible execution plan? Execution find “Winger” tuples using ActorIndex
Plan Are tuples with the same
for each “Winger” tuple
Index on Actor and Title Actor value clustered?
find tuples t with the same title
Unsorted tables
Tables >> Memory using TitleIndex What is the exact structure of
project the attribute Actor of t the index (tree, hash,…)?

UB CSE 562 Spring 2010 25 UB CSE 562 Spring 2010 26

You might also like