You are on page 1of 31

CSE 3207Database Systems

Lecture 1: Introduction

CSE Program
The School of EE & Computing
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Introduction
INDEX
▶ Databases
• Database System consists
Intro. • Database Management System (DBMS) and
Applications
Purpose • Database which contains raw facts and data about
▶ View of Data
the structure of tables
Data Abstraction.
Instances and Schemas • Both are interdependent
Data Models
▶ Languages • Database management system (DBMS)
• Collection of programs
DDL
DML
SQL
▶ DB Design
• Manages structure and controls access to data
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Introduction
INDEX
▶ Databases
• Data are usually stored in a database.
Intro.
Applications
• To implement a database and to manage its
Purpose
▶ View of Data
contents, you need a
Data Abstraction. database management system (DBMS).
Instances and Schemas
Data Models • The DBMS serves as the intermediary
▶ Languages
DDL between the user and the database.
• The database contains the data you have
DML
SQL
▶ DB Design
▶ DB Engine
collected and “data about data,” known as
Storage Manager metadata.
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Database Applications
INDEX
▶ Databases
• Examples
Intro. • Banking: transactions

Applications
Purpose
Airlines: reservations, schedules
▶ View of Data • Universities: registration, grades
Data Abstraction.
Instances and Schemas • Sales: customers, products, purchases
Data Models
▶ Languages
• Online retailers: order tracking, customized
DDL recommendations
DML
SQL
• Manufacturing: production, inventory, orders, supply
▶ DB Design chain
▶ DB Engine • Human resources: employee records, salaries, tax
Storage Manager
Query Processing
deductions
Transaction Manager
▶ DB • Databases can be very large.
Architecture
▶ DB Users
▶ History • Databases touch all aspects of our lives
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

University Database
INDEX
▶ Databases
• Application program examples
Intro. • Add new students, instructors, and courses
Applications
Purpose • Register students for courses, and generate class
▶ View of Data
rosters
Data Abstraction.
Instances and Schemas • Assign grades to students, compute grade point
Data Models
▶ Languages
averages (GPA) and generate transcripts
DDL
DML
• In the early days, database applications were
SQL built directly on top of file systems
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Drawbacks of File Systems


INDEX
▶ Databases
• Data redundancy and inconsistency
Intro. • Multiple file formats, duplication of information in
Applications
different files
Purpose
▶ View of Data
• Difficulty in accessing data
Data Abstraction.
Instances and Schemas • Need to write a new program to carry out each new
Data Models
task
▶ Languages
DDL • Data isolation
DML
SQL • Multiple files and formats
▶ DB Design
▶ DB Engine • Integrity problems
Storage Manager
• Integrity constraints (e.g., account balance > 0)
Query Processing
Transaction Manager become “buried” in program code rather than being
▶ DB stated explicitly
Architecture
▶ DB Users
▶ History • Hard to add new constraints or change existing ones
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Drawbacks of File Systems


INDEX
▶ Databases
• Atomicity of updates
Intro. • Failures may leave database in an inconsistent state
Applications
with partial updates carried out
Purpose
▶ View of Data • Example: Transfer of funds from one account to
Data Abstraction. another should either complete or not happen at all
Instances and Schemas
Data Models • Concurrent access by multiple users
▶ Languages
DDL
• Concurrent access needed for performance
DML • Uncontrolled concurrent accesses can lead to
SQL
▶ DB Design
inconsistencies
▶ DB Engine • Example: Two people reading a balance (say 100) and
Storage Manager updating it by withdrawing money (say 50 each) at the same
Query Processing time
Transaction Manager
▶ DB • Security problems
Architecture
▶ DB Users
▶ History
• Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Levels of Abstraction
INDEX
▶ Databases
• Physical level: describes how a record (e.g.,
Intro. instructor) is stored.
Applications
Purpose • Logical level: describes data stored in database,
▶ View of Data
Data Abstraction.
and the relationships among the data.
Instances and Schemas type instructor = record
Data Models
▶ Languages
ID : string;
DDL name : string;
DML dept_name : string;
SQL
▶ DB Design
salary : integer;
▶ DB Engine end;
Storage Manager
Query Processing
• View level: application programs hide details of
Transaction Manager data types. Views can also hide information (such
▶ DB
Architecture
▶ DB Users as an employee’s salary) for security purposes.
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Architecture
INDEX
▶ Databases
Intro.
Applications
Purpose
▶ View of Data
Data Abstraction.
Instances and Schemas
Data Models
▶ Languages
DDL
DML
SQL
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Instances and Schemas


INDEX • Similar to types and variables in programming languages
▶ Databases
Intro.
• Logical Schema – the overall logical structure of the database
Applications • Example: The database consists of information about a set of
Purpose customers and accounts in a bank and the relationship between them
▶ View of Data Analogous to type information of a variable in a program
Data Abstraction.
Instances and Schemas
• Physical schema– the overall physical structure of the
Data Models database
▶ Languages
DDL
• Instance – the actual content of the database at a particular
DML point in time
SQL • Analogous to the value of a variable
▶ DB Design
▶ DB Engine
• Physical Data Independence – the ability to modify the physical
Storage Manager
schema without changing the logical schema
Query Processing • Applications depend on the logical schema
Transaction Manager • In general, the interfaces between the various levels and components
▶ DB
Architecture
should be well defined so that changes in some parts do not seriously
▶ DB Users
influence others.
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Data Models
INDEX • A collection of tools for describing
▶ Databases
Intro.
• Data
Applications • Data relationships
Purpose
• Data semantics
▶ View of Data
Data Abstraction.
• Data constraints
Instances and Schemas
Data Models
• Relational model
▶ Languages
• Entity-Relationship data model (mainly for database
DDL
DML design)
• Object-based data models (Object-oriented and
SQL
▶ DB Design
▶ DB Engine Object-relational)
Storage Manager
Query Processing • Semi structured data model (XML)
• Other older models:
Transaction Manager
▶ DB
Architecture
▶ DB Users • Network model
▶ History • Hierarchical model
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Relational Model
INDEX
▶ Databases
• All the data is stored in various tables.
Intro.
Applications
• Example of tabular data in the relational
Columns
Purpose
▶ View of Data
model
Data Abstraction.
Instances and Schemas
Data Models Rows
▶ Languages
DDL
DML
SQL
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

A Sample Relational Database


INDEX
▶ Databases
Intro.
Applications
Purpose
▶ View of Data
Data Abstraction.
Instances and Schemas
Data Models
▶ Languages
DDL
DML
SQL
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Data Definition Language (DDL)


INDEX • Specification notation for defining the database schema
▶ Databases
Intro.
• Example: create table instructor (
Applications ID char(5),
Purpose namevarchar(20),
▶ View of Data dept_name varchar(20),
Data Abstraction.
salary numeric(8,2))
Instances and Schemas
Data Models • DDL compiler generates a set of table templates stored
▶ Languages
DDL
in a data dictionary
DML • Data dictionary contains metadata (i.e., data about data)
SQL
▶ DB Design • Database schema
▶ DB Engine • Integrity constraints
Storage Manager • Primary key (ID uniquely identifies instructors)
Query Processing
Transaction Manager
• Authorization
▶ DB • Who can access what
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Data Manipulation Language


INDEX
▶ Databases
• Language for accessing and manipulating the
Intro. data organized by the appropriate data model
Applications
Purpose • DML also known as query language
▶ View of Data
Data Abstraction. • Two classes of languages
• Pure – used for proving properties about
Instances and Schemas
Data Models
▶ Languages computational power and for optimization
DDL
DML
• Relational Algebra
SQL • Tuple relational calculus
▶ DB Design
• Domain relational calculus
▶ DB Engine
Storage Manager • Commercial – used in commercial systems
Query Processing
• SQL is the most widely used commercial language
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

SQL
INDEX
▶ Databases
• The most widely used commercial language
Intro.
Applications
• SQL is NOT a Turing machine equivalent
Purpose
▶ View of Data
language
Data Abstraction.
Instances and Schemas
• To be able to compute complex functions SQL
Data Models is usually embedded in some higher-level
▶ Languages
DDL language
• Application programs generally access
DML
SQL
▶ DB Design
▶ DB Engine
databases through one of
Storage Manager • Language extensions to allow embedded SQL
Query Processing
Transaction Manager • Application program interface (e.g., ODBC/JDBC)
▶ DB
Architecture
▶ DB Users
which allow SQL queries to be sent to a database
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Database Design
INDEX
▶ Databases
• Logical Design – Deciding on the database
Intro. schema. Database design requires that we find a
Applications
Purpose
“good” collection of relation schemas.
▶ View of Data • Business decision – What attributes should we
Data Abstraction.
Instances and Schemas
record in the database?
Data Models • Computer Science decision – What relation schemas
▶ Languages
DDL
should we have and how should the attributes be
DML distributed among the various relation schemas?
• Physical Design – Deciding on the physical layout
SQL
▶ DB Design
▶ DB Engine
Storage Manager
of the database
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Database Design
INDEX
▶ Databases
• Is there any problem with this relation?
Intro.
Applications
Purpose
▶ View of Data
Data Abstraction.
Instances and Schemas
Data Models
▶ Languages
DDL
DML
SQL
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Design Approaches
INDEX
▶ Databases
• Need to come up with a methodology to
Intro. ensure that each of the relations in the
Applications
Purpose database is “good”
▶ View of Data
Data Abstraction. • Two ways of doing so:
Instances and Schemas
Data Models
• Entity Relationship Model (Chapter 7)
▶ Languages • Models an enterprise as a collection of entities and
DDL
relationships
DML
SQL • Represented diagrammatically by an entity-relationship
▶ DB Design diagram:
▶ DB Engine
Storage Manager
• Normalization Theory (Chapter 8)
Query Processing • Formalize what designs are bad, and test for them
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Object-Relational Data Models


INDEX
▶ Databases
• Relational model: flat, “atomic” values
Intro.
Applications
• Object Relational Data Models
Purpose
▶ View of Data
• Extend the relational data model by including
Data Abstraction. object orientation and constructs to deal with
Instances and Schemas added data types.
Data Models
▶ Languages • Allow attributes of tuples to have complex types,
DDL including non-atomic values such as nested
DML
SQL relations.
▶ DB Design
• Preserve relational foundations, in particular the
▶ DB Engine
Storage Manager
declarative access to data, while extending
Query Processing modeling power.
• Provide upward compatibility with existing
Transaction Manager
▶ DB
Architecture
▶ DB Users
relational languages.
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Extensible Markup Language


INDEX
▶ Databases
• Defined by the WWW Consortium (W3C)
Intro.
Applications
• Originally intended as a document mark-up
Purpose
▶ View of Data
language not a database language
Data Abstraction.
Instances and Schemas
• The ability to specify new tags, and to create
Data Models nested tag structures made XML a great way
▶ Languages
DDL to exchange data, not just documents
• XML has become the basis for all new
DML
SQL
▶ DB Design
▶ DB Engine
generation data interchange formats.
Storage Manager
Query Processing
• A wide variety of tools is available for parsing,
Transaction Manager
▶ DB
browsing and querying XML documents/data
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Database Engine
INDEX
▶ Databases
• Storage manager
Intro.
Applications
• Query processing
• Transaction manager
Purpose
▶ View of Data
Data Abstraction.
Instances and Schemas
Data Models
▶ Languages
DDL
DML
SQL
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Storage Management
INDEX
▶ Databases
• Storage manager is a program module that
Intro. provides the interface between the low-level
Applications
Purpose
data stored in the database and the application
▶ View of Data
Data Abstraction.
programs and queries submitted to the system.
Instances and Schemas
Data Models
• The storage manager is responsible to the
▶ Languages following tasks:
DDL
DML
• Interaction with the OS file manager
SQL
• Efficient storing, retrieving and updating of data
▶ DB Design
▶ DB Engine • Issues:
Storage Manager
Query Processing • Storage access
Transaction Manager
▶ DB • File organization
Architecture
▶ DB Users
▶ History
• Indexing and hashing
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Query Processing
INDEX
▶ Databases
1. Parsing and translation
Intro.
Applications
2. Optimization
Purpose
▶ View of Data 3. Evaluation
Data Abstraction.
Instances and Schemas
Data Models
▶ Languages
DDL
DML
SQL
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Query Processing
INDEX
▶ Databases
• Alternative ways of evaluating a given query
Intro. • Equivalent expressions
Applications
Purpose • Different algorithms for each operation
▶ View of Data
Data Abstraction. • Cost difference between a good and a bad
Instances and Schemas
Data Models
way of evaluating a query can be enormous
▶ Languages
DDL • Need to estimate the cost of operations
DML
SQL
• Depends critically on statistical information about
▶ DB Design relations which the database must maintain
▶ DB Engine
Storage Manager
• Need to estimate statistics for intermediate
Query Processing results to compute cost of complex expressions
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Transaction Management
INDEX • What if the system fails?
▶ Databases
Intro. • What if more than one user is concurrently updating
Applications
Purpose the same data?
▶ View of Data
Data Abstraction.
• A transaction is a collection of operations that
Instances and Schemas performs a single logical function in a database
Data Models
▶ Languages
application
DDL
DML
• Transaction-management component ensures that
SQL the database remains in a consistent (correct) state
▶ DB Design
▶ DB Engine
despite system failures (e.g., power failures and
Storage Manager
operating system crashes) and transaction failures.
Query Processing
Transaction Manager
• Concurrency-control manager controls the
▶ DB
Architecture
▶ DB Users
interaction among the concurrent transactions, to
▶ History ensure the consistency of the database.
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

Database Architecture
INDEX
▶ Databases
• The architecture of a database systems is
Intro. greatly influenced by the underlying computer
Applications
Purpose system on which the database is running:
▶ View of Data
Data Abstraction.
• Centralized
Instances and Schemas • Client-server
Data Models
▶ Languages • Parallel (multi-processor)
DDL
DML
• Distributed
SQL
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

DB Users and Administrators


INDEX
▶ Databases
Intro.
Applications
Purpose
▶ View of Data
Data Abstraction.
Instances and Schemas
Data Models
▶ Languages
DDL
DML
SQL
▶ DB Design
▶ DB Engine Database
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

DB Users and Administrators


INDEX
▶ Databases
Intro.
Applications
Purpose
▶ View of Data
Data Abstraction.
Instances and Schemas
Data Models
▶ Languages
DDL
DML
SQL
▶ DB Design
▶ DB Engine
Storage Manager
Query Processing
Transaction Manager
▶ DB
Architecture
▶ DB Users
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

History of DB Systems
INDEX
▶ Databases
• 1950s and early 1960s:
Intro. • Data processing using magnetic tapes for storage
Applications
• Tapes provided only sequential access
Purpose
▶ View of Data • Punched cards for input
Data Abstraction.
Instances and Schemas • Late 1960s and 1970s:
Data Models
▶ Languages
• Hard disks allowed direct access to data
DDL • Network and hierarchical data models in widespread
DML
SQL
use
▶ DB Design • Ted Codd defines the relational data model
▶ DB Engine
• Would win the ACM Turing Award for this work
Storage Manager
Query Processing
• IBM Research begins System R prototype
Transaction Manager • UC Berkeley begins Ingres prototype
▶ DB
Architecture
▶ DB Users
• High-performance (for the era) transaction processing
▶ History
Course CSE 3207: Database Systems Lesson Introduction to Database Systems

History of DB Systems
INDEX • 1980s:
▶ Databases
Intro.
• Research relational prototypes evolve into commercial systems
Applications • SQL becomes industrial standard
Purpose • Parallel and distributed database systems
▶ View of Data
• Object-oriented database systems
Data Abstraction.
Instances and Schemas • 1990s:
Data Models
▶ Languages
• Large decision support and data-mining applications
DDL • Large multi-terabyte data warehouses
DML • Emergence of Web commerce
SQL
▶ DB Design • Early 2000s:
▶ DB Engine • XML and XQuery standards
Storage Manager
• Automated database administration
Query Processing
Transaction Manager • Later 2000s:
▶ DB
Architecture
▶ DB Users
• Giant data storage systems
▶ History • Google BigTable, Yahoo PNuts, Amazon, ..

You might also like