Professional Documents
Culture Documents
Lecture 5: SQL
Instructor: Henry Kalisti
2
Data Definition Language
• May change:
• The schema for each relation.
• The domain of values associated with each
attribute.
• Integrity constraints
• The set of indices to be maintained for each
relations.
• Security and authorization information for
each relation.
• The physical storage structure of each 3
relation on disk.
Domain Types in SQL
• char(n). Fixed length character string, with user-
specified length n.
• varchar(n). Variable length character strings,
with user-specified maximum length n.
• int. Integer (a finite subset of the integers that is
machine-dependent).
• smallint. Small integer (a machine-dependent
subset of the integer domain type).
• numeric(p, d). Fixed point number, with user-
specified precision of p digits, with d digits to the
right of decimal point.
4
Domain Types in SQL
• real, double precision. Floating point and double-
precision floating point numbers, with machine-
dependent precision.
• float(n). Floating point number, with user-
specified precision of at least n digits.
• Null values are allowed in all the domain types.
Declaring an attribute to be not null prohibits null
values for that attribute.
• create domain construct in SQL-92 creates user-
defined domain types 5
attribute Ai
Integrity Constraints in Create Table
• not null
• primary key (A1, ..., An)
• Foreign Key (A,…,An) references s(B1,…Bn)
• check (P), where P is a predicate
Example: Declare branch-name as the primary key for branch and
ensure that the values of assets are non-negative.
create table branch
(branch-name char(15),
branch-city char(30)
assets integer,
primary key (branch-name),
check (assets >= 0))
11
DML
• Read or change the content of the database
12
Data in SQL
1. Atomic types, a.k.a. data types
2. Tables built from atomic types
13
Data Types in SQL
• Characters:
CHAR(20) -- fixed length
VARCHAR(40) -- variable length
• Numbers:
BIGINT, INT, SMALLINT, TINYINT
REAL, FLOAT -- differ in precision
MONEY
• Times and dates:
DATE
DATETIME -- SQL Server 14
• Others... All are simple
Tables in SQL Attribute names
Table name
Product
Tuples or rows
Tables Explained
• A tuple = a record
• Restriction: all attributes are of atomic type
• A table = a set of tuples
• Like a list…
• …but it is unorderd: no first(), no next(), no
last().
• No nested tables, only flat tables are allowed !
• We will see later how to decompose complex
structures into multiple flat tables 16
Tables Explained
• The schema of a table is the table name and its
attributes:
• Product(PName, Price, Category, Manufacturer)
17
SQL Query
SELECT attributes
FROM relations (possibly multiple, joined)
WHERE conditions (selections)
18
Simple SQL Query
Product
SELECT *
FROM Product
WHERE category=‘Gadgets’
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks 19
Powergizmo $29.99 Gadgets GizmoWorks
“selection”
Simple SQL Query
Product
Output Schema
Selections
What goes in the WHERE clause:
• x = y, x < y, x <= y, etc
• For number, they have the usual meanings
• For CHAR and VARCHAR: lexicographic
ordering
• Expected conversion between CHAR and
VARCHAR
• For dates and times, what you expect...
• Pattern matching on strings: s LIKE p (next) 22
The LIKE operator
• s LIKE p: pattern matching on strings
• p may contain two special symbols:
• % = any sequence of characters
• _ = any single character
Compare to:
Category
SELECT DISTINCT category Gadgets
FROM Product Photography
Household
24
Ordering the Results
SELECT Category
FROM Product
ORDER BY PName
?
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
26
Ordering the Results
Category
SELECT DISTINCT category Gadgets
Compare to:
?
SELECT DISTINCT category
FROM Product
ORDER BY PName
27
Joins in SQL
• Connect two or more tables:
PName Price
30
SingleTouch $149.99
Joins
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)
SELECT Country
FROM Product, Company
WHERE Manufacturer=CName AND Category=‘Gadgets’
31
Joins in SQL
Product
Company
PName Price Category Manufacturer
Cname StockPrice Country
Gizmo $19.99 Gadgets GizmoWorks
GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks
Canon 65 Japan
SingleTouch $149.99 Photography Canon
Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi
SELECT Country
FROM Product, Company
WHERE Manufacturer=CName AND Category=‘Gadgets’
Country
What is ??
the problem ? ?? 32
What’s the
solution ?
Joins
Product (pname, price, category, manufacturer)
Purchase (buyer, seller, store, product)
Person(persname, phoneNumber, city)
Which
SELECT DISTINCT pname, address address ?
FROM Person, Company
WHERE worksfor = cname
35
Answer (store)
Tuple Variables
General rule:
tuple variables introduced automatically by the system:
SELECT Product.name
FROM Product AS Product
WHERE Product.price > 100
36
Doesn’t work when Product occurs more than once:
In that case the user needs to define variables explicitely.
Renaming Columns
Product PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
1. Nested loops:
Answer = {}
for x1 in R1 do
for x2 in R2 do
…..
for xn in Rn do
if Conditions 38
then Answer = Answer È {(x1,…,xk)}
return Answer
Meaning (Semantics) of SQL
Queries
SELECT a1, a2, …, ak
FROM R1 AS x1, R2 AS x2, …, Rn AS xn
WHERE Conditions
2. Parallel assignment
Answer = {}
for all assignments x1 in R1, …, xn in Rn do
if Conditions then Answer = Answer È {(x1,…,xk)}
return Answer
Looking for R Ç (S È T)
40
Advanced SQL
Part 2
41
Outline
42
Union, Intersection, Difference
(SELECT name
FROM Person
WHERE City=“Seattle”)
UNION
(SELECT name
FROM Person, Purchase
WHERE buyer=name AND store=“The Bon”)
43
Similarly, you can use INTERSECT and EXCEPT.
You must have the same attribute names (otherwise: rename).
Conserving Duplicates
(SELECT name
FROM Person
WHERE City=“Seattle”)
UNION ALL
(SELECT name
FROM Person, Purchase
WHERE buyer=name AND store=“The Bon”) 44
Subqueries
SELECT Purchase.product
FROM Purchase, Person
WHERE buyer = name AND ssn = ‘123456789‘
This is equivalent to the previous one when the ssn is a key
and ‘123456789’ exists in the database;
otherwise they are different.
46
Subqueries Returning
Relations
Find companies that manufacture products bought by Joe Blow.
SELECT Company.name
FROM Company, Product
WHERE Company.name = Product.maker
AND Product.name IN
(SELECT Purchase.product
FROM Purchase
WHERE Purchase .buyer = ‘Joe Blow‘);
47
Here the subquery returns a set of values: no more
runtime errors.
Subqueries Returning
Relations
Equivalent to:
SELECT Company.name
FROM Company, Product, Purchase
WHERE Company.name = Product.maker
AND Product.name = Purchase.product
AND Purchase.buyer = ‘Joe Blow’
SELECT name
FROM Product
WHERE price > ALL (SELECT price
FROM Purchase 50
WHERE maker=‘Gizmo-Works’)
Correlated Queries
Movie (title, year, director, length)
Find movies whose title appears more than once.
correlation
Find all companies s.t. some of their products have price < 100
53
Existential: easy ! J
Existential/Universal
Conditions
Product ( pname, price, company)
Company( cname, city)
Find all companies s.t. all of their products have price < 100
Universal: hard ! L
54
Existential/Universal
Conditions
1. Find the other companies: i.e. s.t. some product ³ 100
SELECT DISTINCT Company.cname
FROM Company
WHERE Company.cname IN (SELECT Product.company
FROM Product
WHERE Product.price >= 100
2. Find all companies s.t. all their products have price < 100
SELECT DISTINCT Company.cname
FROM Company
WHERE Company.cname NOT IN (SELECT Product.company
55
FROM Product
WHERE Product.price >= 100
Aggregation
SELECT Avg(price)
FROM Product
WHERE maker=“Toyota”
56
Aggregation: Count
SELECT Count(*)
FROM Product
WHERE year > 1995
Better:
62
First compute the FROM-WHERE clauses (date >
“9/1”) then GROUP BY product:
63
Then, aggregate
Product TotalSales
Bagel $29.75
Banana $12.48
Banana $12.48 17
Bagel $29.75 20
For every product, what is the total sales and max quantity sold?
SELECT product, Sum(price * quantity) AS SumSales
Max(quantity) AS MaxQuantity
66
FROM Purchase
GROUP BY product
HAVING Clause
68
General form of Grouping and
Aggregation
SELECT S
FROM R1,…,Rn
WHERE C1
GROUP BY a1,…,ak
HAVING C2
Evaluation steps:
1. Compute the FROM-WHERE part, obtain a table with all attributes in
R1,…,Rn
2. Group by the attributes a1,…,ak
3. Compute the aggregates in C2 and keep only groups satisfying C2
4. Compute aggregates in S and return the result 69
Aggregation
Author(login,name)
Document(url, title)
Wrote(login,url)
Mentions(url,word)
70
Grouping vs. Nested Queries
• Find all authors who wrote at least 10 documents:
• Attempt 1: with nested queries
This is
SQL by
a novice
SELECT DISTINCT Author.name
FROM Author
WHERE count(SELECT Wrote.url
FROM Wrote
WHERE Author.login=Wrote.login) 71
> 10
Grouping vs. Nested Queries
• Find all authors who wrote at least 10 documents:
• Attempt 2: SQL style (with GROUP BY)
SELECT Author.name
FROM Author, Wrote, Mentions
WHERE Author.login=Wrote.login AND Wrote.url=Mentions.url
GROUP BY Author.name
HAVING count(distinct Mentions.word) > 10000
74
Null Values
• If x= NULL then 4*(3-x)/7 is still NULL
75
Null Values
• C1 AND C2 = min(C1, C2)
• C1 OR C2 = max(C1, C2)
• NOT C1 = 1 – C1
SELECT * E.g.
FROM Person age=20
WHERE (age < 25) AND heigth=NULL
weight=200
(height > 6 OR weight > 190)
Rule in SQL: include only tuples that yield TRUE
76
Null Values
Unexpected behavior:
SELECT *
FROM Person
WHERE
Some Persons age <included
are not 25 OR! age >= 25
77
Null Values
Can test for NULL explicitly:
• x IS NULL
• x IS NOT NULL
SELECT *
FROM Person
WHERE age < 25 OR age >= 25 OR age IS NULL
Same as:
But Products that never sold will be lost !
SELECT Product.name, Purchase.store
FROM Product, Purchase
WHERE Product.name = Purchase.prodName 79
Outerjoins
Left outer joins in SQL:
Product(name, category)
Purchase(prodName, store)
80
Product Purchase
Name Category ProdName Store
Name Store
Gizmo Wiz
Camera Ritz
Camera Wiz 81
OneClick NULL
Outer Joins
82
Modifying the Database
Three kinds of modifications
• Insertions
• Deletions
• Updates
83
Insertions
General form:
camera - - 87
Insertion: an Example
camera 200 -
88
UPDATE PRODUCT
SET price = price/2
WHERE Product.name IN
(SELECT product
FROM Purchase
WHERE Date =‘Oct, 25, 1999’);
90
Data Definition in SQL
So far we have seen the Data Manipulation Language, DML
Next: Data Definition Language (DDL)
Data types:
Defines the types.
• Create tables
• Delete tables
• Modify table schema
91
• Characters:
• CHAR(20) -- fixed length
• VARCHAR(40) -- variable length
• Numbers:
• INT, REAL plus variations
• Times and dates:
• DATE, TIME (Pointbase)
92
Creating Tables
Example:
name VARCHAR(30),
social-security-number INT,
age SHORTINT,
city VARCHAR(30),
gender BIT(1),
Birthdate DATE
); 93
Deleting or Modifying a Table
Deleting:
Example: DROP Person; Exercise with care !!
95
SELECT *
FROM Person
WHERE name = “Smith”
96
Sequential scan of the file Person may take long
Indexes
Syntax:
98
Creating Indexes
SELECT *
Helps in: FROM Person
WHERE age = 55 AND city = “Seattle”
SELECT *
and even in: FROM Person
WHERE age = 55
SELECT * 100
But not in: FROM Person
WHERE city = “Seattle”
The Index Selection Problem
• We are given a workload = a set of SQL queries
plus how often they run
• What indexes should we build to speed up the
workload ?
• FROM/WHERE clauses è favor an index
• INSERT/UPDATE clauses è discourage an index
• Index selection = normally done by people,
recently done automatically
101
Defining Views
Views are relations, except that they are not physically stored.
104
What Happens When We Query a
View ?
SELECT name, Seattle-view.store
FROM Seattle-view, Product
WHERE Seattle-view.product = Product.name AND
Product.category = “shoes”
106
Updating Views
How can I insert a tuple into a table that doesn’t exist?