Database Cheatsheet

Introduction to Database 1
Chapter 1
DBMS - database management system - is a specialized SW for efficiently managing large amount of -
mostly structured - data. A DBMS is capable of
data model, which specifies a logical structure of the data, called schema
high-level query language
efficient persistent storage system
transaction management, an atomic unit of work
Data Model
...is a collection of conceptual tools for describing
data
relationships among various data
data semantics
consistency constraints
It provides users with an abstract view of the data. There are data models like
relational model
entity-relationship data model, which is mainly for database design
object-based data models
semi-structured data model
et cetera
Relational Model
Instances and Schemas

Schema is description of structure of the database.
logical schema: the logical structure of the database
conceptual schema, e.g., ER model, UML
implementation schema, e.g., relational model
physical schema, e.g., storage id, format, index
Instance is actual content of the database at a particular point in time.
Levels of Abstraction
Physical, logical, and view level
1
Data Independence
ability to make physical or logical level changes without affecting application programs
DDL: Data Definition Language

...provides facilities to define the relation schema, e.g.
create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2),
)
It provides facilities to specify integrity constraints
primary key, unique, not null, foreign key constraints
domain constraints
It provides facilities for authorization
DML: Data Manipulation Language

...is the ”query language”! There are procedural and declarative DML. SQL: structured query language
is a declarative DML. It is not Turing machine equivalent, and is embedded in some host language.
......
Chapter 2
Terms
A relation schema R(A1 , ..., An ) is, like, a table, having its attributes and all.
logical design of a database
A relation instance r(R) defined over schema is a set of rows
a snapshot of the data in the database at a given instant in time
A tuple is an element of a relation, aka row in a table
A binary relation over sets X and Y is join
A relation is a set of tuples
Keys
A key K ⊆ R = {A1 , ..., An }
K is a superkey if values for K can uniquely identify a tuple of each possible relation r(R)
A candidate key is a minimal superkey
One of the candidate keys is designated to be the primary key
A foreign key refers to another relation’s primary key.

Referential integrity constraint: A value of an attribute in a relation must be the value of an
attribute in another relation.
2
Relational Algebra
Basic Operators
selection σ
σp (r) where p is the selection predicate.
relation consisting of select rows that satisfy the predicate
projection Π
ΠA1 ,...,Ak (r) where Ai are attribute names
relation of k listed columns
since relations are sets, duplicate rows are removed
cartesian product ×
a tuple from each possible pair of tuples
If an attribute of same name exists in both relations, we need to distinguish them.
join r ▷◁θ s = σθ (r × s)
union ∪, difference −
relations must have the same arity, i.e., same number of attributes
...and the attribute domains must be compatible
rename ρ
ρx (E), where E is a relational algebra expression
A result of E, by default, does not have a name to be referred to by.
Can also be used as ρx(A1,...,Ak) (E)
These are fundamental operators of relational algebra, which cannot be written in terms of others.
Extended Operators
duplicate-elimination δ
extended projection πB+C→X
aggregation AVG, SUM, etc
grouping γ
grouping attributes γaggregation (r)
dept name γavg(salary) as avg salary (instructor)
...but do we need to project this: Πdept name,avg salary ()?

sorting τ
outer join ▷◁
assignment ←
A query can be written as a sequential program consisting of a series of assignments.
3
Chapter 3
DDL in SQL
We have data types like:
char(n): fixed-length string
varchar(n): string with maximum length n
int: integer, machine-dependent
smallint: small integer, machine-dependent
numeric(p, d): aka decimal, fixed point number
e.g. numeric(3,2) can store 3.24
real, double precision: floating point, machine-dependent
We create tables like:
CREATE TABLE table_name (
attr0_name attr0_type,
attr1_name attr1_type NOT NULL,
...
attrn_name attrn_type,
PRIMARY KEY (attrp0_name, ..., attrpm_name),
FOREIGN KEY (attrf0_name, ..., attrfk_name) REFERENCES other_table
)
We alter tables like:
ALTER TABLE r
ADD attr_name attr_type
or:
ALTER TABLE r
DROP attr_name
We drop tables like:
... ROBERT’);
DROP TABLE students; --
I bet you are already familiar with SELECT FROM WHERE. Beware to use DISTINCT or ALL keywords
to explicitly eliminate or keep duplicate rows.
String Operations
We have LIKE operator, % for zero or more of any characters, for any single character, specifiable
escape character using ESCAPE operator. MySQL is not case sensitive even even for LIKE operation.
We can also concatenate with —— operator, convert case with UPPER() and LOWER() functions, get
length or extract substring with LEN() and SUBSTRING() functions, etc.
Clauses
WHERE is a clause.
HAVING applies to each group, while WHERE applies to each tuple before forming groups.
ORDER BY can order the tuples of the result by one or more attributes. You can specify ASC(default)
or DESC per attribute, like:
ORDER BY dept_name ASC, gpa DESC
GROUP BY is a clause to be used with aggregate functions. Result can only contain aggregate values
and/or the grouping attribute(s).
WITH is a clause to define a temporary relation, called common table expression (CTE),
4
Clause Predicates
BETWEEN operator like:
WHERE gpa BETWEEN 2.7 AND 3.7
Row constructor, like:
SELECT name, course_id
FROM instructor, teaches
WHERE (instructor.ID, dept_name) = (teaches.ID, ’History’)
...honestly not seeing much point here.
Set Operations
UNION, INTERSECT, and EXCEPT(set difference), these are DISTINCT by default, specify ALL
keyword when needed.
Subqueries
In the following SQL query:
SELECT a1, a2, ..., an
FROM r1, r2, ..., rm
WHERE p
ai can be replaced by a subquery that generates a single value, aka scalar subquery
ri can be replaced by any valid subquery
P can be replaced with an expression of the form ”attribute <operation>(subquery)”
Set Comparison Operator

SOME/ALL: comparison holds true for some/all row in the subquery
EXISTS: subquery is not empty
UNIQUE: subquery contains no duplicate rows
...but is there even an SQL variant actually supporting this?
Modification of the Database

INSERT, DELETE, and UPDATE, like...
INSERT INTO takes (attribute, names, optionally)

VALUES (attribute, values, in_order);
INSERT INTO takes

SELECT studentID, courseID, "Spring"
FROM student, course
WHERE ...;
UPDATE instructor
SET salary = salary * 1.05
WHERE ...;
UPDATE instructor
SET salary = CASE
WHEN salary <= 100000 THEN salary * 1.05
ELSE salary * 1.03
END;
DELETE FROM instructor; -- no clause: delete everyone!
5
Chapter 4
Join
There are either inner or left/right/full outer join, depending on how unmatched(dangling) tuples are
treated.
We can use ON predicate/NATURAL/USING (attrs) for join condition.
student NATURAL JOIN takes; -- uses studentID by default

instructor JOIN teaches USING(courseID); -- every instructor who ever opened the course
That was inner join, the default way. Outer join, in addition to the result of inner join, keeps tuples
that have no match. There are LEFT/RIGHT/FULL OUTER JOIN to keep unmatched tuples from
left/right/both operand relations.
There is JOIN ... ON predicate, which is about the same as WHERE.
View
A view is a virtual relation, i.e. is not stored.
CREATE VIEW viewname AS (query);
But can we modify a view to modify stored relation(s)? Most SQL implementations allow updates only
on simple views, where
FROM: only one relation
SELECT: only attribute names of the relation

no expressions, aggregates, or DISTINCT specification
must contain every not-null attribute
no GROUP BY or HAVING clause
Then there is materialized view, a view that is physically stored. How do we keep it up-to-date? Periodic
reconstruction: unacceptable for applications requiring up-to-date data. Incremental maintenance: only
recompute parts that are affected by the changes of underlying base tables
Transaction
A: atomicity
C: consistency
I: isolation
D: durability
Integrity Constraints
On a single relation: NOT NULL, PRIMARY KEY, UNIQUE(=superkey), CHECK (predicate)
Referential integrity: like, if a row in the instructor table has value ”Biology” for attribute deptName,
then there must exist a row in department table whose deptName is ”Biology”.
Foreign key constraint: value in R.A must appear in the primary key of S. A is called a foreign key. And
NULL is allowed unless declared NOT NULL.

Database Cheatsheet

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Database Cheatsheet

Uploaded by

Copyright:

Available Formats

Introduction to Database 1

Instances and Schemas

DDL: Data Definition Language

primary key, unique, not null, foreign key constraints

It provides facilities for authorization

DML: Data Manipulation Language

A binary relation over sets X and Y is join

A relation is a set of tuples

A candidate key is a minimal superkey

One of the candidate keys is designated to be the primary key

A foreign key refers to another relation’s primary key.

extended projection πB+C→X

aggregation AVG, SUM, etc

dept name γavg(salary) as avg salary (instructor)

...but do we need to project this: Πdept name,avg salary ()?

Set Comparison Operator

Modification of the Database

INSERT INTO takes (attribute, names, optionally)

INSERT INTO takes

DELETE FROM instructor; -- no clause: delete everyone!

student NATURAL JOIN takes; -- uses studentID by default

SELECT: only attribute names of the relation

You might also like

Database Cheatsheet

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Database Cheatsheet

Uploaded by

Copyright:

Available Formats

Introduction to Database 1

Instances and Schemas

DDL: Data Definition Language

 primary key, unique, not null, foreign key constraints

It provides facilities for authorization

DML: Data Manipulation Language

 A binary relation over sets X and Y is join

 A relation is a set of tuples

 A candidate key is a minimal superkey

 One of the candidate keys is designated to be the primary key

 A foreign key refers to another relation’s primary key.

 extended projection πB+C→X

 aggregation AVG, SUM, etc

dept name γavg(salary) as avg salary (instructor)

...but do we need to project this: Πdept name,avg salary ()?

Set Comparison Operator

Modification of the Database

INSERT INTO takes (attribute, names, optionally)

INSERT INTO takes

DELETE FROM instructor; -- no clause: delete everyone!

student NATURAL JOIN takes; -- uses studentID by default

 SELECT: only attribute names of the relation

You might also like

primary key, unique, not null, foreign key constraints

A binary relation over sets X and Y is join

A relation is a set of tuples

A candidate key is a minimal superkey

One of the candidate keys is designated to be the primary key

A foreign key refers to another relation’s primary key.

extended projection πB+C→X

aggregation AVG, SUM, etc

SELECT: only attribute names of the relation