Professional Documents
Culture Documents
G PRAKASH
Semi-structured Data Model
Domain Constraint
Referential Integrity
Relational Databases
based on the relational model and uses a collection of tables to represent both
data and the relationships among those data
Tables
It
also includes DDL and DML
Relational model is an example of record-based model
Database structured in fixed-format records of several types
Each table consists of a particular record type
table
select instructor.name from instructor where instructor.dept_name = ‘History’;
Queries may involve information from more than one table.
select instructor.ID, department.dept_name from instructor, department where
instructor.dept_name= department.dept_name and department.budget > 95000;
Data-Definition Language
SQL provides a rich DDL that allows one to define tables, integrity constraints,
assertions, etc
create table department (dept name char (20), building char (15), budget number
(12,2));
Database Access from Application Programs
SQL does not support actions such as
Input from users,
output to displays,
Actions
must be written in a host language, such as C, C++, or Java, with
embedded SQL queries that access the data in the database
To access the database, DML statements need to be executed from the host
language. 2 ways:
By providing an application program interface – Open Database Connectivity
(ODBC) – commonly used API standard & Java Database Connectivity (JDBC)
standard
By extending the host language syntax to embed DML calls within the host language
program – a preprocessor, called the DML precompiler, converts the DML statements
to normal procedure calls in the host language.
Database Design
Designed to manage large bodies of information
These large bodies of information do not exist in isolation
mainly involves the design of the database schema
Design Process
Initial phase is to characterize fully the data needs of the prospective database users;
Outcome => user requirements specification.
Next, the designer chooses a data model, by applying the concepts of the chosen data
the database & “how to group” these attributes to form the various tables
A relationship
is an association among
several entities.
Routine maintenance.
RELATIONAL DATABASES
Natural join
In general, the natural join operation on two relations matches tuples whose values
are the same on all attribute names that are common to both relations.
Cartesian product operation
combines tuples from two relations, but unlike the join operation, its result
contains all pairs of tuples from the two relations, regardless of whether their
attribute values match
As relations are sets, we can perform normal set operations on relations
Union operation
performs a set union of two “similarly structured” tables
Intersection operation
Set difference operation
create table department (dept _name varchar (20), building varchar (15), budget
number (12,2), primary key (dept name));
create table course (course id varchar (7), title varchar (50),
dept _name varchar (20), credits number (2,0), primary key (course id),
foreign key (dept name) references department);
create table instructor (ID varchar (5), name varchar (20) not null, dept name
varchar (20), salary number (8,2), primary key (ID), foreign key (dept name)
references department);
To find instructor names and course identifiers for instructors in the Computer
Science department
select name, course id from instructor, teaches where instructor.ID= teaches.ID
and instructor.dept name = ’Comp. Sci.’;
In general, the meaning of an SQL query can be understood as follows
1. Generate a Cartesian product of the relations listed in the from clause
2. Apply the predicates specified in the where clause on the result of Step 1.
3. For each tuple in the result of Step 2, output the attributes (or results of
expressions) specified in the select clause.
Natural join (visited again)
Cartesian Product Natural Join
Concatenates each tuple of the first relation Considers only those pairs of tuples with the
with every tuple of the second same value on those attributes that appear in
the schemas of both relations
SQL STRING FUNCTIONS
Mod(N,M)
Power(m,n)
Sqrt(n)
Round(m)
SQL AGGREGATE FUNCTIONS
MIN(column_name)
MAX(column_name)
COUNT(column_name)
AVG(column_name)
SUM(column_name)
SOME DATE/TIME FUNCTIONS AND
QUERIES
Select sysdate from dual;
Select current_date from dual;
LIKE ‘_r%’ => Finds any values that have "r" in the second position
LIKE ‘%or%’ => Finds any values that have "or" in any position
SQL IN Operator
The IN operator allows you to specify multiple values in a WHERE clause.
SELECT column_name(s) FROM table_name
WHERE column_name IN (value1, value2, ...);
SELECT column_name(s) FROM table_name
WHERE column_name IN (SELECT STATEMENT);
The SQL BETWEEN Operator
The BETWEEN operator selects values within a given range. The values can
be numbers, text, or dates.
SELECT column_name(s) FROM table_name
WHERE column_name BETWEEN value1 AND value2;
The SQL ORDER BY clause
used to sort the data in ascending or descending order, based on one or more
columns
Syntax:
SELECT column-list FROM table_name [WHERE condition] [ORDER BY
column1, column2, .. columnN] [ASC | DESC];
SQL> SELECT * FROM CUSTOMERS ORDER BY NAME;
SQL> SELECT * FROM CUSTOMERS ORDER BY NAME DESC;
The SQL GROUP BY clause
used in collaboration with the SELECT statement to arrange identical data
into groups
GROUP BY clause follows the WHERE clause in a SELECT statement
and precedes the ORDER BY clause
SELECT column1, column2 FROM table_name WHERE [ conditions ]
GROUP BY column1, column2 ORDER BY column1, column2
If you want to know the total amount of salary on each customer, then the
GROUP BY query would be as follows:
SQL> SELECT NAME, SUM(SALARY) FROM CUSTOMERS GROUP BY
NAME;
NAME SUM(SALARY
)
Hardik 8500.00
kaushik 8500.00
Komal 4500.00
Muffy 10000.00
Ramesh 3500.00
SQL - Using Joins
usedto combine records from two or more tables in a database
A JOIN is a means for combining fields from two tables by using values
common to each
Consider the following tables: CUSTOMERS & ORDERS
table1.common_field = table2.common_field;
SQL> SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS LEFT JOIN
ON table1.common_field = table2.common_field;
SQL> SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS FULL
an EQUIJOIN.
Syntax: SELECT table1.column, table2.column FROM table1 JOIN
table2 USING (join_column1, join_column2…);
table1, table2 are the name of the tables participating in joining.
The natural join syntax contains the NATURAL keyword, the JOIN…USING
syntax does not.
An error occurs if the NATURAL and USING keywords occur in the same
join clause.
The JOIN…USING clause allows one or more equijoin columns to specify in
brackets after the USING keyword.
SQL - UNIONS CLAUSE
used to combine the results of two or more SELECT statements without returning any
duplicate rows
To use this UNION clause, each SELECT statement must have
The same number of columns selected
The same number of column expressions
The same data type and
Have them in the same order
Syntax
SELECT column1 [, column2 ] FROM table1 [, table2 ] [WHERE condition] UNION
SELECT column1 [, column2 ] FROM table1 [, table2 ] [WHERE condition]
Find all the courses taught in the both the Fall 2009 and Spring
2010 semesters?
Using intersect
(select course_id from section where semester = ’Fall’ and year= 2009)
intersect
Using
(select in connective
course _id from section where semester = ’Spring’ and year= 2010);
Write the subquery
select distinct course_id from section where semester = ’Fall’ and year=
2009 and course_id in (select course_id from section
where semester = ’Spring’ and year= 2010);
Find all the courses taught in the Fall 2009 semester but not in the
Spring 2010 semester? Exercise, Hint: use ‘not in’
Set Comparison
select name from instructor where salary > some (select salary from
instructor where dept_name = ’Biology’);
SQL also allows < some, <= some, >= some, = some, and <> some
comparisons.
Find the names of all instructors that have a salary value greater
than that of each instructor in the Biology department?
The construct > all corresponds to the phrase “greater than all.”
As it does for some, SQL also allows < all, <= all, >= all, = all, and <>
all comparisons
select name from instructor where salary > all (select salary from
instructor where dept_name = ’Biology’);
E.g. Student _ID attribute for a specific student entity refers to only one
student ID.
Multi-valued attribute: And, they can have multiple values.
E.g. Phone_number attribute
A relationship is an association among several entities;
When an Entity is related to another Entity, they are said to have a
relationship
A relationship set is a set of relationships of the same type.
It is a mathematical relation on n ≥ 2 (possibly non-distinct) entity sets. If E1, E2,
. . . , En are entity sets, then a relationship set R is a subset of
advisor
{(e1, e2, . . . , en) | e1 ∈ E1, e2 ∈ E2, . . . , en ∈ En}
where (e1, e2, . . . , en) is a relationship
The association between entity sets is referred to as participation; i.e., the
entity sets E1, E2, . . . , En participate in relationship set R
A relationship instance in an E-R schema represents an association between
the named entities in the real-world enterprise that is being modeled
A relationship may also have attributes called descriptive attributes
Consider a relationship set advisor with entity sets instructor and student.
We could associate the attribute date with that relationship to specify the date
when an instructor became the advisor of a student
A relationship instance in a given relationship set must be uniquely
identifiable from its participating entities, without using the descriptive
attributes.
We cannot represent multiple dates by multiple relationship instances between
the same instructor and a student, since the relationship instances would not be
uniquely identifiable using only the participating entities
It is possible to have more than one relationship set involving the same entity
sets (Exercise?)
Depending upon the number of entities involved, a degree is assigned to
relationships
For example, if 2 entities are involved, it is said to be Binary relationship, if
3 entities are involved, it is said to be Ternary relationship, and so on.
Constraints
Mapping Cardinalities
Or cardinality ratios, express the number of entities to which another entity
can be associated via a relationship set
Most useful in describing binary relationship sets
For a binary relationship set R between entity sets A and B, the mapping
cardinality must be one of the following:
Participation Constraints
The participation of an entity set E in a relationship set R is said to be total if every
entity in E participates in at least one relationship in R
Partial – if only some entities in E participates in relationships in R
Keys
The primary key of an entity set allows us to distinguish among the various entities of
the set
Keys also help to identify relationships uniquely, and thus distinguish relationships
from each other
Similar mechanism needed to distinguish among the various relationships of a
relationship set.
Let R be a relationship set involving entity sets E1, E2, . . . , En.
Let primary key(Ei ) denote the set of attributes that forms the primary key for entity
set Ei
The composition of the primary key for a relationship set depends on the
set of attributes associated with the relationship set R.
If the relationship set R has no attributes associated with it, then the set of
attributes
primary-key (E1) ∪ primary-key (E2) ∪ ·· · ∪ primary-key (En)
describes an individual relationship in set R.
If the relationship set R has attributes a1, a2, . . . , am associated with it,
then the set of attributes
primary-key (E1) ∪ primary-key (E2) ∪ · · · ∪ primary-key (En) ∪ {a1, a2, . . . ,
am}
describes an individual relationship in set R.
The structure of the primary key for the relationship set depends on the
mapping cardinality of the relationship set
After choosing the entity sets and their corresponding attributes, relationship sets
entity sets are redundant and need to be removed from the original entity sets.
Consider the entity sets instructor and department:
Model the fact that each instructor has an associated department using a relationship
set inst_dept relating instructor and department
Primary key for the department relation where it is redundant in the entity
set instructor and needs to be removed.
Removing the attribute “dept_name” is rather unintuitive
When we create a relational schema from the E-R diagram, the attribute
dept_name gets added to the relation instructor, but only if each
instructor has at most one associated department.
If an instructor has more than one associated department, the relationship
between instructors and departments is recorded in a separate relation
inst_dept
Treating the connections between “instructor” and “department”
uniformly as a relationship makes the logical relationship explicit
Helps avoid an early assumption that each instructor is associated with
only one department
Similarly the student entity set is related to the department entity set
through the relationship set student_dept and thus there is no need for a
dept_name attribute in student.
A good entity-relationship design does not contain redundant attributes
Basic Structure
An E-R diagram consists of the following major components:
Rectangles divided into two parts
Diamonds represent relationship sets
Double diamonds represent identifying relationship sets linked to weak entity sets
Mapping Cardinality
One-to-one:
Draw a directed line from the relationship set
advisor to both entity sets instructor and student
One-to-many:
Draw a directed line from the relationship set
advisor to the entity set instructor and an
undirected line to the entity set student
Many-to-one:
The line between advisor and student has a cardinality constraint of 1..1,
meaning the minimum and the maximum cardinality are both 1
What is the cardinality limit of 1…* specify? (Exercise)
Complex Attributes Roles
Roles in E-R diagrams are indicated by
labeling the lines that connect diamonds to
rectangles.
Nonbinary Relationship Sets
Weak Entity Sets
Suppose we create a relationship set sec_course between entity sets section and
course.
An entity set that does not have sufficient attributes to form a primary key is
termed a weak entity set
An entity set that has a primary key is termed a strong entity set.
For a weak entity set to be meaningful, it must be associated with another entity
set, called the identifying or owner entity set.
Discriminator
underlined with
a dashed line
The identifying entity set is said
to own the weak entity set that it
identifies
The primary key of the entity set serves as the primary key of the resulting schema
E.g., entity set ‘student’ from the E-R diagram with 3 attributes: ID, name, tot_cred =>
component attributes
We do not create a separate attribute for the composite attribute itself.
‘name’ attribute in ‘instructor’ relation => the schema generated for instructor
instructor (ID, first name, middle name, last name, street number, street name,
apt number, city, state, zip code, date of birth)
Multivalued attributes are treated differently from other attributes
Attributes in an E-R diagram generally map directly into attributes for the appropriate
relation schemas
Multivalued attributes, however, are an exception => new relation schemas are created
this schema
For a multivalued attribute M, we create a relation schema R with an attribute A that
corresponds to M and attributes corresponding to the primary key of the entity set or
relationship set of which M is an attribute
In addition, we create a foreign-key constraint on the relation schema created from the
multivalued attribute, with the attribute generated from the primary key of the entity set
referencing the relation generated from the entity set.
Derived attributes are not explicitly represented in the relational data model
Representation of Weak Entity Sets
Let A be a weak entity set with attributes a1 , a2 , . . . , am. Let B be the strong entity set
1 2
on which A depends
Let the primary key of B consist of attributes b1, b 2, . . . , bn
2
We represent the entity set A by a relation schema called A with one attribute for each
For schemas derived from a weak entity set, the combination of the primary key of the
strong entity set and the discriminator of the weak entity set serves as the primary key
of the schema
In addition to creating a primary key, we also create a foreign-key constraint on the
relation A, specifying that the attributes b1, b2, . . . , bn reference the primary key of
the relation B.
Considering the weak entity set ‘section’
The primary key of the course entity set, on which section depends, is
course_id
Thus, we represent section by a schema with the following attributes:
id referencing the primary key of the course schema, and the integrity
constraint “on delete cascade”
When a referential-integrity constraint is violated, the normal procedure is to reject the action
that caused the violation
However, a foreign key clause can
specify that if a delete or update action
on the referenced relation violates the
constraint, then, instead of rejecting
the action, the system must take steps
to change the tuple in the referencing
relation to restore the constraint
set on the “many” side of the relationship set serves as the primary key.
For an n-ary relationship set without any arrows on its edges, the union of the primary
key-attributes from the participating entity sets becomes the primary key
For an n-ary relationship set with an arrow on one of its edges, the primary keys of the
entity sets not on the “arrow” side of the relationship set serve as the primary key for
the schema
How to create foreign-key constraints on the relation schema R?
For each entity set Ei related to relationship set R, we create a foreign-key constraint
from relation schema R, with the attributes of R that were derived from primary-key
attributes of Ei referencing the primary key of the relation schema representing Ei .
Consider the relationship set advisor in the E-R diagram
Example:
Since the relationship set has no attributes, the advisor schema has two attributes, the
primary keys of instructor and student.
Since both attributes have the same name, we rename them i_ID and s_ID
Since the advisor relationship set is many-to-one from student to instructor the primary
primary key of instructor and attribute s_ID referencing the primary key of student.
The schemas derived from a relationship set are depicted as follows:
Primary key for weak entity set section is {course_id, sec_id, semester, year}
Since sec course has no descriptive attributes, the sec course schema has
Includes
attributes course_id, sec_id, semester, andprimary
year. key of strong entity set
The schema for the entity set section includes the attributes course_id, sec
_id, semester, and year (among others).
Every (course_id, sec_id, semester, year) combination in a
sec_ course relation would also be present in the relation on
schema section, and vice versa.
Combination of Schemas
Consider a many-to-one relationship set AB from entity set A to entity set B.
instructor (ID, name, salary) => A
department (dept_name, building, budget) => B
inst_dept(ID, dept_name)
inst_dept and instructor schemas can be combined and now the
instructor (ID, name, dept_name, salary) has these attributes.
Similarly, we can combine the schemas for the relationship sets
stud_dept
course_dept
sec_class
sec_time_slot
In the case of one-to-one relationships, the
relation schema for the relationship set can be
combined with the schemas for either of the
entity sets