You are on page 1of 51

database

Done By :
Aisha AL-Hadam
Supervisor :
Dr. Adnan
database-management system (DBMS)
is a collection of interrelated data and a set of programs to
access those data. The collection of data, usually referred to as
the database, contains information relevant to an enterprise.
The primary goal of a DBMS is to provide a way to store and
retrieve database information that is both convenient and
efficient.
1.2 Purpose of Database Systems

• A major purpose of a database system is to provide users with an abstract view of the data. That is, the
system hides certain details of how the data are stored and maintained.
1.3 View of Data

• Data Models
• RelationalModel.
• Entity-Relationship Model
• Semi-structured Data Model.
• Object-Based Data Model.

• Relational Data Model


• Data Abstraction
• Physical level.
• Logical level.
• View level.
• Instances and Schemas
Database Languages
• Data-Definition Language(DDL)
• specify the storage structure and access methods used by the database system
• by a set of statements in a special type
• The SQL Data-Definition Language SQL(DDL)
• SQL provides a rich DDL that allows one to define tables with data types and integrity
• constraints.
• Data-Manipulation Language
A data-manipulation language (DML) is a language that enables users to access or manipulate data as organized by the appropriate data model.
• The SQL Data-Manipulation Language
The SQL query language is nonprocedural. A query takes as input several tables (possibly only one) and always returns a single table
• For instance, the following
• query finds the instructor ID and department name of all instructors associated
• with a department with a budget of more than $95,000.
• select instructor.ID, department.dept name
• from instructor, department
• where instructor.dept name= department.dept name and
• department.budget > 95000;
• Database Access from Application Programs
• Non-procedural query languages such as SQL are not as powerful as a universal Turing
• machine; that is, there are some computations that are possible using a general-purpose
• programming language but are not possible using SQL
Database Design

• Database systems are designed to manage large bodies of information. These large bodies of information do
not exist in isolation. They are part of the operation of some enterprise whose end product may be
information from the database or may be some device or service for which the database plays only a
supporting role. Database design mainly involves the design of the database schema. The design of a
complete database application environment that meets the needs of the enterprise being modeled requires
attention to a broader set of issues.
• A database systemis partitioned into modules that deal with each of the responsibilities of the overall system.
The functional components of a database system can be broadly divided into the storage manager, the query
processor components, and the transaction management component.
• the storage manager
• The storage manager is the component of a database system that provides the interface between the low-level
data stored in the database and the application programs and queries submitted to the system. The storage
manager is responsible for the interaction with the file manager.
• The query processor
• The query processor is important because it helps the database system to simplify and facilitate access to
data.
• The transaction manager
• The transaction manager is important because it allows application developers to treat a sequence of database
accesses as if they were a single unit that either happens in its entirety or not at all.
Database and Application Architecture

• The figure summarizes how different types of users interact with a database, and how the different
components of a database engine are connected to each other.
• The centralized architecture shown in Figure 1.3 is applicable to shared-memory server architectures, which
have multiple CPUs and exploit parallel processing, but all
Database Users and Administrators

• A primary goal of a database system is to retrieve information from and store new information in the
database. People who work with a database can be categorized as database users or database administrators.
• Database Users and User Interfaces
• Database Administrator
RELATIONAL LANGUAGES

• The relational model uses a collection of tables to represent both data and the relationships among those data.
• Structure of Relational Databases
• A relational database consists of a collection of tables, each of which is assigned aunique name.
• For example,
consider the instructor table, which stores information about instructors. The table has four column headers: ID,
name, dept name, and salary. Each row of this table records information about an instructor, consisting of the
instructor’s ID, name, dept name, and salary. Similarly, the course table stores information about courses,
consisting of a course id, title, dept name, and credits, for each course.
• Note that each instructor is identified by the value of the column ID, while each course is identified by the
value of the column course id.

• Database Schema
• database schema, is the logical design of the database,
Keys

• A superkey:
is a set of one or more attributes that, taken collectively, allow us to identify uniquely a tuple in the relation.
• For example,
the ID attribute of the relationinstructor is sufficient to distinguish one instructor tuple from another. Thus, ID is a superkey. The
name attribute of instructor, on the other hand, is not a superkey, because several instructors might have the same name.
candidate keys:
is a minimal superkey, that is, a set of attributes that forms a superkey, but none of whose subsets is a superkey. One of the candidate
keys of a relation is chosen as its primary key.
• primary key:
Use to denote a candidate key that is chosen by the database designer as the principal means of identifying tuples within a relation.
• A foreign-key
• constraint from attribute(s) A of relation r1 to the primary-key B of relation r2 states that the value of A for each tuple in r1 must
also be the value of B for some tuple in r2. The relation r1 is called the referencing relation, and r2 is called the referenced
relation.
Schema Diagrams

• A database schema, along with primary key and foreign-key constraints, can be depicted by schema diagrams.
Figure 2.9 shows the schema diagram for our university organization. Each relation appears as a box, with the
relation name at the top in blue and the attributes listed inside the box.
Relational Query Languages

• A query language is a language in which a user requests information from the database.
• These languages are usually on a level higher than that of a standard programming
• language.
• Query languages can be categorized as imperative, functional, or declarative.In an imperative query language,
the user instructs the system to perform a specific sequence of operations on the database to compute the
desired result; such languages usually have a notion of state variables, which are updated in the course of the
computation.
The Relational Algebra

• The relational algebra consists of a set of operations that take one or two relations as
• input and produce a new relation as their result.
• The Select Operation
• The Project Operation
• Composition of Relational Operations
• The Cartesian-Product Operation
• The Join Operation
• Set Operations
• The Assignment Operation
• The Rename Operation
Introduction to SQL

• SQL Data Definition


• The set of relations in a database are specified using a data-definition language (DDL).
• The SQL DDL allows specification of not only a set of relations, but also information about each relation,
including:
• The schema for each relation.
• The types of values associated with each attribute.
• The integrity constraints.
• The set of indices to be maintained for each relation.
• The security and authorization information for each relation.
• The physical storage structure of each relation on disk.
Basic Types

• The SQL standard supports a variety of built-in types, including:


• char(n)
• varchar(n)
• int.
• smallint
• real, double precision.
• float(n)
Basic Structure of SQL Queries

• The basic structure of an SQL query consists of three clauses: select, from, and where.
• Queries on a Single Relation
• Ex:
select name
from instructor
where dept name = 'Comp. Sci.' and salary > 70000;
• Queries on Multiple Relations
• Ex:
select name, instructor.dept name, building
from instructor, department
where instructor.dept name= department.dept name;
Additional Basic Operations

• The Rename Operation


• Ordering the Display of Tuples
• Ex:
select name, course id • Ex:
from instructor, teaches select name
where instructor.ID= teaches.ID;
from instructor
• String Operations
• Ex: where dept name = 'Physics'
select dept name order by name;
from department • Where-Clause Predicates
where building like '%Watson%’;
• Ex:
• Attribute Specification in the Select Clause select name
• Ex:
from instructor
select instructor.*
from instructor, teaches
where salary between 90000 and 100000;
where instructor.ID= teaches.ID;
Set Operations
• The Union Operation • The Except Operation
• EX: • Ex:
To find all courses taught in the Fall 2017 semester but not in the Spring 2018 semester,
To find the set of all courses taught either in Fall 2017 or in Spring 2018, or both, we
write the following query. we write:
(select course id (select course id
from section from section
where semester = 'Fall' and year= 2017) where semester = 'Fall' and year= 2017)

union except

(select course id (select course id


from section
from section
where semester = 'Spring' and year= 2018);
where semester = 'Spring' and year= 2018);

• The Intersect Operation


• EX:
To find the set of all courses taught in both the Fall 2017 and Spring 2018, we write:
(select course id
from section
where semester = 'Fall' and year= 2017)
intersect
(select course id
from section
where semester = 'Spring' and year= 2018);
Null Values

• Null values present special problems in relational operations, including arithmetic operations, comparison
operations, and set operations. The result of an arithmetic expression (involving, for example, +, −, ∗, or ∕) is
null if any of the input values is null.
• For example, if a query has an expression r.A+ 5, and r.A is null for a particular tuple, then the expression
result must also be null for that tuple.
• SQL therefore treats as unknown the result of any comparison involving a null value (other than predicates is
null and is not null, which are described later in this section). This creates a third logical value in addition to
true and false.
Aggregate Functions

• Aggregate functions are functions that take a collection (a set or multiset) of values as input and return a single value. SQL offers five standard built-in
aggregate functions:9
• Average: avg
• Minimum: min
• Maximum: max
• Total: sum
• Count: count

Basic Aggregation The Having Clause Aggregation with Aggregation with Null and
Ex: Consider the query “Find the Ex: after groups have been Grouping Boolean Values
average salary of instructors in the formed, so aggregate functions may Ex: As an illustration, consider the Ex: assume that some tuples in the
Computer Science be used in the having clause. query “Find the average salary in instructor relation have a null value
department.” We write this query as We express this query in SQL as each department.” for salary.
follows follows: We write this query as follows: Consider the following query to
select avg (salary) select dept name, avg (salary) as select dept name, avg (salary) as total all salary amounts:
from instructor avg salary avg salary select sum (salary)
where dept name = 'Comp. Sci.'; from instructor from instructor from instructor;
group by dept name group by dept name;
having avg (salary) > 42000;
Nested Subqueries

• SQL provides a mechanism for nesting subqueries. A subquery is a select-from-where expression that is nested within another query. A
common use of subqueries is to perform tests for set membership, make set comparisons, and determine set cardinality by nesting
subqueries in the where clause.

Set Membership Set Comparison Test for Empty Relations Test for the Absence of Duplicate
Ex: Ex: Ex: Tuples
we tested membership in a set comparisons, consider the an SQL query to “find the total number Ex:
one-attribute relation. It is query “Find the departments of (distinct) students We can test for the existence of duplicate
also possible to test for that have the highest average who have taken course sections taught tuples in a subquery by using the not
membership in an arbitrary salary.” We begin by writing a by the instructor with ID 110011”. That unique construct. To illustrate this
relation in SQL. For example, query to find all average query used a tuple constructor syntax construct, consider the query “Find all
we salaries, and then nest it as a that is not supported by some databases. courses that
can write the query “find the subquery of a larger query that An were offered at least twice in 2017” as
total number of (distinct) finds those departments alternative way to write the query, using follows:
students who have taken for which the average salary is the exists construct, is as follows: select T.course id
course greater than or equal to all select count (distinct ID) from course as T
sections taught by the average salaries: from takes where not unique (select R.course id
instructor with ID 110011” as select dept name where exists (select course id, sec id, from section as R
follows: from instructor semester, year where T.course id= R.course id and
select count (distinct ID) group by dept name from teaches R.year = 2017);
from takes having avg (salary) >= all where teaches.ID= '10101'
where (course id, sec id, (select avg (salary) and takes.course id = teaches.course id
semester, year) in (select from instructor and takes.sec id = teaches.sec id
course id, sec id, semester, group by dept name); and takes.semester = teaches.semester
year and takes.year = teaches.year
from teaches
where teaches.ID= '10101');
Nested Subqueries(cont.)

Subqueries in the From The With Clause Scalar Subqueries Scalar Without a From Clause
Clause EX: EX: Ex:
Ex: The with clause provides a SQL allows subqueries to occur As an example, suppose we wish to find
We can give the subquery way of defining a temporary wherever an expression returning a value the average number of sections taught
result relation a name, and relation whose definition is is permitted, (regardless
rename the attributes, using available only to the query in provided the subquery returns only one of year or semester) per instructor, with
the as clause, as illustrated which the with clause occurs. tuple containing a single attribute; such sections taught bymultiple instructors
below. Consider the following subqueries counted once per instructor. We need to
select dept name, avg salary query, which finds those are called scalar subqueries. For count the number of tuples in teaches to
from (select dept name, avg departments with the example, a subquery can be used in the find
(salary) maximum budget. select the total number of sections taught and
from instructor with max budget (value) as clause as illustrated in the following count the number of tuples in instructor
group by dept name) (select max(budget) example that lists all departments along to find
as dept avg (dept name, avg from department) with the the number of instructors. Then a simple
salary) select budget number of instructors in each division gives us the desired result.
where avg salary > 42000; from department, max budget department: Onemight
where department.budget = select dept name, write this as:
max budget.value; (select count(*) (select count (*) from teaches) / (select
from instructor count (*) from instructor);
where department.dept name =
instructor.dept name)
as num instructors
from department;
Modification of the Database

• We have restricted our attention until now to the extraction of information from thedatabase. Now, we show
how to add, remove, or change information with SQL.

Deletion Insertion Updates


delete from r insert into course update
where P; values ('CS-437', instructor
'Database set salary=
Systems', 'Comp. salary * 1.05;
Sci.', 4);
Intermediate SQL

• Join Expressions

The Natural Join Join Conditions Outer Joins Join Types and Conditions
Consider the following SQL Consider the following query, following SQL The default join type, when
query, which computes for each which has a join expression query may appear to retrieve the the join clause is used without the outer
student the set of courses containing the on condition: required information: prefix, is the inner join. Thus,
a student has taken: select * select * select *
select name, course id from student join takes on from student natural join takes; from student join takes using (ID);
from student, takes student.ID = takes.ID; is equivalent to:
where student.ID = takes.ID; select *
from student inner join takes using
(ID);
Views

View Definition Using Views in SQL Queries Materialized Views Update of a View
We define a view in SQL by we can find all Physics courses Certain database systems allow Suppose the view faculty,
using the create view offered in the Fall 2017 view relations to be stored, but which we saw earlier, is made
command. To define a view, semester in the Watson they make sure that, if available to a clerk. Since
we must give the view a name building by writing: the actual relations used in the we allow a view name to
and must state the query that select course id view definition change, the appear wherever a relation
computes the view. The form from physics fall 2017 view is kept up-to-date. Such name is allowed, the clerk can
of the where building = 'Watson'; views are called materialized write:
create view command is: views. insert into faculty
create view v as <query values ('30765', 'Green',
expression>; 'Music');
Transactions

• A transaction consists of a sequence of query and/or update statements. The SQL standard specifies that a
transaction begins implicitly when an SQL statement is executed. One of the following SQL statements must
end the transaction:
• Commit work commits the current transaction; that is, it makes the updates performed by the transaction
become permanent in the database. After the transaction is committed, a new transaction is automatically
started.
• Rollback work causes the current transaction to be rolled back; that is, it undoes all the updates performed by
the SQL statements in the transaction. Thus, the database state is restored to what it was before the first
statement of the transaction was executed.
Integrity Constraints

• Integrity constraints ensure that changes made to the database by authorized users do not result in a loss of
data consistency. Thus, integrity constraints guard against accidental damage to the database. This is in
contrast to security constraints, which guard against access to the database by unauthorized users.
• Examples of integrity constraints are:
• An instructor name cannot be null.
• No two instructors can have the same instructor ID.
• Every department name in the course relation must have a matching department name in the department
relation.
• The budget of a department must be greater than $0.00.
Integrity Constraints(CONT.)

Constraints on a Single Not Null Constraint Unique Constraint The Check Clause
Relation In cases such as this, we wish to forbid SQL also supports an As another example, consider the following:
The allowed integrity null values, and we can do so by integrity constraint: create table section
constraints include restricting the domain of the attributes unique (Aj1 , Aj2 ,…, (course id varchar (8),
• not null name and budget to exclude null Ajm) sec id varchar (8),
• unique values, by declaring it as follows: semester varchar (6),
• check(<predicate>) name varchar(20) not null year numeric (4,0),
budget numeric(12,2) not null building varchar (15),
room number varchar (7),
time slot id varchar (4),
primary key (course id, sec id, semester, year),
check (semester in ('Fall', 'Winter', 'Spring', 'Summer')));

Referential Integrity Assigning Names to Integrity Constraint Violation Complex Check Conditions
The definition of the course Constraints During a Transaction and Assertions
table has a declaration for example, if we wish to assign Transactions may consist of we could specify the following
“foreign key (dept name) the name minsalary to the check several steps, and integrity referential-integrity constraint
references department”. constraint on the salary attribute constraints may be violated on the relation section:
of instructor (see Figure 4.9), we temporarily after one step, but a check (time slot id in (select time
would modify the declaration later step may remove the slot id from time slot))
for salary to: violation.
salary numeric(8,2), constraint
minsalary check (salary >
SQL Data Types and Schemas

Date and Time Types in Default Values Large-Object Types


Type Conversion and
SQL create table student For example, we may
Formatting Functions
Date and time values can be (ID varchar (5), declare
select cast(ID as
specified like this: name varchar (20) not null, attributes
numeric(5)) as inst id
date '2018-04-25' dept name varchar (20), book review clob(10KB)
from instructor
time '09:30:00' tot cred numeric (3,0) image blob(10MB)
order by inst id
timestamp '2018-04-25 default 0, movie blob(2GB)
10:29:01.45' primary key (ID));

User-Defined Types Generating Unique Key Create Table Extensions Schemas, Catalogs, and
The create type clause can Values the following statement Environments
be used to define new insert into instructor creates a table t1 containing FOR example,
types. For example, the (name, dept name, salary) the results catalog5.univ
statements: values ('Newprof', 'Comp. of a query. schema.course
create type Dollars as Sci.', 100000); create table t1 as
numeric(12,2) final; (select *
create type Pounds as from instructor
numeric(12,2) final; where dept name = 'Music')
with data;
Index Definition in SQL

• Indices are important for efficient processing of queries, as well as for efficient enforcement of integrity
constraints. Although not part of the SQL standard, SQL commands for creation of indices are supported by
most database systems.
Authorization

• We may assign a user several forms of authorizations on parts of the database. Authorizationson data include:
• Authorization to read data.
• Authorization to insert new data.
• Authorization to update data.
• Authorization to delete data.
Authorization(CONT.)
• Granting and Revoking of Roles Authorization on Views Authorizations on Schema
Privileges Roles can be granted to users, as This view can be defined in The following grant statement
• The basic form of this well as to other roles, as these SQL as follows: allows user Mariano to create
statement is: statements show: create view geo instructor as relations that reference the key
create role dean; (select * dept name of the department
grant <privilege list> grant instructor to dean; from instructor relation as a foreign key:
on <relation name or view grant dean to Satoshi; where dept name = 'Geology'); grant references (dept name) on
name> department to Mariano;
to <user/role list>;

Transfer of Privileges Revoking of Privileges Row-Level Authorization


Amit the select privilege on The following revoke statement we would specify the following
department and allow Amit to revokes only the grant option, predicate to
grant this privilege to others, rather than the actual be associated with the takes
we write: select privilege: relation:
grant select on department to revoke grant option for select on ID = sys context ('USERENV',
Amit with grant option; department from Amit; 'SESSION USER')
Advanced SQL

• Accessing SQL from a Programming Language


• a database programmer must have access to a general-purpose programming language for at least two
reasons:
1. Not all queries can be expressed in SQL, since SQL does not provide the full expressive power of a general-
purpose language. That is, there exist queries that can be expressed in a language such as C, Java, or Python that
cannot be expressed in SQL. To write such queries, we can embed SQL within a more powerful language.
2. Nondeclarative actions—such as printing a report, interacting with a user, or sending the results of a query to
a graphical user interface—cannot be done from within SQL.
• There are two approaches to accessing SQL from a general-purpose programming language:
• Dynamic SQL: A general-purpose program can connect to and communicate with a database server using a
collection of functions (for procedural languages) or methods (for object-oriented languages).
• Embedded SQL: Like dynamic SQL, embedded SQL provides a means by which a program can interact
with a database server.
Accessing SQL from a Programming
Language

JDBC Database Access from Python ODBC Embedded SQL


The JDBC standard defines an The Python Database API used in The Open Database Connectivity The SQL standard defines
application program interface the program is implemented by (ODBC) standard defines an API embeddings of SQL in a variety of
(API) that Java programs drivers for that applications programming languages,
can use to connect to database many databases, but unlike with can use to open a connection with such as C, C++, Cobol, Pascal,
servers. (The word JDBC was JDBC, there are minor differences a database, send queries and Java, PL/I, and Fortran. A
originally an abbreviation in the API across updates, and get back language in which SQL
for Java Database Connectivity, different drivers, in particular in results. Applications such as queries are embedded is referred
but the full form is no longer the parameters to the connect() graphical user interfaces, statistics to as a host language, and the SQL
used.) function. packages, and spreadsheets structures permitted
can make use of the same ODBC in the host language constitute
API to connect to any database embedded SQL.
server that
supports ODBC.
Functions and Procedures

Declaring and Invoking SQL Functions and Language Constructs for Procedures and External Language Routines
Procedures Functions External procedures and functions can be specified
This function can be used in a query that The syntax for while statements and repeat in this way (note that the exact
returns names and statements is: syntax depends on the specific database system you
budgets of all departments with more than 12 while boolean expression do use):
instructors: sequence of statements; create procedure dept count proc( in dept name
select dept name, budget end while varchar(20),
from department repeat out count integer)
where dept count(dept name) > 12; sequence of statements; language C
until boolean expression external name '/usr/avi/bin/dept count proc'
end repeat create function dept count (dept name varchar(20))
returns integer
language C
external name '/usr/avi/bin/dept count'
Triggers
• A trigger is a statement that the system executes automatically as a side effect of a modification to the database. To define a
trigger, we must:
• Specify when a trigger is to be executed. This is broken up into an event that causes the trigger to be checked and a condition
that must be satisfied for trigger execution to proceed.
• Specify the actions to be taken when the trigger executes.

Need for Triggers Triggers in SQL When Not to Use Triggers


Triggers can be used to implement certain We now consider how to implement triggers There are many good uses for triggers, such
integrity constraints that cannot be specified in SQL. The syntax we present here is as those we have just seen in Section 5.3.2,
using the constraint mechanism of SQL. defined but some uses are best handled by
Triggers are also useful mechanisms for by the SQL standard, but most databases alternative techniques. For example, we
alerting humans or for starting certain tasks implement nonstandard versions of this could implement
automatically when certain conditions are syntax. Although the syntax we present here the on delete cascade feature of a foreign-
met. may not be supported on such systems, key constraint by using a trigger instead
the concepts we describe are applicable of using the cascade feature.
across implementations.
Recursive Queries

Transitive Closure Using Iteration Recursion in SQL


One way to write the preceding It is rather inconvenient to specify
query is to use iteration: First find transitive closure using iteration.
those courses that are a direct There is an alternative
prerequisite of CS-347, then those approach, using recursive view
courses that are a prerequisite of all definitions, that is easier to use.
the courses under the first set, and
so on. This iterative process
continues until we reach an
iteration where no courses are
added.
Advanced Aggregation Features

Ranking Windowing Pivoting Rollup and Cube


Finding the position of a value Window queries compute an Consider an application where a SQL supports generalizations of
within a set is a common aggregate function over ranges of shop wants to find out what kinds the group by construct using the
operation. For instance, we tuples. This is useful, of clothes are popular. rollup and cube operations,
may wish to assign students a for example, to compute an Let us suppose that clothes are which allow multiple group by
rank in class based on their grade- aggregate of a fixed range of characterized by their item name, queries to be run in a single query,
point average (GPA), time; the time range is called a color, and size, with the
select ID, rank() over (order by window. and that we have a relation sales result returned as a single
(GPA) desc) as s rank select year, avg(num credits) with the schema. relation.
from student grades; over (order by year rows 3 sales (item name, color, clothes Consider again our retail shop
preceding) size, quantity) example and the relation:
as avg total credits sales (item name, color, clothes
from tot credits; size, quantity)
Database Design Using the E-R Model

• Overview of the Design Process

Design Phases

Initial phase -- characterize fully the data Second phase -- choosing a data model
needs of the prospective database users. Applying the concepts of the chosen data model
Translating these requirements into a conceptual schema
of the database.
Final Phase -- Moving from an abstract data model to the A fully developed conceptual schema indicates the
implementation of the database functional requirements of the enterprise.
Logical Design – Deciding on the database schema.
Describe the kinds of operations (or transactions) that
Database design requires that we find a “good” will be performed on the data.
collection of relation schemas.
 Business decision – What attributes should we record
in the database?
 Computer Science decision – What relation schemas
should we have and how should the attributes be
distributed among the various relation schemas?
Physical Design – Deciding on the physical layout of the
database
Overview of the Design Process(CONT.)
• Design Alternatives
• A major part of the database design process is deciding how to represent in the design the various types of “things”
such as people, places, products, and the like.
• In designing a database schema, we must ensure that we avoid two major pitfalls:

1. Redundancy: A bad design may repeat information. 2. Incompleteness: A bad design may make certain
For example, if we store the course identifier and title of aspects of the enterprise difficult
a course with each course offering, the title would be or impossible to model. For example, suppose that, as in
stored redundantly (i.e.,multiple times, unnecessarily) case (1) above,
with each course offering. we only had entities corresponding to course offering,
It would suffice to store only the course identifier with without having an entity
each course offering, and to associate the title with the
course identifier only once, in a course entity.
The Entity-Relationship Model

The entity-relationship (E-R) data model was developed to facilitate database design by
allowing specification of an enterprise schema that represents the overall logical structure of a database

Relationship Sets Entity Sets


A relationship is an association among several entities. An entity is a “thing” or “object” in the real world that is
For example, we can define a distinguishable from all other
relationship advisor that associates instructor Katz with objects. For example, each person in a university is an
student Shankar. This relationship entity. An entity has a set of properties,
specifies that Katz is an advisor to student Shankar. A and the values for some set of properties must uniquely
relationship set is a set of identify an entity.
relationships of the same type.
• Complex Attributes
• For each attribute, there is a set of permitted values, called the domain, or value set, of that attribute. The domain of
attribute course id might be the set of all text strings of a certain length. Similarly, the domain of attribute semester might be
strings from the set {Fall, Winter, Spring, Summer}.
Mapping Cardinalities
Mapping cardinalities, or cardinality ratios, express the number of entities to which
another entity can be associated via a relationship set. Mapping cardinalities are most useful in describing binary relationship
sets, although they can contribute to the description of relationship sets that involve more than two entity sets. For a binary
relationship set R between entity sets A and B, the mapping cardinality
must be one of the following:
• One-to-one. An entity in A is associated with at most one entity in B, and an entity in B is associated with at most one entity in
A. (See Figure 6.9a.)
• One-to-many. An entity in A is associated with any number (zero or more) of entities in B. An entity in B, however, can be
associated with at most one entity in A.
(See Figure 6.9b.)
• Many-to-one. An entity in A is associated with at most one entity in B. An entity in B, however, can be associated with any
number (zero or more) of entities in A.
(See Figure 6.10a.)
Primary Key

• We must have a way to specify how entities within a given entity set and relationships within a given
relationship set are distinguished

Entity Sets Relationship Sets Weak Entity Sets


Conceptually, individual entities are We need a mechanism to distinguish Consider a section entity, which is
distinct; from a database perspective, among the various relationships of a uniquely identified by a course
however, the relationship identifier, semester,
differences among them must be set. year, and section identifier. Section
expressed in terms of their attributes. entities are related to course entities.
Suppose we
create a relationship set sec course
between entity sets section and
course.
Removing Redundant Attributes in Entity Sets

• When we design a database using the E-R model, we usually start by identifying those entity sets that should be
included For example, in the university organization we have discussed thus far, we decided to include such
entity sets as student and instructor. Once the entity sets are decided upon, we must choose the appropriate
attributes. These attributes are supposed to represent the various values we want to capture in the database. In
the university organization, we decided that for the instructor entity set, we will include the attributes ID, name,
dept name, and salary. We could have added the attributes phone number, office number, home page, and others.
The choice of what attributes to include is up to the designer, who has a good understanding of the structure of
the enterprise. Once the entities and their corresponding attributes are chosen, the relationship sets among the
various entities are formed. These relationship sets may result in a situation where attributes in the various entity
sets are redundant and need to be removed from the original entity sets. To illustrate, consider the entity sets
instructor and department:
• The entity set instructor includes the attributes ID, name, dept name, and salary, with ID forming the primary
key.
• The entity set department includes the attributes dept name, building, and budget, with dept name forming the
primary key.
Reducing E-R Diagrams to Relational Schemas
• Both the E-R model and the relational database model are abstract, logical representations of real-world enterprises. Because
the two models employ similar design principles, we can convert an E-R design into a relational design. For each entity set and
for each relationship set in the database design, there is a unique relation schema to which we assign the name of the
corresponding entity set or relationship set.

Representation of Strong Entity Sets Representation of Strong Entity Sets Representation of Weak Entity Sets
Let E be a strong entity set with only with Complex Attributes Let A be a weak entity set with attributes a1, a2,…, am.
simple descriptive attributes a1, a2,…, When a strong entity set has nonsimple Let B be the strong entity set on which A depends. Let
an. We attributes, things are a bit more complex. the primary key of B consist of attributes b1, b2,…, bn.
represent this entity with a schema We represent the entity set A by a relation schema
called E with n distinct attributes. called A with one attribute for each ember of the set:
{a1, a2,…, am} ∪ {b1, b2,…, bn}

Representation of Relationship Sets Redundancy of Schemas Combination of Schemas


Let R be a relationship set, let a1, a2,…, am be the A relationship set linking a weak entity set Consider a many-to-one relationship
set of attributes formed by the union of the primary to the corresponding strong entity set is set AB from entity set A to entity set
keys of each of the entity sets participating in R, treated specially. B. Using our
and let the descriptive attributes (if any) of R be b1, relational-schema construction
b2,…, bn. We represent this relationship set by a algorithm outlined previously,
relation schema called R with one attribute for each
member of the set:
{a1, a2,…, am} ∪ {b1, b2,…, bn}
Extended E-R Features

Specialization Generalization Attribute Inheritance


An entity set may include The refinement from an initial entity A crucial property of the higher- and
subgroupings of entities that are set into successive levels of entity lower-level entities created by
distinct in some way from other entities subgroupings specialization and
in the set. For instance, a subset of represents a top-down design process generalization is attribute inheritance.
entities within an entity set may have in which distinctions aremade explicit.
attributes that are not shared by all the
entities in the entity set. The E-R model
provides a means for representing these
distinctive entity groupings.

Constraints on Specializations Aggregation Reduction to Relation Schemas


To model an enterprise more One limitation of the E-R model is that • Representation of Generalization
accurately, the database designer may it cannot express relationships among • Representation of Aggregation
choose to place relationships.
certain constraints on a particular
generalization/specialization
Entity-Relationship Design Issues

• Common Mistakes in E-R Diagrams


A common mistake when creating E-R models is the use of the primary key of an entity
set as an attribute of another entity set, instead of using a relationship.
• Use of Entity Sets versus Attributes
• of Entity Sets versus Relationship Sets
• Binary versus n-ary Relationship Sets
Relationships in databases are often binary. Some relationships that appear to be nonbinary
could actually be better represented by several binary relationships
Alternative Notations for Modeling Data

A diagrammatic representation of the data model of an application is a very important part of designing a
database schema. Creation of a database schema requires not only data modeling experts, but also
domain experts who know the requirements of the application but may not be familiar with data
modeling. An intuitive diagrammatic representation is particularly important since it eases
communication of information between these groups of experts.
• Alternative E-R Notations
• The Unified Modeling Language UML
Other Aspects of Database Design

Our extensive discussion of schema design in this chapter may create the false impression that schema design is
the only component of a database design. There are indeed several other considerations that we address more
fully in subsequent chapters, and survey briefly here.
• Functional Requirements
All enterprises have rules on what kinds of functionality are to be supported by an enterprise application. These
could include transactions that update the data, as well as queries to view data in a desired fashion. In addition
to planning the functionality, designers have to plan the interfaces to be built to support the functionality.
• Data Flow, Workflow
Database applications are often part of a larger enterprise application that interacts not only with the database
system but also with various specialized applications.
• Schema Evolution
Database design is usually not a one-time activity. The needs of an organization evolve continually, and the data
that it needs to store also evolve correspondingly.

You might also like