You are on page 1of 82



Database Solutions (2
Chapter 1 Introduction- Review questions
1.1 List four examples of database systems other than those listed in Section 1.1.
Some examples could be:
A system that maintains component part details for a car manufacturer;
An advertising company eeping details of all clients and adverts placed !ith them;
A training company eeping course information and participants" details;
An organi#ation maintaining all sales order information$
1. !iscuss the meanin" of each of the followin" terms#
$a% data
%or end users& this constitutes all the different values connected !ith the various
ob'ects(entities that are of concern to them$
$b% database
A shared collection of logically related data (and a description of this data)& designed to meet
the information needs of an organi#ation$
$c% database mana"ement system
A soft!are system that: enables users to define& create& and maintain the database and
provides controlled access to this database$
$d% application pro"ram
A computer program that interacts !ith the database by issuing an appropriate re)uest
(typically an S*+ statement) to the D,-S$
$e% data independence
.his is essentially the separation of underlying file structures from the programs that operate
on them& also called program/data independence$
Database Solutions (2
$f% views.
A virtual table that does not necessarily exist in the database but is generated by the D,-S
from the underlying base tables !henever it"s accessed$ .hese present only a subset of the
database that is of particular interest to a user$ 0ie!s can be customi#ed& for example& field
names may change& and they also provide a level of security preventing users from seeing certain
1.& !escribe the main characteristics of the database approach.
%ocus is no! on the data first& and then the applications$ .he structure of the data is no! ept
separate from the programs that operate on the data$ .his is held in the system catalog or data
dictionary$ 1rograms can no! share data& !hich is no longer fragmented$ .here is also a reduction in
redundancy& and achievement of program/data independence$
1.' !escribe the five components of the !()S environment and discuss how they relate
to each other.
(2) 3ard!are: .he computer system(s) that the D,-S and the application programs run on$
.his can range from a single 14& to a single mainframe& to a net!or of computers$
(2) Soft!are: .he D,-S soft!are and the application programs& together !ith the operating
system& including net!or soft!are if the D,-S is being used over a net!or$
(5) Data: .he data acts as a bridge bet!een the hard!are and soft!are components and
the human components$ As !e"ve already said& the database contains both the operational
data and the meta/data (the 6data about data")$
(7) 1rocedures: .he instructions and rules that govern the design and use of the database$ .his
may include instructions on ho! to log on to the D,-S& mae bacup copies of the database&
and ho! to handle hard!are or soft!are failures$
(8) 1eople: .his includes the database designers& database administrators (D,As)&
application programmers& and the end/users$
1.* !escribe the problems with the traditional two-tier client-server architecture and
discuss how these problems were overcome with the three-tier client-server
Database Solutions (2
9n the mid/2::;s& as applications became more complex and potentially could be deployed to
hundreds or thousands of end/users& the client side of this architecture gave rise to t!o
A 6fat" client& re)uiring considerable resources on the client"s computer to run effectively
(resources include dis space& <A-& and 41= po!er)$
A significant client side administration overhead$
,y 2::8& a ne! variation of the traditional t!o/tier client/server model appeared to solve these
problems called the three-tier client-server architecture$ .his ne! architecture proposed
three layers& each potentially running on a different platform:
(2) .he user interface layer& !hich runs on the end/user"s computer (the client)$
(2) .he business logic and data processing layer$ .his middle tier runs on a server and is often
called the application server$ >ne application server is designed to serve multiple clients$
(5) A D,-S& !hich stores the data re)uired by the middle tier$ .his tier may run on a separate
server called the database server$
.he three/tier design has many advantages over the traditional t!o/tier design& such as:
A 6thin" client& !hich re)uires less expensive hard!are$
Simplified application maintenance& as a result of centrali#ing the business logic for many
end/users into a single application server$ .his eliminates the concerns of soft!are
distribution that are problematic in the traditional t!o/tier client/server architecture$
Added modularity& !hich maes it easier to modify or replace one tier !ithout affecting the
other tiers$
Easier load balancing& again as a result of separating the core business logic from the
database functions$ %or example& a +ransaction ,rocessin" )onitor $+,)% can be used to
reduce the number of connections to the database server$ (A .1- is a program that
controls data transfer bet!een clients and servers in order to provide a consistent
environment for >nline .ransaction 1rocessing (>+.1)$)
An additional advantage is that the three/tier architecture maps )uite naturally to the ?eb
environment& !ith a ?eb bro!ser acting as the 6thin" client& and a ?eb server acting as the
application server$ .he three/tier client server architecture is illustrated in %igure 2$7$
Database Solutions (2
1.- !escribe the functions that should be provided by a modern full-scale multi-user
Data Storage& <etrieval and =pdate Authori#ation Services
A =ser/Accessible 4atalog Support for Data 4ommunication
.ransaction Support 9ntegrity Services
4oncurrency 4ontrol Services Services to 1romote Data 9ndependence
<ecovery Services =tility Services
1.. /f the functions described in your answer to 0uestion 1.-1 which ones do you thin2
would not be needed in a standalone ,C !()S3 ,rovide 4ustification for your
4oncurrency 4ontrol Services / only single user$
Authori#ation Services / only single user& but may be needed if different individuals are to use
the D,-S at different times$
=tility Services / limited in scope$
Support for Data 4ommunication / only standalone system$
1.5 !iscuss the advanta"es and disadvanta"es of !()Ss.
Some advantages of the database approach include control of data redundancy& data
consistency& sharing of data& and improved security and integrity$ Some disadvantages include
complexity& cost& reduced performance& and higher impact of a failure$
Database Solutions (2
Chapter +he Relational )odel - Review questions
.1 !iscuss each of the followin" concepts in the context of the relational data model#
$a% relation
A table !ith columns and ro!s$
$b% attribute
A named column of a relation$
$c% domain
.he set of allo!able values for one or more attributes$
$d% tuple
A record of a relation$
$e% relational database.
A collection of normali#ed tables$
. !iscuss the properties of a relational table.
A relational table has the follo!ing properties:
.he table has a name that is distinct from all other tables in the database$
Each cell of the table contains exactly one value$ (%or example& it !ould be !rong to store
several telephone numbers for a single branch in a single cell$ 9n other !ords& tables don"t
contain repeating groups of data$ A relational table that satisfies this property is said to be
normali#ed or in first normal form$)
Each column has a distinct name$
.he values of a column are all from the same domain$
.he order of columns has no significance$ 9n other !ords& provided a column name is moved
along !ith the column values& !e can interchange columns$
Each record is distinct; there are no duplicate records$
Database Solutions (2
.he order of records has no significance& theoretically$
.& !iscuss the differences between the candidate 2eys and the primary 2ey of a
table. 6xplain what is meant by a forei"n 2ey. 7ow do forei"n 2eys of tables relate
to candidate 2eys3 8ive examples to illustrate your answer.
.he primary ey is the candidate ey that is selected to identify tuples uni)uely !ithin a
relation$ A foreign ey is an attribute or set of attributes !ithin one relation that matches the
candidate ey of some (possibly the same) relation$
.' 9hat does a null represent3
<epresents a value for a column that is currently unno!n or is not applicable for this record$
.* !efine the two principal inte"rity rules for the relational model. !iscuss why it is
desirable to enforce these rules.
6ntity inte"rity 9n a base table& no column of a primary ey can be null$
Referential inte"rity 9f a foreign ey exists in a table& either the foreign ey value must
match a candidate ey value of some record in its home table or the foreign ey value must be
!holly null$
Database Solutions (2
Chapter & S0L and 0(6 - Review questions
&.1 9hat are the two ma4or components of S0L and what function do they serve3
A data definition language (DD+) for defining the database structure$
A data manipulation language (D-+) for retrieving and updating data$
&. 6xplain the function of each of the clauses in the SELECT statement. 9hat
restrictions are imposed on these clauses3
FROM specifies the table or tables to be used;
WHERE filters the ro!s sub'ect to some condition;
GROUP BY forms groups of ro!s !ith the same column value;
HAVING filters the groups sub'ect to some condition;
SELECT specifies !hich columns are to appear in the output;
ORDER BY specifies the order of the output$
&.& 9hat restrictions apply to the use of the a""re"ate functions within the SELECT
statement3 7ow do nulls affect the a""re"ate functions3
An aggregate function can be used only in the SE+E4. list and in the 3A09BC clause$
Apart from 4>=B.(D)& each function eliminates nulls first and operates only on the remaining
non/null values$ 4>=B.(D) counts all the ro!s of a table& regardless of !hether nulls or
duplicate values occur$
&.' 6xplain how the GROUP BY clause wor2s. 9hat is the difference between the
WHERE and HAVING clauses3
S*+ first applies the ?3E<E clause$ .hen it conceptually arranges the table based on the
grouping column(s)$ Bext& applies the 3A09BC clause and finally orders the result according to
the ><DE< ,E clause$
?3E<E filters ro!s sub'ect to some condition; 3A09BC filters groups sub'ect to some
Database Solutions (2
&.* 9hat is the difference between a subquery and a 4oin3 :nder what circumstances
would you not be able to use a subquery3
?ith a sub)uery& the columns specified in the SE+E4. list are restricted to one table$ .hus&
cannot use a sub)uery if the SE+E4. list contains columns from more than one table$
&.- 9hat is 0(6 and what is the relationship between 0(6 and S0L3
*,E is an alternative& graphical/based& 6point/and/clic" !ay of )uerying the database& !hich is
particularly suited for )ueries that are not too complex& and can be expressed in terms of a fe!
tables$ *,E has ac)uired the reputation of being one of the easiest !ays for non/technical
users to obtain information from the database$
*,E )ueries are converted into their e)uivalent S*+ statements before transmission to the
D,-S server$
Database Solutions (2
Chapter ' !atabase Systems !evelopment Lifecycle - Review questions
'.1 !escribe what is meant by the term ;software crisis<.
.he past fe! decades has !itnessed the dramatic rise in the number of soft!are applications$
-any of these applications proved to be demanding& re)uiring constant maintenance$ .his
maintenance involved correcting faults& implementing ne! user re)uirements& and modifying the
soft!are to run on ne! or upgraded platforms$ ?ith so much soft!are around to support& the
effort spent on maintenance began to absorb resources at an alarming rate$ As a result& many
ma'or soft!are pro'ects !ere late& over budget& and the soft!are produced !as unreliable&
difficult to maintain& and performed poorly$ .his led to !hat has become no!n as the 6soft!are
crisis"$ Although this term !as first used in the late 2:@;s& more than 5; years later& the crisis
is still !ith us$ As a result& some people no! refer to the soft!are crisis as the 6soft!are
'. !iscuss the relationship between the information systems lifecycle and the database
system development lifecycle.
An information system is the resources that enable the collection& management& control& and
dissemination of data(information throughout a company$ .he database is a fundamental
component of an information system$ .he lifecycle of an information system is inherently lined
to the lifecycle of the database that supports it$
.ypically& the stages of the information systems lifecycle include: planning& re)uirements
collection and analysis& design (including database design)& prototyping& implementation& testing&
conversion& and operational maintenance$ As a database is a fundamental component of the
larger company/!ide information system& the database system development lifecycle is
inherently lined !ith the information systems lifecycle$
'.& (riefly describe the sta"es of the database system development lifecycle.
See =i"ure '.1 Stages of the database system development lifecycle$
Database Solutions (2
!atabase plannin" is the management activities that allo! the stages of the database system
development lifecycle to be reali#ed as efficiently and effectively as possible$
System definition involves identifying the scope and boundaries of the database system
including its ma'or user vie!s$ A user vie! can represent a 'ob role or business application area$
Requirements collection and analysis is the process of collecting and analy#ing information
about the company that is to be supported by the database system& and using this information
to identify the re)uirements for the ne! system$
.here are three approaches to dealing !ith multiple user vie!s& namely the centrali#ed
approach& the vie! integration approach& and a combination of both$ .he centrali>ed approach
involves collating the users" re)uirements for different user vie!s into a single list of
re)uirements$ A data model representing all the user vie!s is created during the database
design stage$ .he view inte"ration approach involves leaving the users" re)uirements for each
user vie! as separate lists of re)uirements$ Data models representing each user vie! are
created and then merged at a later stage of database design$
!atabase desi"n is the process of creating a design that !ill support the company"s mission
statement and mission ob'ectives for the re)uired database$ .his stage includes the logical and
physical design of the database$
.he aim of !()S selection is to select a system that meets the current and future
re)uirements of the company& balanced against costs that include the purchase of the D,-S
product and any additional soft!are(hard!are& and the costs associated !ith changeover and
?pplication desi"n involves designing the user interface and the application programs that use
and process the database$ .his stage involves t!o main activities: transaction design and user
interface design$
,rototypin" involves building a !oring model of the database system& !hich allo!s the
designers or users to visuali#e and evaluate the system$
Database Solutions (2
Implementation is the physical reali#ation of the database and application designs$
!ata conversion and loadin" involves transferring any existing data into the ne! database and
converting any existing applications to run on the ne! database$
+estin" is the process of running the database system !ith the intent of finding errors$
/perational maintenance is the process of monitoring and maintaining the system follo!ing
'.' !escribe the purpose of creatin" a mission statement and mission ob4ectives for the
required database durin" the database plannin" sta"e.
.he mission statement defines the ma'or aims of the database system& !hile each mission
ob'ective identifies a particular tas that the database must support$
'.* !iscuss what a user view represents when desi"nin" a database system.
A user vie! defines !hat is re)uired of a database system from the perspective of a particular
'ob (such as -anager or Supervisor) or business application area (such as mareting& personnel&
or stoc control)$
'.- Compare and contrast the centrali>ed approach and view inte"ration approach to
mana"in" the desi"n of a database system with multiple user views.
An important activity of the re)uirements collection and analysis stage is deciding ho! to deal
!ith the situation !here there is more than one user vie!$ .here are three approaches to
dealing !ith multiple user vie!s:
the centrali#ed approach&
the vie! integration approach& and
a combination of both approaches$
Database Solutions (2
Centrali>ed approach
<e)uirements for each user vie! are merged into a single list of re)uirements for the ne!
database system$ A logical data model representing all user vie!s is created during the database
design stage$
.he centrali>ed approach involves collating the re)uirements for different user vie!s into a
single list of re)uirements$ A data model representing all user vie!s is created in the database
design stage$ A diagram representing the management of user vie!s 2 to 5 using the centrali#ed
approach is sho!n in %igure 7$7$ Cenerally& this approach is preferred !hen there is a significant
overlap in re)uirements for each user vie! and the database system is not overly complex$
See =i"ure '.' .he centrali#ed approach to managing multiple user vie!s 2 to 5$
@iew inte"ration approach
<e)uirements for each user vie! remain as separate lists$ Data models representing each user
vie! are created and then merged later during the database design stage$
.he view inte"ration approach involves leaving the re)uirements for each user vie! as separate
lists of re)uirements$ ?e create data models representing each user vie!$ A data model that
represents a single user vie! is called a local lo"ical data model$ ?e then merge the local data
models to create a "lobal lo"ical data model representing all user vie!s of the company$
A diagram representing the management of user vie!s 2 to 5 using the vie! integration
approach is sho!n in %igure 7$8$ Cenerally& this approach is preferred !hen there are significant
differences bet!een user vie!s and the database system is sufficiently complex to 'ustify
dividing the !or into more manageable parts$
See =i"ure '.* .he vie! integration approach to managing multiple user vie!s 2 to 5$
%or some complex database systems it may be appropriate to use a combination of both the
centrali#ed and vie! integration approaches to managing multiple user vie!s$ %or example& the
re)uirements for t!o or more users vie!s may be first merged using the centrali#ed approach
and then used to create a local lo"ical data model$ (.herefore in this situation the local data
model represents not 'ust a single user vie! but the number of user vie!s merged using the
centrali#ed approach)$ .he local data models representing one or more user vie!s are then
Database Solutions (2
merged using the vie! integration approach to form the "lobal lo"ical data model representing
all user vie!s$
'.. 6xplain why it is necessary to select the tar"et !()S before be"innin" the physical
database desi"n phase.
Database design is made up of t!o main phases called logical and physical design$ During logical
database design& !e identify the important ob'ects that need to be represented in the database
and the relationships bet!een these ob'ects$ During physical database design& !e decide ho!
the logical design is to be physically implemented (as tables) in the target D,-S$ .herefore it is
necessary to have selected the target D,-S before !e are able to proceed to physical
database design$
See %igure 7$2 Stages of the database system development lifecycle$
'.5 !iscuss the two main activities associated with application desi"n.
.he database and application design stages are parallel activities of the database system
development lifecycle$ 9n most cases& !e cannot complete the application design until the design
of the database itself has taen place$ >n the other hand& the database exists to support the
applications& and so there must be a flo! of information bet!een application design and
database design$
.he t!o main activities associated !ith the application design stage is the design of the user
interface and the application programs that use and process the database$
?e must ensure that all the functionality stated in the re)uirements specifications is present in
the application design for the database system$ .his involves designing the interaction bet!een
the user and the data& !hich !e call transaction design$ 9n addition to designing ho! the
re)uired functionality is to be achieved& !e have to design an appropriate user interface to the
database system$
'.A !escribe the potential benefits of developin" a prototype database system.
.he purpose of developing a prototype database system is to allo! users to use the prototype to
identify the features of the system that !or !ell& or are inade)uate& and if possible to suggest
Database Solutions (2
improvements or even ne! features for the database system$ 9n this !ay& !e can greatly clarify
the re)uirements and evaluate the feasibility of a particular system design$ 1rototypes should
have the ma'or advantage of being relatively inexpensive and )uic to build$
'.1B !iscuss the main activities associated with the implementation sta"e.
.he database implementation is achieved using the Data Definition +anguage (DD+) of the
selected D,-S or a graphical user interface (C=9)& !hich provides the same functionality !hile
hiding the lo!/level DD+ statements$ .he DD+ statements are used to create the database
structures and empty database files$ Any specified user vie!s are also implemented at this
.he application programs are implemented using the preferred third or fourth "eneration
lan"ua"e $&8L or '8L%$ 1arts of these application programs are the database transactions&
!hich !e implement using the Data -anipulation +anguage (D-+) of the target D,-S& possibly
embedded !ithin a host programming language& such as 0isual ,asic (0,)& 0,$net& 1ython& Delphi&
4& 4GG& 4H& Iava& 4>,>+& %ortran& Ada& or 1ascal$ ?e also implement the other components of
the application design such as menu screens& data entry forms& and reports$ Again& the target
D,-S may have its o!n fourth generation tools that allo! rapid development of applications
through the provision of non/procedural )uery languages& reports generators& forms generators&
and application generators$
Security and integrity controls for the application are also implemented$ Some of these controls
are implemented using the DD+& but others may need to be defined outside the DD+ using& for
example& the supplied D,-S utilities or operating system controls$
'.11 !escribe the purpose of the data conversion and loadin" sta"e.
.his stage is re)uired only !hen a ne! database system is replacing an old system$ Bo!adays&
it"s common for a D,-S to have a utility that loads existing files into the ne! database$ .he
utility usually re)uires the specification of the source file and the target database& and then
automatically converts the data to the re)uired format of the ne! database files$ ?here
applicable& it may be possible for the developer to convert and use application programs from
the old system for use by the ne! system$
Database Solutions (2
'.1 6xplain the purpose of testin" the database system.
,efore going live& the ne!ly developed database system should be thoroughly tested$ .his is
achieved using carefully planned test strategies and realistic data so that the entire testing
process is methodically and rigorously carried out$ Bote that in our definition of testing !e have
not used the commonly held vie! that testing is the process of demonstrating that faults are
not present$ 9n fact& testing cannot sho! the absence of faults; it can sho! only that soft!are
faults are present$ 9f testing is conducted successfully& it !ill uncover errors in the application
programs and possibly the database structure$ As a secondary benefit& testing demonstrates
that the database and the application programs appear to be !oring according to their
specification and that performance re)uirements appear to be satisfied$ 9n addition& metrics
collected from the testing stage provides a measure of soft!are reliability and soft!are
As !ith database design& the users of the ne! system should be involved in the testing
process$ .he ideal situation for system testing is to have a test database on a separate
hard!are system& but often this is not available$ 9f real data is to be used& it is essential to
have bacups taen in case of error$
.esting should also cover usability of the database system$ 9deally& an evaluation should be
conducted against a usability specification$ Examples of criteria that can be used to conduct the
evaluation include (Sommerville& 2;;;):
+earnability / 3o! long does it tae a ne! user to become productive !ith the systemJ
1erformance / 3o! !ell does the system response match the user"s !or practiceJ
<obustness / 3o! tolerant is the system of user errorJ
<ecoverability / 3o! good is the system at recovering from user errorsJ
Adapatability / 3o! closely is the system tied to a single model of !orJ
Some of these criteria may be evaluated in other stages of the lifecycle$ After testing is
complete& the database system is ready to be 6signed off" and handed over to the users$
'.1& 9hat are the main activities associated with operational maintenance sta"e.
9n this stage& the database system no! moves into a maintenance stage& !hich involves the
follo!ing activities:
-onitoring the performance of the database system$ 9f the performance falls belo! an
acceptable level& the database may need to be tuned or reorgani#ed$
Database Solutions (2
-aintaining and upgrading the database system (!hen re)uired)$ Be! re)uirements are
incorporated into the database system through the preceding stages of the lifecycle$
Database Solutions (2
Chapter * !atabase ?dministration and Security - Review questions
*.1 !efine the purpose and tas2s associated with data administration and database
Data administration is the management and control of the corporate data& including database
planning& development and maintenance of standards& policies and procedures& and logical
database design$
Database Solutions (2
Database administration is the management and control of the physical reali#ation of the
corporate database system& including physical database design and implementation& setting
security and integrity controls& monitoring system performance& and reorgani#ing the database
as necessary$
*. Compare and contrast the main tas2s carried out by the !? and !(?.
.he Data Administrator (DA) and Database Administrator (D,A) are responsible for managing
and controlling the activities associated !ith the corporate data and the corporate database&
respectively$ .he DA is more concerned !ith the early stages of the lifecycle& from planning
through to logical database design$ 9n contrast& the D,A is more concerned !ith the later
stages& from application(physical database design to operational maintenance$ Depending on the
si#e and complexity of the organi#ation and(or database system the DA and D,A can be the
responsibility of one or more people$
Database Solutions (2
*.& 6xplain the purpose and scope of database security.
Security considerations do not only apply to the data held in a database$ ,reaches of security
may affect other parts of the system& !hich may in turn affect the database$ 4onse)uently&
database security encompasses hard!are& soft!are& people& and data$ .o effectively implement
security re)uires appropriate controls& !hich are defined in specific mission ob'ectives for the
system$ .his need for security& !hile often having been neglected or overlooed in the past& is
no! increasingly recogni#ed by organi#ations$ .he reason for this turn/around is due to the
increasing amounts of crucial corporate data being stored on computer and the acceptance that
any loss or unavailability of this data could be potentially disastrous$
*.' List the main types of threat that could affect a database system1 and for each1
describe the possible outcomes for an or"ani>ation.
Database Solutions (2
=i"ure *.1 A summary of the potential threats to computer systems$
*.* 6xplain the followin" in terms of providin" security for a database#
bac2up and recoveryC
Authori#ation is the granting of a right or privilege that enables a sub'ect to have legitimate
access to a system or a system"s ob'ect$ Authori#ation controls can be built into the soft!are&
and govern not only !hat database system or ob'ect a specified user can access& but also !hat
the user may do !ith it$ .he process of authori#ation involves authentication of a sub'ect
Database Solutions (2
re)uesting access to an ob'ect& !here 6sub'ect" represents a user or program and 6ob'ect"
represents a database table& vie!& procedure& trigger& or any other ob'ect that can be created
!ithin the database system$
A vie! is a virtual table that does not necessarily exist in the database but can be produced
upon re)uest by a particular user& at the time of re)uest$ .he vie! mechanism provides a
po!erful and flexible security mechanism by hiding parts of the database from certain users$
.he user is not a!are of the existence of any columns or ro!s that are missing from the vie!$ A
vie! can be defined over several tables !ith a user being granted the appropriate privilege to
use it& but not to use the base tables$ 9n this !ay& using a vie! is more restrictive than simply
having certain privileges granted to a user on the base table(s)$
(ac2up and recovery
,acup is the process of periodically taing a copy of the database and log file (and possibly
programs) onto offline storage media$ A D,-S should provide bacup facilities to assist !ith
the recovery of a database follo!ing failure$ .o eep trac of database transactions& the D,-S
maintains a special file called a log file (or 'ournal) that contains information about all updates
to the database$ 9t is al!ays advisable to mae bacup copies of the database and log file at
regular intervals and to ensure that the copies are in a secure location$ 9n the event of a failure
that renders the database unusable& the bacup copy and the details captured in the log file are
used to restore the database to the latest possible consistent state$ Iournaling is the process
of eeping and maintaining a log file (or 'ournal) of all changes made to the database to enable
recovery to be undertaen effectively in the event of a failure$
Inte"rity constraints
4ontribute to maintaining a secure database system by preventing data from becoming invalid&
and hence giving misleading or incorrect results$
9s the encoding of the data by a special algorithm that renders the data unreadable by any
program !ithout the decryption ey$ 9f a database system holds particularly sensitive data& it
may be deemed necessary to encode it as a precaution against possible external threats or
Database Solutions (2
attempts to access it$ Some D,-Ss provide an encryption facility for this purpose$ .he D,-S
can access the data (after decoding it)& although there is degradation in performance because
of the time taen to decode it$ Encryption also protects data transmitted over communication
lines$ .here are a number of techni)ues for encoding data to conceal the information; some are
termed irreversible and others reversible$ 9rreversible techni)ues& as the name implies& do not
permit the original data to be no!n$ 3o!ever& the data can be used to obtain valid statistical
information$ <eversible techni)ues are more commonly used$ .o transmit data securely over
insecure net!ors re)uires the use of a cryptosystem& !hich includes:
an encryption ey to encrypt the data (plaintext);
an encryption algorithm that& !ith the encryption ey& transforms the plain text into
a decryption ey to decrypt the ciphertext;
a decryption algorithm that& !ith the decryption ey& transforms the ciphertext bac into
plain text$
Redundant ?rray of Independent !is2s $R?I!%
<A9D !ors by having a large dis array comprising an arrangement of several independent diss
that are organi#ed to improve reliability and at the same time increase performance$ .he
hard!are that the D,-S is running on must be fault/tolerant& meaning that the D,-S should
continue to operate even if one of the hard!are components fails$ .his suggests having
redundant components that can be seamlessly integrated into the !oring system !henever
there is one or more component failures$ .he main hard!are components that should be fault/
tolerant include dis drives& dis controllers& 41=& po!er supplies& and cooling fans$ Dis drives
are the most vulnerable components !ith the shortest times bet!een failures of any of the
hard!are components$
>ne solution is the use of <edundant Array of 9ndependent Diss (<A9D) technology$ <A9D
!ors by having a large dis array comprising an arrangement of several independent diss that
are organi#ed to improve reliability and at the same time increase performance$
Database Solutions (2
Chapter - =act-=indin" - Review questions
-.1 (riefly describe what the process of fact-findin" attempts to achieve for a database
%act/finding is the formal process of using techni)ues such as intervie!s and )uestionnaires to
collect facts about systems& re)uirements& and preferences$
.he database developer uses fact/finding techni)ues at various stages throughout the database
systems lifecycle to capture the necessary facts to build the re)uired database system$ .he
necessary facts cover the business and the users of the database system& including the
terminology& problems& opportunities& constraints& re)uirements& and priorities$ .hese facts are
captured using fact/finding techni)ues$
-. !escribe how fact-findin" is used throu"hout the sta"es of the database system
development lifecycle.
.here are many occasions for fact/finding during the database system development lifecycle$
3o!ever& fact/finding is particularly crucial to the early stages of the lifecycle& including the
database planning& system definition& and re)uirements collection and analysis stages$ 9t"s during
these early stages that the database developer learns about the terminology& problems&
opportunities& constraints& re)uirements& and priorities of the business and the users of the
system$ %act/finding is also used during database design and the later stages of the lifecycle&
but to a lesser extent$ %or example& during physical database design& fact/finding becomes
technical as the developer attempts to learn more about the D,-S selected for the database
system$ Also& during the final stage& operational maintenance& fact/finding is used to determine
!hether a system re)uires tuning to improve performance or further developed to include ne!
Database Solutions (2
-.& =or each sta"e of the database system development lifecycle identify examples of the
facts captured and the documentation produced.
Database Solutions (2
-.' ? database developer normally uses several fact-findin" techniques durin" a sin"le
database pro4ect. +he five most commonly used techniques are examinin"
documentation1 interviewin"1 observin" the business in operation1 conductin" research1
and usin" questionnaires. !escribe each fact-findin" technique and identify the
advanta"es and disadvanta"es of each.
6xaminin" documentation can be useful !hen you"re trying to gain some insight as to ho! the
need for a database arose$ Eou may also find that documentation can be helpful to provide
information on the business (or part of the business) associated !ith the problem$ 9f the
problem relates to the current system there should be documentation associated !ith that
system$ Examining documents& forms& reports& and files associated !ith the current system& is a
good !ay to )uicly gain some understanding of the system$
Interviewin" is the most commonly used& and normally most useful& fact/finding techni)ue$ Eou
can intervie! to collect information from individuals face/to/face$ .here can be several
ob'ectives to using intervie!ing such as finding out facts& checing facts& generating user
interest and feelings of involvement& identifying re)uirements& and gathering ideas and opinions$
/bservation is one of the most effective fact/finding techni)ues you can use to understand a
system$ ?ith this techni)ue& you can either participate in& or !atch a person perform activities
to learn about the system$ .his techni)ue is particularly useful !hen the validity of data
Database Solutions (2
collected through other methods is in )uestion or !hen the complexity of certain aspects of the
system prevents a clear explanation by the end/users$
A useful fact/finding techni)ue is to research the application and problem$ 4omputer trade
'ournals& reference boos& and the 9nternet are good sources of information$ .hey can provide
you !ith information on ho! others have solved similar problems& plus you can learn !hether or
not soft!are pacages exist to solve your problem$
Another fact/finding techni)ue is to conduct surveys through questionnaires$ *uestionnaires
are special/purpose documents that allo! you to gather facts from a large number of people
Database Solutions (2
!hile maintaining some control over their responses$ ?hen dealing !ith a large audience& no
other fact/finding techni)ue can tabulate the same facts as efficiently$
-.* !escribe the purpose of definin" a mission statement and mission ob4ectives for a
database system.
.he mission statement defines the ma'or aims of the database system$ .hose driving the
database pro'ect !ithin the business (such as the Director and(or o!ner) normally define the
mission statement$ A mission statement helps to clarify the purpose of the database pro'ect
and provides a clearer path to!ards the efficient and effective creation of the re)uired
database system$
>nce the mission statement is defined& the next activity involves identifying the mission
ob'ectives$ Each mission ob'ective should identify a particular tas that the database must
support$ .he assumption is that if the database supports the mission ob'ectives then the
mission statement should be met$ .he mission statement and ob'ectives may be accompanied
!ith additional information that specifies& in general terms& the !or to be done& the resources
!ith !hich to do it& and the money to pay for it all$
Database Solutions (2
-.- 9hat is the purpose of the systems definition sta"e3
.he purpose of the system definition stage is to identify the scope and boundary of the
database system and its ma'or user vie!s$ Defining the scope and boundary of the database
system helps to identify the main types of data mentioned in the intervie!s and a rough guide as
to ho! this data is related$ A user vie! represents the re)uirements that should be supported
by a database system as defined by a particular 'ob role (such as -anager or Assistant) or
business application area (such as video rentals or stoc control)$
-.. 7ow do the contents of a users< requirements specification differ from a systems
.here are t!o main documents created during the re)uirements collection and analysis stage&
namely the users" re)uirements specification and the systems specification$
.he users" re)uirements specification describes in detail the data to be held in the database
and ho! the data is to be used$
.he systems specification describes any features to be included in the database system such as
the re)uired performance and the levels of security$
-.5 !escribe one approach to decidin" whether to use centrali>ed1 view inte"ration1 or a
combination of both when developin" a database system for multiple user views.
>ne !ay to help you mae a decision !hether to use the centrali#ed& vie! integration& or a
combination of both approaches to manage multiple user vie!s is to examine the overlap in terms
of the data used bet!een the user vie!s identified during the system definition stage$
9t"s difficult to give precise rules as to !hen it"s appropriate to use the centrali#ed or vie!
integration approaches$ As the database developer& you should base your decision on an
assessment of the complexity of the database system and the degree of overlap bet!een the
various user vie!s$ 3o!ever& !hether you use the centrali#ed or vie! integration approach or a
mixture of both to build the underlying database& ultimately you need to create the original user
vie!s for the !oring database system$
Database Solutions (2
Chapter . 6ntity-Relationship )odelin" - Review questions
..1 !escribe what entities represent in an 6R model and provide examples of entities with
a physical or conceptual existence.
Entity is a set of ob'ects !ith the same properties& !hich are identified by a user or company as
having an independent existence$ Each ob'ect& !hich should be uni)uely identifiable !ithin the
set& is called an entity occurrence$ An entity has an independent existence and can represent
ob'ects !ith a physical (or 6real") existence or ob'ects !ith a conceptual (or 6abstract")
.. !escribe what relationships represent in an 6R model and provide examples of unary1
binary1 and ternary relationships.
<elationship is a set of meaningful associations among entities$ As !ith entities& each association
should be uni)uely identifiable !ithin the set$ A uni)uely identifiable association is called a
relationship occurrence$ Each relationship is given a name that describes its function$ %or
example& the Actor entity is associated !ith the <ole entity through a relationship called 1lays&
and the <ole entity is associated !ith the 0ideo entity through a relationship called %eatures$
.he entities involved in a particular relationship are referred to as participants$ .he number of
participants in a relationship is called the degree and indicates the number of entities involved
in a relationship$ A relationship of degree one is called unary& !hich is commonly referred to as
a recursive relationship$ A unary relationship describes a relationship !here the same entity
participates more than once in different roles$ An example of a unary relationship is Supervises&
Database Solutions (2
!hich represents an association of staff !ith a supervisor !here the supervisor is also a
member of staff$ 9n other !ords& the Staff entity participates t!ice in the Supervises
relationship; the first participation as a supervisor& and the second participation as a member of
staff !ho is supervised (supervisee)$ See %igure A$8 for a diagrammatic representation of the
Supervises relationship$
A relationship of degree t!o is called binary$
A relationship of a degree higher than binary is called a complex relationship$ A relationship of
degree three is called ternary$ An example of a ternary relationship is <egisters !ith three
participating entities& namely ,ranch& Staff& and -ember$ .he purpose of this relationship is to
represent the situation !here a member of staff registers a member at a particular branch&
allo!ing for members to register at more than one branch& and members of staff to move
bet!een branches$
Database Solutions (2
=i"ure ..' Example of a ternary relationship called <egisters$
..& !escribe what attributes represent in an 6R model and provide examples of simple1
composite1 sin"le-value1 multi-value1 and derived attributes.
An attribute is a property of an entity or a relationship$
Attributes represent !hat !e !ant to no! about entities$ %or example& a 0ideo entity may
be described by the catalogBo& title& category& daily<ental& and price attributes$ .hese
attributes hold values that describe each video occurrence& and represent the main source of
data stored in the database$
Simple attribute is an attribute composed of a single component$ Simple attributes cannot be
further subdivided$ Examples of simple attributes include the category and price attributes for
a video$
Composite attribute is an attribute composed of multiple components$ 4omposite attributes can
be further divided to yield smaller components !ith an independent existence$ %or example& the
name attribute of the -ember entity !ith the value 6Don Belson" can be subdivided into fBame
(6Don") and lBame (6Belson")$
Sin"le-valued attribute is an attribute that holds a single value for an entity occurrence$ .he
ma'ority of attributes are single/valued for a particular entity$ %or example& each occurrence of
the 0ideo entity has a single/value for the catalogBo attribute (for example& 2;A252)& and
therefore the catalogBo attribute is referred to as being single/valued$
)ulti-valued attribute is an attribute that holds multiple values for an entity occurrence$ Some
attributes have multiple values for a particular entity$ %or example& each occurrence of the
0ideo entity may have multiple values for the category attribute (for example& 64hildren" and
64omedy")& and therefore the category attribute in this case !ould be multi/valued$ A multi/
valued attribute may have a set of values !ith specified lo!er and upper limits$ %or example& the
category attribute may have bet!een one and three values$
Database Solutions (2
!erived attribute is an attribute that represents a value that is derivable from the value of a
related attribute& or set of attributes& not necessarily in the same entity$ Some attributes may
be related for a particular entity$ %or example& the age of a member of staff (age) is derivable
from the date of birth (D>,) attribute& and therefore the age and D>, attributes are related$
?e refer to the age attribute as a derived attribute& the value of !hich is derived from the
D>, attribute$
..' !escribe what multiplicity represents for a relationship.
)ultiplicity is the number of occurrences of one entity that may relate to a single occurrence of
an associated entity$
..* 9hat are business rules and how does multiplicity model these constraints3
-ultiplicity constrains the number of entity occurrences that relate to other entity
occurrences through a particular relationship$ -ultiplicity is a representation of the policies
established by the user or company& and is referred to as a business rule$ Ensuring that all
appropriate business rules are identified and represented is an important part of modeling a
.he multiplicity for a binary relationship is generally referred to as one/to/one (2:2)& one/to/
many (2:D)& or many/to/many (D:D)$ Examples of three types of relationships include:
A member of staff manages a branch$
A branch has members of staff$
Actors play in videos$
..- 7ow does multiplicity represent both the cardinality and the participation constraints
on a relationship3
-ultiplicity actually consists of t!o separate constraints no!n as cardinality and participation$
Cardinality describes the number of possible relationships for each participating entity$
,articipation determines !hether all or only some entity occurrences participate in a
relationship$ .he cardinality of a binary relationship is !hat !e have been referring to as one/
to/one& one/to/many& and many/to/many$ A participation constraint represents !hether all entity
occurrences are involved in a particular relationship (mandatory participation) or only some
Database Solutions (2
(optional participation)$ .he cardinality and participation constraints for the Staff -anages
,ranch relationship are sho!n in %igure A$22$
... ,rovide an example of a relationship with attributes.
An example of a relationship !ith an attribute is the relationship called 1lays9n& !hich
associates the Actor and 0ideo entities$ ?e may !ish to record the character played by an
actor in a given video$ .his information is associated !ith the 1lays9n relationship rather than
the Actor or 0ideo entities$ ?e create an attribute called character to store this information
and assign it to the 1lays9n relationship& as illustrated in %igure A$22$ Bote& in this figure the
character attribute is sho!n using the symbol for an entity; ho!ever& to distinguish bet!een a
relationship !ith an attribute and an entity& the rectangle representing the attribute is
associated !ith the relationship using a dashed line$
Database Solutions (2
%igure A$22 A relationship called 1lays9n !ith an attribute called character$
..5 !escribe how stron" and wea2 entities differ and provide an example of each.
?e can classify entities as being either strong or !ea$ A stron" entity is not dependent on the
existence of another entity for its primary ey$ A wea2 entity is partially or !holly dependent
on the existence of another entity& or entities& for its primary ey$ %or example& as !e can
distinguish one actor from all other actors and one video from all other videos !ithout the
existence of any other entity& Actor and 0ideo are referred to as being strong entities$ 9n
other !ords& the Actor and 0ideo entities are strong because they have their o!n primary eys$
An example of a !ea entity called <ole& !hich represents characters played by actors in videos$
9f !e are unable to uni)uely identify one <ole entity occurrence from another !ithout the
existence of the Actor and 0ideo entities& then <ole is referred to as being a !ea entity$ 9n
other !ords& the <ole entity is !ea because it has no primary ey of its o!n$
Database Solutions (2
=i"ure ..- Diagrammatic representation of attributes for the 0ideo& <ole& and Actor entities$
Strong entities are sometimes referred to as parent& o!ner& or dominant entities and !ea
entities as child& dependent& or subordinate entities$
..A !escribe how fan and chasm traps can occur in an 6R model and how they can be
%an and chasm traps are t!o types of connection traps that can occur in E< models$ .he traps
normally occur due to a misinterpretation of the meaning of certain relationships$ 9n general& to
identify connection traps !e must ensure that the meaning of a relationship (and the business
rule that it represents) is fully understood and clearly defined$ 9f !e don"t understand the
relationships !e may create a model that is not a true representation of the 6real !orld"$
A fan trap may occur !hen t!o entities have a 2:D relationship that fan out from a third entity&
but the t!o entities should have a direct relationship bet!een them to provide the necessary
information$ A fan trap may be resolved through the addition of a direct relationship bet!een
the t!o entities that !ere originally separated by the third entity$
Database Solutions (2
A chasm trap may occur !hen an E< model suggests the existence of a relationship bet!een
entities& but the path!ay does not exist bet!een certain entity occurrences$ -ore specifically&
a chasm trap may occur !here there is a relationship !ith optional participation that forms part
of the path!ay bet!een the entities that are related$ Again& a chasm trap may be resolved by
the addition of a direct relationship bet!een the t!o entities that !ere originally related
through a path!ay that included optional participation$
Database Solutions (2
Chapter 5 Dormali>ation E Review questions
5.1 !iscuss how normali>ation may be used in database desi"n.
Bormali#ation can be used in database design in t!o !ays: the first is to use normali#ation as a
bottom/up approach to database design; the second is to use normali#ation in con'unction !ith
E< modeling$
=sing normali#ation as a bottom-up approach involves analy#ing the associations bet!een
attributes and& based on this analysis& grouping the attributes together to form tables that
represent entities and relationships$ 3o!ever& this approach becomes difficult !ith a large
number of attributes& !here it"s difficult to establish all the important associations bet!een
the attributes$ Alternatively& you can use a top-down approach to database design$ 9n this
approach& !e use E< modeling to create a data model that represents the main entities and
relationships$ ?e then translate the E< model into a set of tables that represents this data$
9t"s at this point that !e use normali#ation to chec !hether the tables are !ell designed$
5. !escribe the types of update anomalies that may occur on a table that has redundant
.ables that have redundant data may have problems called update anomalies& !hich are
classified as insertion& deletion& or modification anomalies$ See %igure F$2 for an example of a
table !ith redundant data called Staff,ranch$ .here are t!o main types of insertion anomalies&
!hich !e illustrate using this table$
9nsertion anomalies
(2) .o insert the details of a ne! member of staff located at a given branch into the
Staff,ranch table& !e must also enter the correct details for that branch$ %or example& to
insert the details of a ne! member of staff at branch ,;;2& !e must enter the correct
details of branch ,;;2 so that the branch details are consistent !ith values for branch
,;;2 in other records of the Staff,ranch table$ .he data sho!n in the Staff,ranch table
is also sho!n in the Staff and ,ranch tables sho!n in %igure F$2$ .hese tables do have
redundant data and do not suffer from this potential inconsistency& because for each staff
member !e only enter the appropriate branch number into the Staff table$ 9n addition& the
Database Solutions (2
details of branch ,;;2 are recorded only once in the database as a single record in the
,ranch table$
(2) .o insert details of a ne! branch that currently has no members of staff into the
Staff,ranch table& it"s necessary to enter nulls into the staff/related columns& such as
staffBo$ 3o!ever& as staffBo is the primary ey for the Staff,ranch table& attempting to
enter nulls for staffBo violates entity integrity& and is not allo!ed$ .he design of the tables
sho!n in %igure F$2 avoids this problem because ne! branch details are entered into the
,ranch table separately from the staff details$ .he details of staff ultimately located at a
ne! branch can be entered into the Staff table at a later date$
!eletion anomalies
9f !e delete a record from the Staff,ranch table that represents the last member of staff
located at a branch& the details about that branch are also lost from the database$ %or example&
if !e delete the record for staff Art 1eters (S;728) from the Staff,ranch table& the details
relating to branch ,;;5 are lost from the database$ .he design of the tables in %igure F$2
avoids this problem because branch records are stored separately from staff records and only
the column branchBo relates the t!o tables$ 9f !e delete the record for staff Art 1eters
(S;728) from the Staff table& the details on branch ,;;5 in the ,ranch table remain
)odification anomalies
9f !e !ant to change the value of one of the columns of a particular branch in the Staff,ranch
table& for example the telephone number for branch ,;;2& !e must update the records of all
staff located at that branch$ 9f this modification is not carried out on all the appropriate
records of the Staff,ranch table& the database !ill become inconsistent$ 9n this example&
branch ,;;2 !ould have different telephone numbers in different staff records$
.he above examples illustrate that the Staff and ,ranch tables of %igure F$2 have more
desirable properties than the Staff,ranch table of %igure F$2$ 9n the follo!ing sections& !e
examine ho! normal forms can be used to formali#e the identification of tables that have
desirable properties from those that may potentially suffer from update anomalies$
Database Solutions (2
5.& !escribe the characteristics of a table that violates first normal form $1D=% and then
describe how such a table is converted to 1D=.
.he rule for first normal form $1D=% is a table in !hich the intersection of every column and
record contains only one value$ 9n other !ords a table that contains more than one atomic value
in the intersection of one or more column for one or more records is not in 2B%$ .he non 2B%
table can be converted to 2B% by restructuring original table by removing the column !ith the
multi/values along !ith a copy of the primary ey to create a ne! table$ See %igure F$7 for an
example of this approach$ .he advantage of this approach is that the resultant tables may be in
normal forms later that 2B%$
5.' 9hat is the minimal normal form that a relation must satisfy3 ,rovide a definition for
this normal form.
>nly first normal form (2B%) is critical in creating appropriate tables for relational databases$
All the subse)uent normal forms are optional$ 3o!ever& to avoid the update anomalies discussed
in Section F$2& it"s normally recommended that you proceed to third normal form (5B%)$
=irst normal form $1D=% is a table in !hich the intersection of every column and record
contains only one value$
5.* !escribe an approach to convertin" a first normal form $1D=% table to second normal
form $D=% table$s%.
Second normal form applies only to tables !ith composite primary eys& that is& tables !ith a
primary ey composed of t!o or more columns$ A 2B% table !ith a single column primary ey is
automatically in at least 2B%$
A second normal form $D=% is a table that is already in 2B% and in !hich the values in each
non/primary/ey column can be !ored out from the values in all the columns that maes up the
primary ey$
A table in 2B% can be converted into 2B% by removing the columns that can be !ored out from
only part of the primary ey$ .hese columns are placed in a ne! table along !ith a copy of the
part of the primary ey that they can be !ored out from$
Database Solutions (2
5.- !escribe the characteristics of a table in second normal form $D=%.
Second normal form $D=% is a table that is already in 2B% and in !hich the values in each non/
primary/ey column can only be !ored out from the values in all the columns that mae up the
primary ey$
5.. !escribe what is meant by full functional dependency and describe how this type of
dependency relates to D=. ,rovide an example to illustrate your answer.
.he formal definition of second normal form $D=% is a table that is in first normal form and
every non/primary/ey column is fully functionally dependent on the primary ey$ %ull functional
dependency indicates that if A and , are columns of a table& , is fully functionally dependent on
A& if , is not dependent on any subset of A$ 9f , is dependent on a subset of A& this is referred
to as a partial dependency$ 9f a partial dependency exists on the primary ey& the table is not in
2B%$ .he partial dependency must be removed for a table to achieve 2B%$
See Section 5.' for an example$
5.5 !escribe the characteristics of a table in third normal form $&D=%.
+hird normal form $&D=% is a table that is already in 2B% and 2B%& and in !hich the values in
all non/primary/ey columns can be !ored out from only the primary ey (or candidate ey)
column(s) and no other columns$
5.A !escribe what is meant by transitive dependency and describe how this type of
dependency relates to &D=. ,rovide an example to illustrate your answer.
.he formal definition for third normal form $&D=% is a table that is in first and second normal
forms and in !hich no non/primary/ey column is transitively dependent on the primary ey$
.ransitive dependency is a type of functional dependency that occurs !hen a particular type of
relationship holds bet!een columns of a table$ %or example& consider a table !ith columns A& ,&
and 4$ 9f , is functionally dependent on A (A ,) and 4 is functionally dependent on , (, 4)&
then 4 is transitively dependent on A via , (provided that A is not functionally dependent on ,
or 4)$ 9f a transitive dependency exists on the primary ey& the table is not in 5B%$ .he
transitive dependency must be removed for a table to achieve 5B%$
See Section 5.* for an example$
Database Solutions (2
Chapter A Lo"ical !atabase !esi"n E Step 1- Review questions
A.1 !escribe the purpose of a desi"n methodolo"y.
A design methodology is a structured approach that uses procedures& techni)ues& tools& and
documentation aids to support and facilitate the process of design$
A. !escribe the main phases involved in database desi"n.
Database design is made up of t!o main phases: logical and physical database design$
Lo"ical database desi"n is the process of constructing a model of the data used in a company
based on a specific data model& but independent of a particular D,-S and other physical
9n the logical database design phase !e build the logical representation of the database&
!hich includes identification of the important entities and relationships& and then translate this
representation to a set of tables$ .he logical data model is a source of information for the
physical design phase& providing the physical database designer !ith a vehicle for maing
tradeoffs that are very important to the design of an efficient database$
,hysical database desi"n is the process of producing a description of the implementation of
the database on secondary storage; it describes the base tables& file organi#ations& and indexes
used to achieve efficient access to the data& and any associated integrity constraints and
security restrictions$ 9n the physical database design phase !e decide ho! the logical design is
to be physically implemented in the target relational D,-S$ .his phase allo!s the designer to
mae decisions on ho! the database is to be implemented$ .herefore& physical design is tailored
to a specific D,-S$
A.& Identify important factors in the success of database desi"n.
.he follo!ing are important factors to the success of database design:
?or interactively !ith the users as much as possible$
%ollo! a structured methodology throughout the data modeling process$
Employ a data/driven approach$
9ncorporate structural and integrity considerations into the data models$
=se normali#ation and transaction validation techni)ues in the methodology$
=se diagrams to represent as much of the data models as possible$
Database Solutions (2
=se a database design language (D,D+)$
,uild a data dictionary to supplement the data model diagrams$
,e !illing to repeat steps$
A.' !iscuss the important role played by users in the process of database desi"n.
=sers play an essential role in confirming that the logical database design is meeting their
re)uirements$ +ogical database design is made up of t!o steps and at the end of each step
(Steps 2$: and 2$8) users are re)uired to revie! the design and provide feedbac to the
designer$ >nce the logical database design has been 6signed off" by the users the designer can
continue to the physical database design stage$
A.* !iscuss the main activities associated with each step of the lo"ical database desi"n
.he logical database design phase of the methodology is divided into t!o main steps$
9n Step 2 !e create a data model and chec that the data model has minimal redundancy and
is capable of supporting user transactions$ .he output of this step is the creation of a
logical data model& !hich is a complete and accurate representation of the company (or part
of the company) that is to be supported by the database$
9n Step 2 !e map the E< model to a set of tables$ .he structure of each table is checed
using normali#ation$ Bormali#ation is an effective means of ensuring that the tables are
structurally consistent& logical& !ith minimal redundancy$ .he tables are also checed to
ensure that they are capable of supporting the re)uired transactions$ .he re)uired integrity
constraints on the database are also defined$
A.- !iscuss the main activities associated with each step of the physical database desi"n
1hysical database design is divided into six main steps:
Step 5 involves the design of the base tables and integrity constraints using the available
functionality of the target D,-S$
Step 7 involves choosing the file organi#ations and indexes for the base tables$ .ypically&
D,-Ss provide a number of alternative file organi#ations for data& !ith the exception of 14
D,-Ss& !hich tend to have a fixed storage structure$
Database Solutions (2
Step 8 involves the design of the user vie!s originally identified in the re)uirements
analysis and collection stage of the database system development lifecycle$
Step @ involves designing the security measures to protect the data from unauthori#ed
Step A considers relaxing the normali#ation constraints imposed on the tables to improve
the overall performance of the system$ .his is a step that you should undertae only if
necessary& because of the inherent problems involved in introducing redundancy !hile still
maintaining consistency$
Step F is an ongoing process of monitoring and tuning the operational system to identify and
resolve any performance problems resulting from the design and to implement ne! or
changing re)uirements$
A.. !iscuss the purpose of Step 1 of lo"ical database desi"n.
1urpose of Step 2 is to build a logical data model of the data re)uirements of a company (or part
of a company) to be supported by the database$
Each logical data model comprises:
attributes and attribute domains&
primary eys and alternate eys&
integrity constraints$
.he logical data model is supported by documentation& including a data dictionary and E<
diagrams& !hich you"ll produce throughout the development of the model$
A.5 Identify the main tas2s associated with Step 1 of lo"ical database desi"n.
Step 2 4reate and chec E< model
Step 2$2 9dentify entities
Step 2$2 9dentify relationships
Step 2$5 9dentify and associate attributes !ith entities or relationships
Step 2$7 Determine attribute domains
Step 2$8 Determine candidate& primary& and alternate ey attributes
Database Solutions (2
Step 2$@ Speciali#e(Cenerali#e entities (optional step)
Step 2$A 4hec model for redundancy
Step 2$F 4hec model supports user transactions
Step 2$: <evie! model !ith users
A.A !iscuss an approach to identifyin" entities and relationships from a users< requirements
Identifyin" entities
>ne method of identifying entities is to examine the users" re)uirements specification$ %rom
this specification& you can identify nouns or noun phrases that are mentioned (for example&
staff number& staff name& catalog number& title& daily rental rate& purchase price)$ Eou should
also loo for ma'or ob'ects such as people& places& or concepts of interest& excluding those
nouns that are merely )ualities of other ob'ects$
%or example& you could group staff number and staff name !ith an entity called Staff and group
catalog number& title& daily rental rate& and purchase price !ith an entity called 0ideo$
An alternative !ay of identifying entities is to loo for ob'ects that have an existence in their
o!n right$ %or example& Staff is an entity because staff exists !hether or not you no! their
names& addresses& and salaries$ 9f possible& you should get the user to assist !ith this activity$
Identifyin" relationships
3aving identified the entities& the next step is to identify all the relationships that exist
bet!een these entities$ ?hen you identify entities& one method is to loo for nouns in the users"
re)uirements specification$ Again& you can use the grammar of the re)uirements specification to
identify relationships$ .ypically& relationships are indicated by verbs or verbal expressions$ %or
,ranch 3as Staff
,ranch 9sAllocated 0ideo%or<ent
0ideo%or<ent 9s1art>f <entalAgreement
.he fact that the users" re)uirements specification records these relationships suggests that
they are important to the users& and should be included in the model$
.ae great care to ensure that all the relationships that are either explicit or implicit in the
users" re)uirements specification are noted$ 9n principle& it should be possible to chec each pair
Database Solutions (2
of entities for a potential relationship bet!een them& but this !ould be a daunting tas for a
large system comprising hundreds of entities$ >n the other hand& it"s un!ise not to perform
some such chec$ 3o!ever& missing relationships should become apparent !hen you chec the
model supports the transactions that the users re)uire$ >n the other hand& it is possible that an
entity can have no relationship !ith other entities in the database but still play an important
part in meeting the user"s re)uirements$
A.1B !iscuss an approach to identifyin" attributes from a users< requirements
specification and the association of attributes with entities or relationships.
9n a similar !ay to identifying entities& loo for nouns or noun phrases in the users" re)uirements
specification$ .he attributes can be identified !here the noun or noun phrase is a property&
)uality& identifier& or characteristic of one of the entities or relationships that you"ve previously
,y far the easiest thing to do !hen you"ve identified an entity or a relationship in the users"
re)uirements specification is to consider K?hat information are !e re)uired to hold on $ $ $JL$
.he ans!er to this )uestion should be described in the specification$ 3o!ever& in some cases&
you may need to as the users to clarify the re)uirements$ =nfortunately& they may give you
ans!ers that also contain other concepts& so users" responses must be carefully considered$
A.11 !iscuss an approach to chec2in" a data model for redundancy. 8ive an example to
illustrate your answer.
.here are three approaches to identifying !hether a data model suffers from redundancy:
(2) re/examining one/to/one (2:2) relationships;
(2) removing redundant relationships;
(5) considering the time dimension !hen assessing redundancy$
3o!ever& to ans!er this )uestion you need only describe one approach$ ?e describe approach (2)
?n example of approach $1%
9n the identification of entities& you may have identified t!o entities that represent the same
ob'ect in the company$ %or example& you may have identified t!o entities named ,ranch and
Database Solutions (2
>utlet that are actually the same; in other !ords& ,ranch is a synonym for >utlet$ 9n this case&
the t!o entities should be merged together$ 9f the primary eys are different& choose one of
them to be the primary ey and leave the other as an alternate ey$
A.1 !escribe two approaches to chec2in" that a lo"ical data model supports the
transactions required by the user.
.he t!o possible approaches to ensuring that the logical data model supports the re)uired
transactions& includes:
$1% !escribin" the transaction
=sing the first approach& you chec that all the information (entities& relationships& and their
attributes) re)uired by each transaction is provided by the model& by documenting a description
of each transaction"s re)uirements$
$% :sin" transaction pathways
.he second approach to validating the data model against the re)uired transactions involves
representing the path!ay taen by each transaction directly on the E< diagram$ 4learly& the
more transactions that exist& the more complex this diagram !ould become& so for readability
you may need several such diagrams to cover all the transactions$
A.1& Identify and describe the purpose of the documentation "enerated durin" Step 1 of
lo"ical database desi"n.
!ocument entities
.he data dictionary describes the entities including the entity name& description& aliases& and
Database Solutions (2
=i"ure A. Extract from the data dictionary for the ,ranch user vie!s of Stay3ome sho!ing a
description of entities$
6R dia"rams
.hroughout the database design phase& E< diagrams are used !henever necessary& to help build
up a picture of !hat you"re attempting to model$ Different people use different notations for
E< diagrams$ 9n this boo& !e"ve used the latest ob'ect/oriented notation called :)L $:nified
)odelin" Lan"ua"e%& but other notations perform a similar function$
!ocument relationships
As you identify relationships& assign them names that are meaningful and obvious to the user&
and also record relationship descriptions& and the multiplicity constraints in the data dictionary$
Database Solutions (2
=i"ure A.. Extract from the data dictionary for the ,ranch user vie!s of Stay3ome
sho!ing descriptions of relationships$
!ocument attributes
As you identify attributes& assign them names that are meaningful and obvious to the user$
?here appropriate& record the follo!ing information for each attribute:
attribute name and description;
data type and length;
any aliases that the attribute is no!n by;
!hether the attribute must al!ays be specified (in other !ords& !hether the attribute
allo!s or disallo!s nulls);
!hether the attribute is multi/valued;
!hether the attribute is composite& and if so& !hich simple attributes mae up the
composite attribute;
!hether the attribute is derived and& if so& ho! it should be computed;
default values for the attribute (if specified)$
Database Solutions (2
=i"ure A.5 Extract from the data dictionary for the ,ranch user vie!s of Stay3ome
sho!ing descriptions of attributes$
!ocument attribute domains
As you identify attribute domains& record their names and characteristics in the data
dictionary$ =pdate the data dictionary entries for attributes to record their domain in place of
the data type and length information$
!ocument candidate1 primary1 and alternate 2eys
<ecord the identification of candidate& primary& and alternate eys (!hen available) in the data
Database Solutions (2
=i"ure A.1B Extract from the data dictionary for the ,ranch user vie!s of Stay3ome sho!ing
attributes !ith primary and alternate eys identified$
!ocument entities
Eou no! have a logical data model that represents the database re)uirements of the company
(or part of the company)$ .he logical data model is checed to ensure that the model supports
the re)uired transactions$ .his process creates documentation that ensures that all the
information (entities& relationships& and their attributes) re)uired by each transaction is
provided by the model& by documenting a description of each transaction"s re)uirements$
Alternative approach to validating the data model against the re)uired transactions involves
representing the path!ay taen by each transaction directly on the E< diagram$ 4learly& the
more transactions that exist& the more complex this diagram !ould become& so for readability
you may need several such diagrams to cover all the transactions$
Database Solutions (2
Chapter 1B Lo"ical !atabase !esi"n E Step E Review questions
1B.1 !escribe the main purpose and tas2s of Step of the lo"ical database desi"n
.o create tables for the logical data model and to chec the structure of the tables$
.he tass involved in Step 2 are:
Step 2$2 4reate tables
Step 2$2 4hec table structures using normali#ation
Step 2$5 4hec tables support user transactions
Step 2$7 4hec business rules
Step 2$8 <evie! logical database design !ith users
1B. !escribe the rules for creatin" tables that represent#
(a) strong and !ea entities;
(b) one/to/many (2:D) binary relationships;
(c) one/to/many (2:D) recursive relationships;
(d) one/to/one (2:2) binary relationships;
(e) one/to/one (2:2) recursive relationships;
(f) many/to/many (D:D) binary relationships;
(g) complex relationships;
(h) multi/valued attributes$
Cive examples to illustrate your ans!ers$
Database Solutions (2
Examples are provided throughout the description of Step 2$2 in 4hapter 2;$
1B.& !iscuss how the technique of normali>ation can be used to chec2 the structure of
the tables created from the 6R model and supportin" documentation.
.he purpose of the techni)ue of normali#ation to examine the groupings of columns in each table
created in Step 2$2$ Eou chec the composition of each table using the rules of normali#ation& to
avoid unnecessary duplication of data$
Eou should ensure that each table created is in at least third normal form (5B%)$ 9f you identify
tables that are not in 5B%& this may indicate that part of the E< model is incorrect& or that you
have introduced an error !hile creating the tables from the model$ 9f necessary& you may need
to restructure the data model and(or tables$
1B.' !iscuss one approach that can be used to chec2 that the tables support the
transactions required by the users.
>ne approach to checing that the tables support a transaction is to examine the transaction"s
data re)uirements to ensure that the data is present in one or more tables$ Also& if a
Database Solutions (2
transaction re)uires data in more than one table you should chec that these tables are lined
through the primary ey(foreign ey mechanism$
1B.* !iscuss what business rules represent. 8ive examples to illustrate your answers.
,usiness rules are the constraints that you !ish to impose in order to protect the database
from becoming incomplete& inaccurate& or inconsistent$ Although you may not be able to
implement some business rules !ithin the D,-S& this is not the )uestion here$ At this stage& you
are concerned only !ith high/level design that is& specifying !hat business rules are re)uired
irrespective of ho! this might be achieved$ 3aving identified the business rules& you !ill have a
logical data model that is a complete and accurate representation of the organi#ation (or part of
the organi#ation) to be supported by the database$ 9f necessary& you could produce a physical
database design from the logical data model& for example& to prototype the system for the user$
?e consider the follo!ing types of business rules:
re)uired data&
column domain constraints&
entity integrity&
referential integrity&
other business rules$
1B.* !escribe the alternative strate"ies that can be applied if there is a child record
referencin" a parent record that we wish to delete.
9f a record of the parent table is deleted& referential integrity is lost if there is a child record
referencing the deleted parent record$ 9n other !ords& referential integrity is lost if the
deleted branch currently has one or more members of staff !oring at it$ .here are several
strategies you can consider in this case:
B> A4.9>B 1revent a deletion from the parent table if there are any referencing child
records$ 9n our example& 6Eou cannot delete a branch if there are currently members of staff
!oring there"$
4AS4ADE ?hen the parent record is deleted& automatically delete any referencing child
records$ 9f any deleted child record also acts as a parent record in another relationship then
the delete operation should be applied to the records in this child table& and so on in a cascading
manner$ 9n other !ords& deletions from the parent table cascade to the child table$ 9n our
Database Solutions (2
example& 6Deleting a branch automatically deletes all members of staff !oring there"$ 4learly& in
this situation& this strategy !ould not be !ise$
SE. B=++ ?hen a parent record is deleted& the foreign ey values in all related child
records are automatically set to null$ 9n our example& 69f a branch is deleted& indicate that the
current branch for those members of staff previously !oring there is unno!n"$ Eou can only
consider this strategy if the columns comprising the foreign ey can accept nulls& as defined in
Step 2$5$
SE. DE%A=+. ?hen a parent record is deleted& the foreign ey values in all related child
records are automatically set to their default values$ 9n our example& 69f a branch is deleted&
indicate that the current assignment of members of staff previously !oring there is being
assigned to another (default) branch"$ Eou can only consider this strategy if the columns
comprising the foreign ey have default values& as defined in Step 2$5$
B> 43E4M ?hen a parent record is deleted& do nothing to ensure that referential integrity
is maintained$ .his strategy should only be considered in extreme circumstances$
1B.- !iscuss what business rules represent. 8ive examples to illustrate your answers.
%inally& you consider constraints no!n as business rules$ ,usiness rules should be represented
as constraints on the database to ensure that only permitted updates to tables governed by
6real !orld" transactions are allo!ed$ %or example& Stay3ome has a business rule that prevents
a member from renting more than 2; videos at any one time$
Database Solutions (2
Chapter 11 6nhanced 6ntity-Relationship )odelin" E Review questions
11.1 !escribe what a superclass and a subclass represent.
Superclass is an entity that includes one or more distinct groupings of its occurrences& !hich
re)uire to be represented in a data model$ Subclass is a distinct grouping of occurrences of an
entity& !hich re)uire to be represented in a data model$
11. !escribe the relationship between a superclass and its subclass.
.he relationship bet!een a superclass and any one of its subclasses is one/to/one (2:2) and is
called a superclass(subclass relationship$ %or example& Staff(-anager forms a
superclass(subclass relationship$ Each member of a subclass is also a member of the superclass
but has a distinct role$
11.& !escribe and illustrate usin" an example the process of attribute inheritance.
An entity occurrence in a subclass represents the same 6real !orld" ob'ect as in the superclass$
3ence& a member of a subclass inherits those attributes associated !ith the superclass& but
may also have subclass/specific attributes$ %or example& a member of the Sales1ersonnel
subclass has subclass/specific attributes& salesArea& veh+icenseBo& and carAllo!ance& and all
the attributes of the Staff superclass& namely staffBo& name& position& salary& and branchBo$
11.' 9hat are the main reasons for introducin" the concepts of superclasses and
subclasses into an 66R model3
.here are t!o important reasons for introducing the concepts of superclasses and subclasses
into an E< model$ .he first reason is that it avoids describing similar concepts more than once&
thereby saving you time and maing the E< model more readable$ .he second reason is that it
adds more semantic information to the design in a form that is familiar to many people$ %or
example& the assertions that 6-anager 9S/A member of staff" and 6van 9S/A type of vehicle"
communicate significant semantic content in an easy/to/follo! form$
Database Solutions (2
11.* !escribe what a shared subclass represents.
A subclass is an entity in its o!n right and so it may also have one or more subclasses$ A subclass
!ith more than one superclass is called a shared subclass$ 9n other !ords& a member of a shared
subclass must be a member of the associated superclasses$ As a conse)uence& the attributes of
the superclasses are inherited by the shared subclass& !hich may also have its o!n additional
attributes$ .his process is referred to as multiple inheritance$
11.- !escribe and contrast the process of speciali>ation with the process of
Speciali>ation is the process of maximi#ing the differences bet!een members of an entity by
identifying their distinguishing characteristics$ Speciali#ation is a top/do!n approach to defining
a set of superclasses and their related subclasses$ .he set of subclasses is defined on the basis
of some distinguishing characteristics of the entities in the superclass$ ?hen !e identify a
subclass of an entity& !e then associate attributes specific to the subclass (!here necessary)&
and also identify any relationships bet!een the subclass and other entities or subclasses (!here
8enerali>ation is the process of minimi#ing the differences bet!een entities by identifying
their common features$ .he process of generali#ation is a bottom/up approach& !hich results in
the identification of a generali#ed superclass from the original subclasses$ .he process of
generali#ation can be vie!ed as the reverse of the speciali#ation process$
11.. !escribe the two main constraints that apply to a speciali>ationF"enerali>ation
.here are t!o constraints that may apply to a superclass(subclass relationship called
participation constraints and dis'oint constraints$
,articipation constraint determines !hether every occurrence in the superclass must
participate as a member of a subclass$ A participation constraint may be mandatory or optional$
A superclass(subclass relationship !ith a mandatory participation specifies that every entity
occurrence in the superclass must also be a member of a subclass$ A superclass(subclass
Database Solutions (2
relationship !ith optional participation specifies that a member of a superclass need not belong
to any of its subclasses$
!is4oint constraint describes the relationship bet!een members of the subclasses and indicates
!hether it"s possible for a member of a superclass to be a member of one& or more than one&
subclass$ .he dis'oint constraint only applies !hen a superclass has more than one subclass$ 9f
the subclasses are dis'oint& then an entity occurrence can be a member of only one of the
subclasses$ .o represent a dis'oint superclass(subclass relationship& an 6>r" is placed next to the
participation constraint !ithin the curly bracets$ 9f subclasses of a
speciali#ation(generali#ation are not dis'oint (called nondis'oint)& then an entity occurrence may
be a member of more than one subclass$ .he participation and dis'oint constraints of
speciali#ation(generali#ation are distinct giving the follo!ing four categories: mandatory and
nondis'oint& optional and nondis'oint& mandatory and dis'oint& and optional and dis'oint$
Database Solutions (2
Chapter 1 ,hysical !atabase !esi"n E Step & E Review questions
1.1 6xplain the difference between lo"ical and physical database desi"n. 9hy mi"ht
these tas2s be carried out by different people3
+ogical database design is independent of implementation details& such as the specific
functionality of the target D,-S& application programs& programming languages& or any other
physical considerations$ .he output of this process is a logical data model that includes a set of
relational tables together !ith supporting documentation& such as a data dictionary$ .hese
represent the sources of information for the physical design process& and they provide you !ith
a vehicle for maing trade/offs that are so important to an efficient database design$
?hereas logical database design is concerned !ith the !hat& physical database design is
concerned !ith the ho!$ 9n particular& the physical database designer must no! ho! the
computer system hosting the D,-S operates& and must also be fully a!are of the functionality
of the target D,-S$ As the functionality provided by current systems varies !idely& physical
design must be tailored to a specific D,-S system$ 3o!ever& physical database design is not an
isolated activity N there is often feedbac bet!een physical& logical& and application design$ %or
example& decisions taen during physical design to improve performance& such as merging tables
together& might affect the logical data model$
1. !escribe the inputs and outputs of physical database desi"n.
.he inputs are the logical data model and the data dictionary$ .he outputs are the base tables&
integrity rules& file organi#ation specified& secondary indexes determined& user vie!s and
security mechanisms$
1.& !escribe the purpose of the main steps in the physical desi"n methodolo"y
presented in this chapter.
Step 5 produces a relational database schema from the logical data model& !hich defines the
base tables& integrity rules& and ho! to represent derived data$
Database Solutions (2
1.' !escribe the types of information required to desi"n the base tables.
Eou !ill need to no!:
ho! to create base tables;
!hether the system supports the definition of primary eys& foreign eys& and alternate
!hether the system supports the definition of re)uired data (that is& !hether the system
allo!s columns to be defined as NOT NULL);
!hether the system supports the definition of domains;
!hether the system supports relational integrity rules;
!hether the system supports the definition of business rules$
1.* !escribe how you would you handle the representation of derived data in the
database. 8ive an example to illustrate your answer.
%rom a physical database design perspective& !hether a derived column is stored in the
database or calculated every time it"s needed is a trade/off$ .o decide& you should calculate:
the additional cost to store the derived data and eep it consistent !ith the data from
!hich it is derived& and
the cost to calculate it each time it"s re)uired&
and choose the less expensive option sub'ect to performance constraints$
Database Solutions (2
Chapter 1& ,hysical !atabase !esi"n E Step ' E Review questions
1&.1 !escribe the purpose of Step ' in the database desi"n methodolo"y.
Step 7 determines the file organi#ations for the base tables$ .his taes account of the nature
of the transactions to be carried out& !hich also determine !here secondary indexes !ill be of
1&. !iscuss the purpose of analy>in" the transactions that have to be supported and
describe the type of information you would collect and analy>e.
Eou can"t mae meaningful physical design decisions until you understand in detail the
transactions that have to be supported$ 9n analy#ing the transactions& you"re attempting to
identify performance criteria& such as:
the transactions that run fre)uently and !ill have a significant impact on performance;
the transactions that are critical to the operation of the business;
the times of the day(!ee !hen there !ill be a high demand made on the database (called
the pea load)$
Eou"ll use this information to identify the parts of the database that may cause performance
problems$ At the same time& you need to identify the high/level functionality of the
transactions& such as the columns that are updated in an update transaction or the columns that
are retrieved in a )uery$ Eou"ll use this information to select appropriate file organi#ations and
1&.& 9hen would you not add any indexes to a table3
(2) Do not index small tables$ 9t may be more efficient to search the table in memory than to
store an additional index structure$
(2) Avoid indexing a column or table that is fre)uently updated$
(5) Avoid indexing a column if the )uery !ill retrieve a significant proportion (for example&
28O) of the records in the table& even if the table is large$ 9n this case& it may be more
efficient to search the entire table than to search using an index$
(7) Avoid indexing columns that consist of long character strings$
Database Solutions (2
1&.' !iscuss some of the main reasons for selectin" a column as a potential candidate
for indexin". 8ive examples to illustrate your answer.
(2) 9n general& index the primary ey of a table if it"s not a ey of the file organi#ation$
Although the S*+ standard provides a clause for the specification of primary eys as
discussed in Step 5$2 covered in the last chapter& note that this does not guarantee that
the primary ey !ill be indexed in some <D,-Ss$
(2) Add a secondary index to any column that is heavily used for data retrieval$ %or example&
add a secondary index to the -ember table based on the column lBame& as discussed above$
(5) Add a secondary index to a foreign ey if there is fre)uent access based on it$ %or
example& you may fre)uently 'oin the 0ideo%or<ent and ,ranch tables on the column
branchBo (the branch number)$ .herefore& it may be more efficient to add a secondary
index to the 0ideo%or<ent table based on branchBo$
(7) Add a secondary index on columns that are fre)uently involved in:
(a) selection or 'oin criteria;
(d) other operations involving sorting (such as UNION or DISTINCT)$
(8) Add a secondary index on columns involved in built/in functions& along !ith any columns used
to aggregate the built/in functions$ %or example& to find the average staff salary at each
branch& you could use the follo!ing S*+ )uery:
SELECT branchNo, AVG(salary)
FROM Staff
GROUP BY branchNo;
%rom the previous guideline& you could consider adding an index to the branchBo column by
virtue of the GROUP BY clause$ 3o!ever& it may be more efficient to consider an index on
both the branchBo column and the salary column$ .his may allo! the D,-S to perform the
entire )uery from data in the index alone& !ithout having to access the data file$ .his is
sometimes called an index/only plan& as the re)uired response can be produced using only
data in the index$
(@) As a more general case of the previous guideline& add a secondary index on columns that
could result in an index/only plan$
Database Solutions (2
1&.* 7avin" identified a column as a potential candidate1 under what circumstances would
you decide a"ainst indexin" it3
3aving dra!n up your 6!ish/list" of potential indexes& consider the impact of each of these on
update transactions$ 9f the maintenance of the index is liely to slo! do!n important update
transactions& then consider dropping the index from the list$
Database Solutions (2
Chapter 1' ,hysical !atabase !esi"n E Steps * and - E Review questions
1'.1 !escribe the purpose of the main steps in the physical desi"n methodolo"y
presented in this chapter.
Step 8 designs the user vie!s for the database implementation$ Step @ designs the security
mechanisms for the database implementation$ .his includes designing the access rules on the
base relations$
1'. !iscuss the difference between system security and data security.
System security covers access and use of the database at the system level& such as a username
and pass!ord$ !ata security covers access and use of database ob'ects (such as tables and
vie!s) and the actions that users can have on the ob'ects$
1'.& !escribe the access control facilities of S0L.
Each database user is assigned an authori>ation identifier by the Database Administrator
(D,A); usually& the identifier has an associated pass!ord& for obvious security reasons$ Every
S*+ statement that is executed by the D,-S is performed on behalf of a specific user$ .he
authori#ation identifier is used to determine !hich database ob'ects that user may reference&
and !hat operations may be performed on those ob'ects$ Each ob'ect that is created in S*+
has an o!ner& !ho is identified by the authori#ation identifier$ ,y default& the o!ner is the only
person !ho may no! of the existence of the ob'ect and perform any operations on the ob'ect$
,rivile"es are the actions that a user is permitted to carry out on a given base table or vie!$
%or example& SELECT is the privilege to retrieve data from a table and UPDATE is the privilege
to modify records of a table$ ?hen a user creates a table using the S*+ CREATE TABLE
statement& he or she automatically becomes the o!ner of the table and receives full privileges
for the table$ >ther users initially have no privileges on the ne!ly created table$ .o give them
access to the table& the o!ner must explicitly grant them the necessary privileges using the
S*+ GRANT statement$ A WITH GRANT OPTION clause can be specified !ith the GRANT
statement to allo! the receiving user(s) to pass the privilege(s) on to other users$ 1rivileges can
be revoed using the S*+ REVOKE statement$
Database Solutions (2
?hen a user creates a vie! !ith the CREATE VIEW statement& he or she automatically
becomes the o!ner of the vie!& but does not necessarily receive full privileges on the vie!$ .o
create the vie!& a user must have SELECT privilege to all the tables that mae up the vie!$
3o!ever& the o!ner !ill only get other privileges if he or she holds those privileges for every
table in the vie!$
1'.& !escribe the security features of )icrosoft ?ccess BB.
Access provides a number of security features including the follo!ing t!o methods:
(a) setting a pass!ord for opening a database (system security);
(b) user/level security& !hich can be used to limit the parts of the database that a user can
read or update (data security)$
9n addition to the above t!o methods of securing a -icrosoft Access database& other security
features include:
Encryption(decryption: encrypting a database compacts a database file and maes it
indecipherable by a utility program or !ord processor$ .his is useful if you !ish to transmit
a database electronically or !hen you store it on a floppy dis or compact disc$ Decrypting a
database reverses the encryption$
1reventing users from replicating a database& setting pass!ords& or setting startup options;
Securing 0,A code: this can be achieved by setting a pass!ord that you enter once per
session or by saving the database as an -DE file& !hich compiles the 0,A source code
before removing it from the database$ Saving the database as an -DE file also prevents
users from modifying forms and reports !ithout re)uiring them to specify a log on pass!ord
or !ithout you having to set up user/level security$
Database Solutions (2
Chapter 1* ,hysical !atabase !esi"n E Step . E Review questions
1*.1 !escribe the purpose of Step . in the database desi"n methodolo"y.
Step F considers relaxing the normali#ation constraints imposed on the logical data model to
improve the overall performance of the system$
1*. 6xplain the meanin" of denormali>ation.
%ormally& the term denormali>ation refers to a change to the structure of a base table& such
that the ne! table is in a lo!er normal form than the original table$ 3o!ever& !e also use the
term more loosely to refer to situations !here !e combine t!o tables into one ne! table& !here
the ne! table is in the same normal form but contains more nulls than the original tables$
1*.& !iscuss when it may be appropriate to denormali>e a table. 8ive examples to
illustrate your answer.
.here are no fixed rules for determining !hen to denormali#e tables$ Some of the more common
situations for considering denormali#ation to speed up fre)uent or critical transactions are:
Step A$2$2 4ombining one/to/one (2:2) relationships
Step A$2$2 Duplicating noney columns in one/to/many (2:D) relationships to reduce 'oins
Step A$2$5 Duplicating foreign ey columns in one/to/many (2:D) relationships to reduce 'oins
Step A$2$7 Duplicating columns in many/to/many (D:D) relationships to reduce 'oins
Step A$2$8 9ntroducing repeating groups
Step A$2$@ 4reating extract tables
Step A$2$A 1artitioning tables
1*.' !escribe the two main approaches to partitionin" and discuss when each may be an
appropriate way to improve performance. 8ive examples to illustrate your answer.
7ori>ontal partitionin" Distributing the records of a table across a number of (smaller) tables$
@ertical partitionin" Distributing the columns of a table across a number of (smaller) tables
(the primary ey is duplicated to allo! the original table to be reconstructed)$
1artitions are particularly useful in applications that store and analy#e large amounts of data$
%or example& let"s suppose there are hundreds of thousands of records in the 0ideo%or<ent
Database Solutions (2
table that are held indefinitely for analysis purposes$ Searching for a particular record at a
branch could be )uite time consuming& ho!ever& !e could reduce this time by hori#ontally
partitioning the table& !ith one partition for each branch$
.here may also be circumstances !here !e fre)uently examine particular columns of a very
large table and it may be appropriate to vertically partition the table into those columns that
are fre)uently accessed together and another vertical partition for the remaining columns (!ith
the primary ey replicated in each partition to allo! the original table to be reconstructed)$
Database Solutions (2
Chapter 1- ,hysical !atabase !esi"n E Step 5 E Review questions
1-.1 !escribe the purpose of the main steps in the physical desi"n methodolo"y
presented in this chapter.
Step : monitors the database application systems and improves performance by maing
amendments to the design as appropriate$
1-. 9hat factors can be used to measure efficiency3
.here are a number of factors that !e may use to measure efficiency:
.ransaction throughput: this is the number of transactions processed in a given time
interval$ 9n some systems& such as airline reservations& high transaction throughput is
critical to the overall success of the system$
P <esponse time: this is the elapsed time for the completion of a single transaction$ %rom a
user"s point of vie!& you !ant to minimi#e response time as much as possible$ 3o!ever&
there are some factors that influence response time that you may have no control over&
such as system loading or communication times$ Eou can shorten response time by:
/ reducing contention and !ait times& particularly dis 9(> !ait times;
/ reducing the amount of time resources are re)uired;
/ using faster components$
P Dis storage: this is the amount of dis space re)uired to store the database files$ Eou
may !ish to minimi#e the amount of dis storage used$
1-.& !iscuss how the four basic hardware components interact and affect system
main memory
dis 9(>
Database Solutions (2
Each of these resources may affect other system resources$ E)ually !ell& an improvement in
one resource may effect an improvement in other system resources$ %or example:
Adding more main memory should result in less paging$ .his should help avoid 41=
-ore effective use of main memory may result in less dis 9(>$
1-.' 7ow should you distribute data across dis2s3
%igure 2@$2 illustrates the basic principles of distributing the data across diss:
.he operating system files should be separated from the database files$
.he main database files should be separated from the index files$
.he recovery log file& if available and if used& should be separated from the rest of the
=i"ure 1-.1 +ypical dis2 confi"uration.
1-.* 9hat is R?I! technolo"y and how does it improve performance and reliability3
<A9D originally stood for <edundant Array of 9nexpensive Diss& but more recently the 69" in
<A9D has come to stand for 9ndependent$ <A9D !ors on having a large dis array comprising
an arrangement of several independent diss that are organi#ed to increase performance and at
the same time improve reliability$
1erformance is increased through data striping: the data is segmented into e)ual/si#e
partitions (the striping unit)& !hich are transparently distributed across multiple diss$ .his
gives the appearance of a single large& very fast dis !here in actual fact the data is
distributed across several smaller diss$ Striping improves overall 9(> performance by allo!ing
multiple 9(>s to be serviced in parallel$ At the same time& data striping also balances the load
among diss$ <eliability is improved through storing redundant information across the diss
Database Solutions (2
using a parity scheme or an error/correcting scheme$ 9n the event of a dis failure& the
redundant information can be used to reconstruct the contents of the failed dis$
Database Solutions (2
Chapter 1A Current and 6mer"in" +rends E Review questions
1A.1 !iscuss the "eneral characteristics of advanced database applications.
Design data is characteri#ed by a large number of types& each !ith a small number of
instances$ 4onventional databases are typically the opposite$
Designs may be very large& perhaps consisting of millions of parts& often !ith many
interdependent subsystem designs$
.he design is not static but evolves through time$ ?hen a design change occurs& its
implications must be propagated through all design representations$ .he dynamic
nature of design may mean that some actions cannot be foreseen at the beginning$
=pdates are far/reaching because of topological or functional relationships&
tolerances& and so on$ >ne change is liely to affect a large number of design
>ften& many design alternatives are being considered for each component& and the
correct version for each part must be maintained$ .his involves some form of version
control and configuration management$
.here may be hundreds of staff involved !ith the design& and they may !or in
parallel on multiple versions of a large design$ Even so& the end product must be
consistent and coordinated$ .his is sometimes referred to as cooperative
1A. !iscuss why the wea2nesses of the relational data model and relational !()Ss may
ma2e them unsuitable for advanced database applications.
1oor representation of 6real !orld" entities
Bormali#ation generally leads to the creation of tables that do not correspond to entities in the
6real !orld"$ .he fragmentation of a 6real !orld" entity into many tables& !ith a physical
representation that reflects this structure& is inefficient leading to many 'oins during )uery
Semantic overloading
Database Solutions (2
.he relational model has only one construct for representing data and relationships bet!een
data& namely the table$ %or example& to represent a many/to/many (D:D) relationship bet!een t!o
entities A and B& !e create three tables& one to represent each of the entities A and B& and one
to represent the relationship$ .here is no mechanism to distinguish bet!een entities and
relationships& or to distinguish bet!een different inds of relationship that exist bet!een
entities$ %or example& a 2:D relationship might be Has& Supervises& -anages& and so on$ 9f such
distinctions could be made& then it might be possible to build the semantics into the operations$
9t is said that the relational model is semantically overloaded$
1oor support for business rules
9n Section 2$5& !e introduced the concepts of entity and referential integrity& and in Section
2$2$2 !e introduced domains& !hich are also types of business rules$ =nfortunately& many
commercial systems do not fully support these rules& and it"s necessary to build them into the
applications$ .his& of course& is dangerous and can lead to duplication of effort and& !orse still&
inconsistencies$ %urthermore& there is no support for other types of business rules in the
relational model& !hich again means they have to be built into the D,-S or the application$
+imited operations
.he relational model has only a fixed set of operations& such as set and record/oriented
operations& operations that are provided in the S*+ specification$ 3o!ever& S*+ currently does
not allo! ne! operations to be specified$ Again& this is too restrictive to model the behavior of
many 6real !orld" ob'ects$ %or example& a C9S application typically uses points& lines& line groups&
and polygons& and needs operations for distance& intersection& and containment$
Difficulty handling recursive )ueries
Atomicity of data means that repeating groups are not allo!ed in the relational model$ As a
result& it"s extremely difficult to handle recursive )ueries: that is& )ueries about relationships
that a table has !ith itself (directly or indirectly)$ .o overcome this problem& S*+ can be
embedded in a high/level programming language& !hich provides constructs to facilitate
iteration$ Additionally& many <D,-Ss provide a report !riter !ith similar constructs$ 9n either
case& it is the application rather than the inherent capabilities of the system that provides the
re)uired functionality$
Database Solutions (2
9mpedance mismatch
9n Section 5$2$2& !e noted that until the most recent version of the standard S*+ laced
computational completeness$ .o overcome this problem and to provide additional flexibility& the
S*+ standard provides embedded S*+ to help develop more complex database applications$
3o!ever& this approach produces an impedance mismatch because !e are mixing different
programming paradigms:
(2) S*+ is a declarative language that handles ro!s of data& !hereas a high/level language
such as 64" is a procedural language that can handle only one ro! of data at a time$
(2) S*+ and 5C+s use different models to represent data$ %or example& S*+ provides the
built/in data types Date and 9nterval& !hich are not available in traditional programming
languages$ .hus& it"s necessary for the application program to convert bet!een the t!o
representations& !hich is inefficient& both in programming effort and in the use of
runtime resources$ %urthermore& since !e are using t!o different type systems& it"s not
possible to automatically type chec the application as a !hole$
.he latest release of the S*+ standard& S*+5& addresses some of the above deficiencies
!ith the introduction of many ne! features& such as the ability to define ne! data types and
operations as part of the data definition language& and the addition of ne! constructs to mae
the language computationally complete$
1A.& 6xplain what is meant by a !!()S1 and discuss the motivation in providin" such a
A !istributed !atabase )ana"ement System (!!()S) consists of a single logical database
that is split into a number of fra"ments$ Each fragment is stored on one or more computers
(replicas) under the control of a separate D,-S& !ith the computers connected by a
communications net!or$ Each site is capable of independently processing user re)uests that
re)uire access to local data (that is& each site has some degree of local autonomy) and is also
capable of processing data stored on other computers in the net!or$
1A.' Compare and contrast a !!()S with distributed processin". :nder what
circumstances would you choose a !!()S over distributed processin"3
!istributed processin": a centrali#ed database that can be accessed over a computer net!or$
Database Solutions (2
.he ey point !ith the definition of a distributed D,-S is that the system consists of data
that is physically distributed across a number of sites in the net!or$ 9f the data is centrali#ed&
even though other users may be accessing the data over the net!or& !e do not consider this to
be a distributed D,-S& simply distributed processing$
1A.* !iscuss the advanta"es and disadvanta"es of a !!()S.
Reflects or"ani>ational structure -any organi#ations are naturally distributed over several
locations$ 9t"s natural for databases used in such an application to be distributed over these
Improved shareability and local autonomy .he geographical distribution of an organi#ation can
be reflected in the distribution of the data; users at one site can access data stored at other
sites$ Data can be placed at the site close to the users !ho normally use that data$ 9n this !ay&
users have local control of the data& and they can conse)uently establish and enforce local
policies regarding the use of this data$
Improved availability 9n a centrali#ed D,-S& a computer failure terminates the operations of
the D,-S$ 3o!ever& a failure at one site of a DD,-S& or a failure of a communication lin
maing some sites inaccessible& does not mae the entire system inoperable$
Improved reliability As data may be replicated so that it exists at more than one site& the
failure of a node or a communication lin does not necessarily mae the data inaccessible$
Improved performance As the data is located near the site of 6greatest demand"& and given the
inherent parallelism of DD,-Ss& it may be possible to improve the speed of database accesses
than if !e had a remote centrali#ed database$ %urthermore& since each site handles only a part
of the entire database& there may not be the same contention for 41= and 9(> services as
characteri#ed by a centrali#ed D,-S$
6conomics 9t"s generally accepted that it costs much less to create a system of smaller
computers !ith the e)uivalent po!er of a single large computer$ .his maes it more cost/
effective for corporate divisions and departments to obtain separate computers$ 9t"s also much
more cost/effective to add !orstations to a net!or than to update a mainframe system$
Database Solutions (2
)odular "rowth 9n a distributed environment& it"s much easier to handle expansion$ Be! sites
can be added to the net!or !ithout affecting the operations of other sites$ .his flexibility
allo!s an organi#ation to expand relatively easily$
Complexity A DD,-S that hides the distributed nature from the user and provides an
acceptable level of performance& reliability& and availability is inherently more complex than a
centrali#ed D,-S$ <eplication also adds an extra level of complexity& !hich if not handled
ade)uately& !ill lead to degradation in availability& reliability& and performance compared !ith
the centrali#ed system& and the advantages !e cited above !ill become disadvantages$
Cost 9ncreased complexity means that !e can expect the procurement and maintenance costs
for a DD,-S to be higher than those for a centrali#ed D,-S$ %urthermore& a DD,-S re)uires
additional hard!are to establish a net!or bet!een sites$ .here are ongoing communication
costs incurred !ith the use of this net!or$ .here are also additional manpo!er costs to manage
and maintain the local D,-Ss and the underlying net!or$
Security 9n a centrali#ed system& access to the data can be easily controlled$ 3o!ever& in a
DD,-S not only does access to replicated data have to be controlled in multiple locations& but
the net!or itself has to be made secure$ 9n the past& net!ors !ere regarded as an insecure
communication medium$ Although this is still partially true& significant developments have been
made recently to mae net!ors more secure$
Inte"rity control more difficult Enforcing integrity constraints generally re)uires access to a
large amount of data that defines the constraint& but is not involved in the actual update
operation itself$ 9n a DD,-S& the communication and processing costs that are re)uired to
enforce integrity constraints may be prohibitive$
Lac2 of standards Although DD,-Ss depend on effective communication& !e are only no!
starting to see the appearance of standard communication and data access protocols$ .his lac
of standards has significantly limited the potential of DD,-Ss$ .here are also no tools or
methodologies to help users convert a centrali#ed D,-S into a distributed D,-S$
Database Solutions (2
Lac2 of experience Ceneral/purpose DD,-Ss have not been !idely accepted& although many of
the protocols and problems are !ell understood$ 4onse)uently& !e do not yet have the same level
of experience in industry as !e have !ith centrali#ed D,-Ss$ %or a prospective adopter of this
technology& this may be a significant deterrent$
!atabase desi"n more complex ,esides the normal difficulties of designing a centrali#ed
database& the design of a distributed database has to tae account of fragmentation of data&
allocation of fragments to specific sites& and data replication$
1A.- !escribe the expected functionality of a replication server.
At its basic level& !e expect a distributed data replication service to be capable of copying data
from one database to another& synchronously or asynchronously$ 3o!ever& there are many other
functions that need to be provided& such as:
P Specification of replication schema .he system should provide a mechanism to allo! a
privileged user to specify the data and ob'ects to be replicated$
P Subscription mechanism .he system should provide a mechanism to allo! a privileged
user to subscribe to the data and ob'ects available for replication$
P 9nitiali#ation mechanism .he system should provide a mechanism to allo! for the
initiali#ation of a target replica$
P Scalability .he service should be able to handle the replication of both small and large
volumes of data$
P -apping and transformation .he service should be able to handle replication across
different D,-Ss and platforms$ .his may involve mapping and transforming the data
from one data model into a different data model& or the data in one data type to a
corresponding data type in another D,-S$
P >b'ect replication 9t should be possible to replicate ob'ects other than data$ %or
example& some systems allo! indexes and stored procedures (or triggers) to be
P Easy administration 9t should be easy for the D,A to administer the system and to
chec the status and monitor the performance of the replication system components$
Database Solutions (2
1A.. Compare and contrast the different ownership models for replication. 8ive examples
to illustrate your answer.
>!nership relates to !hich site has the privilege to update the data$ .he main types of
o!nership are masterFslave& wor2flow& and update-anywhere (sometimes referred to as peer/
to/peer or symmetric replication)$
)asterFslave ownership
?ith master(slave o!nership& asynchronously replicated data is o!ned by one site& the master
or primary site& and can be updated by only that site$ =sing a 6publish/and/subscribe" metaphor&
the master site (the publisher) maes data available$ >ther sites 6subscribe" to the data o!ned
by the master site& !hich means that they receive read/only copies on their local systems$
1otentially& each site can be the master site for non/overlapping data sets$ 3o!ever& there can
only ever be one site that can update the master copy of a particular data set& and so update
conflicts cannot occur bet!een sites$
A master site may o!n the data in an entire table& in !hich case other sites subscribe to
read/only copies of that table$ Alternatively& multiple sites may o!n distinct fragments of the
table& and other sites then subscribe to read/only copies of the fragments$ .his type of
replication is also no!n as asymmetric replication$
9or2flow ownership
+ie master(slave o!nership& this model avoids update conflicts !hile at the same time providing
a more dynamic o!nership model$ ?orflo! o!nership allo!s the right to update replicated data
to move from site to site$ 3o!ever& at any one moment& there is only ever one site that may
update that particular data set$ A typical example of !orflo! o!nership is an order processing
system& !here the processing of orders follo!s a series of steps& such as order entry& credit
approval& invoicing& shipping& and so on$ 9n a centrali#ed D,-S& applications of this nature access
and update the data in one integrated database: each application updates the order data in
se)uence !hen& and only !hen& the state of the order indicates that the previous step has been
:pdate-anywhere $symmetric replication% ownership
.he t!o previous models share a common property: at any given moment& only one site may
update the data; all other sites have read/only access to the replicas$ 9n some environments&
this is too restrictive$ .he update/any!here model creates a peer/to/peer environment !here
Database Solutions (2
multiple sites have e)ual rights to update replicated data$ .his allo!s local sites to function
autonomously& even !hen other sites are not available$
Shared o!nership can lead to conflict scenarios and the replication architecture has to be
able to employ a methodology for conflict detection and resolution$ A simple mechanism to
detect conflict !ithin a single table is for the source site to send both the old and ne! values
(before/ and after/images) for any records that have been updated since the last refresh$ At
the target site& the replication server can chec each record in the target database that has
also been updated against these values$ 3o!ever& consideration has to be given to detecting
other types of conflict such as violation of referential integrity bet!een t!o tables$ .here have
been many mechanisms proposed for conflict resolution& but some of the most common are:
earliest(latest timestamps& site priority& and holding for manual resolution$
1A.5 8ive a definition of an //!()S. 9hat are the advanta"es and disadvanta"es of an
//!) A (logical) data model that captures the semantics of ob'ects supported in ob'ect/
oriented programming$
//!( A persistent and sharable collection of ob'ects defined by an >>D-$
//!()S .he manager of an >>D,$
1A.A 8ive a definition of an /R!()S. 9hat are the advanta"es and disadvanta"es of an
.hus& there is no single extended relational model; rather& there are a variety of these models&
!hose characteristics depend upon the !ay and the degree to !hich extensions !ere made$
3o!ever& all the models do share the same basic relational tables and )uery language& all
incorporate some concept of 6ob'ect"& and some have the ability to store methods (or procedures
or triggers) as !ell as data in the database$
1A.1B 8ive a definition of a data warehouse. !iscuss the benefits of implementin" a data
!ata warehouse : a consolidated(integrated vie! of corporate data dra!n from disparate
operational data sources and a range of end/user access tools capable of supporting simple to
highly complex )ueries to support decision/maing$
Database Solutions (2
1otential high return on investment
4ompetitive advantage
9ncreased productivity of corporate decision/maers
1A.11 !escribe the characteristics of the data held in a data warehouse.
.he data held in a data !arehouse is described as being sub'ect/oriented& integrated& time/
variant& and non/volatile (9nmon& 2::5)$
P Sub'ect/oriented as the !arehouse is organi#ed around the ma'or sub'ects of the
organi#ation (such as customers& products& and sales) rather than the ma'or application
areas (such as customer invoicing& stoc control& and product sales)$ .his is reflected in
the need to store decision/support data rather than application/oriented data$
P 9ntegrated because of the coming together of source data from different organi#ation/
!ide applications systems$ .he source data is often inconsistent using for example&
different data types and(or formats$ .he integrated data source must be made
consistent to present a unified vie! of the data to the users$
P .ime/variant because data in the !arehouse is only accurate and valid at some point in
time or over some time interval$ .he time/variance of the data !arehouse is also sho!n in
the extended time that the data is held& the implicit or explicit association of time !ith
all data& and the fact that the data represents a series of snapshots$
Bon/volatile as the data is not updated in real/time but is refreshed from operational
systems on a regular basis$ Be! data is al!ays added as a supplement to the database&
rather than a replacement$ .he database continually absorbs this ne! data& incrementally
integrating it !ith the previous data$
1A.1 !iscuss how data marts differ from data warehouses and identify the main reasons
for implementin" a data mart.
A data mart holds a subset of the data in a data !arehouse normally in the form of summary
data relating to a particular department or business area such as -areting or 4ustomer
Services$ .he data mart can be stand/alone or lined centrally to the corporate data !arehouse$
As a data !arehouse gro!s larger& the ability to serve the various needs of the organi#ation may
Database Solutions (2
be compromised$ .he popularity of data marts stems from the fact that corporate data
!arehouses proved difficult to build and use$
1A.1& !iscuss what online analytical processin" $/L?,% is and how /L?, differs from data
/nline analytical processin" $/L?,%# .he dynamic synthesis& analysis& and consolidation of large
volumes of multi/dimensional data$ .he ey characteristics of >+A1 applications include multi/
dimensional vie!s of data& support for complex calculations& and time intelligence$
1A.1' !escribe /L?, applications and identify the characteristics of such applications.
An essential re)uirement of all >+A1 applications is the ability to provide users !ith 'ust/in/
time (I9.) information& !hich is necessary to mae effective decisions about an organi#ationQs
strategic directions$
1A.1* !iscuss how data minin" can reali>e the value of a data warehouse.
Simply storing information in a data !arehouse does not provide the benefits an organi#ation is
seeing$ .o reali#e the value of a data !arehouse& it"s necessary to extract the no!ledge
hidden !ithin the !arehouse$ 3o!ever& as the amount and complexity of the data in a data
!arehouse gro!s& it becomes increasingly difficult& if not impossible& for business analysts to
identify trends and relationships in the data using simple )uery and reporting tools$ Data mining
is one of the best !ays to extract meaningful trends and patterns from huge amounts of data$
Data mining discovers information !ithin data !arehouses that )ueries and reports cannot
effectively reveal$
Database Solutions (2
1A.1- 9hy would we want to dynamically "enerate web pa"es from data held in the
operational database3 List some "eneral requirements for web-database inte"ration.
An 3.-+(R-+ document stored in a file is an example of a static ?eb page: the content of the
document does not change unless the file itself is changed$ >n the other hand& the content of a
dynamic ?eb page is generated each time it"s accessed$ As a result& a dynamic ?eb page can
have features that are not found in static pages& such as:
P 9t can respond to user input from the bro!ser$ %or example& returning data re)uested by
the completion of a form or the results of a database )uery$
P 9t can be customi#ed by and for each user$ %or example& once a user has specified some
preferences !hen accessing a particular site or page (such as area of interest or level of
expertise)& this information can be retained and information returned appropriate to
these preferences$
Bot in any raned order& the re)uirements are as follo!s:
P .he ability to access valuable corporate data in a secure manner$
P Data and vendor independent connectivity to allo! freedom of choice in the selection of
the D,-S no! and in the future$
P .he ability to interface to the database independent of any proprietary ?eb bro!ser or
?eb server$
P A connectivity solution that taes advantage of all the features of an organi#ation"s
P An open/architecture approach to allo! interoperability !ith a variety of systems and
P A cost/effective solution that allo!s for scalability& gro!th& and changes in strategic
directions& and helps reduce the costs of developing and maintaining applications$
P Support for transactions that span multiple 3..1 re)uests$
P Support for session/ and application/based authentication$
P Acceptable performance$
P -inimal administration overhead$
P A set of high/level productivity tools to allo! applications to be developed& maintained&
and deployed !ith relative ease and speed$
Database Solutions (2
1A.1. 9hat is G)L and discuss the approaches for mana"in" G)L-based data.
G)L# a meta/language (a language for describing other languages) that enables designers to
create their o!n customi#ed tags to provide functionality not available !ith 3.-+$
9t"s anticipated that there !ill be t!o main models that !ill exist: data/centric and
document/centric$ 9n a data-centric model& R-+ is used as the storage and interchange format
for data that is structured& appears in a regular order& and is most liely to be machine
processed instead of read by a human$ 9n a data/centric model& the fact that the data is stored
and transferred as R-+ is incidental and other formats could also have been used$ 9n this case&
the data could be stored in a relational& ob'ect/relational& or ob'ect/oriented D,-S$ %or
example& >racle has completely integrated R-+ into its >racle :i system$ R-+ can be stored as
entire documents using the data types R-+.ype or 4+>,(,+>, (4haracter(,inary +arge
>b'ect) or can be decomposed into its constituent elements and stored that !ay$ .he >racle
)uery language has also been extended to permit searching of R-+/based content$
9n a document-centric model& the documents are designed for human consumption (for
example& boos& ne!spapers& and email)$ Due to the nature of this information& much of the data
!ill be irregular or incomplete& and its structure may change rapidly or unpredictably$
=nfortunately& relational& ob'ect/relational& and ob'ect/oriented D,-Ss do not handle data of
this nature particularly !ell$ 4ontent management systems are an important tool for handling
these types of documents$ =nderlying such a system& you may no! find a native R-+ database: