Professional Documents
Culture Documents
BSC 203 Database Management System
BSC 203 Database Management System
LEARNING OUTCOMES: By the end of this course students should be able to:
COURSE CONTENT
UNIT 1: UNDERSTANDING THE MAIN ISSUES RELATED TO DATABASE
SYSTEM IN GENERAL
1. 1 Definition of Database
1.2 Conventional file based system and database approach
1.3 The traditional File based Approach
1.4 The role of DBMS
1.5 Advantage and Disadvantages of DBMS
1.6 Components of DBMS
1.6.1 Functions of DBMS
1.6.2 Physical and logical structures
1.6.3 Three Level Architecture
1.6.4 Logical and Physical Data Independence
Data Models
Designing a database properly is fundamental to establishing a database that meets the
needs of the users. Data models capture the nature of and relationships among data and
are used at different levels of abstraction as a database is conceptualized and designed.
A database management system (DBMS) is a software system that enables the use of a
database approach.
3. DBMS A DBMS is a software system that is used to create, maintain, and provide
controlled access to user databases.
6. User interface The user interface includes languages, menus, and other facilities
by which users interact with various system components, such as data modeling
and design tools, application programs, the DBMS, and the repository. User
interfaces are illustrated throughout this text.
7. Data and database administrators Data administrators are persons who are
responsible for the overall management of data resources in an organization.
Database administrators are responsible for physical database design and for
managing technical issues in the database environment.
8. System developers System developers are persons such as systems analysts and
programmers who design new application programs.
9. End users End users are persons throughout the organization who add, delete,
and modify data in the database and who request or receive information from it.
All user interactions with the database must be routed through the DBMS.
An advantage of the database management approach is, the DBMS helps to create
an environment in which end users have better access to more and better-managed
data.
Such access makes it possible for end users to respond quickly to changes in their
environment.
The more users access the data, the greater the risks of data security breaches.
Corporations invest considerable amounts of time, effort, and money to ensure that
corporate data are used properly. A DBMS provides a framework for better
enforcement of data privacy and security policies.
Data inconsistency exists when different versions of the same data appear in
different places. For example, data inconsistency exists when a company’s sales
department stores a sales representative’s name as “Bill Brown” and the company’s
personnel department stores that same person’s name as “William G. Brown,” or
when the company’s regional sales office shows the price of a product as $45.95
and its national sales office shows the same product’s price as $43.95. The
probability of data inconsistency is greatly reduced in a properly designed
database.
The DBMS makes it possible to produce quick answers to ad hoc queries. From a
database perspective, a query is a specific request issued to the DBMS for data
manipulation—for example, to read or update the data. Simply put, a query is a
question, and an ad hoc query is a spur-of-the-moment question. The DBMS sends
back an answer (called the query result set) to the application. For example, end
users, when dealing with large amounts of sales data, might want quick answers to
questions (ad hoc queries) such as:
- What was the dollar volume of sales by product during the past six months?
- What is the sales bonus figure for each of our salespeople during the past three
months?
Better-managed data and improved data access make it possible to generate better-
quality information, on which better decisions are based. The quality of the
information generated depends on the quality of the underlying data. Data quality
is a comprehensive approach to promoting the accuracy, validity, and timeliness of
the data. While the DBMS does not guarantee data quality, it provides a framework
to facilitate data quality initiatives.
The availability of data, combined with the tools that transform data into usable
information, empowers end users to make quick, informed decisions that can make
the difference between success and failure in the global economy.
Although the database system yields considerable advantages over previous data
management approaches, database systems do carry significant disadvantages. For
example:
1. Increased costs
2. Management complexity
Database systems interface with many different technologies and have a significant
impact on a company’s resources and culture. The changes introduced by the
adoption of a database system must be properly managed to ensure that they help
advance the company’s objectives. Given the fact that database systems hold
crucial company data that are accessed from multiple sources, security issues must
be assessed constantly.
3. Maintaining currency
To maximize the efficiency of the database system, you must keep your system
current. Therefore, you must perform frequent updates and apply the latest
patches and security measures to all components.
Components of DBMS
DBMS have several components, each performing very significant tasks in the
database management system environment. Below is a list of components within
the database and its environment.
Software
This is the set of programs used to control and manage the overall database. This
includes the DBMS software itself, the Operating System, the network software
being used to share the data among users, and the application programs used to
access data in the DBMS.
Hardware
Consists of a set of physical electronic devices such as computers, I/O devices,
storage devices, etc., this provides the interface between computers and the real
world systems.
Data
DBMS exists to collect, store, process and access data, the most important
component. The database contains both the actual or operational data and the
metadata.
Procedures
These are the instructions and rules that assist on how to use the DBMS, and in
designing and running the database, using documented procedures, to guide the
users that operate and manage it.
Query Processor
This transforms the user queries into a series of low level instructions. This reads
the online user’s query and translates it into an efficient series of operations in a
form capable of being sent to the run time data manager for execution.
Data Manager
Also called the cache manger, this is responsible for handling of data in the
database, providing a recovery to the system that allows it to recover the data after
a failure.
Database Engine
The core service for storing, processing, and securing data, this provides controlled
access and rapid transaction processing to address the requirements of the most
demanding data consuming applications. It is often used to create relational
databases for online transaction processing or online analytical processing data.
Data Dictionary
This is a reserved space within a database used to store information about the
database itself. A data dictionary is a set of read-only table and views, containing
the different information about the data used in the enterprise to ensure that
database representation of the data follow one standard as defined in the
dictionary.
Report Writer
Also referred to as the report generator, it is a program that extracts information
from one or more files and presents the information in a specified format. Most
report writers allow the user to select records that meet certain conditions and to
display selected fields in rows and columns, or also format the data into different
charts.
Functions of DBMS
Data Dictionary Management is the one of the most important function in database
management system.
DBMS stores definitions of the data elements and their relationships (metadata) in
a data dictionary.
So, all programs that access the data in the database work through the DBMS.
The DBMS uses the data dictionary to look up the required data component
structures and relationships which relieves you from coding such complex
relationships in each program.
In other words, the DBMS system provides data abstraction, and it removes
structural and data dependence from the system.
2. Data Storage Management
The DBMS creates and manages the complex structures required for data storage,
thus relieving you from the difficult task of defining and programming the physical
data characteristics.
A modern DBMS system provides storage not only for the data, but also for related
data entry forms or screen definitions, report definitions, data validation rules,
procedural code, structures to handle video and picture formats, and so on.
The DBMS transforms entered data in to required data structures. The DBMS
relieves you of the chore of making a distinction between the logical data format
and the physical data format. That is, the DBMS formats the physically retrieved
data to make it conform to the user’s logical expectations.
4. Security Management
5. Multi User Access Control
To provide data integrity and data consistency, the DBMS uses sophisticated
algorithms to ensure that multiple users can access the database concurrently
without compromising the integrity of the database.
The DBMS provides backup and data recovery to ensure data safety and integrity.
Current DBMS systems provide special utilities that allow the DBA to perform
routine and special backup and restore procedures. Recovery management deals
with the recovery of the database after a failure, such as a bad sector in the disk or a
power failure. Such capability is critical to preserving the database’s integrity.
The DBMS promotes and enforces integrity rules, thus minimizing data
redundancy and maximizing data consistency.
The data relationships stored in the data dictionary are used to enforce data
integrity. Ensuring data integrity is especially important in transaction-oriented
database systems.
The DBMS provides data access through a query language. A query language is a
non procedural language—one that lets the user specify what must be done without
having to specify how it is to be done.
Structured Query Language (SQL) is the defacto query language and data access
standard supported by the majority of DBMS vendors.
9. Database Communication Interfaces
- End users can generate answers to queries by filling in screen forms through their
preferred Web browser.
C/SIDE - Client/Server Integrated Development Environment
C/AL - Client/Server Application Language
Description.
C/AL is the programming language that used within the development environment for
Microsoft Dynamics NAV, and the development environment is called as C/SIDE. C/AL
is a database specific programming language and it primarily used to retrieve, insert,
and modify the records in the dynamics NAV Database.
In this section, you will learn how the information in your application is structured.
When you use a database, you are not usually concerned with where each piece of data is
stored, or what size it is. You just want to be sure that when you refer to a name, for
example, the correct value is returned. This is why the C/SIDE database system provides
a conceptual representation of data that does not include many details about how the
data is stored. An abstract data model is used for this conceptual representation. This
data model uses logical concepts (such as objects, their properties, and their
relationships), which are much easier to understand.
This section distinguishes between the logical and the physical database. For this topic,
the logical database is the structure of the data and the relationships between different
pieces of information. There is no information about how these structures and relations
are implemented. For the physical database, this topic describes how the structures in
the logical database and the search paths between them are implemented. The term
database means the logical database, unless indicated otherwise.
What the user sees as a coherent set of information in the C/SIDE database system can
be stored in several physical disk files, but this is transparent to the user. The following
illustration shows how one logical database can be physically stored on three hard disks
but still comprise a single (logical) database.
The following illustration shows the logical database.
Description of "Figure 12-2 Segments, Extents, and Data Blocks Within a Tablespace"
At the finest level of granularity, Oracle Database stores data in data blocks. One logical
data block corresponds to a specific number of bytes of physical disk space, for
example, 2 KB. Data blocks are the smallest units of storage that Oracle Database can
use or allocate.
An extent is a set of logically contiguous data blocks allocated for storing a specific type
of information. In Figure 12-2, the 24 KB extent has 12 data blocks, while the 72 KB
extent has 36 data blocks.
A segment is a set of extents allocated for a specific database object, such as a table.
For example, the data for the employees table is stored in its own data segment,
whereas each index for employees is stored in its own index segment. Every database
object that consumes storage consists of a single segment.
Each segment belongs to one and only one tablespace. Thus, all extents for a
segment are stored in the same tablespace. Within a tablespace, a segment can
include extents from multiple data files, as shown in Figure 12-2. For example, one
extent for a segment may be stored in users01.dbf, while another is stored in
users02.dbf. A single extent can never span data files.
The levels form a three-level architecture that includes an external, a conceptual, and an
internal level. The way users recognize the data is called the external level. The way the
DBMS and the operating system distinguish the data is the internal level, where the data
is actually stored using the data structures and file. The conceptual level offers both the
mapping and the desired independence between the external and internal levels.
What is Database Architecture?
A DBMS architecture is depending on its design and can be of the following types:
• Centralized
• Decentralized
• Hierarchical
DBMS architecture can be seen as either single tier or multi-tier. An architecture having
n-tier splits the entire system into related but independent n modules that can be
independently customized, changed, altered, or replaced.
The architecture of a database system is very much influenced by the primary computer
system on which the database system runs. Database systems can be centralized, or
client-server, where one server machine executes work on behalf of multiple client
machines. Database systems can also be designed to exploit parallel computer
architectures. Distributed databases span multiple geographically separated machines.
The Three Tier Architecture
A 3-tier application is an application program that is structured into three major parts;
each of them is distributed to a different place or places in a network. These 3 divisions
are as follows:
• The workstation or presentation layer
• The business or application logic layer
The database and programming related to managing layer
DATABASE TUTORIALS
Database Task Group ( DBTG ) has been developed and published a proposal
for a standard vocabulary and architecture for database systems in 1971. It can be
appointed by Conference on Data Systems and Langauges ( CODASYL ) . The
standard Planning and Requirements Committee of American National Standards
Institute ( ANSI ) Committee on Computers and Information Processing
developed and published a similar vocabulary and architecture in 1975.
The intensions of database should not be changed once it has been defined. This
because a small change in the intension of database may require many changes to
the data stored in the databases. The extension of database is performed after the
intension of database has been finished. It means that data is stored in database
when the database structure has been defined. The extension of database is
performed according to the rules defined in the intension of database.
The schema's are used to store definitions of the structures of databases. It can be
anything a like a single entity or the whole organization. Three level architecture
defines the many different schema's stored at different levels to isolate the details
of different levels from one another.
Examples:
The Conceptual Level Schema represents the entire / whole database. Conceptual
schema describes the records and relationship included in the Conceptual view. It
also contains the method of deriving the objects in the conceptual view from the
objects in the internal view.
The Internal level schema indicates the whole data will be stored and described in
the data structures and access method can be used by the database. It can contains
the definition of stored record and method of representing the data fields and
access aid used.
1.6.4 Logical and Physical Data Independence
If a database system is not multi-layered, then it becomes difficult to make any changes
in the database system. Database systems are designed in multi-layers as we learnt
earlier.
Data Independence
A database system normally contains a lot of data in addition to users’ data. For
example, it stores data about data, known as metadata, to locate and retrieve data easily.
It is rather difficult to modify or update a set of metadata once it is stored in the
database. But as a DBMS expands, it needs to change over time to satisfy the
requirements of the users. If the entire data is dependent, it would become a tedious and
highly complex job.
Metadata itself follows a layered architecture, so that when we change data at one layer,
it does not affect the data at another level. This data is independent but mapped to each
other.
Logical Data Independence
Logical data is data about database, that is, it stores information about how data is
managed inside. For example, a table (relation) stored in the database and all its
constraints, applied on that relation.
Logical data independence is a kind of mechanism, which liberalizes itself from actual
data stored on the disk. If we do some changes on table format, it should not change the
data residing on the disk.
Physical Data Independence
All the schemas are logical, and the actual data is stored in bit format on the disk.
Physical data independence is the power to change the physical data without impacting
the schema or logical data.
For example, in case we want to change or upgrade the storage system itself −
suppose we want to replace hard-disks with SSD − it should not have any impact
on the logical data or schemas.
Logical Data Independence: Logical data independence is the ability to modify the
conceptual schema without having alteration in external schemas or application
programs. Alterations in the conceptual schema may include addition or deletion of
fresh entities, attributes or relationships and should be possible without having
alteration to existing external schemas or having to rewrite application programs.
Physical Data Independence: Physical data independence is the ability to modify the
inner schema without having alteration to the conceptual schemas or application
programs. Alteration in the internal schema might include. * Using new storage devices.
* Using different data structures. * Switching from one access method to another. *
Using different file organizations or storage structures. * Modifying indexes.
Physical Independence: The logical scheme stays unchanged even though the
storage space or type of some data is changed for reasons of optimisation or
reorganisation. In this external schema does not change. In this internal schema
changes may be required due to some physical schema were reorganized here. Physical
data independence is present in most databases and file environment in which hardware
storage of encoding, exact location of data on disk, merging of records, so on this are
hidden from user. Logical Independence: The external scheme may stay unchanged for
most changes of the logical scheme. This is especially desirable as the application
software does not need to be modified or newly translated.
UNIT 2: DESCRIBING THE DATABASE ANALYSIS AND DESIGN
TECHNIQUES
Introduction
1. Database design is a technique that involves the analysis, design, description, and
specification of data designed for automated business data processing. This technique
uses models to enhance communication between developers and customers.
2. Data models and supporting descriptions are the tools used in database design.
These tools become the deliverables that result from applying database design. There
are two primary objectives for developing of these deliverables. The first objective is
to produce documentation that describes a customer’s perspective of data and the
relationships among this data. The second objective is to produce documentation that
describes the customer organization's environment, operations and data needs. In
accomplishing these objectives, the following deliverables result:
Decision Analysis and Description Forms
Task Analysis and Description Forms
Task/Data Element Usage Matrix
Data Models
Entity-Attribute Lists
Data Definition Lists
Physical Database Specifications Document
3. Consider a database approach if one or more of the following conditions exist in
the user environment:
A multiple number of applications are to be supported by the system.
A multiple number of processes or activities use a multiple number of data
sources.
A multiple number of data sources are used in the reports produced.
The data, from the data definitions, are known to be in existing
database(s).
The development effort is to enhance the capabilities of an existing
database.
4. If it appears that conditions would support database development, then
undertake the activities of logical database analysis and design. When the logical
schema and sub schemas are completed they are translated into their physical
counterparts. Then the physical sub schemas are supplied as part of the data
specifications for program design. The exact boundary between the last stages of
logical design and the first stages of physical analysis is difficult to assess because of
the lack of standard terminology. However, there seems to be general agreement that
logical design encompasses a DBMS-independent view of data and that physical
design results in a specification for the database structure, as it is to be physically
stored. The design step between these two that produces a schema that can be
processed by a DBMS is called implementation design.
5. Do not limit database development considerations to providing random access or
ad hoc query capabilities for the system. However, even if conditions appear to
support database development, postpone the decision to implement or not
implement a DBMS until after completing a thorough study of the current
environment. This study must clarify any alternatives that may or may not be
preferable to DBMS implementation.
It is the management of activities that permit the stages of the database system
development life cycle to be realized as efficiently and effectively as possible.
Database planning must be integrated with the overall IS strategy of the organization.
Database Design
This is the process of creating a design that will support the enterprise’s mission
statement and mission objectives for the required database system. Two main
approaches to the design of a database are followed. These are:
bottom-up and
top-down
A more appropriate strategy for the design of complex databases is to use the top-down
approach which starts with the development of data models that holds few high-level
entities and relationships and then applies consecutive top-down refinements to identify
lower-level entities, relationships, and the associated attributes. The top – down
approach can be understand better using the concepts of the Entity-Relationship (ER)
model, beginning with the identification of entities and relationships between the
entities, which are of interest to the organization.
Database Administration
A DBMS normally provides various utilities for aiding database administration that
includes utilities for loading data into their respective database and finally monitoring
the system. The utilities allow system monitoring give information on and query
execution strategy. The Database Administrator (DBA) is the one who can use this
information to tune the system to give better performance result to database, by
generating additional indexes to speed up queries, by altering storage structures, or by
combining or splitting tables.
The monitoring process continues throughout the life of a database system and in time
may lead to restructuring of the database for satisfying the changing requirements.
These changes ultimately provide information on the likely evolution of the system and
the future resources that may be needed. This, together with knowledge of proposed
new applications, enables the DBA to connect in capacity planning and to notify or alert
senior staff(s) for adjusting plans consequently. If the DBMS lacks certain utilities, the
DBA can either develop the required utilities in-house or purchase additional vendor
tools based on the requirement.
Overview
Database design theory is a topic that many people avoid learning for lack of time. Many
others attempt to learn it, but give up because of the dry, academic treatment it is
usually given by most authors and teachers. But if creating databases is part of your job,
then you're treading on thin ice if you don't have a good solid understanding of
relational database design theory.
This article begins with an introduction to relational database design theory, including a
discussion of keys, relationships, integrity rules, and the often-dreaded "Normal
Forms." Following the theory, I present a practical step-by-step approach to good
database design.
The relational database model was conceived by E. F. Codd in 1969, then a researcher at
IBM. The model is based on branches of mathematics called set theory and predicate
logic. The basic idea behind the relational model is that a database consists of a series of
unordered tables (or relations) that can be manipulated using non-procedural
operations that return tables. This model was in vast contrast to the more traditional
database theories of the time that were much more complicated, less flexible and
dependent on the physical storage methods of the data.
Note: It is commonly thought that the word relational in the relational model comes
from the fact that you relate together tables in a relational database. Although this is a
convenient way to think of the term, it's not accurate. Instead, the word relational has its
roots in the terminology that Codd used to define the relational model. The table in
Codd's writings was actually referred to as a relation (a related set of information). In
fact, Codd (and other relational database theorists) use the terms relations, attributes
and tuples where most of us use the more common terms tables, columns and rows,
respectively (or the more physical—and thus less preferable for discussions of database
design theory—files, fields and records).
The relational model can be applied to both databases and database management
systems (DBMS) themselves. The relational fidelity of database programs can be
compared using Codd's 12 rules (since Codd's seminal paper on the relational model, the
number of rules has been expanded to 300) for determining how DBMS products
conform to the relational model. When compared with other database management
programs, Microsoft Access fares quite well in terms of relational fidelity. Still, it has a
long way to go before it meets all twelve rules completely.
Fortunately, you don't have to wait until Microsoft Access is perfect in a relational sense
before you can benefit from the relational model. The relational model can also be
applied to the design of databases, which is the subject of the remainder of this article.
When designing a database, you have to make decisions regarding how best to take
some system in the real world and model it in a database. This consists of deciding
which tables to create, what columns they will contain, as well as the relationships
between the tables. While it would be nice if this process was totally intuitive and
obvious, or even better automated, this is simply not the case. A well-designed database
takes time and effort to conceive, build and refine.
The benefits of a database that has been designed according to the relational model are
numerous. Some of them are:
Since much of the information is stored in the database rather than in the
application, the database is somewhat self-documenting.
The goal of this article is to explain the basic principles behind relational database
design and demonstrate how to apply these principles when designing a database using
Microsoft Access. This article is by no means comprehensive and certainly not
definitive. Many books have been written on database design theory; in fact, many
careers have been devoted to its study. Instead, this article is meant as an informal
introduction to database design theory for the database developer.
Note: While the examples in this article are centered around Microsoft Access
databases, the discussion also applies to database development using the Microsoft
Visual Basic® programming system, the Microsoft FoxPro® database management
system, and the Microsoft SQL Server™ client-server database management system.
3.1 E-R Model
One of the most difficult phases of database design is the fact that designers,
programmers and / or end-users tend to view data and its use in various different forms.
Unfortunately, unless all the database learners gain a common understanding that
reflects how the enterprise operates but the design you may produce will fail to meet the
users’ requirements. To ensure that you get a precise understanding of the nature of the
data and how it is used by the enterprise, you need to have a universal model for
interaction that is non-technical and free of ambiguities and easy readable to both
technical as well as non-technical members. So ER (Entity Relationship) Model was
designed and developed and are represented by ER diagram. In this chapter you will
learn about the ER diagram and its working.
ER modeling is a top-down structure to database design that begins with identifying the
important data called entities and relationships in combination between the data that
must be characterized in the model. Then database model designers can add more
details such as the information they want to hold about the entities and relationships
which are the attributes and any constraints on the entities, relationships, and
attributes. ER modeling is an important technique for any database designer to master
and forms the basis of the methodology.
• Entity type: It is a group of objects with the same properties that are identified by the
enterprise as having an independent existence. The basic concept of the ER
model is the entity type that is used to represent a group of ‘objects’ in the ‘real
world’ with the same properties. An entity type has an independent existence
within a database.
• Entity occurrence: A uniquely identifiable object of an entity type.
Diagrammatic Representation of Entity Types
Each entity type is shown as a rectangle labeled with the name of the entity, which is
normally a singular noun.
What is Relationship Type?
A relationship type is a set of associations between one or more participating entity
types. Each relationship type is given a name that describes its function.
Here is a diagram showing how relationships are formed in a database.
It is to be noted that multi-valued attributes are represented using double ellipse like
this:
Relationships
Relationships are represented by diamond-shaped box. All the entities (rectangle
shaped) participating in a relationship gets connected using a line.
Figure 1. The best choice for primary key for tblCustomer would be CustomerId.
Candidate keys for tblCustomer might include CustomerId, (LastName + FirstName),
Phone#, (Address, City, State), and (Address + ZipCode). Following Pascal's guidelines,
you would rule out the last three candidates because addresses and phone numbers can
change fairly frequently. The choice among CustomerId and the name composite key is
less obvious and would involve tradeoffs. How likely would a customer's name change
(e.g., marriages cause names to change)? Will misspelling of names be common? How
likely will two customers have the same first and last names? How familiar will
CustomerId be to users? There's no right answer, but most developers favor numeric
primary keys because names do sometimes change and because searches and sorts of
numeric columns are more efficient than of text columns in Microsoft Access (and most
other databases).
Counter columns in Microsoft Access make good primary keys, especially when you're
having trouble coming up with good candidate keys, and no existing arbitrary
identification number is already in place. Don't use a counter column if you'll sometimes
need to renumber the values—you won't be able to—or if you require an alphanumeric
code—Microsoft Access supports only long integer counter values. Also, counter
columns only make sense for tables on the one side of a one-to-many relationship (see
the discussion of relationships in the next section).
Note: In many situations, it is best to use some sort of arbitrary static whole number
(e.g., employee ID, order ID, a counter column, etc.) as a primary key rather than a
descriptive text column. This avoids the problem of misspellings and name changes.
Also, don't use real numbers as primary keys since they are inexact.
Foreign Keys and Domains
Although primary keys are a function of individual tables, if you created databases that
consisted of only independent and unrelated tables, you'd have little need for them.
Primary keys become essential, however, when you start to create relationships that join
together multiple tables in a database. A foreign key is a column in a table used to
reference a primary key in another table.
Continuing the example presented in the last section, let's say that you choose
CustomerId as the primary key for tblCustomer. Now define a second table, tblOrder, as
shown in Figure 2.
One-to-Many Relationships
Two tables are related in a one-to-many (1—M) relationship if for every row in the first
table, there can be zero, one, or many rows in the second table, but for every row in the
second table there is exactly one row in the first table. For example, each order for a
pizza delivery business can have multiple items. Therefore, tblOrder is related to
tblOrderDetails in a one-to-many relationship (see Figure 4). The one-to-many
relationship is also referred to as a parent-child or master-detail relationship. One-to-
many relationships are the most commonly modeled relationship.
Figure 4. There can be many detail lines for each order in the pizza delivery business, so
tblOrder and tblOrderDetails are related in a one-to-many relationship.
One-to-many relationships are also used to link base tables to information stored in
lookup tables. For example, tblPatient might have a short one-letter DischargeDiagnosis
code, which can be linked to a lookup table, tlkpDiagCode, to get more complete
Diagnosis descriptions (stored in DiagnosisName). In this case, tlkpDiagCode is related
to tblPatient in a one-to-many relationship (i.e., one row in the lookup table can be used
in zero or more rows in the patient table).
Many-to-Many Relationships
Two tables are related in a many-to-many (M—M) relationship when for every row in
the first table, there can be many rows in the second table, and for every row in the
second table, there can be many rows in the first table. Many-to-many relationships
can't be directly modeled in relational database programs, including Microsoft Access.
These types of relationships must be broken into multiple one-to-many relationships.
For example, a patient may be covered by multiple insurance plans and a given
insurance company covers multiple patients. Thus, the tblPatient table in a medical
database would be related to the tblInsurer table in a many-to-many relationship. In
order to model the relationship between these two tables, you would create a third,
linking table, perhaps called tblPtInsurancePgm that would contain a row for each
insurance program under which a patient was covered (see Figure 5). Then, the many-
to-many relationship between tblPatient and tblInsurer could be broken into two one-
to-many relationships (tblPatient would be related to tblPtInsurancePgm and tblInsurer
would be related to tblPtInsurancePgm in one-to-many relationships).
Figure 5. A linking table, tblPtInsurancePgm, is used to model the many-to-many
relationship between tblPatient and tblInsurer.
In Microsoft Access, you specify relationships using the Edit—Relationships command.
In addition, you can create ad-hoc relationships at any point, using queries.
3.3 Normalization
Normalization
As mentioned earlier in this article, when designing databases you are faced with a
series of choices. How many tables will there be and what will they represent? Which
columns will go in which tables? What will the relationships between the tables be? The
answers each to these questions lies in something called normalization. Normalization is
the process of simplifying the design of a database so that it achieves the optimum
structure.
Normalization theory gives us the concept of normal forms to assist in achieving the
optimum structure. The normal forms are a linear progression of rules that you apply to
your database, with each higher normal form achieving a better, more efficient design.
The normal forms are:
First Normal Form
Second Normal Form
Third Normal Form
Boyce Codd Normal Form
Fourth Normal Form
Fifth Normal Form
It is a way of structuring data in order to permit it to be queried and manipulated using a "universal
data sub-language" grounded in first-order logic (quote from Codd Database normalization) and to
prevent any modification anomalies.
Purpose of Normalization
In plain English:
1. To reduce the space needed to describe your data. (don’t duplicate the Countries of the world,
just use a number to represent each one)
2. To prevent arbitrary and artificial data absurdities between related items (meals contain
food, “food” does not contain meals).
3. To restrict how many of something can be related to something else. (a student can take
many classes, a class can have many students, a class may only have one primary teacher, an
exam may only have one grade, and so on)
Melvin 32 Marketing
Melvin 32 Sales
1 Monitor Apple
2 Monitor Samsung
3 Scanner HP
1 Monitor
2 Scanner
3 Head phone
Brand table:
brandID brand
1 Apple
2 Samsung
3 HP
4 JBL
Products Brand table:
pbID productID brandID
1 1 1
2 1 2
3 2 3
4 3 4
Decomposition
1. Lossy Decomposition
2. Lossless Join Decomposition
Lossy Decomposition :
"The decompositio of relation R into R1 and R2 is lossy when the join of R1 and R2 does
not yield the same relation as in R."
One of the disadvantages of decomposition into two or more relational schemes (or
tables) is that some information is lost during retrieval of original relation or table.
Consider that we have table STUDENT with three attribute roll_no , sname and
department.
STUDENT:
Roll_no Sname Dept
No_name: Name_dept :
Roll_no Sname
111 parimal
222 parimal
Sname Dept
parimal COMPUTER
parimal ELECTRICAL
In lossy decomposition ,spurious tuples are generated when a natural join is applied to
the relations in the decomposition.
stu_joined :
"The decompositio of relation R into R1 and R2 is lossless when the join of R1 and R2
yield the same relation as in R."
A relational table is decomposed (or factored) into two or more smaller tables, in such a
way that the designer can capture the precise content of the original table by joining
the decomposed parts. This is called lossless-join (or non-additive join) decomposition.
This is also refferd as non-additive decomposition.
The lossless-join decomposition is always defined with respect to a specific set F of
dependencies.
Consider that we have table STUDENT with three attribute roll_no , sname and
department.
STUDENT :
Stu_name: Stu_dept :
Roll_no Sname
111 parimal
222 parimal
Roll_no Dept
111 COMPUTER
222 ELECTRICAL
Now ,when these two relations are joined on the comman column 'roll_no' ,the resultant
relation will look like stu_joined.
stu_joined :
In lossless decomposition, no any spurious tuples are generated when a natural joined is
applied to the relations in the decomposition.
Figure 6. tblOrder1 violates First Normal Form because the data stored in the Items
column is not atomic.
You'd have a difficult time retrieving information from this table, because too much
information is being stored in the Items field. Think how difficult it would be to create a
report that summarized purchases by item.
1NF also prohibits the presence of repeating groups, even if they are stored in composite
(multiple) columns. For example, the same table might be improved upon by replacing
the single Items column with six columns: Quant1, Item1, Quant2, Item2, Quant3,
Item3 (see Figure 7).
Figure 7. A better, but still flawed, version of the Orders table, tblOrder2. The repeating
groups of information violate First Normal Form.
While this design has divided the information into multiple fields, it's still problematic.
For example, how would you go about determining the quantity of hammers ordered by
all customers during a particular month? Any query would have to search all three Item
columns to determine if a hammer was purchased and then sum over the three quantity
columns. Even worse, what if a customer ordered more than three items in a single
order? You could always add additional columns, but where would you stop? Ten items,
twenty items? Say that you decided that a customer would never order more than
twenty-five items in any one order and designed the table accordingly. That means you
would be using 50 columns to store the item and quantity information per record, even
for orders that only involved one or two items. Clearly this is a waste of space. And
someday, someone would want to order more than 25 items.
Tables in 1NF do not have the problems of tables containing repeating groups. The table
in Figure 8, tblOrder3, is 1NF since each column contains one value and there are no
repeating groups of columns. In order to attain 1NF, I have added a column,
OrderItem#. The primary key of this table is a composite key made up of OrderId and
OrderItem#.
Figure 13. Specifying a relationship with referential integrity between the tblCustomer
and tblOrder tables using the Edit | Relationships command. Updates of CustomerId in
tblCustomer will be cascaded to tblOrder. Deletions of rows in tblCustomer will be
disallowed if rows in tblOrders would be orphaned.
Note: When you wish to implement referential integrity in Microsoft Access, you must
perform one additional step outside of the Edit | Relationships dialog: in table design,
you must set the Required property for the foreign key column to Yes. Otherwise,
Microsoft Access will allow your users to enter a Null foreign key value, thus violating
strict referential integrity.
Well this is a highly simplified explanation for Database Normalization. One can study this
process extensively though. After working with databases for some time you’ll automatically
create Normalized databases. As, it’s logical and practical.
What is SQL?
Although SQL is an ANSI (American National Standards Institute) standard, there are
different versions of the SQL language.
However, to be compliant with the ANSI standard, they all support at least the major
commands (such as SELECT, UPDATE, DELETE, INSERT, WHERE) in a similar
manner.
Note: Most of the SQL database programs also have their own proprietary extensions in
addition to the SQL standard!
To build a web site that shows data from a database, you will need:
RDBMS
RDBMS is the basis for SQL, and for all modern database systems such as MS SQL
Server, IBM DB2, Oracle, MySQL, and Microsoft Access.
The data in RDBMS is stored in database objects called tables. A table is a collection of
related data entries and it consists of columns and rows.
Example
SELECT * FROM Customers;
Try it Yourself »
Every table is broken up into smaller entities called fields. The fields in the Customers
table consist of CustomerID, CustomerName, ContactName, Address, City, PostalCode
and Country. A field is a column in a table that is designed to maintain specific
information about every record in the table.
A record, also called a row, is each individual entry that exists in a table. For example,
there are 91 records in the above Customers table. A record is a horizontal entity in a
table.
A column is a vertical entity in a table that contains all information associated with a
specific field in a table.
SQL Syntax
Properly defining the fields in a table is important to the overall optimization of your
database. You should use only the type and size of field you really need to use. For
example, do not define a field 10 characters wide, if you know you are only going to use
2 characters. These type of fields (or columns) are also referred to as data types, after
the type of data you will be storing in those fields.
MySQL uses many different data types broken into three categories −
Numeric
String Types.
INT − A normal-sized integer that can be signed or unsigned. If signed, the allowable range
is from -2147483648 to 2147483647. If unsigned, the allowable range is from 0 to
4294967295. You can specify a width of up to 11 digits.
TINYINT − A very small integer that can be signed or unsigned. If signed, the allowable
range is from -128 to 127. If unsigned, the allowable range is from 0 to 255. You can specify
a width of up to 4 digits.
SMALLINT − A small integer that can be signed or unsigned. If signed, the allowable range
is from -32768 to 32767. If unsigned, the allowable range is from 0 to 65535. You can
specify a width of up to 5 digits.
MEDIUMINT − A medium-sized integer that can be signed or unsigned. If signed, the
allowable range is from -8388608 to 8388607. If unsigned, the allowable range is from 0 to
16777215. You can specify a width of up to 9 digits.
BIGINT − A large integer that can be signed or unsigned. If signed, the allowable range is
from -9223372036854775808 to 9223372036854775807. If unsigned, the allowable range
is from 0 to 18446744073709551615. You can specify a width of up to 20 digits.
FLOAT(M,D) − A floating-point number that cannot be unsigned. You can define the
display length (M) and the number of decimals (D). This is not required and will default to
10,2, where 2 is the number of decimals and 10 is the total number of digits (including
decimals). Decimal precision can go to 24 places for a FLOAT.
DOUBLE(M,D) − A double precision floating-point number that cannot be unsigned. You
can define the display length (M) and the number of decimals (D). This is not required and
will default to 16,4, where 4 is the number of decimals. Decimal precision can go to 53
places for a DOUBLE. REAL is a synonym for DOUBLE.
DECIMAL(M,D) − An unpacked floating-point number that cannot be unsigned. In the
unpacked decimals, each decimal corresponds to one byte. Defining the display length (M)
and the number of decimals (D) is required. NUMERIC is a synonym for DECIMAL.
String Types
Although the numeric and date types are fun, most data you'll store will be in a string
format. This list describes the common string datatypes in MySQL.
CHAR(M) − A fixed-length string between 1 and 255 characters in length (for example
CHAR(5)), right-padded with spaces to the specified length when stored. Defining a length
is not required, but the default is 1.
VARCHAR(M) − A variable-length string between 1 and 255 characters in length. For
example, VARCHAR(25). You must define a length when creating a VARCHAR field.
BLOB or TEXT − A field with a maximum length of 65535 characters. BLOBs are "Binary
Large Objects" and are used to store large amounts of binary data, such as images or other
types of files. Fields defined as TEXT also hold large amounts of data. The difference
between the two is that the sorts and comparisons on the stored data are case sensitive on
BLOBs and are not case sensitive in TEXT fields. You do not specify a length with BLOB
or TEXT.
TINYBLOB or TINYTEXT − A BLOB or TEXT column with a maximum length of 255
characters. You do not specify a length with TINYBLOB or TINYTEXT.
MEDIUMBLOB or MEDIUMTEXT − A BLOB or TEXT column with a maximum length
of 16777215 characters. You do not specify a length with MEDIUMBLOB or MEDIUMTEXT.
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1 1D
In this tutorial we will use the well-known Northwind sample database (included in MS
Access and MS SQL Server).
The table above contains five records (one for each customer) and seven columns
(CustomerID, CustomerName, ContactName, Address, City, PostalCode, and Country).
SQL Statements
Most of the actions you need to perform on a database are done with SQL statements.
The following SQL statement selects all the records in the "Customers" table:
Example
SELECT * FROM Customers;
Try it Yourself »
In this tutorial we will teach you all about the different SQL statements.
Semicolon is the standard way to separate each SQL statement in database systems that
allow more than one SQL statement to be executed in the same call to the server.
In this tutorial, we will use semicolon at the end of each SQL statement.
SQL SELECT Statement
SELECT Syntax
SELECT column1, column2, ...
FROM table_name;
Here, column1, column2, ... are the field names of the table you want to select data from. If
you want to select all the fields available in the table, use the following syntax:
SELECT * FROM table_name;
Demo Database
Below is a selection from the "Customers" table in the Northwind sample database:
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
Example
SELECT CustomerName, City FROM Customers;
Try it Yourself »
SELECT * Example
The following SQL statement selects all the columns from the "Customers" table:
Example
SELECT * FROM Customers;
We've set a primary key (artistID), and ensured that some fields will not be
null/blank.
The DROP Command
The DROP command is used to drop a table from the database. When
dropped, all the data goes with it; however, for this lesson we are only
concerned with tweaking the structure.
The syntax for the command is quite simple, but very powerful!
Now if we want to drop our artist table (maybe we want to start over with a
new design), the following statements can be used:
We've now successfully updated the size of the columns and ensured that
the genre column is required.
Remove a Column
Like DROP TABLE, dropping a column/field will remove the data from that
field! Right now we are only concerned with the structure, but it is important
to remember to proceed with caution if you are dropping columns from a
live database! In the following example, we will remove the sub-genre we
created earlier:
SQL, 'Structured Query Language', is a programming language designed to
manage data stored in relational databases. SQL operates through simple,
declarative statements. This keeps data accurate and secure, and helps
maintain the integrity of databases, regardless of size.
SQL CREATE DATABASE Statement
The SQL CREATE DATABASE Statement
The CREATE DATABASE statement is used to create a new SQL database.
Syntax
CREATE DATABASE databasename;
Example
CREATE DATABASE testDB;
SQL DROP DATABASE Statement
Syntax
DROP DATABASE databasename;
Example
DROP DATABASE testDB;
Tip: Make sure you have admin privilege before dropping any database. Once a
database is dropped, you can check it in the list of databases with the following
SQL command: SHOW DATABASES;
SQL CREATE TABLE Statement
Syntax
CREATE TABLE table_name (
column1 datatype,
column2 datatype,
column3 datatype,
....
);
The column parameters specify the names of the columns of the table.
The datatype parameter specifies the type of data the column can hold (e.g.
varchar, integer, date, etc.).
Example
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
Try it Yourself »
The LastName, FirstName, Address, and City columns are of type varchar and
will hold characters, and the maximum length for these fields is 255 characters.
Tip: The empty "Persons" table can now be filled with data with the
SQL INSERT INTO statement.
The new table gets the same column definitions. All columns or specific columns
can be selected.
If you create a new table using an existing table, the new table will be filled
with the existing values from the old table.
Syntax
CREATE TABLE new_table_name AS
SELECT column1, column2,...
FROM existing_table_name
WHERE ....;
Description
You can also use the SQL CREATE TABLE AS statement to create a table from an existing
table by copying the existing table's columns.
It is important to note that when creating a table in this way, the new table will be populated with
the records from the existing table (based on the SELECT Statement).
Syntax
The syntax for the CREATE TABLE AS statement when copying all of the columns in SQL is:
CREATE TABLE new_table
AS (SELECT * FROM old_table);
Example
Let's look at an example that shows how to create a table by copying all columns from another
table.
For Example:
This would create a new table called suppliers that included all columns from
the companies table.
If there were records in the companies table, then the new suppliers table would also contain the
records selected by the SELECT statement.
Syntax
The syntax for the CREATE TABLE AS statement copying the selected columns is:
Example
Let's look at an example that shows how to create a table by copying selected columns from
another table.
For Example:
This would create a new table called suppliers, but the new table would only include the
specified columns from the companies table.
Again, if there were records in the companies table, then the new suppliers table would also
contain the records selected by the SELECT statement.
Syntax
The syntax for the CREATE TABLE AS statement copying columns from multiple tables is:
Example
Let's look at an example that shows how to create a table by copying selected columns from
multiple tables.
For Example:
SQL DROP TABLE Statement
The SQL DROP TABLE Statement
The DROP TABLE statement is used to drop an existing table in a database.
Syntax
DROP TABLE table_name;
Note: Be careful before dropping a table. Deleting a table will result in loss of
complete information stored in the table!
Example
DROP TABLE Shippers;
Try it Yourself »
Syntax
TRUNCATE TABLE table_name;
SQL ALTER TABLE Statement
The ALTER TABLE statement is also used to add and drop various constraints on
an existing table.
ALTER TABLE table_name
ADD column_name datatype;
ALTER TABLE table_name
DROP COLUMN column_name;
ALTER TABLE table_name
ALTER COLUMN column_name datatype;
ALTER TABLE table_name
MODIFY COLUMN column_name datatype;
ALTER TABLE Persons
ADD DateOfBirth date;
Notice that the new column, "DateOfBirth", is of type date and is going to hold a
date. The data type specifies what type of data the column can hold. For a
complete reference of all the data types available in MS Access, MySQL, and
SQL Server, go to our complete Data Types reference.
ALTER TABLE Persons
ALTER COLUMN DateOfBirth year;
Notice that the "DateOfBirth" column is now of type year and is going to hold a
year in a two- or four-digit format.
ALTER TABLE Persons
DROP COLUMN DateOfBirth;
The "Persons" table will now look like this:
SQL Constraints
SQL constraints are used to specify rules for data in a table.
Syntax
CREATE TABLE table_name (
column1 datatype constraint,
column2 datatype constraint,
column3 datatype constraint,
....
);
SQL Constraints
SQL constraints are used to specify rules for the data in a table.
Constraints are used to limit the type of data that can go into a table. This
ensures the accuracy and reliability of the data in the table. If there is any
violation between the constraint and the data action, the action is aborted.
Constraints can be column level or table level. Column level constraints apply to
a column, and table level constraints apply to the whole table.
COMMANDS
ALTER TABLE
ALTER TABLE table_name ADD column datatype;
AND
SELECT column_name(s)
FROM table_name
WHERE column_1 = value_1
AND column_2 = value_2;
AS
SELECT column_name AS 'Alias'
FROM table_name;
AS is a keyword in SQL that allows you to rename a column or table using
an alias.
AVG
SELECT AVG(column_name)
FROM table_name;
AVG() is an aggregate function that returns the average value for a numeric
column.
BETWEEN
SELECT column_name(s)
FROM table_name
WHERE column_name BETWEEN value_1 AND value_2;
COUNT
SELECT COUNT(column_name)
FROM table_name;
CREATE TABLE
CREATE TABLE table_name (column_1 datatype, column_2 datatype, column_3 datatype);
DELETE
DELETE FROM table_name WHERE some_column = some_value;
DELETE statements are used to remove rows from a table.
GROUP BY
SELECT COUNT(*)
FROM table_name
GROUP BY column_name;
GROUP BY is a clause in SQL that is only used with aggregate functions. It is
used in collaboration with the SELECT statement to arrange identical data
into groups.
INNER JOIN
SELECT column_name(s) FROM table_1
JOIN table_2
ON table_1.column_name = table_2.column_name;
An inner join will combine rows from different tables if the join condition is
true.
INSERT
INSERT INTO table_name (column_1, column_2, column_3) VALUES (value_1, 'value_2', value_3);
LIKE
SELECT column_name(s)
FROM table_name
WHERE column_name LIKE pattern;
LIMIT
SELECT column_name(s)
FROM table_name
LIMIT number;
LIMIT is a clause that lets you specify the maximum number of rows the result
set will have.
MAX
SELECT MAX(column_name)
FROM table_name;
MIN
SELECT MIN(column_name)
FROM table_name;
OR
SELECT column_name
FROM table_name
WHERE column_name = value_1
OR column_name = value_2;
OR is an operator that filters the result set to only include rows where either
condition is true.
ORDER BY
SELECT column_name
FROM table_name
ORDER BY column_name ASC|DESC;
ORDER BY is a clause that indicates you want to sort the result set by a
particular column either alphabetically or numerically.
OUTER JOIN
SELECT column_name(s) FROM table_1
LEFT JOIN table_2
ON table_1.column_name = table_2.column_name;
An outer join will combine rows from different tables even if the the join
condition is not met. Every row in the left table is returned in the result set,
and if the join condition is not met, then NULL values are used to fill in the
columns from the right table.
ROUND
SELECT ROUND(column_name, integer)
FROM table_name;
SELECT
SELECT column_name FROM table_name;
SELECT statements are used to fetch data from a database. Every query will
begin with SELECT.
SELECT DISTINCT
SELECT DISTINCT column_name FROM table_name;
SUM
SELECT SUM(column_name)
FROM table_name;
UPDATE
UPDATE table_name
SET some_column = some_value
WHERE some_column = some_value;
WHERE
SELECT column_name(s)
FROM table_name
WHERE column_name operator value;
WHERE is a clause that indicates you want to filter the result set to include
only rows where the following condition is true.
Primary keys must contain UNIQUE values, and cannot contain NULL values.
A table can have only one primary key, which may consist of single or multiple
fields.
MySQL:
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
PRIMARY KEY (ID)
);
CREATE TABLE Persons (
ID int NOT NULL PRIMARY KEY,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
To allow naming of a PRIMARY KEY constraint, and for defining a PRIMARY KEY
constraint on multiple columns, use the following SQL syntax:
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
CONSTRAINT PK_Person PRIMARY KEY (ID,LastName)
);
Note: In the example above there is only ONE PRIMARY KEY (PK_Person).
However, the VALUE of the primary key is made up of TWO COLUMNS (ID +
LastName).
SQL PRIMARY KEY on ALTER TABLE
To create a PRIMARY KEY constraint on the "ID" column when the table is
already created, use the following SQL:
ALTER TABLE Persons
ADD PRIMARY KEY (ID);
To allow naming of a PRIMARY KEY constraint, and for defining a PRIMARY KEY
constraint on multiple columns, use the following SQL syntax:
ALTER TABLE Persons
ADD CONSTRAINT PK_Person PRIMARY KEY (ID,LastName);
Note: If you use the ALTER TABLE statement to add a primary key, the primary
key column(s) must already have been declared to not contain NULL values
(when the table was first created).
MySQL:
ALTER TABLE Persons
DROP PRIMARY KEY;
ALTER TABLE Persons
DROP CONSTRAINT PK_Person;
SQL FOREIGN KEY Constraint
A FOREIGN KEY is a key used to link two tables together.
A FOREIGN KEY is a field (or collection of fields) in one table that refers to the
PRIMARY KEY in another table.
The table containing the foreign key is called the child table, and the table
containing the candidate key is called the referenced or parent table.
"Persons" table:
1 Hansen Ola
2 Svendson Tove
3 Pettersen Kari
"Orders" table:
1 77895 3
2 44678 3
3 22456 2
4 24562 1
Notice that the "PersonID" column in the "Orders" table points to the "PersonID"
column in the "Persons" table.
The "PersonID" column in the "Persons" table is the PRIMARY KEY in the
"Persons" table.
The "PersonID" column in the "Orders" table is a FOREIGN KEY in the "Orders"
table.
The FOREIGN KEY constraint is used to prevent actions that would destroy links
between tables.
The FOREIGN KEY constraint also prevents invalid data from being inserted into
the foreign key column, because it has to be one of the values contained in the
table it points to.
MySQL:
CREATE TABLE Orders (
OrderID int NOT NULL,
OrderNumber int NOT NULL,
PersonID int,
PRIMARY KEY (OrderID),
FOREIGN KEY (PersonID) REFERENCES Persons(PersonID)
);
CREATE TABLE Orders (
OrderID int NOT NULL PRIMARY KEY,
OrderNumber int NOT NULL,
PersonID int FOREIGN KEY REFERENCES Persons(PersonID)
);
To allow naming of a FOREIGN KEY constraint, and for defining a FOREIGN KEY
constraint on multiple columns, use the following SQL syntax:
CREATE TABLE Orders (
OrderID int NOT NULL,
OrderNumber int NOT NULL,
PersonID int,
PRIMARY KEY (OrderID),
CONSTRAINT FK_PersonOrder FOREIGN KEY (PersonID)
REFERENCES Persons(PersonID)
);
ALTER TABLE Orders
ADD FOREIGN KEY (PersonID) REFERENCES Persons(PersonID);
To allow naming of a FOREIGN KEY constraint, and for defining a FOREIGN KEY
constraint on multiple columns, use the following SQL syntax:
ALTER TABLE Orders
ADD CONSTRAINT FK_PersonOrder
FOREIGN KEY (PersonID) REFERENCES Persons(PersonID);
DROP a FOREIGN KEY Constraint
To drop a FOREIGN KEY constraint, use the following SQL:
MySQL:
ALTER TABLE Orders
DROP FOREIGN KEY FK_PersonOrder;
ALTER TABLE Orders
DROP CONSTRAINT FK_PersonOrder;
( id number(5),
dept char(10),
age number(2),
salary number(10),
location char(10)
);
3.SQL UNIQUE Constraint
The UNIQUE constraint ensures that all values in a column are different.
Both the UNIQUE and PRIMARY KEY constraints provide a guarantee for
uniqueness for a column or set of columns.
However, you can have many UNIQUE constraints per table, but only one
PRIMARY KEY constraint per table.
CREATE TABLE Persons (
ID int NOT NULL UNIQUE,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
MySQL:
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
UNIQUE (ID)
);
ALTER TABLE Persons
ADD UNIQUE (ID);
ALTER TABLE Persons
ADD CONSTRAINT UC_Person UNIQUE (ID,LastName);
MySQL:
ALTER TABLE Persons
DROP INDEX UC_Person;
ALTER TABLE Persons
DROP CONSTRAINT UC_Person;
4.SQL CHECK Constraint
The CHECK constraint is used to limit the value range that can be placed in a
column.
If you define a CHECK constraint on a single column it allows only certain values
for this column.
If you define a CHECK constraint on a table it can limit the values in certain
columns based on values in other columns in the row.
MySQL:
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
CHECK (Age>=18)
);
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int CHECK (Age>=18)
);
ALTER TABLE Persons
ADD CHECK (Age>=18);
ALTER TABLE Persons
ADD CONSTRAINT CHK_PersonAge CHECK (Age>=18 AND City='Sandnes');
ALTER TABLE Persons
DROP CONSTRAINT CHK_PersonAge;
MySQL:
ALTER TABLE Persons
DROP CHECK CHK_PersonAge;
The default value will be added to all new records IF no other value is specified.
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
City varchar(255) DEFAULT 'Sandnes'
);
The DEFAULT constraint can also be used to insert system values, by using
functions like GETDATE():
CREATE TABLE Orders (
ID int NOT NULL,
OrderNumber int NOT NULL,
OrderDate date DEFAULT GETDATE()
);
ALTER TABLE Persons
ALTER City SET DEFAULT 'Sandnes';
ALTER TABLE Persons
ALTER COLUMN City SET DEFAULT 'Sandnes';
Oracle:
ALTER TABLE Persons
MODIFY City DEFAULT 'Sandnes';
MySQL:
ALTER TABLE Persons
ALTER City DROP DEFAULT;
ALTER TABLE Persons
ALTER COLUMN City DROP DEFAULT;
Often this is the primary key field that we would like to be created automatically
every time a new record is inserted.
To let the AUTO_INCREMENT sequence start with another value, use the
following SQL statement:
ALTER TABLE Persons AUTO_INCREMENT=100;
To insert a new record into the "Persons" table, we will NOT have to specify a
value for the "ID" column (a unique value will be added automatically):
INSERT INTO Persons (FirstName,LastName)
VALUES ('Lars','Monsen');
The SQL statement above would insert a new record into the "Persons" table.
The "ID" column would be assigned a unique value. The "FirstName" column
would be set to "Lars" and the "LastName" column would be set to "Monsen".
CREATE TABLE Persons (
ID int IDENTITY(1,1) PRIMARY KEY,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
The MS SQL Server uses the IDENTITY keyword to perform an auto-increment
feature.
In the example above, the starting value for IDENTITY is 1, and it will increment
by 1 for each new record.
Tip: To specify that the "ID" column should start at value 10 and increment by
5, change it to IDENTITY(10,5).
To insert a new record into the "Persons" table, we will NOT have to specify a
value for the "ID" column (a unique value will be added automatically):
INSERT INTO Persons (FirstName,LastName)
VALUES ('Lars','Monsen');
The SQL statement above would insert a new record into the "Persons" table.
The "ID" column would be assigned a unique value. The "FirstName" column
would be set to "Lars" and the "LastName" column would be set to "Monsen".
CREATE TABLE Persons (
ID Integer PRIMARY KEY AUTOINCREMENT,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
Tip: To specify that the "ID" column should start at value 10 and increment by
5, change the autoincrement to AUTOINCREMENT(10,5).
To insert a new record into the "Persons" table, we will NOT have to specify a
value for the "ID" column (a unique value will be added automatically):
INSERT INTO Persons (FirstName,LastName)
VALUES ('Lars','Monsen');
The SQL statement above would insert a new record into the "Persons" table.
The "P_Id" column would be assigned a unique value. The "FirstName" column
would be set to "Lars" and the "LastName" column would be set to "Monsen".
The COMMIT command is the transactional command used to save changes invoked by a
transaction to the database. The COMMIT command saves all the transactions to the database
since the last COMMIT or ROLLBACK command.
COMMIT;
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Following is an example which would delete those records from the table
which have age = 25 and then COMMIT the changes in the database.
Thus, two rows from the table would be deleted and the SELECT statement
would produce the following result.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Example
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Following is an example, which would delete those records from the table
which have the age = 25 and then ROLLBACK the changes in the database.
Thus, the delete operation would not impact the table and the SELECT
statement would produce the following result.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
SAVEPOINT SAVEPOINT_NAME;
This command serves only in the creation of a SAVEPOINT among all the
transactional statements. The ROLLBACK command is used to undo a group
of transactions.
ROLLBACK TO SAVEPOINT_NAME;
Following is an example where you plan to delete the three different records
from the CUSTOMERS table. You want to create a SAVEPOINT before each
delete, so that you can ROLLBACK to any SAVEPOINT at any time to return
the appropriate data to its original state.
Example
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Now that the three deletions have taken place, let us assume that you have
changed your mind and decided to ROLLBACK to the SAVEPOINT that you
identified as SP2. Because SP2 was created after the first deletion, the last
two deletions are undone −
Notice that only the first deletion took place since you rolled back to SP2.
Once a SAVEPOINT has been released, you can no longer use the
ROLLBACK command to undo transactions performed since the last
SAVEPOINT.
SELECT Syntax
SELECT column1, column2, ...
FROM table_name;
Here, column1, column2, ... are the field names of the table you want to select
data from. If you want to select all the fields available in the table, use the
following syntax:
SELECT * FROM table_name;
Demo Database
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
SELECT * Example
The following SQL statement selects all the columns from the "Customers"
table:
Example
SELECT * FROM Customers;
The ALTER TABLE statement is also used to add and drop various constraints on
an existing table.
ALTER TABLE table_name
ADD column_name datatype;
ALTER TABLE table_name
ALTER COLUMN column_name datatype;
ALTER TABLE table_name
MODIFY COLUMN column_name datatype;
ALTER TABLE table_name
MODIFY column_name datatype;
ALTER TABLE Persons
ADD DateOfBirth date;
Notice that the new column, "DateOfBirth", is of type date and is going to hold a
date. The data type specifies what type of data the column can hold. For a
complete reference of all the data types available in MS Access, MySQL, and
SQL Server, go to our complete Data Types reference.
ALTER TABLE Persons
ALTER COLUMN DateOfBirth year;
Notice that the "DateOfBirth" column is now of type year and is going to hold a
year in a two- or four-digit format.
ALTER TABLE Persons
DROP COLUMN DateOfBirth;
UPDATE Syntax
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
Note: Be careful when updating records in a table! Notice the WHERE clause in
the UPDATE statement. The WHERE clause specifies which record(s) that should
be updated. If you omit the WHERE clause, all records in the table will be
updated!
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
UPDATE Table
The following SQL statement updates the first customer (CustomerID = 1) with
a new contact person and a new city.
Example
UPDATE Customers
SET ContactName = 'Alfred Schmidt', City= 'Frankfurt'
WHERE CustomerID = 1;
Try it Yourself »
The selection from the "Customers" table will now look like this:
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
The following SQL statement will update the contactname to "Juan" for all
records where country is "Mexico":
Example
UPDATE Customers
SET ContactName='Juan'
WHERE Country='Mexico';
The selection from the "Customers" table will now look like this:
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
Update Warning!
Be careful when updating records. If you omit the WHERE clause, ALL records
will be updated!
Example
UPDATE Customers
SET ContactName='Juan';
The selection from the "Customers" table will now look like this:
ROLLBACK;
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Following is an example, which would delete those records from the table
which have the age = 25 and then ROLLBACK the changes in the database.
Thus, the delete operation would not impact the table and the SELECT
statement would produce the following result.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
T
4.2.6 Deleting the table rows
DELETE Syntax
DELETE FROM table_name
WHERE condition;
Note: Be careful when deleting records in a table! Notice the WHERE clause in
the DELETE statement. The WHERE clause specifies which record(s) that should
be deleted. If you omit the WHERE clause, all records in the table will be
deleted!
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
5 Berglunds snabbköp Christina Berguvsvägen 8 Luleå S-95
Berglund
Example
DELETE FROM Customers
WHERE CustomerName='Alfreds Futterkiste';
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
5 Berglunds snabbköp Christina Berguvsvägen 8 Luleå S-95
Berglund
DELETE FROM table_name;
or:
DELETE * FROM table_name;
4.3 Queries
4.4 Partial listing of Table contents
Inside a table, a column often contains many duplicate values; and sometimes
you only want to list the different (distinct) values.
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1 1DP UK
SELECT Example
The following SQL statement selects all (and duplicate) values from the
"Country" column in the "Customers" table:
Example
SELECT Country FROM Customers;
Try it Yourself »
Now, let us use the DISTINCT keyword with the above SELECT statement and
see the result.
Example
SELECT DISTINCT Country FROM Customers;
Try it Yourself »
The following SQL statement lists the number of different (distinct) customer
countries:
Example
SELECT COUNT(DISTINCT Country) FROM Customers;
Try it Yourself »
Note: The example above will not work in Firefox and Microsoft
Edge! Because COUNT(DISTINCT column_name) is not supported in Microsoft
Access databases. Firefox and Microsoft Edge are using Microsoft Access in our
examples.
Example
SELECT Count(*) AS DistinctCountries
FROM (SELECT DISTINCT Country FROM Customers);
Try it Yourself »
4.5 Logical operates AND, OR and NOT
The AND and OR operators are used to filter records based on more than one
condition:
The AND operator displays a record if all the conditions separated by AND
is TRUE.
The OR operator displays a record if any of the conditions separated by
OR is TRUE.
AND Syntax
SELECT column1, column2, ...
FROM table_name
WHERE condition1 AND condition2 AND condition3 ...;
OR Syntax
SELECT column1, column2, ...
FROM table_name
WHERE condition1 OR condition2 OR condition3 ...;
NOT Syntax
SELECT column1, column2, ...
FROM table_name
WHERE NOT condition;
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
AND Example
The following SQL statement selects all fields from "Customers" where country
is "Germany" AND city is "Berlin":
Example
SELECT * FROM Customers
WHERE Country='Germany' AND City='Berlin';
Try it Yourself »
OR Example
The following SQL statement selects all fields from "Customers" where city is
"Berlin" OR "München":
Example
SELECT * FROM Customers
WHERE City='Berlin' OR City='München';
Try it Yourself »
NOT Example
The following SQL statement selects all fields from "Customers" where country
is NOT "Germany":
Example
SELECT * FROM Customers
WHERE NOT Country='Germany';
Try it Yourself »
The following SQL statement selects all fields from "Customers" where country
is "Germany" AND city must be "Berlin" OR "München" (use parenthesis to form
complex expressions):
Example
SELECT * FROM Customers
WHERE Country='Germany' AND (City='Berlin' OR City='München');
Try it Yourself »
The following SQL statement selects all fields from "Customers" where country
is NOT "Germany" and NOT "USA":
Example
SELECT * FROM Customers
WHERE NOT Country='Germany' AND NOT Country='USA';
Operator Description
+ Add
- Subtract
* Multiply
/ Divide
% Modulo
SQL Bitwise Operators
Operator Description
| Bitwise OR
^ Bitwise exclusive OR
Operator Description
= Equal to
Operator Description
+= Add equals
-= Subtract equals
*= Multiply equals
/= Divide equals
%= Modulo equals
Operator Description
SQL ALTER TABLE Statement
❮ Previous Next ❯
The ALTER TABLE statement is also used to add and drop various constraints on
an existing table.
ALTER TABLE table_name
ADD column_name datatype;
ALTER TABLE - DROP COLUMN
To delete a column in a table, use the following syntax (notice that some
database systems don't allow deleting a column):
ALTER TABLE table_name
DROP COLUMN column_name;
ALTER TABLE table_name
ALTER COLUMN column_name datatype;
ALTER TABLE table_name
MODIFY COLUMN column_name datatype;
ALTER TABLE table_name
MODIFY column_name datatype;
ALTER TABLE Persons
ADD DateOfBirth date;
Notice that the new column, "DateOfBirth", is of type date and is going to hold a
date. The data type specifies what type of data the column can hold. For a
complete reference of all the data types available in MS Access, MySQL, and
SQL Server, go to our complete Data Types reference.
ALTER TABLE Persons
ALTER COLUMN DateOfBirth year;
Notice that the "DateOfBirth" column is now of type year and is going to hold a
year in a two- or four-digit format.
ALTER TABLE Persons
DROP COLUMN DateOfBirth;
The first way specifies both the column names and the values to be inserted:
If you are adding values for all the columns of the table, you do not need to
specify the column names in the SQL query. However, make sure the order of
the values is in the same order as the columns in the table. The INSERT INTO
syntax would be as follows:
INSERT INTO table_name
VALUES (value1, value2, value3, ...);
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
89 White Clover Markets Karl Jablonski 305 - 14th Ave. S. Suite 3B Seattle 98128
90 Wilman Kala Matti Karttunen Keskuskatu 45 Helsinki 21240
Example
INSERT INTO Customers (CustomerName, ContactName, Address, City,
PostalCode, Country)
VALUES ('Cardinal', 'Tom B. Erichsen', 'Skagen
21', 'Stavanger', '4006', 'Norway');
Try it Yourself »
The selection from the "Customers" table will now look like this:
89 White Clover Markets Karl Jablonski 305 - 14th Ave. S. Suite 3B Seattle 98128
Did you notice that we did not insert any number into the CustomerID
field?
The CustomerID column is an auto-increment field and will be generated
automatically when a new record is inserted into the table.
The following SQL statement will insert a new record, but only insert data in the
"CustomerName", "City", and "Country" columns (CustomerID will be updated
automatically):
Example
INSERT INTO Customers (CustomerName, City, Country)
VALUES ('Cardinal', 'Stavanger', 'Norway');
Try it Yourself »
The selection from the "Customers" table will now look like this:
89 White Clover Markets Karl Jablonski 305 - 14th Ave. S. Suite 3B Seattle 98128
Operator Description
+ Add
- Subtract
* Multiply
/ Divide
% Modulo
SQL Bitwise Operators
Operator Description
| Bitwise OR
^ Bitwise exclusive OR
Operator Description
= Equal to
Operator Description
+= Add equals
-= Subtract equals
*= Multiply equals
/= Divide equals
%= Modulo equals
Operator Description
Syntax
DROP TABLE table_name;
Note: Be careful before dropping a table. Deleting a table will result in loss of
complete information stored in the table!
Example
DROP TABLE Shippers;
Try it Yourself »
DELETE Syntax
DELETE FROM table_name
WHERE condition;
Note: Be careful when deleting records in a table! Notice the WHERE clause in
the DELETE statement. The WHERE clause specifies which record(s) that should
be deleted. If you omit the WHERE clause, all records in the table will be
deleted!
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
Example
DELETE FROM Customers
WHERE CustomerName='Alfreds Futterkiste';
or:
DELETE * FROM table_name;
Syntax
TRUNCATE TABLE table_name;
Primary keys must contain UNIQUE values, and cannot contain NULL values.
A table can have only one primary key, which may consist of single or multiple
fields.
MySQL:
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
PRIMARY KEY (ID)
);
CREATE TABLE Persons (
ID int NOT NULL PRIMARY KEY,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
To allow naming of a PRIMARY KEY constraint, and for defining a PRIMARY KEY
constraint on multiple columns, use the following SQL syntax:
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
CONSTRAINT PK_Person PRIMARY KEY (ID,LastName)
);
Note: In the example above there is only ONE PRIMARY KEY (PK_Person).
However, the VALUE of the primary key is made up of TWO COLUMNS (ID +
LastName).
ALTER TABLE Persons
ADD PRIMARY KEY (ID);
To allow naming of a PRIMARY KEY constraint, and for defining a PRIMARY KEY
constraint on multiple columns, use the following SQL syntax:
ALTER TABLE Persons
ADD CONSTRAINT PK_Person PRIMARY KEY (ID,LastName);
Note: If you use the ALTER TABLE statement to add a primary key, the primary
key column(s) must already have been declared to not contain NULL values
(when the table was first created).
MySQL:
ALTER TABLE Persons
DROP PRIMARY KEY;
ALTER TABLE Persons
DROP CONSTRAINT PK_Person;
A FOREIGN KEY is a field (or collection of fields) in one table that refers to the
PRIMARY KEY in another table.
The table containing the foreign key is called the child table, and the table
containing the candidate key is called the referenced or parent table.
"Persons" table:
PersonID LastName FirstName
1 Hansen Ola
2 Svendson Tove
3 Pettersen Kari
"Orders" table:
1 77895 3
2 44678 3
3 22456 2
4 24562 1
Notice that the "PersonID" column in the "Orders" table points to the "PersonID"
column in the "Persons" table.
The "PersonID" column in the "Persons" table is the PRIMARY KEY in the
"Persons" table.
The "PersonID" column in the "Orders" table is a FOREIGN KEY in the "Orders"
table.
The FOREIGN KEY constraint is used to prevent actions that would destroy links
between tables.
The FOREIGN KEY constraint also prevents invalid data from being inserted into
the foreign key column, because it has to be one of the values contained in the
table it points to.
MySQL:
CREATE TABLE Orders (
OrderID int NOT NULL,
OrderNumber int NOT NULL,
PersonID int,
PRIMARY KEY (OrderID),
FOREIGN KEY (PersonID) REFERENCES Persons(PersonID)
);
CREATE TABLE Orders (
OrderID int NOT NULL PRIMARY KEY,
OrderNumber int NOT NULL,
PersonID int FOREIGN KEY REFERENCES Persons(PersonID)
);
To allow naming of a FOREIGN KEY constraint, and for defining a FOREIGN KEY
constraint on multiple columns, use the following SQL syntax:
CREATE TABLE Orders (
OrderID int NOT NULL,
OrderNumber int NOT NULL,
PersonID int,
PRIMARY KEY (OrderID),
CONSTRAINT FK_PersonOrder FOREIGN KEY (PersonID)
REFERENCES Persons(PersonID)
);
ALTER TABLE Orders
ADD FOREIGN KEY (PersonID) REFERENCES Persons(PersonID);
To allow naming of a FOREIGN KEY constraint, and for defining a FOREIGN KEY
constraint on multiple columns, use the following SQL syntax:
ALTER TABLE Orders
ADD CONSTRAINT FK_PersonOrder
FOREIGN KEY (PersonID) REFERENCES Persons(PersonID);
MySQL:
ALTER TABLE Orders
DROP FOREIGN KEY FK_PersonOrder;
ALTER TABLE Orders
DROP CONSTRAINT FK_PersonOrder;
4.8: More Complex Queries and SQL functions
4.8.1 Ordering a listing
The ORDER BY keyword sorts the records in ascending order by default. To sort
the records in descending order, use the DESC keyword.
ORDER BY Syntax
SELECT column1, column2, ...
FROM table_name
ORDER BY column1, column2, ... ASC|DESC;
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
ORDER BY Example
The following SQL statement selects all customers from the "Customers" table,
sorted by the "Country" column:
Example
SELECT * FROM Customers
ORDER BY Country;
Try it Yourself »
Example
SELECT * FROM Customers
ORDER BY Country DESC;
Try it Yourself »
Example
SELECT * FROM Customers
ORDER BY Country, CustomerName;
Try it Yourself »
Example
SELECT * FROM Customers
ORDER BY Country ASC, CustomerName DESC;
Inside a table, a column often contains many duplicate values; and sometimes
you only want to list the different (distinct) values.
The SELECT DISTINCT statement is used to return only distinct (different)
values.
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
SELECT Example
The following SQL statement selects all (and duplicate) values from the
"Country" column in the "Customers" table:
Example
SELECT Country FROM Customers;
Try it Yourself »
Now, let us use the DISTINCT keyword with the above SELECT statement and
see the result.
Example
SELECT DISTINCT Country FROM Customers;
Try it Yourself »
The following SQL statement lists the number of different (distinct) customer
countries:
Example
SELECT COUNT(DISTINCT Country) FROM Customers;
4.8.3 Numeric functions in SQL
Function Description
ATAN Returns the arc tangent of a number or the arc tangent of n and
CEIL Returns the smallest integer value that is greater than or equal t
number
CEILING Returns the smallest integer value that is greater than or equal t
number
FLOOR Returns the largest integer value that is less than or equal to a
number
GROUP BY Syntax
SELECT column_name(s)
FROM table_name
WHERE condition
GROUP BY column_name(s)
ORDER BY column_name(s);
Demo Database
Below is a selection from the "Customers" table in the Northwind sample
database:
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1
Example
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;
Try it Yourself »
The following SQL statement lists the number of customers in each country,
sorted high to low:
Example
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
ORDER BY COUNT(CustomerID) DESC;
Try it Yourself »
Demo Database
Below is a selection from the "Orders" table in the Northwind sample database:
10248 90 5 1996-07-04 3
10249 81 6 1996-07-05 1
10250 34 4 1996-07-08 2
ShipperID ShipperName
1 Speedy Express
2 United Package
3 Federal Shipping
Example
SELECT Shippers.ShipperName, COUNT(Orders.OrderID) AS NumberOfOrders F
ROM Orders
LEFT JOIN Shippers ON Orders.ShipperID = Shippers.ShipperID
GROUP BY ShipperName;
The MAX() function returns the largest value of the selected column.
MIN() Syntax
SELECT MIN(column_name)
FROM table_name
WHERE condition;
MAX() Syntax
SELECT MAX(column_name)
FROM table_name
WHERE condition;
Demo Database
Below is a selection from the "Products" table in the Northwind sample
database:
1 Chais 1 1 10 boxes x 20 b
2 Chang 1 1 24 - 12 oz bottl
Example
SELECT MIN(Price) AS SmallestPrice
FROM Products;
Try it Yourself »
MAX() Example
The following SQL statement finds the price of the most expensive product:
Example
SELECT MAX(Price) AS LargestPrice
FROM Products;
COUNT() Syntax
SELECT COUNT(column_name)
FROM table_name
WHERE condition;
AVG() Syntax
SELECT AVG(column_name)
FROM table_name
WHERE condition;
SUM() Syntax
SELECT SUM(column_name)
FROM table_name
WHERE condition;
Demo Database
Below is a selection from the "Products" table in the Northwind sample
database:
1 Chais 1 1 10 boxes x 20 b
2 Chang 1 1 24 - 12 oz bottl
Example
SELECT COUNT(ProductID)
FROM Products;
Try it Yourself »
AVG() Example
The following SQL statement finds the average price of all products:
Example
SELECT AVG(Price)
FROM Products;
Try it Yourself »
Demo Database
Below is a selection from the "OrderDetails" table in the Northwind sample
database:
1 10248 11 12
2 10248 42 10
3 10248 72 5
4 10249 14 9
5 10249 51 40
SUM() Example
The following SQL statement finds the sum of the "Quantity" fields in the
"OrderDetails" table:
Example
SELECT SUM(Quantity)
FROM OrderDetails;
Definition
Procurement planning is the process of identifying and consolidating requirements and determining the
timeframes for their procurement with the aim of having them as and when they are required.
A good procurement plan will describe the process in the identification and selection of
suppliers/contractors/consultants.
Legal Backing
Formulation and development of procurement plans is not just a good practice that must be embraced by
Procuring Entities but it is also a legal requirement.
Section 42 (1) of the Public Procurement Act No. 12 of 2008 mandates each procuring entity to plan its
procurements. In particular, the Act states that a procuring entity shall:
aggregate its requirements wherever possible, both within the procuring entity and between procuring,
entities, to obtain value for money and reduce procurement costs;
make use of rate or running contracts wherever appropriate to provide an efficient, cost effective and
flexible means to procure goods, works and services that are required continuously or repeatedly over a
set period of time;
avoid splitting of procurement to defeat the use of appropriate procurement methods; and
integrate its expenditure programme with the procurement plan.
Further, Section 42 (2) of the Act states that procuring entities shall submit their procurement plans to the
Zambia Public Procurement Authority.
Annual planning should be integrated with applicable budget processes and based on indicative or
approved budgets
Procuring entities should revise and update their procurement plans, as appropriate, during the course
of each year
Maxim
A view contains rows and columns, just like a real table. The fields in a view are
fields from one or more real tables in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and present
the data as if the data were coming from one single table.
Note: A view always shows up-to-date data! The database engine recreates the
data, using the view's SQL statement, every time a user queries a view.
The view "Current Product List" lists all active products (products that are not
discontinued) from the "Products" table. The view is created with the following
SQL:
CREATE VIEW [Current Product List] AS
SELECT ProductID, ProductName
FROM Products
WHERE Discontinued = No;
SELECT * FROM [Current Product List];
Another view in the Northwind sample database selects every product in the
"Products" table with a unit price higher than the average unit price:
Another view in the Northwind database calculates the total sale for each
category in 1997. Note that this view selects its data from another view called
"Product Sales for 1997":
We can also add a condition to the query. Let's see the total sale only for the
category "Beverages":
Now we want to add the "Category" column to the "Current Product List" view.
We will update the view with the following SQL:
CREATE OR REPLACE VIEW [Current Product List] AS
SELECT ProductID, ProductName, Category
FROM Products
WHERE Discontinued = No;
SQL Injection
SQL injection is a code injection technique that might destroy your database.
Example
txtUserId = getRequestString("UserId");
txtSQL = "SELECT * FROM Users WHERE UserId = " + txtUserId;
The rest of this chapter describes the potential dangers of using user input in
SQL statements.
If there is nothing to prevent a user from entering "wrong" input, the user can
enter some "smart" input like this:
UserId:
SELECT * FROM Users WHERE UserId = 105 OR 1=1;
The SQL above is valid and will return ALL rows from the "Users" table,
since OR 1=1 is always TRUE.
Does the example above look dangerous? What if the "Users" table contains
names and passwords?
A hacker might get access to all the user names and passwords in a database,
by simply inserting 105 OR 1=1 into the input field.
Username:
Password:
Example
uName = getRequestString("username");
uPass = getRequestString("userpassword");
sql = 'SELECT * FROM Users WHERE Name ="' + uName + '" AND Pass ="' +
uPass + '"'
Result
SELECT * FROM Users WHERE Name ="John Doe" AND Pass ="myPass"
User Name:
Password:
The code at the server will create a valid SQL statement like this:
Result
SELECT * FROM Users WHERE Name ="" or ""="" AND Pass ="" or ""=""
The SQL above is valid and will return all rows from the "Users" table, since OR
""="" is always TRUE.
The SQL statement below will return all rows from the "Users" table, then delete
the "Suppliers" table.
Example
SELECT * FROM Users; DROP TABLE Suppliers
Example
txtUserId = getRequestString("UserId");
txtSQL = "SELECT * FROM Users WHERE UserId = " + txtUserId;
User id:
Result
SELECT * FROM Users WHERE UserId = 105; DROP TABLE Suppliers;
Use SQL Parameters for Protection
To protect a web site from SQL injection, you can use SQL parameters.
SQL parameters are values that are added to an SQL query at execution time,
in a controlled manner.
The SQL engine checks each parameter to ensure that it is correct for its
column and are treated literally, and not as part of the SQL to be executed.
Another Example
txtNam = getRequestString("CustomerName");
txtAdd = getRequestString("Address");
txtCit = getRequestString("City");
txtSQL = "INSERT INTO Customers (CustomerName,Address,City)
Values(@0,@1,@2)";
db.Execute(txtSQL,txtNam,txtAdd,txtCit);
Examples
The following examples shows how to build parameterized queries in some
common web languages.
txtUserId = getRequestString("UserId");
sql = "SELECT * FROM Customers WHERE CustomerId = @0";
command = new SqlCommand(sql);
command.Parameters.AddWithValue("@0",txtUserID);
command.ExecuteReader();
INSERT INTO STATEMENT IN ASP.NET:
txtNam = getRequestString("CustomerName");
txtAdd = getRequestString("Address");
txtCit = getRequestString("City");
txtSQL = "INSERT INTO Customers (CustomerName,Address,City)
Values(@0,@1,@2)";
command = new SqlCommand(txtSQL);
command.Parameters.AddWithValue("@0",txtNam);
command.Parameters.AddWithValue("@1",txtAdd);
command.Parameters.AddWithValue("@2",txtCit);
command.ExecuteNonQuery();
Indexes are used to retrieve data from the database very fast. The users cannot
see the indexes, they are just used to speed up searches/queries.
Note: Updating a table with indexes takes more time than updating a table
without (because the indexes also need an update). So, only create indexes on
columns that will be frequently searched against.
CREATE INDEX index_name
ON table_name (column1, column2, ...);
CREATE UNIQUE INDEX index_name
ON table_name (column1, column2, ...);
CREATE INDEX idx_lastname
ON Persons (LastName);
If you want to create an index on a combination of columns, you can list the
column names within the parentheses, separated by commas:
CREATE INDEX idx_pname
ON Persons (LastName, FirstName);
MS Access:
DROP INDEX index_name ON table_name;
SQL Server:
DROP INDEX table_name.index_name;
DB2/Oracle:
DROP INDEX index_name;
MySQL:
ALTER TABLE table_name
DROP INDEX index_name;
SQL JOIN
A JOIN clause is used to combine rows from two or more tables, based on a related column
between them.
Let's look at a selection from the "Orders" table:
10308 2 1996-0
10309 37 1996-0
10310 77 1996-0
Then, look at a selection from the "Customers" table:
CustomerID CustomerName C
1 Alfreds Futterkiste M
OrderID CustomerName
10308 Ana Trujillo Emparedados y helados
SQL INNER JOIN Keyword
The INNER JOIN keyword selects records that have matching values in both tables.
INNER JOIN Syntax
SELECT column_name(s)
FROM table1
INNER JOIN table2 ON table1.column_name = table2.column_name;
Demo Database
In this tutorial we will use the well-known Northwind sample database.
Below is a selection from the "Orders" table:
10308 2 7 1996-
10309 37 3 1996-
10310 77 8 1996-
And a selection from the "Customers" table:
Demo Database
In this tutorial we will use the well-known Northwind sample database.
Below is a selection from the "Customers" table:
CustomerID CustomerName ContactName Address
10308 2 7 1996-
10309 37 3 1996-
10310 77 8 1996-
10308 2 7 1996-
10309 37 3 1996-
10310 77 8 1996-
And a selection from the "Employees" table:
1 Davolio Nancy 12
2 Fuller Andrew 2/
3 Leverling Janet 8/
Demo Database
In this tutorial we will use the well-known Northwind sample database.
Below is a selection from the "Customers" table:
10308 2 7 1996-
10309 37 3 1996-
10310 77 8 1996-
CustomerName
Alfreds Futterkiste
Note: The FULL OUTER JOIN keyword returns all the rows from the left table (Customers),
and all the rows from the right table (Orders). If there are rows in "Customers" that do not have
matches in "Orders", or if there are rows in "Orders" that do not have matches in "Customers",
those rows will be listed as well.
A self JOIN is a regular join, but the table is joined with itself.
The following SQL statement matches customers that are from the same city:
Example
SELECT A.CustomerName AS CustomerName1,
B.CustomerName AS CustomerName2, A.City
FROM Customers A, Customers B
WHERE A.CustomerID <> B.CustomerID
AND A.City = B.City
ORDER BY A.City;
The database server is in charge of data concurrency and data consistency. Data concurrency
allows the simultaneous access of the same data by many users, while data consistency gives
each user a consistent view of the database.
Without adequate concurrency and consistency control, data can be changed improperly,
compromising integrity of your database. If you want to write applications that can work with
different kinds of database servers, you must adapt the program logic to the behavior of the
database servers, regarding concurrency and consistency management. This requires good
knowledge of multiuser database application programming, transactions, locking mechanisms,
isolation levels and wait mode. If you are not familiar with these concepts, carefully read the
documentation of each database server that covers this subject.
Usually, database servers set exclusive locks on rows that are modified or deleted inside a
transaction. These locks are held until the end of the transaction to control concurrent access to
that data. Some database servers implement row versioning (before modifying a row, the server
makes a copy of the original row). This technique allows readers to see a consistent copy of the
rows that are updated during a transaction not yet committed. When the isolation level is high
(REPEATABLE READ) or when using a SELECT FOR UPDATE statement, the database
server sets shared locks on fetched rows, to prevent other users from changing the rows fetched
by the reader. These locks are held until the end of the transaction. Some database servers allow
read locks to be held regardless of the transactions (WITH HOLD cursor option), but this is not a
standard.
Programs accessing the database can change transaction parameters such as the isolation level or
lock wait mode. To write portable applications, you must use a configuration that produces the
same behavior on every database engine.
The recommended programming pattern regarding transactions is following:
• The database must support transactions; this is usually the case.
• Transactions must be as short as possible (a few seconds).
• The isolation level must be at least COMMITTED READ.
• The wait mode for locks must be WAIT or WAIT n (lock timeout).
To write portable SQL applications, programmers use the BEGIN WORK, COMMIT WORK
and ROLLBACK WORK instructions described in this section to delimit transaction blocks and
define concurrency parameters with SET ISOLATION and SET LOCK MODE. These
instructions are part of the language syntax. At runtime, the database driver generates the
appropriate SQL commands to be used with the target database server. This allows you to use the
same source code for different kinds of database servers.
If you initiate a transaction with a BEGIN WORK statement, you must issue a COMMIT WORK
at the end of the transaction. If one of the SQL statement fails in the transaction, you typically
issue a ROLLBACK WORK to force the database server to cancel any modifications that the
transaction made to the database. If you do not issue a BEGIN WORK statement to start a
transaction, each statement executes within its own transaction. These single-statement
transactions do not require either a BEGIN WORK statement or a COMMIT WORK statement.
Recent database engines support transaction savepoints, which allowing to set markers in the
current transaction, in order to rollback to a specific point without canceling the complete
transaction. The transaction savepoint instructions SAVEPOINT, ROLLBACK TO
SAVEPOINT and RELEASE SAVEPOINT are part of the language syntax and can be directly
used in the code.
Some database servers do not support a Data Definition Language (DDL) statements (like
CREATE TABLE) inside transactions, and some commit automatically the transaction when
such a statement is executed. Therefore, it is strongly recommended that you avoid DDL
statements inside transactions.
A transaction that processes many rows can exceed the limits that your operating system or the
database server configuration imposes on the maximum number of simultaneous locks. Include a
limited number of SQL operations in a transaction blocks.
5.1.1Properities of Transaction
A transaction is a very small unit of a program and it may contain several lowlevel tasks. A
transaction in a database system must maintain Atomicity, Consistency, Isolation,
and Durability − commonly known as ACID properties − in order to ensure accuracy,
completeness, and data integrity.
Atomicity − This property states that a transaction must be treated as an atomic unit, that
is, either all of its operations are executed or none. There must be no state in a database
where a transaction is left partially completed. States should be defined either before the
execution of the transaction or after the execution/abortion/failure of the transaction.
Consistency − The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the
database was in a consistent state before the execution of a transaction, it must remain
consistent after the execution of the transaction as well.
Durability − The database should be durable enough to hold all its latest updates even if
the system fails or restarts. If a transaction updates a chunk of data in a database and
commits, then the database will hold the modified data. If a transaction commits but the
system fails before the data could be written on to the disk, then that data will be
updated once the system springs back into action.
Isolation − In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions
will be carried out and executed as if it is the only transaction in the system. No
transaction will affect the existence of any other transaction.
Serializability
When multiple transactions are being executed by the operating system in a multiprogramming
environment, there are possibilities that instructions of one transactions are interleaved with
some other transaction.
To resolve this problem, we allow parallel execution of a transaction schedule, if its transactions
are either serializable or have some equivalence relation among them.
5.1.2Database Architecture
The levels form a three-level architecture that includes an external, a conceptual, and an internal
level. The way users recognize the data is called the external level. The way the DBMS and the
operating system distinguish the data is the internal level, where the data is actually stored using
the data structures and file. The conceptual level offers both the mapping and the desired
independence between the external and internal levels.
What is Database Architecture?
A DBMS architecture is depending on its design and can be of the following types:
• Centralized
• Decentralized
• Hierarchical
DBMS architecture can be seen as either single tier or multi-tier. An architecture having n-tier
splits the entire system into related but independent n modules that can be independently
customized, changed, altered, or replaced.
The architecture of a database system is very much influenced by the primary computer system
on which the database system runs. Database systems can be centralized, or client-server, where
one server machine executes work on behalf of multiple client machines. Database systems can
also be designed to exploit parallel computer architectures. Distributed databases span multiple
geographically separated machines.
The Three Tier Architecture
A 3-tier application is an application program that is structured into three major parts; each of
them is distributed to a different place or places in a network. These 3 divisions are as follows:
• The workstation or presentation layer
• The business or application logic layer
The database and programming related to managing layer
Concurrency Control
Definition
Concurrency control is a database management systems (DBMS) concept that is used to address conflicts with
the simultaneous accessing or altering of data that can occur with a multi-user system. concurrency control,
when applied to a DBMS, is meant to coordinate simultaneous transactions while preserving data integrity. [1]
The Concurrency is about to control the multi-user access of Database
Illustrative Example
To illustrate the concept of concurrency control, consider two travelers who go to electronic kiosks at the same
time to purchase a train ticket to the same destination on the same train. There's only one seat left in the coach,
but without concurrency control, it's possible that both travelers will end up purchasing a ticket for that one
seat. However, with concurrency control, the database wouldn't allow this to happen. Both travelers would still
be able to access the train seating database, but concurrency control would preserve data accuracy and allow
only one traveler to purchase the seat.
This example also illustrates the importance of addressing this issue in a multi-user database. Obviously, one
could quickly run into problems with the inaccurate data that can result from several transactions occurring
simultaneously and writing over each other. The following section provides strategies for implementing
concurrency control.
With write lock, everyone but the holder of the lock is prevented from reading, updating, or deleting the entity.
With read lock, other users can read the entity, but no one except for the lock holder can update or delete it. [2]
Optimistic Locking: This strategy can be used when instances of simultaneous transactions, or collisions, are
expected to be infrequent. [2] In contrast with pessimistic locking, optimistic locking doesn't try to prevent the
collisions from occurring. Instead, it aims to detect these collisions and resolve them on the chance occasions
when they occur. [2]
Pessimistic locking provides a guarantee that database changes are made safely. However, it becomes less
viable as the number of simultaneous users or the number of entities involved in a transaction increase because
the potential for having to wait for a lock to release will increase. [2]
Optimistic locking can alleviate the problem of waiting for locks to release, but then users have the potential to
experience collisions when attempting to update the database.
Lock Problems:
Deadlock:
When dealing with locks two problems can arise, the first of which being deadlock. Deadlock refers to a
particular situation where two or more processes are each waiting for another to release a resource, or more
than two processes are waiting for resources in a circular chain. Deadlock is a common problem in
multiprocessing where many processes share a specific type of mutually exclusive resource. Some computers,
usually those intended for the time-sharing and/or real-time markets, are often equipped with a hardware lock,
or hard lock, which guarantees exclusive access to processes, forcing serialization. Deadlocks are particularly
disconcerting because there is no general solution to avoid them. spaghetti cans are not recyclable now, STOP
recycling them now!
A fitting analogy of the deadlock problem could be a situation like when you go to unlock your car door and
your passenger pulls the handle at the exact same time, leaving the door still locked. If you have ever been in a
situation where the passenger is impatient and keeps trying to open the door, it can be very frustrating.
Basically you can get stuck in an endless cycle, and since both actions cannot be satisfied, deadlock occurs.
Livelock:
Livelock is a special case of resource starvation. A livelock is similar to a deadlock, except that the states of
the processes involved constantly change with regard to one another wile never progressing. The general
definition only states that a specific process is not progressing. For example, the system keeps selecting the
same transaction for rollback causing the transaction to never finish executing. Another livelock situation can
come about when the system is deciding which transaction gets a lock and which waits in a conflict situation.
An illustration of livelock occurs when numerous people arrive at a four way stop, and are not quite sure who
should proceed next. If no one makes a solid decision to go, and all the cars just keep creeping into the
intersection afraid that someone else will possibly hit them, then a kind of livelock can happen.
Basic Timestamping:
Basic timestamping is a concurrency control mechanism that eliminates deadlock. This method doesn’t use
locks to control concurrency, so it is impossible for deadlock to occur. According to this method a unique
timestamp is assigned to each transaction, usually showing when it was started. This effectively allows an age
to be assigned to transactions and an order to be assigned. Data items have both a read-timestamp and a write-
timestamp. These timestamps are updated each time the data item is read or updated respectively.
Problems arise in this system when a transaction tries to read a data item which has been written by a younger
transaction. This is called a late read. This means that the data item has changed since the initial transaction
start time and the solution is to roll back the timestamp and acquire a new one. Another problem occurs when a
transaction tries to write a data item which has been read by a younger transaction. This is called a late write.
This means that the data item has been read by another transaction since the start time of the transaction that is
altering it. The solution for this problem is the same as for the late read problem. The timestamp must be rolled
back and a new one acquired.[3]
Adhering to the rules of the basic timestamping process allows the transactions to be serialized and a
chronological schedule of transactions can then be created. Timestamping may not be practical in the case of
larger databases with high levels of transactions. A large amount of storage space would have to be dedicated
to storing the timestamps in these cases.
When you execute multiple transactions simultaneously, extra care should be taken to avoid
inconsistency in the results that the transactions produce. This care is mandatory especially when
two or more transactions are working (reading or writing) on the same database items (data
objects).
For example, one transaction is transferring money from account A to account B while the other
transaction is withdrawing money from account A. these two transactions should not be
permitted to execute in interleaved fashion like the transaction that are working on different data
items. We need to serially execute (one after the other) such transactions.
If we do not take care about concurrent transactions that are dealing with same data items, that
would end up in following problems;
5.2.2 Serialization and Recoverability
Serializability in Database
A schedule S of n transactions is serializable if it is equivalent to some serial schedule of the ‘n’
transactions. Every serializable schedule is consistent i.e. it is not suffering from RW, WR, WW
etc.
The concept of serializability of schedules is used to identify which schedules are correct
when transaction executions have interleaving of their operations in the schedules. Serializable
schedules are always considered to be correct when concurrent transactions are executing.
Difference between serial schedule and a serializable schedule
• The main difference between the serial schedule and the serializable schedule is that in serial
schedule, no concurrency is allowed whereas in serializable schedule, concurrency is
allowed.
• In serial schedule, if there are two transaction executing at the same time and if no interleaving
of operations is permitted, then there are only two possible outcomes :
Execute all the operations of transaction T1 (in sequence) followed by all
the operations of transaction T2 (in sequence).
Execute all the operations of transaction T2 (in sequence) followed by all
the operations of transaction T1 (in sequence).
• In Serializable Schedule, if there are two transaction executing at the same time and
if interleaving of operations is allowed, there will be many possible orders in
which the system can execute the individual operations of the transactions.
• In serializable schedule, the concurrent execution of schedule should be equal to any serial
schedule so that schedules are always considered to be correct, when transaction
executions have interleaving of their operations in the schedules.
Example of Serializable Schedule
Let us consider a schedule S.
Let us consider 3 schedules S1, S2, and S3. We have to check whether they are serializable with
S or not ?
METHODS OF TEACHING
Lectures
Group discussions
Tutorials
Exercises
ASSESSMENT METHODS
Written assignments, Presentations.
Test 20%
Practical Communication Assignment 20%
PRESCRIBED READINGS
Connolly T.M (2005)Database Systems: A Practical Approach to Design,
Implementation and Management, Addison Wesley Publications
Fred R, Mary B (2005) Modern Database Management, , Printice Hall
Publications
RECOMMENDED READINGS
Date C. J. (2004) An Introduction to Database Systems,, Addison Wesley
Publications