You are on page 1of 24

Ministry of Secondary Education Republic of Cameroon

Peace – Work – Fatherland


& Quality International School
QIS TKC - YAOUNDE School Year 2014/2015
Department of Computer Science

DATABASE FUNDAMENTALS
Class: Comp. Sc A/L By: NGANFOR JESPA

The term database refers to a collection of related data from which the users can efficiently
retrieve the desired information. In addition to the storage and retrieval of data, certain other
operations can also be performed on a database. These operations include adding, updating
and deleting data. All these operations on a database are performed using a database
management system (DBMS). Essentially, a DBMS is a computerized record-keeping system.

In this topic we will be introduced to the basic terminology used in a database management
system (such as normalization, entities, attributes, keys, relational database management
systems, structured query language).

Table of Contents
I. INTRODUCTION TO DATABASES ...............................................................................2
II. DATABASE MANAGEMENT SYSTEM (DBMS)......................................................4
III. DATABASE MODELS..................................................................................................4
IV. DATA BASE ABSTRACTION LEVELS .....................................................................6
V. DATABASE USERS......................................................................................................7
VI. CENTRLIZED DATABASE vs DISTRIBUTED DATABASE ...................................7
VII. ENTITY-RELATION MODEL......................................................................................8
VIII. RELATIONAL DATABASE ...................................................................................10
IX. DATA INTEGRITY .....................................................................................................15
X. DATABASE NORMALIZATION...............................................................................15
XI. INTRODUCTION TO QUERIES ................................................................................18

Page 1
Topic: Database Fundamentals By NGANFOR JESPA

I. INTRODUCTION TO DATABASES

I.1 Definition and examples


A database can be summarily described as a repository for data. It is a collection of non-
redundant data which can be shared by different application systems. Although databases are
generally computerized, instances of non-computerized (paper-based) databases from
everyday life can be cited in abundance. A dictionary, a phone book, a collection of recipes
and a TV guide are all common examples of non-computerized databases. The examples of
computerized databases include customer files, search engines, books catalogue, equipment
inventories and sales transactions.

I.2 Computerized database VS non-computerized database


Computerized database Paper-based database
Can hold vast amount of data Limited by the physical space available
Very fast to find a specific record Can take a while to manually search through all
of the record
Can easily search for a specific criteria e.g. Difficult to search for a specific criteria; every
“All the student who live in Ntarikon” record would have to be manually look at
Data can be sorted into ascending or Difficult to sort data on more than one criteria
descending order on multiple criteria
The database can be kept secure by the use The only security will be locking up the record
of password
Can easily update or amend a record Changes have to be done manually. Record can
look messy if scribbled out
Easy to make a back-up in case of data lost Difficult to make a back-up because every page
would have to be rewritten or photocopied

I.3 Database terminologies


I.3.1 Data and Information

Data can be anything such as a number, a person's name, images, sounds and so on. Hence,
data can be defined as a set of isolated and unrelated raw facts (represented by values), which
have little or no meaning because they lack a context for evaluation (e.g. ‘Monica’, ‘36’,
‘Chief’ …). When the data are processed and converted into a meaningful and useful form, it
is known as information. Hence, information can be defined as a set of organized and
validated collection of data. For example, 'Monica is 35 years old and she is a chef'.

Strictly speaking, data refer to the values physically recorded in the database, whereas
information refers to the conclusion or meaning drawn out of it. With respect to database,
these terms are synonymous.

I.3.2 Data validation

Data validation is the process of ensuring that a program operates on clean, correct and
useful data. It uses routines, often called "validation rules" "validation constraints" or

Page 2
Topic: Database Fundamentals By NGANFOR JESPA

"Check routines", that checks for correctness, meaningfulness, and security of data that
are input to the system.

I.3.3 Metadata

Metadata is literally "data about data." This term refers to information about data itself --
perhaps the origin, size, formatting or other characteristics of a data item. In the database
field, metadata is essential to understanding and interpreting the contents of a data warehouse
(central repositories of integrated data from one or more disparate sources).

I.3.4 Data Dictionary

A data dictionary is a collection of descriptions of the data objects or items in a data model
for the benefit of programmers and others who need to refer to them. When developing
programs that use the data model, a data dictionary can be consulted to understand where a
data item fits in the structure, what values it may contain, and basically what the data item
means in real-world terms.

I.3.5 Field, record, file

In the structure of a database, the smallest component under which data is entered is the field.
All fields in the same database have unique names, several data fields make up a record,
several records make up a file, and several files make up a database.

A File is a named collection of logically related multiple records. Depending on the database
software, a table can also be referred to as a table.

Example: in a database maintaining information about employee,

- The fields can be Code, Deptt, Name, Address, City and Phone (see Figure 1).
- Fields Code, Deptt, Name, Address, City and Phone for a particular employee form a
record. Figure 1 contains five records (0101–0109) and each record has six fields.
- A collection of all the employee records of a company form employee table.

Figure1. Fields and Records in a Table

Page 3
Topic: Database Fundamentals By NGANFOR JESPA

II. DATABASE MANAGEMENT SYSTEM (DBMS)

A DBMS is a collection of programs that manages the database structure and controls access
to the data stored in the database. The DBMS serves as the intermediary between the end
user and the database by translating user requests into the complex computer code. The end
user interacts with the DBMS through an application program.

Data Model: A data model is a representation of a real world situation about which data is
to be collected and stored in a database. A data model depicts the dataflow and logical
interrelationships among different data elements.

Database system: Database system is a general term that refers to the combination of a
database, a DBMS and a data model. A database system consists of Data (the database),
Software, Hardware and Users. It allows users to Store, Update, Retrieve, Organize and
Protect their data.

Figure 2: a database system

Some DBMS examples include MySQL, PostgreSQL, Microsoft Access, SQL Server,
FileMaker, Oracle, RDBMS, dBase, Clipper, and FoxPro.

III. DATABASE MODELS

A data model is a collection of concepts and rules for the description of the structure of the
database. Structure of the database means the data types, the constraints and the relationships
for the description or storage of data respectively. Database models may be grouped into two
categories:
• conceptual model focuses on the logical nature of the data representation and is
concerned with what is represented in the database; conceptual model include Entity-
Relationship (ER) Model and Object-Oriented (OO) Model
• Implementation model emphases on how information is represented in the database or on
how the data structures are implemented to represent what is modelled. Implementation
models include the hierarchical database model, the network database model, and the
relational database model.

Page 4
Topic: Database Fundamentals By NGANFOR JESPA

III.1 Hierarchical Database Model


The hierarchical data model organizes the
data in a tree-like structure. Each child node
(also known as dependents) can have only
one parent node.
The main advantage of the hierarchical data
model is that the data access is quite
predictable in structure, and therefore, both
retrieval and updates can be highly optimized
by a DBMS. However, the main drawback of
this model is that the links are 'hard coded'
into the data structure, that is, the links are Figure 3 : The hierarchical database model
permanently established and cannot be
modified.

III.2 Network Database Model


Using records and sets, this model uses a one-to-
many relationship approach for the data records.
Multiple branches are allocated for lower-level
structures and branches that are then connected
by multiple nodes, which represent higher-level
structures within the information. This database
modelling method provides an efficient way to
retrieve information and organize the data so that
it can be looked at multiple ways, providing a
means of increasing business performance and
reaction time. This is a viable model for
planning road, train, or utility networks, but it
is difficult to alter the database because
Figure 4 : The Network database model
information entered can alter the entire database.

III.3 Relational Database


The relational data model represents the database as a collection of simple two-dimensional
tables called relations. The rows of a relation are referred to as tipples and the columns are
referred to as attributes.

Page 5
Topic: Database Fundamentals By NGANFOR JESPA

Fig 5: a relational database

III.4 Object-oriented Database Model


The object-oriented model is a relatively new data model and provides an outlook for the
future database models. An object-oriented database stores and maintains objects. An object
is an item that can contain both the data and the procedures that manipulate the data. For
example, a student object might contain not only data about a student's name, roll number and
address, but also procedures on some tasks such as printing the student record or calculating
the student's tuition fees. Like the other models, the object model assumes that objects can
conceptually be collected together into meaningful groups known as classes.

III.5: Object Relational Model:


These models have created an entirely new type of database, which combines database design
with application program to solve specific technical problems while leveraging the best of
both worlds. To date, object databases still need to be refined to achieve greater
standardization. Real world applications of this model often include technical or scientific
fields, such as engineering and molecular biology.

IV. DATA BASE ABSTRACTION LEVELS

Three – level architecture for database system is proposed to archive the characteristics of the
database approach. The goal of this architecture is separate the applications and the physical
database so the actual details of how data is organized are hided from the users.

Figure 6: Three- level database Architecture

Page 6
Topic: Database Fundamentals By NGANFOR JESPA

1) External level or view: View level part may be considered the "who" part of the
picture. In this highest level, there exists a number of views which of is defined a part
of the actual database. Each view is provided for a user or a group of users so that it
helps in simplified the interaction between the user and system.
2) Conceptual (logical) level: At conceptual level, the emphasis lies on the "what" part
of the picture. It describes the logical structure of the whole database. The entire
database is described using simple logical concepts such as objects, their properties
or relationships. Thus the complexity of the implementation detail of the data with be
hided from the users (abstraction).
3) Internal (physical) level: Physical level emphasizes the "how" and "where" parts of
the data storage. It describes where the data is actually stored, how is it stored and
how to access it.

V. DATABASE USERS

A database user is a principal at the database level. Every database user is a member of
the public role.

End user: People whose jobs require access to database for querying, updating and
generating report.

Application developer: Write software to allow end users to interface with the database
system. This kind of user need to familiar with the DBMSs to accomplish their task.

Database systems programmer: Writes the database software itself

Database Administrator (DBA): A person or a group of people who is responsible for


designing & managing the database system. i.e. authorizing the access to the database,
monitoring its use and managing all the resource to support the use of the whole database
system

VI. CENTRALIZED DATABASE vs DISTRIBUTED DATABASE

VI.1 Centralized database:


In a centralized database, all the data of
an organization is stored in a single place
such as a mainframe computer or a
server. Users in remote locations access
the data through the Wide Area Network
(WAN) using the application programs
provided to access the data. The
centralized database (the mainframe or
the server) should be able to satisfy all

Figure 7: A centralized database

Page 7
Topic: Database Fundamentals By NGANFOR JESPA

the requests coming to the system, therefore could easily become a bottleneck. But since all
the data reside in a single place it easier to maintain and back up data. Furthermore, it is
easier to maintain data integrity, because once data is stored in a centralized database,
outdated data is no longer available in other places

VI.2 Distributed database


A distributed database is a d a t a b a s e
t h a t consists of two or more files located at
different sites on a computer network.
Because the database is distributed, different
users can access it without interfering with
one another in a transparent manner. Users
access the data in a distributed database by
accessing the WAN, but the user has the
impression that the access of data is done on
his local machine. To keep a distributed
database up to date, it uses the replication and
duplication processes.
Figure 8: A distributed database

VII. ENTITY-RELATION MODEL

The conceptual model can be represented using Entity-Relationship model (E-R model).
The E-R model views the real world as a set of basic objects (known as entities), their
characteristics (known as attributes) and associations among these objects (known as
relationships).

VII.1 Entity, Attributes and relationship

1) An entity is any object in the system that we want to model and store information
about. Entities are usually recognizable concepts, either concrete or abstract, such as
person, places, things, or events which have relevance to the database.
2) An attribute is an item of information which is stored about an entity. There exist
different types of entity

(a) Simple and composite attributes. A simple attribute is the smallest semantic unit of
data, which are atomic (no internal structure). A composite attribute can be subdivided
into parts, e.g., address (street, city, state, zip).
(b) Single and multivalued attributes. Single attributes have a single value for a particular
entity. Multivalued attributes have multiple values of an attribute for a particular entity;
e.g., degrees or courses that a student can have or take.

3) Relationship is an association, dependency or link between two or more.

Page 8
Topic: Database Fundamentals By NGANFOR JESPA

Example: In a business enterprise, entity may be Product, Representative and Customer. The
attributes of product can be Name and Price. The attribute of representative is Name, Region
and Phone. The attributes of Customer are Name, City and Age. The relationship between
customer and product is represented by “Buys”, and the relationship between Representative
and product is represented by “Sells”

Figure 9: Entity, attribute and relationship


VII.2 Entity-Relationship diagram (ERD)
An entity-relationship diagram (ERD) is a graphical representation of entities and their
relationships.

Price Name
Phone Name
Region
City

Name Representative Sells Product Buys Customer

Age
Date

Figure 10. Entities, Attributes and Relationship


• Rectangles represent entity types
• Ellipses represent attributes
• Diamonds represent relationship types
• Lines link attributes to entity types and entity types to relationship types
• Primary key attributes are underlined
VII.3 Types of Relationship
Even though a relationship may involve more than two entities, the most commonly
encountered relationships are binary, involving exactly two entities. Generally, such binary
relationships are of three types and called cardinality: one-to-one, one-to-many and many-
to-many.
a) One-to-one Relationship (1:1)

Page 9
Topic: Database Fundamentals By NGANFOR JESPA

One-to-one relationships occur when each entry in the first table has one, and only one,
counterpart in the second table. One-to-one relationships are rarely used because it is often
more efficient to simply put all of the information in a single table.
E.g. if a man only marries one woman and a woman only marries one man, it is a one-to-
one (1:1) relationship.

Fig 5: One-to-One
b) One-to-many Relationship (1: M)
One-to-many relationships are the most common type of database relationship. They
occur when each record in the first table corresponds to one or more records in the second
table but each record in the second table corresponds to only one record in the first table.
For example, the relationship between a Teachers table and a Students table in an
elementary school database would likely be a one-to-many relationship, because each
student has only one teacher, but each teacher may have multiple students.

Fig 6: One-to-Many
The crowbar represents the many occurrences.
c) Many-to-many Relationship (M:M)
Many-to-many relationships occur when each record in the first table corresponds to one or
more records in the second table and each record in the second table corresponds to one or
more records in the first table. For example, one teacher teaches many students and a student
is taught by many teachers.

Fig 7 Many-to-Many relationship


VIII. RELATIONAL DATABASE
The relational database model is used in most of today's commercial databases. It is used
since the early 80ies and was developed 1970 by E. F. Codd. The relational database model
is based on a mathematical concept where relations are interpreted as tables.

VIII.1 Concept of the relational model


In contrast to the entity-relationship-model (ERM) which is a conceptual data model, the
relational model is a logical data model. The relational model is not about abstract objects
but defines how data should be represented in a specific database management system. The

Page 10
Topic: Database Fundamentals By NGANFOR JESPA

goal of a logical data model is to arrange the data in such a form that it is consistent, non-
redundant and supports operations for data manipulation.
The main organization unit in a relational data model is the relation. A relation can be
represented as a table but the definition of the relation is not necessarily equal to the
definition of the table and vice versa.

VIII.2 Relational model terminologies


The terminologies used in relational model are quite different from the one used in ERM. The
table below shows some terms used in relational model

ER Model Relational Model Database Traditional Programmer

Entity Relation Table File

Entity Instance Tuple Row Record

Attribute Attribute Column Field

Identifier Key Key Key (or link)

VIII.3 Database Keys


Keys are, as their name suggests, a key part of a relational database and a vital part of the
structure of a table. They ensure each record within a table can be uniquely identified by one
or a combination of fields within the table. They help enforce integrity and help identify the
relationship between tables. There are three main types of keys, candidate keys, primary keys
and foreign keys. There is also an alternative key or secondary key that can be used, as the
name suggests, as a secondary or alternative key to the primary key
a) Super Key
A Super key is any combination of fields within a table that uniquely identifies each record
within that table.
b) Candidate key
In a table, there can be more than one field that can uniquely identify each record. All such
fields are known as candidate keys. One of these candidate keys is chosen as a primary key;
the other keys that are not chosen as primary key are known as alternate keys or secondary
keys.

c) Primary Key:
A field or a set of fields that uniquely identify each record in a table is known as a primary
key. This implies that no two records in the relation can have same value for the primary key.
For example, an employee number uniquely identifies a member of staff within a company.

Page 11
Topic: Database Fundamentals By NGANFOR JESPA

An IP address uniquely addresses a PC on the internet. A primary key is mandatory. That is,
each entity occurrence must have a value for its primary key.
d) Foreign Key:
A field of a table that references the primary key of another table is referred to as foreign
key. Figure 13.3 illustrates how a foreign key constraint is related to a primary key constraint.
Here, the field Item_Code in the PURCHASE table references the field Item_Code in the
ITEM relation. Thus, the attribute Item_Code in the PURCHASE relation is the foreign key.

Figure 8. Foreign Key


e) Simple Key
Any of the keys described before (i.e. primary, secondary or foreign) may have one or more
attributes. A simple key consists of a single attribute to uniquely identify an entity
occurrence.
f) Compound Key
A compound key consists of more than one attribute to uniquely identify an entity
occurrence. Each attribute, which makes up the key, is also a simple key in its own right.
For example, we have an entity named enrolment, which holds the courses on which a
student is enrolled. In this scenario a student is allowed to enroll on more than one
course. This has a compound key of both student number and course number, which is
required to uniquely identify a student on a particular course.

Fig 9: a compound key


Student number and course number combined is a compound primary key for the
enrolment entity.
g) Composite Key
A composite key consists of more than one attribute to uniquely identify an entity
occurrence. This differs from a compound key in that one or more of the attributes, which
make up the key, are not simple keys in their own right.

Page 12
Topic: Database Fundamentals By NGANFOR JESPA

For example, you have a database holding your CD collection. One of the entities is called
tracks, which holds details of the tracks on a CD. This has a composite key of CD name,
track number.

Fig 10: a composite key


CD name in the track entity is a simple key, linking to the CD entity, but track number is not
a simple key in its own right.

Application exercise
For each of the following entities, list possible primary keys. Then, suggest secondary keys,
if any: Student, Course, Unit, Result, Classroom, Lecturer, Department, and Attendance

VIII.3 Transforming an ERM to a relational model scheme


VIII.3.1 Entity to Relation Conversion

 General steps:

- Entities – In general, each entity will be converted directly to a relation.


- The attributes of the entity become the attributes of the Relation.
- The Identifier of the Entity becomes a Key of the Relation.
- Relationships will be mapped as “Foreign Keys”. There are a number of variations
to this step that will be described in more details below.

VIII.3.2 Representing Relationships – One to Many (1:M)


To represent a 1:M relationship, take the primary key of the parent table (on the “1” side)
and insert it as a foreign key into the child table (on the “M” side). It may make sense to
rename the foreign key to reflect its relationship to the table you are inserting it into.

Page 13
Topic: Database Fundamentals By NGANFOR JESPA

VIII.3.3 Representing Relationships – Many to Many

There is no direct representation of a M:N relationship in the relational model. You will need
to turn each M:N relationship between two entities into a separate relation (table) of its
own. This relation will usually have as its own primary key the combination of two foreign
keys – each of these will be the primary key of one of the relations involved in this
relationship.

VIII.3.4 Representing Relationships – One to One

The key or one relation is stored in the second relation.

Page 14
Topic: Database Fundamentals By NGANFOR JESPA

NB: If a relationship has attributes then they need to go into a table. Where to put them
depends on the type of the relationship. In a 1:1 or 1:M relationship, put them the same place
the foreign key goes (on the M side in 1:M). In a M:N relationship, put them in the new table
you create for the relationship.

IX. DATA INTEGRITY


Data integrity refers to the accuracy and consistency of stored data, indicated by an absence
of any alteration in data between two updates of a data record. The concept of data integrity
ensures that all data in a database can be traced and connected to other data. This ensures that
everything is recoverable and searchable. Having a single, well-defined and well-controlled
data integrity system increases stability, performance, reusability and maintainability.
The following three integrity constraints are used in a relational database structure to achieve
data integrity:
 Entity Integrity: The rule states that every table must have its own primary key and
that each has to be unique and not null.
 Referential Integrity: This is the concept of foreign keys. The rule states that the
foreign key value can be in two states. The first state is that the foreign key
value would refer to a primary key value of another table, or it can be null.
 Domain Integrity: This states that all columns in a relational database are in
a defined domain.

X. DATABASE NORMALIZATION

Normalization is a process in which data attributes within a data model are organized to
increase the cohesion of entity types. In other words, the goal of data normalization is to
reduce and even eliminate data redundancy, an important consideration for application
developers. There are two goals of the normalization process:
- eliminating redundant data (for example, storing the same data in more than one
table) and
- ensuring data dependencies make sense (only storing related data in a table).
Both of these are worthy goals as they reduce the amount of space a database consumes and
ensure that data is logically stored.

X.1 Dependencies
In order to be able to normalize a relation, we must first understand the concept of
dependency between attributes within a relation. There exist various types of dependencies:

1) Functional dependency: If A and B are attributes of relation R, B is functionally


dependent on A (denoted A --> B), if each value of A in R is associated with exactly
one value of B in R.
2) Full functional dependency: We talk about full functional dependency if attribute B
is functional dependent on A, if A is a composite primary key and B is not already
functional dependent on parts of A.

Page 15
Topic: Database Fundamentals By NGANFOR JESPA

3) Transitive dependency: If A determines B and B determines C then C is determined


by (dependent on) A. We write A --> B and B --> C then A --> C.

Examples: Let’s consider the following relation

X.2 The Normal Forms


The database community has developed a series of guidelines for ensuring that databases are
normalized. These are referred to as normal forms and are numbered from one (the lowest
form of normalization, referred to as first normal form or 1NF) through five (fifth normal
form or 5NF). In practical applications, you'll often see 1NF , 2NF , and 3NF along with the
occasional 4NF. Fifth normal form is very rarely seen and won't be discussed in this article.

X.2.1 First normal form (1NF)

A table is in first normal form (1NF) if a relation cannot have repeating fields or groups (no
field must have more than one value): To do it, we have to:
a) Eliminate duplicative columns from the
same table.
b) Create separate tables for each group of
related data and identify each row with a
unique column or set of columns (the
primary key).
Example Let’s consider the following relation:
Students (Surname, LastName, Knowledge)

The attribute Knowledge can contain multiple values


and therefore the relation is not in the first normal
form.

To get to the first normal form (1NF) we must create a


separate tuple for each value of the multivalued
attribute

X.2.2 Second normal form (2NF)

Page 16
Topic: Database Fundamentals By NGANFOR JESPA

A table is 2NF when it is in 1NF and when all of its non-key attributes are fully dependent
on its primary key (no partial dependency). That is, if X → A holds, then there should not be
any proper subset Y of X, for that Y → A also holds. To do it, we should:
a) Remove subsets of data that apply to multiple rows of a table and place them in
separate tables.
b) Create relationships between these new tables and their predecessors through the use
of foreign keys.
Example: Let’s consider the following relation:
We see here in Student_Project
relation that the primary key
attributes are Stu_ID and
Proj_ID. According to the rule,
Non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon both and not on
any of the prime key attribute individually. But we find that Stu_Name can be identified by
Stu_ID and Proj_Name can be identified by Proj_ID independently. This is called partial
dependency, w h i c h i s n o t a l l o w e d i n S e c o n d
Normal Form.

So to put the table into 2NF, We broke the


relation in two as depicted in the above picture. So
there exists no partial dependency.

X.2.3 Third normal form (3NF)


A table is in third normal form (3NF) if it is in 2NF and there is no transitive dependency,
that is, an attribute depends on one or more other non-key attributes.
Third normal form (3NF) goes one large step further:
 Meet all the requirements of the second normal form.
 Remove columns that are not dependent upon the primary key.
Example: Let’s consider the following student_detail relation
We find that in above
depicted Student_detail
relation, Stu_ID is key
and the only primary key
Attribute. We find that City can be identified by
Stu_ID as well as Zip itself. Additionally, Stu_ID →
Zip → City, so there exist transitive dependency.

We broke the relation as above depicted two relations


to bring it into 3NF

Page 17
Topic: Database Fundamentals By NGANFOR JESPA

X.2.4 Boyce-Codd Normal Form

BCNF is an extension of Third Normal Form in strict way. it referred to as the "third
and half (3.5) normal form", since it adds one more requirement:
 Meet all the requirements of the third normal form.
 Every determinant must be a candidate key.

X.2.5 Fourth normal form (4NF)

Finally, fourth normal form (4NF) has one additional requirement:


 Meet all the requirements of the third normal form.
 A relation is in 4NF if it has no multi-valued dependencies.
Multivalued dependencies occur when the presence of one or more rows in a table implies
the presence of one or more other rows in that same table. In practice we rarely need to apply
the 4NF to a database.

XI. INTRODUCTION TO QUERIES

A database query is used to extract data from the database in a readable format according to
the user's request. For instance, if you have an employee table, you might issue a SQL
statement that returns the employee who is paid the most. This request to the database for
usable employee information is a typical query that can be performed in a relational database.
The most known and most used query language is SQL (Structured Query Language).

XI.1 Introduction to SQL


SQL (Structured Query Language) is a database query language developed by IBM in the
1970s as a way of getting information into and out of relational DBMSs. A fundamental
difference between SQL and standard programming languages is that SQL is declarative, that
is, the user has to specify what kind of data are required from the database and the RDBMS is
responsible for figuring out the way to retrieve it.

XI.2 Basic operations in a RDB


In a relational database, three basic operations are used to develop useful sets of data:
selection, projection and join.
 The selection ( ) operation retrieves certain records from a relation based on the user-
specified criteria.
 The projection ( ) operation extracts fields from a relation, permitting the user to create
new relations that contain only the required information.
 The join operation combines the data from the two relations based on a common column,
providing the user with more information than is available in individual relations.
Together, these three operations are part of relational algebra.

Page 18
Topic: Database Fundamentals By NGANFOR JESPA

XI.3 SQL Data Types


When the table is defined every field in it is assigned a data type. The type of a data value
both defines and constrains the kinds of operations, which may be performed on it. Some of
the most commonly used SQL data types are as follows:

• CHAR(size): It defines a fixed size-length character string (can contain letters,


numbers and special characters), where size can be a maximum of 255.
• VARCHAR (size): It defines a variable length character string (can contain
letters, numbers and special characters) of up to size characters.
• NUMBER (size): It defines an integer-type data with maximum number of digits
up to size specified in parenthesis.
• DATE: It is used to store date. By default, the format is YYYY-MM-DD.
• NUMBER (size, decimal): It holds numbers with fractions. The maximum
numbers of digits are specified in size. The maximum number of digits to the right of
the decimal is specified in decimal.

XI.4 SQL Basic Commands


SQL commands can be divided into two main sublanguages: DDL and DML

• Data Definition Language: DDL is used to create and delete database and its objects.
These commands are primarily used by the DBA during the building and removal
phases of a database project. The most important DDL statements in SQL are as
follows:
- CREATE TABLE: To create a new table.
- ALTER TABLE: To modify the structure of a table.
- DROP TABLE: To delete a table.
• Data Manipulation Language: DML is used to retrieve, insert, modify and delete
database information. These commands will be used by all database users during the
routine operation of the database. The most important DML statements in SQL are the
following:
- INSERT: To insert data into a table.
- UPDATE: To update data in a table.
- DELETE: To delete data from a table.
- SELECT: To retrieve data from a table.

NOTE: All SQL queries must be terminated by a semicolon (;) even if the statement extends
over many lines.

XI.4.1 CREATE TABLE Command

The CREATE TABLE command is used to define the structure of the table.

Syntax: Example:
CREATE TABLE <tablename> ( CREATE TABLE EMPLOYEE(
<field1> <data type>, Code NUMBER(5),

Page 19
Topic: Database Fundamentals By NGANFOR JESPA

< field2> <data type>, Deptt CHAR(10), Name


... ... ... ... ... CHAR(20), Address
< fieldN> <data type>); CHAR(100), Telephone
CHAR(8) Salary
NUMBER(8,2));

• The table and column names must start with a letter followed by letters, numbers or
underscores.
• Avoid using SQL keywords as names for tables or columns (such as SELECT,
CREATE, and INSERT).
• For each column, a name and a data type must be specified and the column name
must be unique within the table definition.
• Each column definition should be separated with a comma.

XI.4.2 ALTER TABLE Command

The ALTER TABLE command allows a user to change the structure of an existing table.

• New columns can be added with the ADD clause.


• Existing columns can be modified with the MODIFY clause.
• Columns can be removed from a table by using the DROP clause.

Syntax:
ALTER TABLE <tablename>
<ADD | MODIFY | DROP column(s)>;

Examples: Explanation
1 ALTER TABLE EMPLOYEE command will add a new column, named Email,
ADD Email CHAR(25); having a maximum width of 25
characters in the EMPLOYEE table.
2 ALTER TABLE EMPLOYEE command will change the maximum width of the
MODIFY Name CHAR(25); Name column to 25 characters in the EMPLOYEE
table.
3 ALTER TABLE EMPLOYEE command will delete the Deptt column from the
DROP Deptt; EMPLOYEE table.

XI.4.3 DROP TABLE Command

The DROP TABLE command removes the table definition (with all records).

• Columns can be removed from a table by using the DROP clause.

Syntax: Examples:
DROP TABLE <tablename>; DROP TABLE EMPLOYEE;
The above SQL command will delete the EMPLOYEE table.

XI.4.4 INSERT Command

The INSERT command is used to insert or add rows (records) into the specified table.

Page 20
Topic: Database Fundamentals By NGANFOR JESPA

Syntax:
INSERT INTO <tablename> (column1, column2, ..., columnN)
VALUES (value1, value2, ..., valueN);
Examples Explanation
1 INSERT INTO EMPLOYEE ( example 1 will add a new record at the
Code, Deptt, Name, Address, Salary) bottom of the EMPLOYEE table
VALUES (101, 'RD01', 'Prince', 'Park Way', consisting of the values in parenthesis.
15000);
2 INSERT INTO EMPLOYEE
VALUES (102, 'RD01', 'Pankaj', 'Pitampura',
26062700, 8000);
Note that for each of the listed columns, a matching value must be specified. In case no
column list is specified, then a value must be given for each column and in the same order as
specified in the CREATE TABLE command.
XI.4.5 UPDATE Command

The UPDATE command is used for modifying attribute values of records in a table.

Syntax:
UPDATE <tablename>
SET column1 = value1
[, column2 = value2]
... ... ... ... ...
[, columnN = valueN]
[WHERE <condition>];
Note that components specified inside the square brackets [] are optional.

Examples Explanation
1 UPDATE EMPLOYEE Example 1 command will update (in our case,
SET Salary = Salary + 1000;increments) the Salary field with 1000 for all the records.
2 UPDATE EMPLOYEE Example 2 command will increment the Salary column
SET Salary = Salary + 1000 with 1000 f o r o n l y t h o s e r o w s t h a t comply
WHERE Deptt = 'RD01'; w i t h condition specified in WHERE clause (Deptt =
'RD01').
XI.4.6 DELETE Command

The DELETE command is used to delete all or selected records from the specified table.

Syntax: Example:
DELETE FROM <tablename> DELETE FROM EMPLOYEE
[WHERE <condition>]; WHERE Salary > 8000;

NOTE: If WHERE condition is not used in the DELETE command, then all the records from
the specified table will be deleted.

XI.4.7 SELECT command

Page 21
Topic: Database Fundamentals By NGANFOR JESPA

The SELECT statement is used to query the database and retrieve selected data.

Syntax:
SELECT <column1, column2, column3,...., columnN>
FROM <tablename>
[WHERE <condition>]
[GROUP BY <column1, column2, column3,...., columnN>]
[HAVING <condition>]
[ORDER BY <column1, column2, column3,...., columnN [ASC|DESC]>];

To select all the columns of a table, use * instead of column list with SELECT.

Examples Explanation
1 SELECT Code, Name, Salary The SELECT statement selects the values of the three
FROM EMPLOYEE; specified c o l u m n s f r o m t h e E M P L O Y E E t a b l e .
This operation is called projection.
2 SELECT * The SELECT statement selects all those columns from
FROM EMPLOYEE EMPLOYEE table in which the Salary column contains a
WHERE Salary > 7500; value greater than 7500. This operation is called selection.
SELECT * The SELECT statement displays the result in a descending
FROM EMPLOYEE order by the attribute Name.
WHERE Salary > 7500
ORDER BY Name DESC;

The ORDER BY clause specifies a sorting order in which the result tuples of a query are to
be displayed; DESC specifies a descending order. By default, ORDER BY arranges the
result set in ascending order (whether one uses ASC or not).

XI.5 SQL Joins


The SQL Joins clause is used to combine records from two or more tables in a database. A
JOIN is a means for combining fields from two tables by using values common to each.

Consider the following two tables; (a) CUSTOMERS table is as follows:

(a) CUSTOMERS (b) ORDERS


ID Name Age Address Salary Oid Date Cust_id Amount
1 Ramesh 32 Ahmedabad 2000 102 2009-10-08 3 3000
2 Khilan 25 Dehli 1500 100 2009-10-08 3 1500
3 kaushik 23 Chaitali 2000 101 2009-11-20 2 1560
4 Chaitali 25 Mumbai 6500 103 2008-05-20 4 2060
5 Hardik 27 Bhopal 8500
6 Komal 22 MP 4500
7 Muffy 24 Indore 10000

Now, let us join these two tables in our SELECT statement as follows:

Query This would produce the following result:


SELECT ID, Name, Age, Amount ID Name Age Amount
3 kaushik 23 3000

Page 22
Topic: Database Fundamentals By NGANFOR JESPA
FROM CUSTOMERS, ORDERS 3 kaushik 23 1500
WHERE CUSTOMERS.ID = 2 Khilan 25 1560
ORDERS.CUST_ID; 4 Chaitali 25 2060

Here, it is noticeable that the join is performed in the WHERE clause. Several operators can be
used to join tables, such as =, <, >, <>, <=, >=,! =, BETWEEN, LIKE, and NOT; they can all be
used to join tables. However, the most common operator is the equal symbol.

SQL Join Types:

There are different types of joins available in SQL:

 INNER JOIN: returns rows when there is a match in both tables.


 LEFT JOIN: returns all rows from the left table, even if there are no matches in the right
table.
 RIGHT JOIN: returns all rows from the right table, even if there are no matches in the left
table.
 FULL JOIN: returns rows when there is a match in one of the tables.
 SELF JOIN: is used to join a table to itself as if the table were two tables, temporarily
renaming at least one table in the SQL statement.
 CARTESIAN JOIN: returns the Cartesian product of the sets of records from the two
or more joined tables.

QUERRY APPLICATION

1) The table E (for EMPLOYEE)


nr name salary
1 John 100
5 Sarah 300
7 Tom 100

Query Solution Query Solution


select salary Salary select nr, salary nr Salary
from E 100 from E 1 100
300 5 300
7 100
select * nr name salary select * nr name salary
from E 1 John 100 from E 7 Tom 100
where salary < 200 7 Tom 100 where salary < 200
and nr >= 7
select name, salary Name Salary
from E John 100
where salary < 200 Tom 100

Page 23
Topic: Database Fundamentals By NGANFOR JESPA

2) Cartesian product:

The table E (for EMPLOYEE) The table D (for DEPARTMENT)

h enr ename dept dnr dname


1 Bill A A Marketing
2 Sarah C B Sales
C Legal
3 John A

Query Solution Query Solution


select * enr ename dept dnr dname select * enr ename dept dnr dname
from E, D 1 Bill A A Marketing from E, D 1 Bill A A Marketing
1 Bill A B Sales where dept = 2 Sarah C C Sales
1 Bill A C Legal dnr 3 John A A Marketing
2 Sarah B A Marketing
2 Sarah B B Sales
2 Sarah B C Legal
3 John C A Marketing
3 John C B Sales
3 John C C Legal

3) Natural join

The table E (for EMPLOYEE) The table D (for DEPARTMENT)


nr name dept nr name
1 Bill A A Marketing
2 Sarah C B Sales
3 John A C Legal

Query Solution
select * enr ename dept dnr dname
from E as E(enr, ename, dept), D as D(dnr, dname) 1 Bill A A Marketing
where dept = dnr
2 Sarah C C Legal
3 John A A Marketing
Select * nr name dept nr name
from E, D 1 Bill A A Marketing
where dept = D.nr
2 Sarah C C Legal
3 John A A Marketing

Page 24

You might also like