You are on page 1of 137

Chapter 1: Introduction to Database

Systems

Fundamentals of Database Systems

Software Engineering Department


Outline
• Data

• Database

• Database Management System

• Database system

• Database-System Applications

• Purpose of Database Systems

• Database system and file based approach

• Characteristics of the database approach

• Database users and Administrators


What is Data?
• The term data refers to known raw facts about things like people, places,
events and concepts.
• The word raw indicates that the facts have not yet been processed to
reveal their meaning.
• Data is any fact that can be recorded (or) stored into a computer hard disk.
• there are various forms of data, like video, audio image, graphics, text
document,
• Data is the foundation of information, which is the bedrock of knowledge—
that is, the body of information and facts about a specific subject.

6-3
Information and Knowledge
• Information is the processed data presented in a form suitable for
human interpretation.
• Information is the result of processing raw data to reveal its meaning.
• Knowledge:

– The body of information and facts about a specific subject.

– Knowledge implies familiarity, awareness, and understanding of


information as it applies to an environment.
– A key characteristic is that new knowledge can be derived from
old knowledge.

6-4
Database
• A database is an organized collection of interrelated data, generally
stored and accessed electronically from a computer system.
• It contains information relevant to an enterprise.
• Management of data involves both defining structures for storage of
information and providing mechanisms for the manipulation of information.
• The database is an important assets for many organizations.

• Database touch all aspects of our lives.

1-5
Database (Cont.)
• The database is an integrated collection of facts about an organization.
• Organization can be a University or a department in a University , Insurance companies,
Manufacturing companies, Banks, Airlines, Telecommunications, Governmental and Non-
governmental Organizations, Research institutions ,etc.

• The database is used as a central data source for other applications

1-6
Database (Cont.)
• A database can be defined accurately using its basic implicit properties:
– It represents some aspects of the Mini-world
– Any assortment of data is not a data base.
– It is a collection logically coherent data

– It has intended users and applications

1-7
Database Management System
• Database Management System is a program or software which is used to construct, manipulate and retrieve the data

in the database.

• A database management system (DBMS) is a collection of usually complex pieces of software that allows a user to

define, create, manipulate and protect and manage access to a database.

• The primary goal of a DBMS is to provide a way to store and retrieve database information that is both convenient

and efficient.

1-8
Database Systems
• A database system is a collection of interrelated data and a set of
programs that allow users to access and modify these data.
• Database systems are used to manage collections of data that are:
– Highly valuable,

– Relatively large, and

– Accessed by multiple users and applications, often at the same time.

1-9
Database Systems (Cont’d…)
• A modern database system is a complex software system whose task

is to manage a large, complex collection of data.

• The database system must ensure the safety of the information

stored, despite system crashes or attempts at unauthorized access.

• Database systems are ubiquitous today, and most people interact,

either directly or indirectly, with databases many times every day.

1-10
Database Applications Examples
Here are some of the applications of database system
• Enterprise Information
– Sales: customers, products, purchases
– Accounting: payments, receipts, assets
– Human Resources: Information about employees, salaries, payroll
taxes.
• Manufacturing: management of production, inventory, orders, supply
chain.
• Banking and finance
– customer information, accounts, loans, and banking transactions.
– Credit card transactions
– Finance: sales and purchases of financial instruments (e.g., stocks
and bonds; storing real-time market data
• Universities: registration, grades

1-11
Database Applications Examples (Cont.)

• Airlines: reservations, schedules

• Telecommunication: records of calls, texts, and data usage, generating


monthly bills, maintaining balances on prepaid calling cards
• Web-based services

– Online retailers: order tracking, customized recommendations

– Online advertisements

• Document databases

• Navigation systems: For maintaining the locations of varies places of


interest along with the exact routes of roads, train systems, buses,
etc.
1-12
Purpose of Database Systems
In the early days, database applications were built directly on top
of file systems, which leads to:
• Data redundancy and inconsistency
– data is stored in multiple file formats resulting induplication of
information in different files
• Difficulty in accessing data
– Need to write a new program to carry out each new task
• Data isolation
– Multiple files and formats
• Integrity problems
– Integrity constraints (e.g., account balance > 0) become “buried”
in program code rather than being stated explicitly
– Hard to add new constraints or change existing ones

13 1-13
Purpose of Database Systems (Cont.)

• Atomicity of updates
– Failures may leave database in an inconsistent state with partial updates carried out
– Example: Transfer of funds from one account to another should either complete or
not happen at all
• Concurrent access by multiple users
– Concurrent access needed for performance
– Uncontrolled concurrent accesses can lead to inconsistencies
• Ex: Two people reading a balance (say 100) and updating it by withdrawing
money (say 50 each) at the same time
• Security problems
– Hard to provide user access to some, but not all, data

Database systems offer solutions to all the above problems

14 1-14
Example University Database
• Mini-world for a University Database example: Part of a
UNIVERSITY environment.
• Some mini-world entities:

– STUDENTs

– PRE-REQUISITE COURSEs

– COURSEs

– DEPARTMENTs

– INSTRUCTORs

Note: The above could be expressed in the ENTITY-RELATIONSHIP data


model.

15
University Database Example

• Data consists of information about:


– Students
– Instructors
– Classes
• Application program examples:
– Add new students, instructors, and courses
– Register students for courses, and generate class rosters
– Assign grades to students, compute grade point averages (GPA)
and generate transcripts

16
Example University Database(Continued…)
• Some mini-world relationships:
– STUDENTs take COURSEs
– COURSEs have PRE-REQUISITE COURSEs
– INSTRUCTORs teach COURSEs
– COURSEs are offered by DEPARTMENTs
– STUDENTs major in DEPARTMENTs

17
Example Relational Database Snapshot

18
Evolution of Database Systems
• Two approaches to convert data to information:
– File-based
• Developed starting from 1960’s
• Stores, manipulates, retrieves data from large flat files
– Database (relational systems)
• Developed by E. F. Codd of IBM the early 1980's
• Widely used today

Evolution of Database
Systems

19
File-Based Approach
• A file is a collection of related information
• A system of files and collection of application programs manipulating them is
a file-based system

University File-Based System

20
Limitations of File-Based Approach

• Much efforts for ad hoc query answering:


– What is the average grade for Mr.Abrham’s students?

– List the activities for all students enrolled in CoSc2041.

– Which personnel are students as well as staff?

• Other limitations:
– Duplication of data

– Data dependency

– Slow development, high maintenance and fixed queries

21
Database Users and Administrators

• A primary goal of a database system is to retrieve information from


and store new information in the database.
• People who work with a database can be categorized as

– database users or

• Naïve users

• Application programmers

• Sophisticated users

– database administrators.

1-22
Database users
• Naive users are unsophisticated users who interact with the system

by using predefined user interfaces, such as web or mobile

applications.

• The typical user interface for naive users is a forms interface, where

the user can fill in appropriate fields of the form.

• Naive users may also view read reports generated from the database.

1-23
Database users
• Application programmers are computer professionals who write application programs.

Application programmers can choose from many tools to develop user interfaces.

• Sophisticated users interact with the system without writing programs. Instead, they

form their requests either using a database query language or by using tools such as

data analysis software.

Analysts who submit queries to explore data in the database fall in this category.

1-24
Database Administrator
A person who has central control over the system is called a database
administrator (DBA). Responsibilities of a DBA include:

 Schema definition

 Storage structure and access-method definition

 Schema and physical-organization modification

 Granting of authorization for data access

 Routine maintenance

 Periodically backing up the database

 Ensuring that enough free disk space is available for normal operations, and
upgrading disk space as required
 Monitoring jobs running on the database
1-25
Responsibilities of DBA
Schema definition.
The DBA creates the original database schema by executing a set of data
definition statements in the DDL.

Storage structure and access-method definition.


The DBA may specify some parameters pertaining to the physical organization
of the data and the indices to be created.

Schema and physical-organization modification.


The DBA carries out changes to the schema and physical organization to reflect
the changing needs of the organization, or to alter the physical organization to
improve performance.

1-26
Responsibilities of DBA (Cont’d.)

• Granting of authorization for data access.

By granting different types of authorization, the database administrator can

regulate which parts of the database various users can access.

The authorization information is kept in a special system structure that the

database system consults whenever a user tries to access the data in the

system.

1-27
Responsibilities of DBA (Cont’d.)
• Routine maintenance.

Examples of the database administrator’s routine maintenance activities are:


Periodically backing up the database onto remote servers, to prevent loss of data
in case of disasters such as flooding.
Ensuring that enough free disk space is available for normal operations, and
upgrading disk space as required.

Monitoring jobs running on the database and ensuring that performance is not
degraded by very expensive tasks submitted by some users.

1-28
Fundamentals of Database Systems
! !
!
n e
o
e r
p t
h a
C
of
n d
E

1-29
Data Base System Concepts
and Architecture
Outline

1. Data Models, Schemas, and Instances


2. Three –Schema Architecture and Data Independence
3. Database Languages and Interfaces
4. Software Modules of DBMS
5. Centralized and Client/Server architecture for
DBMS
6. Classification of Database Management Systems
2.1 Data Models, Schemas, and Instances

 Data Model: A set of concepts to describe the structure


of a database, and certain constraints that the database should obey.
They provide data abstraction.

 Structure: data types, relationships, constraints

 Data Model Operations: Operations for specifying database retrievals


and updates by referring to the concepts of the data model.

 Generic operation: insert, delete, modify, retrieve

 User-defined operations
2.1.1 Categories of Data Models
 Conceptual (high-level, semantic) data models: Provide concepts
that are close to the way many users perceive data. (Also called
entity-based or object-based data models.)
 Entities
 Attributes
 Relationships
 Physical (low-level, internal) data models: Provide concepts that
describe details of how data is stored in the computer.
 Record formats
 Record ordering
 Access paths
2.1.1 Categories of Data Models (Continued…)

 Implementation (record-oriented) data models: Provide concepts that fall


between the above two, balancing user views with some computer storage
details.
 Relational
 Network
 Hierarchical
2.1.2 Schemas, Instances and Database State
 Database Schema (meta-data): The description of a database. It is
very important for further understanding of the database and its
constants
 Schema Diagram: A diagrammatic display of (some aspects of ) a
database schema
 Database Instance: The actual data stored in a database at a
particular moment in time. Also called database state ( or
occurrence, snapshot)

 The database schema changes very infrequently. The database state


changes every time the database is updated. Schema is also called
intension, whereas state is called extension
Schema diagram for the UNIVERSITY database

 Schema Constructs
The UNIVERSITY database
Differentiating Schema and Instances

 The database schema changes less frequently. It changes during


schema evolution
 Database definition. Database Instances are empty
 When you start loading data into the database

Initial state of your database


 When you update or delete. Your database state changes
 Valid state is a state that satisfies the structure and constraints of
the database
2.2 Three –Schema Architecture and Data
Independence
 2.2.1 Three-Schema Architecture

Database architecture: is a set of specifications , rules and processes that determine how
the database is designed and constructed. It also determines how data is stored in the
database and how data is accessed by the components of DBMS

 It is generally accepted architecture for database systems and it was suggested by


ANSI/SPARC
 The 3-schema architecture was proposed to achieve and visualize 3
important characteristics :
 Insulation of programs and data/program and operations
 Support of multiple views of the data
 Use of catalog (database description)
2.2.1 Three-Schema Architecture(Continued…)
 The database system is described by three abstract levels which correspond
to three different views of the data stored in the database

 In this architecture schemas can be defined at 3 different levels:

1. External or View level - individual user view


 Contains a number of external schemas or user views
 Each external schema describes the part of the database that a
particular user group is interested in and hides the rest
 A high-level data model or Implementation data model can be used at
this level.
Fig. Three-Schema Architecture
2.2.1 Three-Schema Architecture(Continued…)

2. Conceptual or Logical level- community user view

 Has a conceptual schema that describes the structure of the


whole database to a community of users
 The conceptual schema hides the details of physical storage
structures and concentrates on describing entities, relationships,
data types, constraints and user operations
 A high-level data model or Implementation data model can be used
at this level
2.2.1 Three-Schema Architecture(Continued…)
3.Internal or Physical level - physical or storage view

 Has an internal schema that describes the details of the physical


storage structures of the database
 The internal schema uses a physical data model and describes the
complete details of data storage and access paths for the
database
 Efficiency considerations are the most important at the internal
level and the data structures are chosen to provide an efficient
database access

 Do all DBMSs implement the 3-schema architecture?

 Does the 3- schema architecture deal about the actual data?


Mappings

 Mapping :the processes of transforming requests and results between levels

– i.e. The process of transforming a request specified on an external

schema into a request against the conceptual schema, and then into a request

against the internal schema for processing over the stored database

 These mappings might be time consuming. So some DBMSs- specially those that

are meant to support small databases-do not support external views


2.2.2 Data Independence
 What is Data independence ?
 The capacity to change the schema at one lower level without
having to change the schema at the next higher level(s). Only the
mappings between the levels are changed
 The 3-schema architecture makes it easier to achieve true data
independence
 There are two kinds of data independence:

1. Logical data independence

2. Physical data independence


2.2.2 Data Independence( Contiuued…)

1. Logical data independence


 Is the capacity to change the conceptual schema without having to
change the external schemas or application programs
 When do we need to change the conceptual schema ?
 To expand the database
 Reduce the database
2.2.2 Data Independence( Continued …)

2.Physical data independence:

 Is the capacity to change the internal schema without having to

change the conceptual schema or external schemas

 When do we need to change the internal schema?

 To improve the performance of retrieval or update of data but

the conceptual schema will remain the same if we don’t add

extra record type, data item or constraint into the database.


2.3 Database Languages and Interfaces
 2.3.1 DBMS Languages
a. In DBMSs where no strict separation of levels is maintained:
 One language, called the data definition language (DDL), is used by
the DBA and by database designers to define both the conceptual
and internal schemas
b. In DBMSs where a clear separation is maintained between the conceptual
and internal levels :
 The DDL is used to specify the conceptual schema only
 Another language, the storage definition language (SDL), is used to
specify the internal schema
 The mappings between the two schemas may be specified in either
one of
• these languages.
c. For true three-schema architecture:
 We would need a third language called, the view definition language
(VDL), to specify user views and their mappings to the conceptual
schema, but in most DBMSs the DDL is used to define both
conceptual and external schemas.
2.3 Database Languages and Interfaces
2.3 Database Languages and Interfaces

 2.3.1 DBMS Languages


 Any DML can be embedded in a general-purpose programming
language
 That language is called the host language and the DML is called
the
data sublanguage
 But if a high-level DML is used in a stand-alone interactive manner,
it is called a query language
2.3 Database Languages and Interfaces
 2.3.2 DBMS Interfaces
 Menu-Based Interfaces for Browsing
– Provide the user with list of options called menus
– The menus lead the user through the formulation of a requests
– No need to memorise specific syntax of a query language
– Query is composed step-by-step by picking options
 Forms-Based Interfaces
– Such interfaces display a form to each user
– User then fills out form entries
– Usually for naïve users as interfaces for canned transactions
– Some DBMSs have form specification languages and others have utilities
2.3 Database Languages and Interfaces
 2.3.2 DBMS Interfaces
 Graphical User Interfaces
– GUI typically displays a schema to the user in a diagrammatic form
– Query is specified by manipulating the diagram
– In many cases they use both menus and forms
– Most GUIs use pointing devices to pick certain parts of a displayed
schema diagram
 Natural Language Interfaces
– Accept requests written in English or some other language and try to
understand them
– Have their own schema similar to that of database conceptual schema
– The natural language interface refers to the words in its schema, as well
as to
• a set of standard words, to interpret the request
2.3 Database Languages and Interfaces
 2.3.2 DBMS Interfaces
 Interfaces for Parametric Users
– Parametric users have a small set of operations that they must perform repeatedly
– A small set of abbreviated commands is used to minimize the number of key
strokes required
– Function keys can be programmed to initiate various commands
 Interfaces for the DBA
– Special interface is needed for the DBA to execute some commands that
are allowed only for him.
– Such commands may include commands for creating users, setting system
parameters, granting account authorization, schema change, etc
2.4 Software Modules of the DBMS
Application
programmers

APPLICATION
DBA Staff Casual users PROGRAMS

Precompiled

DDL STATEMENTS PRIVILEGED COMMANDS INTERACTIVE QUERY Parametric users

Host Language Compiler

DDL Compiler Query Compiler


COMPILED (CANNED)
DML STATEMENTS
TRANSACTIONS

execution

System Catalog/ DML Compiler


Data Dictionary

Run-time Database Processor


execution execution

Concurrency Control/
Stored Data Manager Backup/Recovery Subsystems

STORED DATABASE
2.4 Software Modules of the DBMS
 Access to the disk is controlled primarily by the operating system
(OS), which schedules disk input/output
 Stored data manager module of the DBMS controls access to DBMS
information that is stored on disk, whether it is part of the database
or the catalogue. Note the dotted lines and circles
 Stored data manager uses basic services of the OS to perform low-
level data transfer between hard disk and main memory
 Other task of Stored data manager: Handling buffers in main memory
2.4 Software Modules of the DBMS
 The DDL compiler processes schema definitions, specified in the DDL, and
stores the schema definitions (meta-data) in the DBMS catalogue
 The run-time database processor handles database accesses at run time
under the supervision of stored data manager
 It receives retrieval or update operations and carries them out on the
database
 The query compiler handles high-level queries that are entered interactively
2.4 Software Modules of the DBMS

 It parses, analyzes, and compiles or interprets a query by creating database


access code, and then generates calls to the run-time processor for executing
the code
 The pre-compiler extracts DML commands from an application program written
in a host programming language
 These commands are sent to the DML compiler for compilation into object code
for database access
 The rest of the program is sent to the host language compiler
 Object codes generated by DML compiler + Object codes from Host language
compiler = Canned transactions
 The executable codes of the canned transactions include calls to the Run-time
database processor
2.5 Centralized & Client/Server architecture for DBMS
 2.5.1 Centralized and Distributed Databases
 DBMSs can be categorized as Centralized or Distributed base on the
number of sites over which the database is located

 A DBMS is centralized if the data is stored at a single computer site

 A centralized DBMS can support multiple users, but the DBMS and the
database themselves reside totally at a single computer site

 The following figure shows a single-tier Centralized database


2.5 Centralized & Client/Server architecture for
DBMS
Centralized Database
Dumb
Terminal 1
DBMS
Dumb
Terminal 2

Data Dumb
Terminal n
2.5 Centralized & Client/Server architecture for DBMS

 A distributed DBMS (DDBMS) can have the actual database and DBMS

distributed over many sites and connected by a computer network

 Homogeneous DDBMSs use the same DBMS at multiple sites

 Heterogeneous DDBMSs use different DBMSs at multiple sites .This leads to

a Federated DBMS (or Multidatabase system), where the participating

DBMSs are loosely coupled and have a degree of local autonomy

 Many DDBMSs use client-server architecture.


Homogeneous DDBMSs

CCLLIIE CLIENT CLIENT CLIENT

MYSQL
ENNTT
MYSQL

LAN

CLIENT CLIENT CLIENT CLIENT

Mekelle Adigrat

CLIENT CLIENT CLIENT


CCLLIIE

MYSQL
ENNTT
MYSQL

LAN

CLIENT CLIENT CLIENT CLIENT

Axum Adwa
Heterogeneous DDBMSs

CCLLIIE CLIENT CLIENT CLIENT

DB 2
MYSQL

ENNTT

LAN

CLIENT CLIENT CLIENT CLIENT

Mekelle Adigrat

CLIENT CLIENT CLIENT


CCLLIIE

ACCESS
MS
ORACLE

ENNTT

LAN

CLIENT CLIENT CLIENT CLIENT

Axum Adwa
2.5 Centralized & Client/Server architecture for
DBMS
 2.5.2 Client/Server architecture for DBMS
 A client is defined as a requester of services and a server is defined as

the provider of services. Usually the server is a more powerful machine to

provide the services

 A service can be any resource such as data, display device, CPU time,

memory, etc.

 A single machine can be both a client and a server depending on the

software configuration

 The client and the server has their own characteristics:


2.5 Centralized & Client/Server architecture for DBMS

Characteristics of a client
 Initiates requests
 Waits for replies
 Receives replies
 Usually connects to a small number of servers at one time
 Typically interacts directly with end-users using a graphical user interface
Characteristics of a server
 Never initiates requests or activities
 Waits for and replies to requests from connected clients
 A server can remotely install/uninstall applications and transfer data to the
intended clients
2.5 Centralized & Client/Server
architecture for DBMS
 Two-tier Client/Server Architecture
 It contains a client and a server
 The DBMS & database are stored on the server, and the interface used to
access the database is installed on the client
 An interface called Open Data Base Connectivity (ODBC) provides
Application Program Interface (API) that enables clients to call the DBMS

Server
Client DBMS
User Interface
= Request
Data
Two-tier Client/Server
Over a Communication Network

CLIENT 1
#1 SERVE
R

1
CLIENT  D/BASE
#2


CLIENT 
#3

 Data Request
 Data Response
2.5 Centralized & Client/Server
architecture for DBMS
 Usage Considerations
 Used extensively in non- time critical information processing where
management and operations of the system are not complex

 Used frequently in decision support systems where the transaction


load is light

 They work well in relatively homogeneous environments with


processing rules (business rules) that do not change very often and
when workgroup size is expected to be fewer than 100 users, such
as in small businesses.
2.5 Centralized & Client/Server
architecture for DBMS
 Advantages:
 Since processing was shared between the client and server, sufficient
number of users could interact with such a system
 Disadvantages:
 Performance declines when the number of users exceeds 100
 Clients may require more resources
 With much similar processing on many clients, extending
existing applications and implementing new ones becomes
more complex
 User interface and business processing tend to get mixed
together
2.5 Centralized & Client/Server
architecture for DBMS
 The 3-tier architecture
 Purpose : to overcome the limitations
GUI Web
of the two-tier approach Client
Interface
 There are three tiers :
 Client tier –the interface
between the user and the system
 Middle tier(mainly Application Application Application
Server or Web Server) – contains Server or Web Programs, Web
most of the logic and communicates Server Pages
b/n the other tiers
 Database Server – manages the
database
Database
Database Server Management
System
2.5 Centralized & Client/Server
architecture for DBMS
 The intermediate layer called the Applications Server or Web Server has the
following basic functions:

 Stores the web connectivity software and the rules and business logic
(constraints) part of the application used to access the right amount of
data from the database server

 Acts like a conduit for sending partially processed data between the
database server and the client

 Encryption and decryption of data for transmission between the client &
server
2.5 Centralized & Client/Server
architecture for DBMS
 The 3-tier architecture
 IIS=Internet Information Server
 IIS is a web server developed by
Microsoft
 It is the 2nd most popular web
server next to Apache HTTP
Server
2.5 Centralized & Client/Server
architecture for DBMS
 The thin-client 3-tier model has these tiers:
 The database management system (DBMS)
 The main application software
 A web browser
 IT students in their web design course :
 Database tier: MySQL
 Middle tier: PHP/HTML
 Client tier: Your favorite web browser
 The thick-client 3-tier model has these tiers:
 The database management system (DBMS)
 The main application software
 Some sort of interface software which must be installed on each
client machine
2.5 Centralized & Client/Server
architecture for DBMS
 Advantages of the 3-tier architecture
 Removes a huge processing burden from client machines

 Can be used to consolidate enterprise-wide business rules as application


servers process business rules in a single place for use by multiple
applications. When rules change, only a change to the application server is
required

 Any knowledge of the database server may be hidden from the client.
database queries may be presented to client in alternative forms

 The middle tier server improves performance, flexibility, maintainability,


reusability, and scalability by centralizing process logic
2.5 Centralized & Client/Server
architecture for DBMS
 An n-tier architecture is one which has n tiers, usually including
a
database tier, a client tier, and n-2 other tiers in between

 In other words an n-tier model will have :

 The database management system (DBMS)


 (n-2) application layers
 A GUI (thin or thick)
2.6 Classification of Database Management
Systems

 Based on the data models used :  Based on number of Sites


 Centralized
 Traditional : Relational, Hierarchical,
 Distributed
Network
 Based on purpose
 Emerging : Object-relational , Object-  Special-purpose
Oriented  General-purpose
 Based on Cost
 <$10K
 $10K – $100K
 >$100K
 Based on number of users
 Single-user
 Multi-user
Thanks you
If you have any Question?
Chapter 3: Structured Query Language (SQL)

Fundamentals of Database Systems

Software Engineering Department


Outline
• Introduction to SQL

• SQL Statements

• Data Definition Language SQL Statements

• Data Manipulation Language SQL Statements

• SQL Constraints

• SQL Data Types

• Grant and Invoke SQL Statements

• SQL functions

• What is SQL Injection?

• and how to secure the database

6-78
Introduction to SQL
• The full form of SQL is Structured Query Language

• It is the language used to communicate with the database

• SQL is used by all major relational database systems like MySQL,

PostgreSQL, SQL Server, IBM DB2, Oracle etc.

• SQL was initially developed at IBM (1970’s) by Donald Chamberlin

and Raymond Boyce. Initially it was called as “ Structured English

Query Language” (SEQUEL) and pronounced as “sequel”

1-79
Introduction to SQL (Contd.)
• SQL is ANSI standard: American National Standards Institute
Standard
• SQL is composed of commands that enable users to create database
and table structures, perform various types of data manipulation and
data administration, and query the database to extract useful
information.
• SQL is a standard way to query/ obtain/ add/delete/modify the data
in a database.

1-80
Introduction to SQL (Contd.)
• SQL used by Academician’s, Data Scientist, Machine
learning Engineers, Software Engineers, etc.
• SQL is not general purpose programming language, like
C/C++/JAVA/Python, are general purpose programming
languages.
• SQL is called Domain Specific language, because SQL
is only useful in the domain of databases.

1-81
Introduction to SQL (Contd.)
• Its primarily task is way to query/ obtain/
add/delete/modify the data efficiently
• SQL is a declarative programming language,
means you DONOT have to define step by step
procedures to get something, instead just, focus
on: what you want, not how to get it?

1-82
SQL Statements
• SQL Statements play a major role in interacting with the database
• SQL statement is the smallest standalone element that expresses
some actions to be carried out.
• A syntax is how the keyword, identifiers and constants are
combined to from a valid statement
• SQL statements are made up of special words Keywords,
Identifiers, Constants, and Clauses
• SQL is not case sensitive, “select ” and “SELECT” are same.
• To distinguish keywords in a statement, all keywords should be
written in uppercase letters
• Semicolon (;) is required at the end of every SQL statement

1-83
SQL Statements (Contd.)
• Keywords: SQL standard words used to
construct the SQL statement. Some keywords
are optional, while some are mandatory
• Identifiers: Names we give to the database,
tables or columns
• Constants: Literals representing fixed values
• Clauses: portion of an SQL statement

1-84
SQL Statements (Contd.)
• Example: SELECT fname FROM students WHERE
studentid=5;
• KEYWORDS: SELECT,FROM & WHERE (optional)
• Keyword (operator): Equal sign (=)
• Identifiers: fname,students, and studentid
– Fname and studentid are column names
– Students is a table name
• Constant: numeric constant 5
• Clauses: SELECT, FROM, and WHERE clauses

1-85
SQL Statements (Contd.)
• SQL statements are used to manage the database from a
webpage or application, users interact with the database using
form fields.
• SQL statements are dived into two types
– 1. Data Definition Language (DDL)
– 2. Data Manipulation Language (DML)
• DDL and DML, both are SQL statements
• The best way to distinguish between DDL and DML is, by the
type of the SQL statement used and their effect on the database
• DDL changes the database structure, while DML changes only
the data

1-86
Data Definition Language (DDL)
• Data Definition Language is used to specify the database
schema
• Data Definition Language is used to manage database objects
like tables, columns, indexes and views.
• DDL changes the database structure
• Database objects like tables, columns, are created, modified,
or removed using Data Definition Language SQL statements.
• The most important Data Definition Language statements are
– CREATE create database, table with columns
– ALTER modifies the table and column structure
– DROP removes the tables and columns from the database

1-87
Data Definition Language (DDL),…
• The SQL data-definition language (DDL) allows the specification of
information about relations, including:
– The schema for each relation.

– The type of values associated with each attribute.

– The Integrity constraints

– The set of indices to be maintained for each relation.

– Security and authorization information for each relation.

– The physical storage structure of each relation on disk.

1-88
How to create and use databases
• To Display the existing databases
– SHOW DATABASES; the SQL statement will return all the
existing databases in our database management software.

• SQL statement to create a new database is


– CREATE DATABASE database-name;
• To Use database ; goes here
– USE database-name;
• Delete database
– DROP DATABASE database-name;
1-89
What is Database Table?
• A database table is consist of a systematically
structured vertical columns or fields, and
horizontal rows or record
• Each column is a property of the item, while
each row is an item
• Cell is the smallest unit where a column and row
intersect
• Data elements (or) values are stored in cells

1-90
What is Database Table? (Contd.)
• A database table has a specified number of
columns, but it can have any number of rows.
• A single database must have a unique table
name, while, in multiple databases, same table
name can exist in other databases.

1-91
SQL Statement create Database
Table
• SQL statement to Select the database is
– USE existing-database-name;
• SQL statement to Display existing tables inside the selected
database is
– SHOW TABLES; returns all the existing tables inside the
selected database
• SQL statement to get the Structure of a database table is
– DESCRIBE table-name;
• SQL statement to delete a table is
– DROP TABLE existing-table-name;

1-92
SQL Statement to create new table
• The SQL statement to create a new database table is
CREATE TABLE table-name
(A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraint n ));
– table-name is the name of the relation
– each Ai is an attribute name in the schema of relation table-name
– Di is the data type of values in the domain of attribute Ai
• Here is an example SQL statement to create a new table:
CREATE TABLE instructor (
ID CHAR(5) NOT NULL PRIMARY KEY,
name VARCHAR(20),
dept_name VARCHAR(20),
salary NUMERIC(8,2));
• This SQL statement creates a new table called instructor with the given
columns and their data types.
Integrity Constraints in Create Table

• Types of integrity constraints


– PRIMARY KEY (A1, ..., An )
– FOREIGN KEY (Am, ..., An ) references r
– NOT NULL
• SQL prevents any update to the database that violates an integrity constraint.
• Example:
CREATE TABLE instructor (
ID CHAR(5),
name VARCHAR(20) NOT NULL,
dept_name VARCHAR(20),
salary NUMERIC(8,2),
PRIMARY KEY (ID),
FOREIGN KEY (dept_name) REFERENCES department);
And a Few More Relation Definitions

• CREATE TABLE student (


ID VARCHAR(5),
name VARCHAR(20) NOT NULL,
dept_name VARCHAR(20),
tot_cred NUMERIC(3,0),
PRIMARY KEY (ID),
FOREIGN KEY (dept_name) REFERENCES department);

• CREATE TABLE takes (


ID VARCHAR(5),
course_id VARCHAR(8),
sec_id VARCHAR(8),
semester VARCHAR(6),
year NUMERIC(4,0),
grade VARCHAR(2),
PRIMARY KEY (ID, course_id, sec_id, semester, year) ,
FOREIGN KEY (ID) REFERENCES student,
FOREIGN KEY (course_id, sec_id, semester, year) REFERENCES section);
And more still

• CREATE TABLE course (


course_id varchar(8),
title varchar(50),
dept_name varchar(20),
credits numeric(2,0),
primary key (course_id),
foreign key (dept_name) references
department);
DROP
• The DROP DATABASE statement is used to delete (or)
remove the database including all the tables and data
rows in it.
• Be Careful before deleting the database completely from
the database management system, because the system
doesn’t ask for confirmation.
• After executing the SQL statement, DROP DATABASE, the
system will remove the database, tables and data rows
and there is no easy method to recover the lost database.

1-97
DROP (Contd.)
• DROP is used to remove a database and table
from the schema.
• Syntax is:
– DROP DATABASE existing database-name;
– DROP TABLE existing-table-name;
– Example: DROP DATABASE school; where school is
existing database name
• DROP TABLE user; where user is table name

1-98
ALTER
• The ALTER TABLE SQL statement is used to
– Rename (or) change table name,
– Add a column, change datatype of column, change the name of an
existing column and drop table columns in the existing database table
• To rename the table:
– ALTER TABLE old-table-name RENAME TO new-table-name;
• To add new column in the existing table name:
– ALTER TABLE table-name ADD new-column-name data type;
• To modify the column data type in the existing table:
– ALTER TABLE table-name MODIFY COLUMN column-name new data
type;

1-99
ALTER (Contd.)
• To drop a column from existing table
– ALTER TABLE table-name DROP COLUMN column-name;
• To change the name of an existing column name
– ALTER TABE table-name CHANGE old_column_name new-column-
name data type;
• Example: assume table name is user
– ALTER TABLE user RENAME TO user1;
– ALTER TABLE user ADD firstname VARCHAR(30);
– ALTER TABLE user MODIFY firstname VARCHAR(50);
– ALTER TABLE user CHANGE password pasd CHAR(20);
– ALTER TABLE user DROP firstname;

1-100
Data Manipulation Language
(DML)
• Data Manipulation Language is used to
express database queries and update
• DML is used to manage the data that resides
in our tables and columns
• DML changes only the data, not the database
structure
• The data inside a table is inserted, updated, or
deleted using the DML SQL statements

1-101
Data Manipulation Language
(DML)
• The most important Data Manipulation Languages are:
• INSERT INTO
• UPDATE
• DELETE
– INSERT SQL statement add the data
– UPDATE modifies the data
– DELETE removes the data from the database
• Language for accessing and updating the data organized by the
appropriate data model
– DML also known as query language

1-102
Data Manipulation Language
(DML),…
• There are basically two types of data-manipulation language
– Procedural DML -- require a user to specify what data are needed and
how to get those data.
– Declarative DML -- require a user to specify what data are needed
without specifying how to get those data.
• Declarative DMLs are usually easier to learn and use than are
procedural DMLs.
• Declarative DMLs are also referred to as non-procedural DMLs
• The portion of a DML that involves information retrieval is called
a query language.

1-103
INSERT INTO Statement
• The INSERT INTO statement is used to insert or add new data
rows (or) records to a database table
• To add data rows to a table, first select the existing database

• There are two methods that we can use INSERT statement.


Method-1:

INSERT INTO table-name VALUES (value1,value2,value3,…);


• Column names are not specified, values for each column must be
specified sequentially by column order similar the database table.

1-104
INSERT INTO Statement (Contd.)
• Method 2:

• INSERT INTO table-name (column1, column2,…, column n)


VALUES (value1, value2, value3 ,…, value n);
• The column names are given in first bracket and the values are
given in second bracket.
• In first method, we must provide all the columns values, while in
the second method we can add the values to only selected
columns.

1-105
INSERT INTO Statement (Contd.)
• The string values must be enclosed in double quotes, but
enclosed numeric values in quote is optional or not
required. Here arefew examples
• INSERT INTO teachers VALUES (“John Doe”, 1234);
• INSERT INTO students (firstname, lastname, class, age) VALUES
(“John”,”Doe”,”First”, 26);
• insert into instructor values (“10211”, “Smith”, “Biology”, 66000);

1-106
INSERT INTO Statement (Contd.)
• In first method, we must provide all columns values in
the sequence, while in the second method, we can add
the values to a selected columns
• In most of the cases, the second method is more
convenient.

1-107
UPDATE Statement
• UPDATE statement is used to update the data rows in a
database table
• The update statement can update one (or) multiple
column values in a single SQL statement
• WHERE clause is used to specify the data row to be
updated, UPDATE statement without WHERE clause will
update all the data rows in a table

1-108
UPDATE Statement (Contd.)
• Any existing data rows or records in target table remain
unaffected
• Syntax for update statement is:
– UPDATE table_name SET column_name = new_value WHERE condition;
• Example
– UPDATE tablename SET column1=newvalue;

– UPDATE tablename SET column1=newvalue1,


column2=newvalue2, column3=newvalue3 WHERE
columneid=3;

1-109
Basic SQL Query Structure
• A typical SQL query has the form:

select A1, A2, ..., An


from r1, r2, ..., rm
where P

– Ai represents an attribute
– Ri represents a relation
– P is a predicate logic (condtion).
• The result of an SQL query is a relation.
The select Clause
• The select clause lists the attributes desired in the result of a
query
– Main purpose is to retrieve the data from the database
and return it in a tabular structure
• Example: find the names of all instructors:
select name from instructor;
• It defines the columns that will be returned in the final tabular
result set.
• It is executed after the FROM clause and any optional WHERE,
GROUP BY and HAVING clauses if present.

1-111
The select Clause
• The general SELECT statement syntax is:
SELECT expression(s) involving keywords, identifiers and constants
FROM table name
[WHER clause]
[GROUP BY clause]
[HAVING clause]
[ORDER BY clause]

• SELECT columnname1,columnname2 FROM table-name;


–In this statement, the SELECT clause only returns the column names mentioned in the SQL
statement.

1-112
The select Clause (Cont.)
• An asterisk in the select clause denotes “all attributes”
select *
from instructor
• An attribute can be a literal with from clause
select 'A'
from instructor
– Result is a table with one column and N rows
(number of tuples in the instructors table), each row
with value “A”
The select Clause (Cont.)

• The select clause can contain arithmetic expressions involving the


operation, +, –, , and /, and operating on constants or attributes of
tuples.
– The query:
select ID, name, salary/12
from instructor
would return a relation that is the same as the instructor
relation, except that the value of the attribute salary is divided
by 12.
– Can rename “salary/12” using the as clause:
select ID, name, salary/12 as monthly_salary
SELECT DISTINICT Statement
• SELECT DISTINICT Statement is used to return distinct or different values
from a table column.
• In a database table, a column name may contain duplicate or similar
values, in such cases if we want only the distinct values to be displayed
from the columns, the SELECT DISTINICT Statement is very useful.

1-115
The select Clause (Cont.)
• SQL allows duplicates in relations as well as in query results.
• To force the elimination of duplicates, insert the keyword distinct
after select.
• Find the department names of all instructors, and remove
duplicates
select distinct dept_name
from instructor
• The keyword all specifies that duplicates should not be removed.

select all dept_name


from instructor
The where Clause
• The where clause specifies conditions that the result must satisfy
– Corresponds to the selection predicate of the relational algebra.
• To find all instructors in Comp. Sci. dept
SELECT name
FROM instructor
WHERE dept_name = 'Comp. Sci.'
• SQL allows the use of the logical connectives AND, OR, and NOT
• The operands of the logical connectives can be expressions involving the
comparison operators <, <=, >, >=, ==, and <>.
• Comparisons can be applied to results of arithmetic expressions
• To find all instructors in Comp. Sci. dept with salary > 7000
SELECT name
FROM instructor
WHERE dept_name = 'Comp. Sci.’ AND salary > 7000
FROM Clause
• The from clause lists the relations involved in the query
– Corresponds to the Cartesian product operation of the relational
algebra.
• The FROM clause produces a tabular structure also called as the
“result set” or an “intermediate set” or an intermediate table of
the FROM clause.
• The FROM clause is the first clause that the database system
looks at when it parses the SQL statement.
– Example: SELECT firstname FROM students;
• The FROM statements can return the result sets from one table,
more than table, using joins, views and subqueries.
Examples
• Find the names of all instructors who have
taught some course and the course_id
– select name, course_id
from instructor , teaches
where instructor.ID = teaches.ID

• Find the names of all instructors in the Art


department who have taught some course and
the course_id
– select name, course_id
from instructor , teaches
where instructor.ID = teaches.ID
and instructor. dept_name = 'Art'
WHERE Clause
• The WHERE clause is an optional clause used in SQL
statements
• Acts as a filter on the rows of the result set produced by the
FROM clause.
• The WHERE clause mainly depend up on a condition which
evaluates as either be true, false or unknown.
• A condition is made up of keywords, identifiers and constants
to compare values with the data rows values. If condition is
matched it is called as True condition, otherwise, it is False
condition.
• the condition can be simple or complex condition.

1-120
DELETE Statement
• DELETE removes the definition of the relation as well as the data in
the given relation, delete only deletes the tuples but maintains the
table definition.
• DELETE statement is used to delete the data rows in a database table

• The DELETE statement can delete one or multiple data rows from a
database table.

1-121
DELETE Statement (Contd.)
• WHERE clause is used to specify the data row to be deleted,
DELETE statement without WHERE clause will delete all the
data rows in a table

• Examples
– DELETE FROM table-name; Remove all tuples from the table

– DELETE FROM table-name WHERE column1=value;

1-122
SQL Data Types

• Every column in a database table is defined with


a data type, depending up on the data value its
going to store.
• A SQL datatype defines what kind of value a
column can store.
• There are mainly 3 SQL data types:
– Numeric
– Character and
– Temporal data types
SQL Data Types

• NUMERIC data types stores only numeric values. Numeric data types
are integers, floating point numbers and fixed-point numbers.
– The INT data type stores integers ranging from 2,147,483,648 to 2,147,483,647.
An optional "unsigned" can be denoted with the declaration, modifying the range
to be 0 to 4,294,967,295.
– A FLOAT represents small decimal numbers, used when a somewhat
more precise representation of a number is required.
– E.g. Rainfall FLOAT (4, 2);

– Character data types can store alphabets, symbols and numbers.


Example: full name, description, address, etc.
– Temporal data types can store date, time, date and time together.
SQL Data Types

• CHAR(N). Fixed length character string, with user-specified length n.


– The CHAR or CHARACTER data type stores fixed width character columns.
– It is required to enter the column width in CHAR and VARCHAR datatypes.
– Example: CHAR(20), CHAR(50)
– If the inserted character length is less than the defined column width, the value is positioned
to the left and padded with spaces on the right until the character length is equal to the
defined column width.
– Example: in firstname CHAR(20), String ‘John’ will be appended with 16 spaces.
• VARCHAR(N). Variable length character strings, with user-specified maximum
length n.
– The VARCHAR CHARACTER store dynamic width character column
– In VARCHAR the defined width is the maximum width of the value allowed in the data column.
– The inserted data character length will be exactly similar to the character length itself.
– Example in VARCHAR (250) data column will store only 4 character for the value ‘John’
Temporal Data Types
• The Temporal data types consist of date, time and
timestamp (have both date and time)
• The date value stores standard 365- day Gregorian calendar.
• The most popular date format is YYYY-MM-DD, where 4Y
means Year, 2M means Month and 2D is for the Day.
• The TIMESTAMP data type can store date as well as time
components.
• Example: Date_Of_Birth (DATE): 1994-02-13 and last_login
(TIMESTAMP): YYYY-MM-DD HH:MM:SS

1-126
SQL functions
• These functions operate on the multiset of
values of a column of a relation, and return a
value
AVG: average value
MIN: minimum value
MAX: maximum value
SUM: sum of values
COUNT: number of values

1-127
SQL functions Cont’d…
• AVG : This SQL function returns the average value of a column
that contains numeric values
• SUM: This SQL function returns the sum of a column that
contains numeric values.
• MIN: This SQL function returns the smallest, or minimum value
found in a column that contains numeric values
• MAX: This SQL function returns the largest, or maximum value
found in a column that contains numeric values.
• COUNT: This SQL function returns the number of rows in a
table, or the number of rows that match a search criteria.

1-128
Aggregate Functions – Group By

• Find the average salary of instructors in each department


– select dept_name, avg (salary) as avg_salary
from instructor
group by dept_name;
Aggregation (Cont.)

• Attributes in select clause outside of aggregate functions must


appear in group by list
– /* erroneous query */
select dept_name, ID, avg (salary)
from instructor
group by dept_name;
SQL Constraints
• What is a constraint ?
– A constraint is a property assigned to a column or the set of
columns in a table that prevents certain types of inconsistent
data values from
being placed in the column(s).
– Constraints are used to enforce the data integrity
– There are various categories of integrity constraints:
• Entity Integrity: ensures that there are no duplicate rows in a table
• Domain Integrity: enforces valid entries for a given column by
restricting the type, the format, or the range of possible values

1-131
SQL Constraints (Contd.)
• Referential integrity: ensures that rows cannot be deleted, which are
used by other records
• User-Defined Integrity: enforces some specific business rules that do not
fall into entity, domain, or referential integrity categories.
• The SQL constraints defines the specific rules to be follow to the column
data in a database table.
• While INSERTING, UPDATING or DELETING the data rows, if the constraints
rules are not followed, the system will display an error message and the
action will be terminated.
• The SQL constraints are defined while creating a new database table. We
can also alter the table and add new constraints. The standard SQL supports
six constraints: NOT NULL, UNIQUE, PRIMARY KEY, FORIGEN KEY, CHECK
and DEFAULT.

1-132
PRIMARY KEY Constraint
• The primary key constraint is useful to restrict
storing of duplicate data rows in a given
column.
• The primary key column cannot contain NULL
values.
• The primary key can be defined while creating
a new database table or can be added by
using ALTER statement.

1-133
HAVING Clause
• It is designed for use with the GROUP BY
clause to restrict the groups that appear in the
final result table.
• WHERE clause filters individual rows going
into the final result table
• HAVING clause filters groups going into the
final result table.

1-134
What is SQL Injection?
• SQL Injection refers to the act of someone inserting a MySQL
statement to be run on your database without your
knowledge.
• SQL injection is a method where a malicious user can inject
some SQL commands to display other information or destroy
the database, using form fields on web page or application.
• Injection usually occurs when you ask a user for input, like their
name, email and instead of that, they give you a MYSQL
statement that you will unknowingly run on your database.

1-135
What is SQL Injection? (Contd.)
• SQL injection is a code injection technique that might destroy
your database
• SQL injection is one of the most common web hacking
techniques,
• by using SQL injection, a hacker may get access to other users
password and other information
• SQL injection is the placement of malicious code in SQL
statements, via web page input

1-136
Fundamentals of Database
Systems

End of Chapter Three !!!

Have a nice day!!

Never Stop Learning!

1-137

You might also like