You are on page 1of 51

CS203

Database System
Week 1

Dr. Syed Asif Raza, Assistant Professor


Department of Computer Science
FAST-NUCES, Karachi Campus
Asif.raza@nu.edu.pk

Courtesy:
Welcome!
About me
 PIMSAT, Karachi
 BS Information Technology ‘06

 SZABIST, Karachi
 MS Computer Science ‘12

 University of Science and Technology (UST), South Korea


 PhD Computer Science ’18

 Research Interests
 Software-Defined Networking (SDN), Network Function Virtualization (NFV),
Cloud Computing, Virtualization
 Network Security, Blockchain Technology, HPC/HTC

 Industry Experiences
 Korea Institute of Science and Technology Information (KISTI), S. Korea
 Fermi National Accelerator Lab (FNAL), USA
 NESCOM HQ, Islamabad,
 National Telecommunication Corporation (NTC) HQ, Islamabad
 NADRA RHQ, Pakistan
Overview
 This course provides students with the essential concepts, principles, and
techniques of modern database systems from a user perspective.

 Focuses on the functionalities that are offered by database systems and


not on the methods to implement them.

 The course teaches students the ability to develop a solution for a real-
world data management problem that requires the application of the
theories and practices developed in class.

 From a theoretical point of view, this course covers the essential


principles for the design, analysis, and use of computerized database
systems.

 The design and techniques of conceptual modeling, database modeling,


database system architecture, and user/program interfaces are presented
in a unified way.
Objectives and goals

 To be able to understand the underlying concepts


of database, and database management systems
(DBMS)
 To introduce the concept of rational data models
 Analysis and design of database application or
information system
 Experience with SQL
 Implementation of database
Reference Books
 Textbook (or Laboratory Manual for Laboratory Courses):
 Ramez Elmasri & Shamkant B. Navathe, Database Systems,
Models, Languages, Design and Application Programming, 7th Edition,
2016.

 Reference Books
 Thomas Connolly, Carolyn Begg, Database Systems: A practical
approach to design, implementation and Management, 6th Edition,
2015.
 C.J. Date, An Introduction to Database Systems, 8th Edition, 2004
Grading/Assignments/Project
 Grade breakdown
 Term exams (1 & 2) 30%
 Assignments/Quiz 10%
 Project 10%
 Final exam 45%
 Class participation 5%
 All assignments will be in latex format
 Final Project & Term paper
 2~3 members per group
 Plagiarism will be marked as Zero.
 Passing Marks = 50
 Class Attendance should be >= 80.
 No Student will be allowed after 10 Mins starting of the
class.
Contact and Course Logistics
 Instructor: Dr. Syed Asif Raza
 Email: asif.raza@nu.edu.pk
 Contact Hours:
 2:00-3:30pm
 Monday & Tue.
 Office hours
 Office: Compute Science building, Basement 2, room #9

 Course Website
 https://slate.nu.edu.pk
 Check often for announcements
 Assignments/Projects
 Discussion/Help
Databases Applications: Examples

 Supermarkets?
 Credit cards?
 Travel agents?
 Library?
 Insurance?
 University?
 Etc.
Manual filing systems
 Works well

 While number of items to be stored is small

 For only storage or retrieval functionality or large


number of items
File-based Systems

 Early attempt to computerize manual filing system

 Collection of application programs that perform


services for the end users (e.g. reports)

 Each program define and manage its own data


File-based Systems
 Consider the DreamHome example for file-based
system:
 Sales department: responsible for selling and renting
properties

 Contract Department: responsible for handling lease


agreements
Sales department
 PropertyForRent
 PropertyNo, street, city, type, rooms, ownerNo

 Client
 clientNo, fName, Iname, telNo, prefType, maxRent

 PrivateOwner
 ownerNo, fName, Iname, address, telNo
Lease Department
 Lease
 leaseNo, propertyNo, clientNo, rent, paymentMethod, deposit,
paid, rentStart, rentFinish, duration
 PropertyForRent
 propertyNo, street, city, postcode, type, rooms, rent
 Client
 clientNo, fName, Iname, telNo, prefType, telNo
Limitations of file-based systems
 Seperation and isolation of data
 Each program maintain its own data

 Users of one program may unaware of potential useful data held by


other programs
 For example: if we want to produce all houses that match the
requirements of the clients
 Duplication of data
 Decentralized approach taken by each department

 Same data in different programs

 Waste of space

 Data dependence
 File structure is defined in program code

 Incompatible file formats


 Programs written in different languages

 New requirements need new programs


Database approach!

 Definition of data embedded in application programs


 No separate or independent storage of data
 No control over access and manipulations of data beyond
the imposed by the application

Result!
Database Management System (DBMS)
What Is a Database System?
Basic Definitions
 Database:
 A collection of related data.

 Data:
 Known facts that can be recorded and have an implicit meaning.

 Mini-world:
 Some part of the real world about which data is stored in a
database. For example, student grades and transcripts at a
university.
 Database Management System (DBMS):
 A software package/ system to facilitate the creation and
maintenance of a computerized database.
 Database System:
 The DBMS software together with the data itself. Sometimes, the
applications are also included.
Database: What
 Database
 is collection of related data and its metadata organized in a structured
format
 for optimized information management

 Database Management System (DBMS)


 is a software that enables easy creation, access, and modification of
databases
 for efficient and effective database management

 Database System
 is an integrated system of hardware, software, people, procedures, and data
 that define and regulate the collection, storage, management, and use of
data within a database environment
Database: Why
 Purpose of Database
 Optimizes data management
 Transforms data into information
 Importance of Database Design
 Defines the database’s expected use
 different approach needed for different types of databases
 Avoid data redundancy & ensure data integrity
 data is accurate and verifiable
 Poorly designed database generates errors
 leads to bad decisions
 can lead to failure of organization

 Functions of DBMS/Database System


 Stores data and related data entry forms, report definitions, etc.
 Hides the complexities of relational database model from the user
 facilitates the construction/definition of data elements and their relationships
 enables data transformation and presentation
 Enforces data integrity
 Implements data security management
 access, privacy, backup & restoration
Database: How
 Planning & Analysis
 Assess
 Goal of the organization
 Database environment
 existing hardware, software, raw data, data processing procedures
 Identify
 Database needs
 what database can do to further the goal of the organization
 User needs and characteristics
 who the users are, what they want to do, how they envision doing it
 Database system requirements
 what the database system should do to satisfy the database and user needs
 Design
 From conceptual design to a detailed system specification

 Implementation
 Create the database

 Maintenance
 Troubleshoot, update, streamline the database
Business Rules
 What
 Brief, precise, and unambiguous descriptions of operations in an
organization
 based on policies, procedures, or principles within a specific organization
 help to create and enforce actions within that organization’s environment
 apply to any organization that stores and uses data to generate information
 Why
 Enhance understanding & facilitate communication
 Standardize company’s view of data
 Constitute a communications tool between users and designers
 Allow designer to understand business process as well as the nature, role, and
scope of data
 Promote creation of an accurate data model

 How (sources)
 Interviews
 Company managers
 Policy makers
 Department managers
 End users
 Written documentation
 Procedures, Standards, Operations manuals
 Observation
 Business operations
Database: User-centered
 Perspective
 The user is always right. If there is a problem with the use of the system,
the system is the problem, not the user.

 Compliance
 The user has the right to a system that performs exactly as promised.

 Instruction
 The user has the right to easy-to-use instructions (user guides, online or
contextual help, error messages) for understanding and utilizing a system to
achieve desired goals and recover efficiently and gracefully from problem
situations.

 Usability
 The user should be the master of software and hardware technology, not
vice-versa. Products should be natural and intuitive to use.
Database: Data Models
 Importance
 Abstraction of complex real-word data structures in relative simple
(graphical) representations
 Facilitate interaction among the designer, the applications
programmer, and the end user

 Basic Building Blocks


 Entity
 thing about which data are to be collected and stored
 Attribute
 a characteristic of an entity
 Relationship
 describes an association among entities
 Constraint
 restrictions placed on the data
Evolution of Data Models

 Timeline

1960s 1970s 1980s 1990s 2000+

File-based

Hierarchical
Object-
Network
oriented
Relational Web-based
Entity-Relationship
Simplified database system environment
Database Management System
- manages interaction between end users and database

Database Systems: Design, Implementation, & Management: Rob & Coronel


Database System Environment

 Hardware
 Software
- OS
- DBMS
- Applications
 People
 Procedures
 Data

Database Systems: Design, Implementation, & Management: Rob & Coronel


Typical DBMS Functionality
 Define a particular database in terms of its data types,
structures, and constraints
 Construct or Load the initial database contents on a
secondary storage medium
 Manipulating the database:
 Retrieval: Querying, generating reports
 Modification: Insertions, deletions and updates to its content
 Accessing the database through Web applications
 Processing and Sharing by a set of concurrent users and
application programs – yet, keeping all data valid and
consistent
Typical DBMS Functionality
 Other features:
 Protection or Security measures to prevent unauthorized
access
 “Active” processing to take internal actions on data
 Presentation and Visualization of data
 Maintaining the database and associated programs over the
lifetime of the database application
 Called database, software, and system maintenance
Example of a Database
(with a Conceptual Data Model)
 Mini-world for the example:
 Part of a UNIVERSITY environment.
 Some mini-world entities:
 STUDENTs
 COURSEs
 SECTIONs (of COURSEs)
 (academic) DEPARTMENTs
 INSTRUCTORs
Example of a Database
(with a Conceptual Data Model)
 Some mini-world relationships:
 SECTIONs are of specific COURSEs

 STUDENTs take SECTIONs

 COURSEs have prerequisite COURSEs

 INSTRUCTORs teach SECTIONs

 COURSEs are offered by DEPARTMENTs

 STUDENTs major in DEPARTMENTs

 Note: The above entities and relationships are typically


expressed in a conceptual data model, such as the ENTITY-
RELATIONSHIP data model (see later)
Example of a simple database
Main Characteristics of the Database
Approach
 Self-describing nature of a database system:
 A DBMS catalog stores the description of a particular
database (e.g. data structures, types, and constraints)
 The description is called meta-data.
 This allows the DBMS software to work with different database
applications.
 Insulation between programs and data:
 Called program-data independence.
 Allows changing data structures and storage organization
without having to change the DBMS access programs.
Example of a simplified database catalog
Main Characteristics of the Database
Approach (continued)
 Data Abstraction:
 A data model is used to hide storage details and present the
users with a conceptual view of the database.
 Programs refer to the data model constructs rather than data
storage details
 Support of multiple views of the data:
 Each user may see a different view of the database, which
describes only the data of interest to that user.
Main Characteristics of the Database
Approach (continued)
 Sharing of data and multi-user transaction
processing:
 Allowing a set of concurrent users to retrieve from and to
update the database.
 Concurrency control within the DBMS guarantees that each
transaction is correctly executed or aborted
 Recovery subsystem ensures each completed transaction has
its effect permanently recorded in the database
 OLTP (Online Transaction Processing) is a major part of
database applications. This allows hundreds of concurrent
transactions to execute per second.
Describing Data: Data Models
 A data model is a collection of concepts for
describing data.

 A schema is a description of a particular collection of


data, using a given data model.

 The relational model of data is the most widely used


model today.
 Main concept: relation, basically a table with rows and
columns.
 Every relation has a schema, which describes the
columns, or fields.
Levels of Abstraction
Users
 Views describe how users
see the data.

 Conceptual schema
defines logical structure View 1 View 2 View 3

Conceptual Schema
 Physical schema describes
Physical Schema
the files and indexes used.

 (sometimes called the DB


ANSI/SPARC model)
Example: University Database
 Conceptual schema:
 Students(sid: string, name: string, login:
string, age: integer, gpa:real)
 Courses(cid: string, cname:string, View 1 View 2 View 3
credits:integer)
 Enrolled(sid:string, cid:string, Conceptual Schema
grade:string)
 External Schema (View): Physical Schema
 Course_info(cid:string,enrollment:integer
)
 Physical schema: DB
 Relations stored as unordered files.
 Index on first column of Students.
Data Independence
 Applications insulated from
how data is structured and View 1 View 2 View 3
stored.
 Logical data independence:
Protection from changes in Conceptual Schema
logical structure of data.
Physical Schema
 Physical data independence:
Protection from changes in
physical structure of data. DB

 Q: Why are these particularly


important for DBMS?
Queries, Query Plans, and Operators
Count
Having
distinct
SELECT eid,
SELECT
COUNT
FROM
E.loc,
Emp
ename,
AVG(E.sal)
DISTINCT
E
title
(E.eid) 
FROM Emp
GROUP
WHERE BYE,E.loc
E.salProj P, Asgn A
> $50K
WHERE E.eid = A.eid

Group(agg)
HAVING Count(*) > 5
AND P.pid = A.pid Join
Select
AND E.loc <> P.loc
Join
 Proj
Emp
Emp Emp
Asgn
 System handles query plan
generation & optimization;
ensures correct execution. Employees
Projects
Assignments

• Issues: view reconciliation, operator ordering, physical operator


choice, memory management, access path (index) use, …
Concurrency Control
 Concurrent execution of user programs: key to good DBMS
performance.
 Disk accesses frequent, pretty slow

 Keep the CPU working on several programs concurrently.

 Interleaving actions of different programs: trouble!


 e.g., account-transfer & print statement at same time

 DBMS ensures such problems don’t arise.


 Users/programmers can pretend they are using a single-
user system. (called “Isolation”)
 Thank goodness! Don’t have to program “very, very

carefully”.
Transactions: ACID Properties
 Key concept is a transaction: a sequence of database actions
(reads/writes).

 DBMS ensures atomicity (all-or-nothing property) even if system


crashes in the middle of a Xact.
 Each transaction, executed completely, must take the DB between
consistent states or must not run at all.
 DBMS ensures that concurrent transactions appear to run in isolation.
 DBMS ensures durability of committed Xacts even if system crashes.

 Note: can specify simple integrity constraints on the data. The DBMS
enforces these.
 Beyond this, the DBMS does not understand the semantics of the
data.
 Ensuring that a single transaction (run alone) preserves
consistency is largely the user’s responsibility!
Ensuring Transaction Properites
 DBMS ensures atomicity (all-or-nothing property) even if
system crashes in the middle of a Xact.
 DBMS ensures durability of committed Xacts even if system
crashes.
 Idea: Keep a log (history) of all actions carried out by the
DBMS while executing a set of Xacts:
 Before a change is made to the database, the corresponding
log entry is forced to a safe location.
 After a crash, the effects of partially executed transactions are
undone using the log. Effects of committed transactions are
redone using the log.
 trickier than it sounds!
The Log

 The following actions are recorded in the log:


 Ti writes an object: the old value and the new value.

 Log record must go to disk before the changed page!


 Ti commits/aborts: a log record indicating this action.
 Log is often duplexed and archived on “stable” storage.
 All log related activities (and in fact, all concurrency control
related activities such as lock/unlock, dealing with deadlocks
etc.) are handled transparently by the DBMS.
Structure of a DBMS These layers
must consider
concurrency
control and
 A typical DBMS has a recovery
layered architecture. Query Optimization
 The figure does not show and Execution
the concurrency control and
Relational Operators
recovery components.
 Each database system has Files and Access Methods
its own variations.
Buffer Management

Disk Space Management

DB
Advantages of a DBMS

 Data independence
 Efficient data access
 Data integrity & security
 Data administration
 Concurrent access, crash recovery
 Reduced application development time
 So why not use them always?
 Expensive/complicated to set up & maintain

 This cost & complexity must be offset by need

 General-purpose, not suited for special-purpose tasks (e.g. text


search!)
Databases make these folks happy ...
 DBMS vendors, programmers
 Oracle, IBM, MS, Sybase, …

 End users in many fields


 Business, education, science, …

 DB application programmers
 Build enterprise applications on top of DBMSs

 Build web services that run off DBMSs

 Database administrators (DBAs)


 Design logical/physical schemas

 Handle security and authorization

 Data availability, crash recovery

 Database tuning as needs evolve

…must understand how a DBMS works


Summary (part 1)

 DBMS used to maintain, query large datasets.


 can manipulate data and exploit semantics
 Other benefits include:
 recovery from system crashes,
 concurrent access,
 quick application development,
 data integrity and security.
 Levels of abstraction provide data independence
 Key when dapp/dt << dplatform/dt
Summary, cont.
 DBAs, DB developers the
bedrock of the information
economy

• DBMS R&D represents a broad,


fundamental branch of the science
of computation

You might also like