You are on page 1of 70

Databases

Raymond Freth A. Lagria


Outline

 Introduction to Databases
 Entity-Relationship Modeling
 Relational Modeling
 Normalization
 SQL
Introduction to
Databases
Raymond Freth A. Lagria
Session Outline

 Data and Information


 File-based Approach
 What are Databases?
 Database Management Systems (DBMS)
 The Database Development Process
 Importance of Database Design
Session Outline

 Data and Information


 File-based Approach
 What are Databases?
 Database Management Systems (DBMS)
 The Database Development Process
 Importance of Database Design
 Case Study
What is Data?

 Data – raw facts about people, places events, processes,


etc.
 Information – a collection of facts that is well organized to
have a value/meaning to a particular organization
 Tangible or intangible entity which serves to reduce uncertainty
about some future state or event
What is Data?

https://byjus.com/biology/difference-between-data-and-information/
Data vs Information

Data Information
Unorganized and Unrefined Processed, organized and presented in
a meaningful context
Individual unit that contains raw Group of data that is considered
materials without meaning collectively to carry a logical meaning
Does not depend on information Depends on data
Raw data alone is insufficient for Sufficient for decision making
decision making
Example: an employee’s age, a client’s Example: daily average sales, month’s
current balance, an athlete’s height average daily balance, highest observed
performance department
Information – A Key Resource

 Fuels business
 Critical factor in determining the success or failure of a
business
 Needs to be managed correctly

Easy to
Accurate Complete Update to Date
Understand
Transformation Processes for Data into
Information
 Selecting and Filtering
 Organizing
 Manipulating
 Presenting and Visualizing
 Storing
Structured Data vs. Unstructured Data

 Structured Data
 Data residing in a fixed field within a file or record
 Usually stored in relational databases (RDBMS)
 Unstructured Data
 Data is NOT structured in predefined way
 No data model
 Stored in its native format
Structured Data vs. Unstructured Data
Examples
Structured Unstructured
Dates Document Files
Phone Numbers Text Files
SSS Number Reports
Employee Number Email Messages
Employee Name Audio Files
Customer Last Name Video Files
Supplier Address Images
Structured Data vs. Unstructured Data
Examples
Session Outline

 Data and Information


 File-based Approach
 What are Databases?
 Database Management Systems (DBMS)
 The Database Development Process
 Importance of Database Design
 Case Study
File-Based Approach

 A file is a collection of related information


 A system of files and collection of application programs
manipulating them is a file-based system
 In a file system, data is directly stored in a set of files and
access is allowed to single files or tables at a time.
 It contains flat files that have no relation to other files.
File-Based Approach Analogy

These same
folder files are
not linked.
Limitations of the File-Based Approach

 Efforts for query answering:


 What is the average salary for all employees?
 List down all the employees involved in the project “Xpansion”.
 How many employees are at the manager level?
 How are you going to answer this?
Limitations of the File-Based Approach

 Other limitations:
 Duplication of data
 Data anomalies like update, insert and delete actions cannot do
the same on redundant data
 There should be a way to control and manipulate data in
isolation of the application program
University’s File-Based System

 No relation
 Not talking to each other
Sample File-Based System
Session Outline

 Data and Information


 File-based Approach
 What are Databases?
 Database Management Systems (DBMS)
 The Database Development Process
 Importance of Database Design
 Case Study
The Database System
The Database System

 A database system can be defined as the combination of a


database, a DBMS, and application programs
 Because of the advantages offered by a DBMS, businesses
and organizations prefer the database approach to the file-
based approach
 A database management system (DBMS) is a piece of
software that allows a user to define, create, and manage
access to a database
Components of a Database System

 Four Components
 Users

 Database Application
 DBMS

 Database
Database

 A database is a collection of information that is organized


so that it can be easily accessed, managed and updated.
 Data is organized into rows, columns and tables, and it is
indexed to make it easier to find relevant information.
 Shared, integrated computer structure that stores:
 End user data (raw data)
 Metadata (data about data)
 Saved Date, Saved Time, Row Size, Row Updated
Database
End-user Data (Raw Data)

 Building blocks of information


 Unprocessed information
Metadata

 Process of information building demands an additional


description of the data to be processed
 “Data about data”

Name Data Type Size Description

ID Number 4 Unique Student ID

Name Text 50 Student name

Address Text 200 Mailing Address

City Text 50 Residential City

State Text 2 Residential State

ZIP Number 5 ZIP Code


Database Application

 Computer programs that allow users to manipulate the


data in a DBMS through a user-friendly interface
Database Application
Database Application
Users

 Database administrator, system developer, and end user:


 A database administrator (DBA) is a person responsible for all the
data resources of an organization
 Uses tools that come with a DBMS to improve the productivity and
performance of database planning and design
 System developers are a group of people responsible for the
creation of new application programs that cater to the user
requirements
Users

 Database administrator, system developer, and end user:


 Database Designers are the ones who design the structure of the
database
 Also controls what data must be stored and how they are related
 End-users in an organization can add, update, and delete data in
a database through application programs or directly through a
DBMS
 Use the application program to accomplish their day-to-day tasks
Session Outline

 Data and Information


 File-based Approach
 What are Databases?
 Database Management Systems (DBMS)
 The Database Development Process
 Importance of Database Design
 Case Study
Database Management System (DBMS)

 Database Management refers to actions that businesses take to


work with and process their data → create, validate, retrieve,
update, maintain, link, align, archive, and delete to ensure
control
 A database management system (DBMS) is a piece of software
that allows a user to define, create, and manage access to a
database
 Decouples application programs from data
 The database stores all its data in one location, thereby
limiting data duplication
 MS Access, Oracle, IBM’s DB2, and SQL Server
The DBMS is the link
Database Management System (DBMS) between the apps and
the users to the
database.

 Collection of programs that


manages database structure
and controls access to data
 Possible to share data
among multiple applications
or users
 Makes data management
more efficient and effective
Functions of a DBMS

 A DBMS is primarily responsible for providing a logical


view of underlying data
 Allows its user to store, retrieve, and update data
 Provides a clear and logical view of the process that manipulates
the data
 Data Dictionary Management
 Defines data elements and their relationships
 Data Storage Management
 Stores data and related data entry forms, report definitions, etc.
Functions of a DBMS

 Other functionalities
 Data independence
 Maintain segregation between the program and the data
 Concurrency control – how current your data is
 Recovery services
 Security management
 Enforces user security and data privacy within a database
 Utility services
 Perform initialization and maintenance operations on a
database
Database Management Systems

 10 Most Used Databases by Developers in 2020


 MySQL
 PostgreSQL
 Microsoft SQL Server
 SQLite
 MongoDB
 Redis
 MariaDB
 Oracle Database
 Firebase
 Elasticsearch

https://analyticsindiamag.com/10-most-used-databases-by-developers-in-2020/
Types of Databases According to Design

 Relational Databases
 Defines database relationships in the form of tables
 Most popular DBMS type
 Object-oriented Databases
 Supports storage of all data types
 Considers all data are in the form of object
Types of Databases According to Design
JSON in MongoDB

 Document/JSON Databases
 Data is stored as document collections
 E.g., XML, JSON, BSON
 NoSQL Databases
 Stands for “Not Only SQL”
 Used for large sets of distributed data
 Very efficient in handling large-size unstructured data
 Schema-free
Types of Databases According to Design
JSON in MongoDB

 Document/JSON Databases
 Data is stored as document collections
 E.g., XML, JSON, BSON
 Data Warehouses

 NoSQL Databases
Types of Databases

 Single user
 Supports only one user at a time
 Desktop
 Single-user database running on a personal computer
Types of Databases: by User Group

 Multi-user
 Supports multiple user at the same time
 Multi-user database that supports a large group of users
or an entire organization
 Workgroup
 Multi-user database that supports a small group of
users or a single department
Types of Databases; by Location

 Centralized
 Supports data at a single site
 Distributed
 Supports data distributed across several sites
Types of Databases; by Use

 Transactional
 Supports a company’s day-to-day operations
 Data Warehouse
 Stores data used to generate information required to make
tactical or strategic decisions
 Often use to store historical data
 Structure is quite different
Advantages of a Database Approach

 Advantages:
 Segregation of the application program and the data
 Minimal data duplication
 Ability to retrieve data easily
 Reduced development time and maintenance needs
Disadvantages of a Database Approach

 Disadvantages:
 Complexity

 Size

 Cost
Session Outline

 Data and Information


 File-based Approach
 What are Databases?
 Database Management Systems (DBMS)
 The Database Development Process
 Importance of Database Design
 Case Study
Database Development Process (inspired
by the SDLC process) Requirements
Gathering

1) Establishing the Requirements


Conceptual Data
2) Data Analysis Model

3) Database Design
Logical Design
4) Implementation
5) Testing
Initial Schema
6) Maintenance and Database
Maintenance
Released Schema
and Database
Desirable Properties of a Database

Property Description
Completeness Ensures that users can access the data they want
Integrity Ensures that data is both consistent (no contradictory
data) and correct (no invalid data)
Flexibility Ensures that a database can evolved without requiring
excessive effort to satisfy changing user requirements
Efficiency Ensures that users do not have unduly long response
times when accessing data
Usability Ensures that data can be accessed and manipulated in
ways that match user requirements
Requirements Gathering

 Requirements related specifically to the data


 Consultation and agreement among all users
 Output: data requirements document
 Each course which is available for study is given a course code, a title
and a value for credit points – either a 30-point course or a 60-point
course. A course may have a quota – the maximum number of students
that can be enrolled on the course in any one year; each quota is
reviewed regularly, and the last date of review is recorded with the
quota. A course need not (yet) have any students enrolled on it. Students
may not enroll for more than 180 points’ worth of courses at any one
time.
Analysis to Design

 Two most common techniques:


 Entity-relationship (ER) Modeling
 Graphical database design approach that models entities, attributes,
and defines relationships among these entities
 Normalization
 Tables are organized in such a way that it decreases data redundancy
and dependency
 Larger tables are divided into smaller tables and are linked using
relationships
Analysis

 Statement of data requirements


 Produce a conceptual data model
 Goal: obtain a detailed description of the data that suits user
requirements → includes properties of the data
 The conceptual data model provides a shared, formal
representation between the end-users and the
developers/designers of the database
 Concerned with meaning and structure of the data
 What data a database should contain
 Constraints to satisfy This Training: Entity-Relationship Model
Analysis – Note!

 Analyze the DATA REQUIREMENTS, NOT THE


IMPLEMENTATION!
 Do not think yet of how these requirements are to be met
 Or the DBMS to be used yet
 May fail user requirements
Schema Schema: visual representation of a database,
set of rules that govern a database, or entire
set of objects belonging to a particular user

 Schema → overall description of the database


 Three types:
 External (Enterprise)
 Database in terms of data viewed by different users
 Conceptual (Entity-Relationship)
 Database in terms of entities, attributes, and relationships
 Internal (Logical and Physical)
 Database in terms of stored records, data fields, and indexes
Three-tier Architecture
Database Design

 Starts from the conceptual data model


 Output: specification of a logical schema
 Determines the specific type of database
system (e.g., relational)
 Consider design criteria – flexibility, control of
duplication, constraints

This Training: Relational Data Model and


Relational Database Design
Database Design – the Relational Way

Convert conceptual Define constituents


Represent entities
to relational of your schema for Implementation
as tables
representation each table

Columns,
primary
keys, FK,
constraints
Implementation

 Construction of a database according to the logical


schema
 Storage, security, external schema
 Heavily influenced by the choice of DBMS, database tools
and operating environment
 Additional tasks
 Creating a database schema
 Implementing constraints
 Populating the database
Implementation

 Logical Database Design


 Relational data modelling is used to transform E-R diagram
into a relational schema
 Describe each piece of information you need to track and
the relationships among, or the business rules that govern,
those pieces of information
 Physical Database Design and Creation
 Actual DB tables and relationships
 Selection of software (DBMS) and hardware
Populating the Database

 Can be one two ways:


 From existing data
 Use of user applications
 Existing records to be populated
 Purchased from external agency
 Import and export facilities of a DBMS
 Transformation maybe necessary
 Bulk load – transfer of large quantities of existing data
into the database
Testing

 Goal: uncover errors in the design and implementation of


the database → structure, constraints, user, etc.
 Validate: has the right database been developed to meet
the user requirements?
 Verify: has the database design been implemented
correctly?
Maintenance

 Maintenance of the database is a key issue


 Maintenance can have 3 forms:
 Operational maintenance – database performance
 Porting – DBMS, user processes, computer system changes
 Requirements change – original requirement specs change
Session Outline

 Data and Information


 File-based Approach
 What are Databases?
 Database Management Systems (DBMS)
 The Database Development Process
 Importance of Database Design
 Case Study
Importance of Database Design

 Information consistency
 Eliminating redundant data
 Efficiently executing queries
 Improved performance of a database
 Allows to easily access and retrieve data
Importance of Database Design

 Relational Data Model → stability and non-redundancy


 Data Warehouses → faster query and reports generation
 NoSQL → schema free
Session Outline

 Data and Information


 File-based Approach
 What are Databases?
 Database Management Systems (DBMS)
 The Database Development Process
 Importance of Database Design
 Case Study
References

 IE 171 Lecture Notes, DIE/OR, UP Diliman


 IE 172 Lecture Notes, DIE/OR, UP Diliman
 https://analyticsindiamag.com/10-most-used-databases-by-
developers-in-2020/
 https://www.guru99.com/introduction-to-database-sql.html
 https://www.researchgate.net/publication/236860222_Develop
ing_Dynamic_Packaging_Applications_Using_Semantic_Web-
Based_Integration/figures?lo=1&utm_source=google&utm_medi
um=organic
 https://www.open.edu/openlearn/ocw/mod/oucontent/view.p
hp?printable=1&id=2786
Session Outline

 Data Models
 The Entity Relationship Model
 Entites
 Attributes
 Relationships
 Cardinalities

You might also like