You are on page 1of 12

COSC 0150: DATABASE SYSTEMS 1

Introduction
A database system is a software application designed to store, manage, and retrieve large
amounts of data efficiently. It provides an organized and structured approach to store and access
data, enabling users to perform various operations such as data insertion, modification, and
retrieval.

Databases are widely used in various domains, including businesses, organizations, research
institutions, and more. They play a crucial role in managing and organizing data for efficient data
processing, analysis, and decision-making.

Data: Data refers to the raw facts and figures that are stored in a database. It can be in various
forms, such as text, numbers, images, or multimedia.

Database Management System (DBMS): A DBMS is a software system that enables users to
define, create, manipulate, and manage databases. It provides an interface between the users and
the database, allowing them to interact with the data stored in the system.

Information refers to processed or organized data that has meaningful context, relevance, and
value to a recipient or user. It is the result of analyzing, interpreting, and deriving meaning from
raw data

Before the use of computer, a manual file system was used to maintain the records and files. All
data was typically stored in physical formats such as paper documents, notebooks, index cards,
or manual registers.

Data entry and updates were performed manually by individuals who physically record or
modify the information on the chosen medium. This process typically involved writing or typing
data, which could be time-consuming and prone to human errors.

Retrieving specific data from a manual-based database system was really challenging. Searches
often required manual efforts to locate and scan through physical records, which was time-
consuming, especially when dealing with large volumes of data.

To overcome these disadvantages and to make the processing fast, Traditional File Processing
System was introduced. Traditional file-based systems refer to an older approach to data storage
and management that predates modern database systems. In these systems, data is stored in
separate files or flat files, typically organized based on the requirements of specific applications
or processes.

To illustrate Traditional File Processing Systems definition, lets us take an example of college
where student record for examination is stored in one file and his library record is stored in
another different file that creates many duplicate values like roll Number, Name and Father
Name.

Some of the limitations of traditional file based systems are

Data redundancy: Since data is stored in separate files, it is common for duplicate or redundant
data to exist across multiple files. This redundancy increases storage requirements and makes it
challenging to maintain data consistency and integrity.

Lack of Data independence: Each application or process in a file-based system tends to have its
own set of files and data formats. This lack of data independence means that changes in one
application may require modifications to the underlying files and programs of other applications,
making the system inflexible and difficult to maintain.

Data Integrity and Security: Ensuring data integrity and security in a file-based system is more
challenging compared to modern database systems. File-based systems rely heavily on manual
checks and controls, making them more susceptible to data inconsistencies, unauthorized access,
and data loss.
Data Retrieval: Retrieving specific data in a file-based system can be complex and time-
consuming. Applications typically need to navigate through multiple files and search for relevant
records based on predetermined file structures, leading to slower and less efficient data access.

Despite these limitations, traditional file-based systems were widely used in the early days of
computing when data volumes and application complexity were relatively low. They paved the
way for the development of modern database systems, which address many of the shortcomings
of file-based systems by providing structured, centralized, and efficient data management
capabilities.

Database systems

The database approach to storing data is a systematic and structured method of managing and
organizing data. It involves the use of specialized software called a Database Management
System (DBMS) to create, store, retrieve, and manipulate data efficiently.

A database system consists of various components that work together to manage and organize
data efficiently. The main components of a database system include:

 Hardware: The hardware component of a database system refers to the physical


infrastructure that supports the storage, processing, and retrieval of data. This includes
servers, storage devices, memory, processors, and networking equipment.
 Software: The software component includes the database management system (DBMS)
software, which provides the necessary tools and functionality to manage and manipulate
data. The DBMS software allows users to create, store, retrieve, and modify data in the
database.
 Data: The data component represents the actual information stored in the database. It
includes structured data, such as tables, records, and fields, as well as unstructured data, such
as documents, multimedia files, and other forms of information.
 Procedures: Procedures refer to the set of rules and guidelines that govern the operation and
management of the database system. This includes procedures for data entry, data
manipulation, data backup and recovery, security measures, and system administration.
 Database Access Language: The database access language is used to access the data to and
from the database. The users use it to enter new data, edit existing data and retrieve required
data from the databases. The user write a set of appropriate commands in a database access
language and submits these to the DBMS. The administrators may also use the database
access language to create and maintain the databases. The most popular database access
language is SQL (Structured Query Language).

Database management system

A Database Management System (DBMS) is specialized software that enables the creation,
organization, storage, retrieval, and manipulation of data in a structured and efficient manner. It
provides an interface between users or applications and the underlying database, allowing users
to interact with the data without having to understand the complexities of data storage and
management.

The primary functions of a DBMS include:

 Data Definition: A DBMS allows users to define the structure of the database, including
creating tables, specifying data types, setting up relationships between tables, and defining
constraints to ensure data integrity.
 Data Manipulation: DBMS provides mechanisms to manipulate data, allowing users to
insert, update, delete, and retrieve data from the database. This is typically done through
query languages such as SQL (Structured Query Language).
 Data Storage and Organization: DBMS manages the storage and organization of data in the
database. It determines how data is physically stored on disks or other storage media,
manages file structures, and handles data indexing for efficient data retrieval.
 Data Security: DBMS includes security features to protect data from unauthorized access,
ensuring that only authorized users can access and modify the data. It manages user
authentication, access control, and data encryption to ensure data confidentiality and
integrity.
 Data Integrity and Constraints: DBMS enforces data integrity by enforcing constraints and
rules on the data. It ensures that data meets certain criteria, such as uniqueness, referential
integrity, and data type constraints, to maintain data consistency and accuracy.
 Data Concurrency and Transaction Management: DBMS handles concurrent access to the
database by multiple users or applications, ensuring that data remains consistent and conflicts
are resolved. It manages transactions, which are sequences of database operations, to ensure
atomicity, consistency, isolation, and durability (ACID properties).
 Data Backup and Recovery: DBMS provides mechanisms for data backup and recovery,
allowing users to create regular backups of the database and restore data in case of hardware
failures, system crashes, or other types of data loss.
 Query Optimization and Performance: DBMS includes query optimization techniques to
enhance the performance of database queries. It analyzes query execution plans, optimizes
query execution paths, and uses indexing and caching mechanisms to improve query
performance.
 Data Administration and Management: DBMS provides tools and utilities for database
administration and management tasks. This includes tasks such as database creation,
configuration, monitoring, performance tuning, and user management.

Popular examples of DBMS include Oracle Database, MySQL, Microsoft SQL Server,
PostgreSQL, and MongoDB.

While Database Management Systems (DBMS) offer numerous advantages, they also have some
disadvantages that should be considered. Here are a few common disadvantages of DBMS:

 Complexity: DBMS can be complex to design, implement, and manage. The setup and
configuration of a DBMS often require specialized knowledge and expertise. Database
administrators (DBAs) need to possess a deep understanding of the underlying data model,
query languages, optimization techniques, and administration tasks. This complexity can
increase the overall cost of implementing and maintaining a DBMS.
 Cost: Implementing and maintaining a DBMS can be costly. The expenses include acquiring
the software, hardware infrastructure, licensing fees, ongoing maintenance, and training for
personnel. Small businesses or organizations with limited resources may find the initial
investment and ongoing costs prohibitive.
 Performance Overhead: DBMS introduces performance overhead due to the additional
layers of abstraction and processing required to manage and access data. Operations such as
data retrieval, indexing, query optimization, and concurrency control can consume system
resources, resulting in slower response times compared to direct file-based access for small-
scale applications.
Database users

End Users: End users are individuals or entities who directly interact with the database system to
access, retrieve, and manipulate data according to their specific needs. They typically use
software applications or interfaces that provide them with a user-friendly way to work with the
database.

Database Administrators (DBAs): Database administrators are responsible for managing and
maintaining the database system. They perform tasks such as database installation, configuration,
security management, performance optimization, backup and recovery, and user management.
DBAs have advanced knowledge of the DBMS and are involved in ensuring the overall integrity,
security, and performance of the database system.

Application Developers: Application developers are responsible for designing, developing, and
maintaining software applications that interact with the database system. They use programming
languages, frameworks, and tools to build applications that incorporate database functionality.
Application developers create database schemas, write database queries, implement data
validation rules, and handle database connectivity and integration within their applications.

Database Developers: Database developers, also known as database designers or database


programmers, are responsible for the design, programming, construction, and implementation of
new databases, as well as modifying existing databases for platform updates and changes in user
needs.

Database System Architectures

Single-Tier Database Architecture: In a single-tier database architecture, also known as a


standalone or file-based architecture, the entire database system runs on a single machine. The
application logic, data storage, and user interface are tightly integrated into a single system. In
this architecture, there is no separation between the database server and client applications. The
user interacts directly with the database system, which manages data storage and processing.

Two-Tier Database Architecture: In a two-tier database architecture, the system is divided into
two main components: the client and the server. The client is responsible for the presentation
layer, which includes the user interface and application logic. The server, often referred to as the
database server or database management system (DBMS), handles data storage, query
processing, and management. The client communicates directly with the database server to
retrieve or update data.

Three-Tier Database Architecture: In a three-tier database architecture, the system is divided


into three distinct layers: the presentation layer, the application layer, and the data layer.

 Presentation Layer (Client): The presentation layer, also known as the client tier, is
responsible for the user interface and interaction. It includes the graphical user interface
(GUI) or web interface that users interact with to access the application. The presentation
layer communicates with the application layer to retrieve or send data.
 Application Layer (Middle Tier): The application layer, also called the middle tier, contains
the application logic and business rules. It handles the processing of user requests, executes
business logic, and interacts with the data layer. This layer may also include additional
components such as web servers, application servers, or middleware that facilitate
communication between the presentation and data layers.
 Data Layer (Server): The data layer, also known as the server tier, is responsible for data
storage, management, and processing. It includes the database server or DBMS, which
handles tasks such as data storage, query processing, transaction management, and data
integrity enforcement. The application layer communicates with the data layer to retrieve or
update data as needed.

The three-tier architecture provides a separation of concerns, allowing for scalability, flexibility,
and improved maintainability. It enables easier maintenance and updates to the application logic
without impacting the user interface or data storage. Additionally, the division of layers allows
for distributed deployment, where each layer can be hosted on different machines or servers,
improving performance and scalability.

Overall, the choice of architecture depends on factors such as the complexity of the application,
scalability requirements, and separation of concerns needed in the system.
Database schema and instance

When the database is designed to meet the information needs of an organization, the schema of
the database and the actual data to be stored in it becomes the most important concern of the
organization.

A database schema is a structure that defines the logical organization and design of a database. It
represents the blueprint or skeleton of the database and outlines the tables, relationships,
constraints, and other elements that make up the database.

The data values that populates the database at any point in time is known as an instance of the
database. It is also referred to as the state of the database or snapshot. It is important to note that
the data in the database changes frequently, while the schema remains the same over periods of
time (although not necessarily forever).

For the system to be usable, it must retrieve data efficiently. The need for efficiency has led
designers to use complex data structures to represent data in the database. Since many database-
system users are not computer trained, developers hide the complexity from users through
several levels of abstraction, to simplify users’ interactions with the system. This leads to the
ANSI-SPARC three schema architecture

ANSI-SPARC three-schema architecture


The three-level ANSI-SPARC database architecture, also known as the ANSI-SPARC model or
the ANSI-SPARC three-schema architecture, is a conceptual model that defines three levels of
abstraction for database design. It was developed by the American National Standards Institute
(ANSI) Standards Planning and Requirements Committee (SPARC). The three levels are as
follows:

External Level (View Level): The external level represents the user's view of the database. It
defines multiple user views or external schemas that specify how different user groups or
applications perceive and access the data. Each external schema provides a customized and
simplified view of the database tailored to the specific requirements of each user group.

Conceptual Level (Logical Level): The conceptual level provides a conceptual or logical
representation of the entire database. It describes the overall structure and organization of the
data, independent of any specific application or user view. The conceptual schema, also known
as the global schema, integrates and reconciles the various external schemas to present a unified
and consistent view of the data.

Internal Level (Physical Level): The internal level deals with the physical representation of the
database on storage media. It describes how data is stored, indexed, and accessed at the physical
level. It includes details such as data storage formats, file structures, indexing techniques, and
access paths. The internal schema maps the logical structure defined in the conceptual schema to
the physical storage implementation.

NB. The three-tier database architecture focuses on the distribution of functionality within a
database application, whereas the three-level ANSI-SPARC database architecture focuses on the
conceptual organization and abstraction levels of a database design. The three-tier architecture
defines the layers of a database application, while the ANSI-SPARC model defines the levels of
abstraction in a database system's design.

Advantages of the Three level architecture

1. Data Independence

The three-level schema architecture provides a clear separation between the logical and physical
aspects of the database. This separation allows for data independence, meaning changes in the
physical storage or internal schema do not affect the external or conceptual schemas. It enables
modifications to be made at one level without impacting other levels, providing flexibility and
adaptability to the database system.

There are two types of data independence:

 Logical data independence refers to the characteristic of being able to modify the
conceptual schema without having to change external schemas or application programs.
We may change the conceptual schema to expand the database (by adding a record type
or data item), to change constraints, or to reduce the database (by removing a record type
or data item)

 Physical data independence refers to the characteristic of being able to modify the
internal schema without having to change the conceptual schema. Hence, the external
schemas need not be changed as well.

2. Improved Modularity and Maintainability

The three-level architecture promotes modularity in database design. Each level (external,
conceptual, and internal) can be developed and modified independently, which facilitates easier
maintenance and evolution of the database system. Changes in the external schema, such as
adding or modifying user views, can be implemented without affecting the underlying
conceptual or internal schemas.

3. Enhanced Data Security and Privacy


The external schema allows for controlled and customized data access by different user groups or
applications. It enables the implementation of security measures, such as access controls and user
permissions, to restrict data access based on user roles or privileges. This architecture also
supports data privacy by ensuring that sensitive information is not exposed to unauthorized users
or applications.

4. Flexibility in Database Design

The three-level schema architecture allows for multiple external schemas or views of the same
underlying data. This flexibility enables different user groups or applications to have customized
and simplified views of the data that are tailored to their specific needs. It supports data
abstraction, providing a high-level view of the data that simplifies complex database structures
and enhances usability.

Database development life cycle (DDL)

The database life cycle refers to the process of developing, implementing, maintaining, and
eventually retiring a database system. It typically consists of the following stages:

Database Planning: This initial stage involves defining the purpose and scope of the database
system. It includes identifying the requirements of the system, such as data sources, data volume,
performance goals, security requirements, and user needs. During this stage, project planning,
budgeting, and resource allocation are also considered.

Requirements Gathering and Analysis: In this stage, the requirements for the database system
are gathered from stakeholders, including end users, administrators, and management. The
existing data sources and their structures are analyzed, and any necessary data transformations or
conversions are identified.

Conceptual Database Design: Based on the gathered requirements, a conceptual database design
is created. This design focuses on the high-level structure of the database, including the entities,
relationships, and attributes. Techniques such as entity-relationship diagrams are commonly used
during this stage.

Logical Database Design: The logical database design involves translating the conceptual design
into a more detailed representation. It includes defining the tables, columns, data types, primary
and foreign key relationships, and any constraints. Normalization techniques are applied to
ensure data integrity and eliminate data redundancy.

Physical Database Design: In this stage, the logical design is transformed into a physical
database schema. Factors such as storage structures, indexing, partitioning, and access methods
are considered to optimize performance and storage efficiency. The physical design also takes
into account the hardware and software platform on which the database will be implemented.

Database Implementation: Once the physical design is completed, the database system is
implemented. This involves creating the database schema, setting up the storage structures,
creating tables, defining indexes, and configuring security and access controls. Data from
existing sources may be loaded into the database during this stage.

Database Testing: After implementation, the database system undergoes various tests to ensure
its functionality, performance, and data integrity. Testing includes activities such as data
validation, transaction processing, concurrency control, backup and recovery, and stress testing.
Any issues or bugs discovered are addressed and resolved.

Database Deployment: Once the database system passes the testing phase, it is deployed into the
production environment. This involves migrating data from the old system (if applicable), setting
up backup and recovery procedures, configuring access controls, and establishing connectivity
with application systems.

Database Operations and Maintenance: Once the database is live, it requires ongoing
operations and maintenance. This includes monitoring performance, optimizing queries, tuning
the database configuration, applying patches and updates, and ensuring data backups and disaster
recovery procedures are in place. Regular maintenance tasks, such as index reorganization and
data purging, are performed to maintain system efficiency.

You might also like