Professional Documents
Culture Documents
UNIT I INTRODUCTION
Syllabus:
Issues in File Processing System, Need for DBMS, Basic terminologies of Database, Database
system Architecture, Various Data models, ER diagram basics and extensions, Case study:
Construction of Database design using Entity Relationship diagram for an application such as
University Database, Banking System, Information System
DBMS stands for Database Management System. We can break it like this DBMS = Database +
Management System. Database is a collection of data and Management System is a set of programs
to store and retrieve those data.
Definition:
DBMS is a collection of inter-related data and set of programs to store & access those data
in an easy and effective manner.
Storage:
❖ According to the principles of database systems, the data is stored in such a way that it acquires
lot less space as the redundant data (duplicate data) has been removed before storage.
❖ Let’s take a layman example to understand this:
In a banking system, suppose a customer is having two accounts, one is saving account and
another is salary account.
❖ Let’s say bank stores saving account data at one place (these places are called tables we will
learn them later) and salary account data at another place, in that case if the customer
information such as customer name, address etc. are stored at both places then this is just a
wastage of storage (redundancy/ duplication of data), to organize the data in a better way the
information should be stored at one place and both the accounts should be linked to that
information somehow. The same thing we achieve in DBMS.
Fast Retrieval of data:
❖ Along with storing the data in an optimized and systematic manner, it is also important that
we retrieve the data quickly when needed.
❖ Database systems ensure that the data is retrieved as quickly as possible.
Impact: High maintenance costs, inflexibility in adapting to changing data structures, and
increased risk of errors during updates.
e) Lack of Data Integrity:
Problem: In the absence of constraints and rules enforced by a database management system,
maintaining data integrity becomes a manual task. This increases the risk of entering incorrect or
inconsistent data.
Impact: Lower data quality, increased chances of errors, and difficulties in ensuring the accuracy
of information.
f) Limited Security:
Problem: File systems often have limited security measures. Access controls are typically basic,
and there's a higher risk of unauthorized access.
Impact: Data breaches, unauthorized modifications, and compromised system security.
g) Concurrency Control:
Problem: Ensuring concurrent access to data by multiple users without conflicts is challenging.
File systems may lack mechanisms to handle concurrent updates properly.
Impact: Data corruption, lost updates, and challenges in maintaining data consistency in a multi-
user environment.
h) Scalability Issues:
Problem: As the volume of data grows, file processing systems may struggle to handle large
datasets efficiently. Performance issues can arise.
Impact: Reduced system performance, longer processing times, and challenges in scaling the
system.
i) Limited Query Capabilities:
Problem: File processing systems often lack a query language and sophisticated querying
capabilities. Retrieving specific information can be cumbersome.
Impact: Inefficient data retrieval, increased complexity in generating reports, and challenges in
extracting meaningful insights.
Disadvantages of DBMS:
1. DBMS implementation cost is high compared to the file system
2. Complexity: Database systems are complex to understand
3. Performance: Database systems are generic, making them suitable for various applications.
However, this feature affects their performance for some applications
2. DBMS (Database Management System) - Software that enables users to interact with the
database. It provides tools for creating, managing, and querying databases.
3. Table - A collection of data organized in rows and columns. Tables are the basic structure
in a relational database.
4. Row or Record - A single entry in a database table that contains data related to a specific
entity or object.
5. Column or Field - A vertical section in a database table that represents a specific attribute
or property. Columns hold the data for a particular aspect of the entity.
6. Primary Key - A unique identifier for each record in a table. It ensures that each row can
be uniquely identified and retrieved.
7. Foreign Key - A column in a table that refers to the primary key in another table. It
establishes a link between two tables, enforcing referential integrity.
8. Index - A data structure that improves the speed of data retrieval operations on a database
table.
9. Query - A request for data retrieval or manipulation from a database. Queries are typically
written in a query language like SQL (Structured Query Language).
10. Normalization - The process of organizing the data in a database to eliminate redundancy
and improve data integrity.
11. Relational Database - A type of database that uses a tabular structure to organize data, and
relationships between tables are defined.
12. SQL (Structured Query Language) - A programming language used for managing and
manipulating relational databases. It includes commands for querying, updating, and
managing databases.
13. Transaction - A sequence of one or more database operations treated as a single unit.
Transactions ensure the consistency and integrity of a database.
14. ACID (Atomicity, Consistency, Isolation, Durability) - Properties that guarantee the
reliability of database transactions. Atomicity ensures that transactions are treated as a single
unit, Consistency ensures that a database remains in a valid state, Isolation ensures that
transactions are independent of each other, and Durability ensures that committed
transactions are permanent.
15. Schema - The structure or blueprint that defines the organization of data in a database,
including tables, fields, relationships, and constraints.
16. Data Dictionary - A centralized repository that stores metadata about the database,
including information about tables, columns, data types, and relationships.
1) One-to-many relationship
2) Parent-Child Relationship
3) Deletion Problem
4) Pointers
1) Complexity
2) If Parent node is deleted then child node will be deleted
2) NETWORK MODEL:
❖ Network model is an extension of hierarchical model.
❖ This model was recommended as the best before relationship model.
❖ Same like hierarchical model, the only difference between these two models can have
more than one parent
❖ For Example, consider the following diagram a student entity has more than one parent
2) More paths
1. Entities
2. Attributes
3. Relationships
Features of ER Model
1) Graphical representation
2) Visualization
Advantages of ER Model
1) Very Simple
2) Better communication
Disadvantage of ER Model
1) No industry standard
2) Hidden information
4) RELATIONAL MODEL:
❖ Widely used model
❖ Data are represented as row-wise and column-wise (2-Dimensional Array)
❖ Example: EMP (Employee) Table
4) Isolation
Disadvantages of Relational Model
1) Hardware overheads
data records gets stored in data structures, however the internal details such as
implementation of data structure is hidden at this level (available at physical level).
❖ Design of database at view level is called view schema. This generally describes end user
interaction with database systems.
INSTANCE:
❖ The data stored in database at a particular moment of time is called instance of database.
❖ Database schema defines the variable declarations in tables that belong to a particular
database; the value of these variables at a moment of time is called the instance of that
database.
3) View level: Highest level of data abstraction. This level describes the user interaction with
database system.
Example:
❖ Let’s say we are storing customer information in a customer table. At physical level these
records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in memory.
These details are often hidden from the programmers.
❖ At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The
programmers generally work at this level because they are aware of such things about
database systems.
❖ At view level, user just interact with system with the help of GUI and enter the details at
the screen, they are not aware of how the data is stored and what data is stored; such details
are hidden from them.
DATA INDEPENDENCE:
❖ Change the schema at one level of a database system without having to change the schema
at the next higher level. We can define two types of data independence:
1. Logical data independence - is the capacity to change the conceptual schema without having
to change external schemas or application programs.
2. Physical data independence - is the capacity to change the internal schema without having to
change conceptual schemas.
3. Applications depend on the logical schema. In general, the interfaces between the various
levels and components should be well defined so that changes in some parts do not seriously
influence others
DBMS LANGUAGES:
Database languages are used for read, update and store data in a database. There are several
such languages that can be used for this purpose; one of them is SQL (Structured Query
Language).
1. Data Definition Language (DDL): DDL is used for specifying the database schema. Let’s
take SQL for instance to categorize the statements that comes under DDL.
2. Data Manipulation Language (DML): DML is used for accessing and manipulating data in a
database.
• To read records from table(s) – SELECT
• To insert record(s) into the table(s) – INSERT
• Update the data in table(s) – UPDATE
• Delete all the records from the table – DELETE
3. Data Control language (DCL): DCL is used for granting and revoking user access on a
database –
• To grant access to user – GRANT
• To revoke access from user – REVOKE
In practical data definition language, data manipulation language and data control languages are
not separate language; rather they are the parts of a single database language such as SQL.
4. Transaction Control Language (TCL): Transaction Control Language statements are used to
control the transactions in the database.
• SAVE POINT
• ROLLBACK
• COMMITE
❖ It interprets the requests (queries) from user(s) via an application program /interface into
instructions.
❖ It also executes the user request which is received from the DML compiler.
a) DML Compiler - It processes the DML statements into low level instruction
b) DDL Interpreter - It processes the DDL statements into a set of tables containing meta
data
Storage Manager:
❖ It is an interface between the information stored in the database an and the requests (queries)
❖ The main responsibility is managing the data manipulation such as addition deletion,
modification, etc.,
a) Authorization Manager - It ensures role-based access control, i.e,. checks whether the
particular person is privileged to perform the requested operation or not.
b) Integrity Manager - It checks the integrity constraints when the database is modified.
d) File Manager - It manages the file space and the data structure used to represent
information in the database.
e) Buffer Manager - It is responsible for cache memory and the transfer of data between the
secondary storage and main memory.
Disk Storage:
b) Data Dictionary - It contains the information about the structure of any database object. It
is the repository of information that governs the metadata.
Database Administrator:
❖ Coordinates all the activities of the database system.
❖ The database administrator has a good understanding of the enterprise’s information
resources and needs.
❖ Database administrator's duties (Roles) include:
▪ Schema definition
▪ Storage structure and access method definition
▪ Schema and physical organization modification
▪ Granting user authority to access the database
▪ Specifying integrity constraints
▪ Acting as liaison with users
▪ Monitoring performance and responding to changes in requirements
E-R MODEL:
An Entity–Relationship Model (ER model) is a systematic way of describing and defining a
business process. An ER model is typically implemented as a database. The main components of
E-R model are: entity set and relationship set.
Here are the geometric shapes and their meaning in an E-R Diagram –
An Entity is represented by a set of attributes, which are descriptive properties possessed by all
members of an entity set.
Attribute types:
❑ Simple and composite attributes.
❑ Single-valued attributes.
❑ Multi-valued attributes.
An attribute that can hold multiple values is known as multivalued attribute. We represent
it with double ellipses in an E-R Diagram.
E.g. A person can have more than one phone numbers so the phone number attribute is
multivalued.
❑ Derived attributes:
A derived attribute is one whose value is dynamic and derived from another attribute. It is
represented by dashed ellipses in an E-R Diagram.
E.g. Person age is a derived attribute as it changes over time and can be derived from
another attribute (Date of birth).
Cardinality Constraints:
We express cardinality constraints by drawing either a directed line (→), signifies “one”
An undirected line (—), signifies “many” between the relationship set and the entity set.
One-to-one relationship
• A customer is associated with at most one loan via the relationship borrower
• A loan is associated with at most one customer via borrower
One-To-Many Relationship:
In the one-to-many relationship a loan is associated with at most one customer via
borrower; a customer is associated with several loans via borrower.
Many-To-One Relationships
Customers via borrower, a customer is associated with at most one loan via borrower
Many-To-Many Relationship
▪ A customer is associated with several loans via borrower
▪ A loan is associated with several customers via borrower
Specialization
❖ Top-down design process: We designate subgroupings within an entity set that are
distinctive from other entities in the set.
❖ These sub groupings become lower-level entity sets that have attributes or participate in
relationships that do not apply to the higher-level entity set.
❖ Depicted by a triangle component labeled ISA (E.g. customer “is a” person).
❖ Attribute inheritance – a lower-level entity set inherits all the attributes and relationship
participation of the higher-level entity set to which it is linked
Generalization
❖ A bottom-up design process – combine a number of entities sets that share the same
features into a higher-level entity set.
❖ Specialization and generalization are simple inversions of each other; they are represented
in an E-R diagram in the same way.
❖ The terms specialization and generalization are used interchangeably.