0% found this document useful (0 votes)
5 views20 pages

Dbms Unit 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views20 pages

Dbms Unit 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DATA BASE APPROACH

Q) What is data base? What are the Feachers of database?


Advantages and disadvantages?
A) Introduction:
The term database is a well defined as a slight collection of electronic
information that can be handed to generate a piece of useful information.
The data can be created, retrieved, updated, managed, controlled and
organised to several data processing related operations.
Database management system is a technology to produces solutions that
are optimised and succeeded in the storing and retrieving of data from
databases. Its database management system is a systematic approach to
storing and managing databases. It provides an environment for users Will
store and access data in a secure way.
Information:
Knowledge vs data
Data and knowledge are often used interchangeably, but they represent different stages in a
process of understanding. The key difference is that data are raw, unprocessed facts, while
knowledge is the understanding and insight derived from processing and applying that data.

Data are raw facts. Information is data with context. Knowledge is information with
application.

Information is data that has been processed, organized, and given meaning. When you put
data into a relevant framework, it becomes information. This is where you can start to answer
basic questions about the data.
Data base = Data + Base

For example,1. An online telephone directory database to store data relating to people, phone
numbers, other contact details, etc

2. The electricity service department is using the database for related issues, managing bills,
handling fault that are, etc.

3. a Facebook social network needs to store, manipulate and present data related to
followers, their friends messages, member activates, Advertisements and a lot more.
Finance
economics accounting

R&D Sales
DATABASE

Marketing
Planning

Production

A database has the following properties:


1.It is a presentation of some aspect of real world or a collection of data elements and
representing real world information.
2. A database is logical, coherent and internally consistent.
3. A database is designed, built and populated with a data for a specific purpose.

Features of data base approach:


Self-Describing Nature
A database is self-describing because it stores not only the data itself but also metadata,
which is data about the data. This metadata, stored in a system catalog or data dictionary,
defines the structure of the database, including table schemas, data types, and relationships.
This feature allows the DBMS and applications to interpret and use the data without needing
external documentation.

Data Independence
This is a critical feature that separates the data's structure from the programs that access it.

• Physical data independence means you can change how data is physically stored
(e.g., changing file format or storage device) without affecting the application programs.
• Logical data independence means you can change the logical structure of the
database (e.g., adding a new column to a table) without forcing existing applications to
be rewritten.

Reduced Data Redundancy


Unlike file-based systems where the same data might be duplicated in multiple files, the
database approach aims to store each piece of data in a single, centralized location. This
reduces redundancy, which in turn helps ensure data consistency and accuracy.

Data Integrity and Consistency


A DBMS enforces rules, or integrity constraints, to ensure the data is accurate and
consistent. These rules can include ensuring a field is not empty, a value is within a specific
range, or that a record is linked to a valid record in another table (e.g., using primary and
foreign keys).

Support for Multiple Views


A single physical database can provide multiple views of the data. Each view is a custom,
virtual table tailored to the specific needs of a particular user or group of users. This feature
simplifies access, enhances security by hiding sensitive data, and allows different users to
interact with the database in ways that are relevant to their tasks.

Data Sharing and Multi-User Access


A core feature of a DBMS is its ability to allow multiple users to access and modify data
concurrently without compromising data integrity. It uses sophisticated concurrency control
mechanisms to manage simultaneous transactions and prevent issues like "lost updates" or
"dirty reads."

Security and Backup/Recovery


The database approach includes built-in security features, such as user authentication and
access controls, to restrict who can see or modify specific data. Additionally, a DBMS provides
robust backup and recovery facilities to protect against data loss due to system failures,
hardware crashes, or other disasters.
Pros and cons of database:

Advantages Disadvantages
Data Integrity: Enforces rules to maintain High Cost: Requires significant investment in
data accuracy. software, hardware, and staff.
Data Security: Offers granular access Complexity: Requires specialized skills for
control and user authentication. design and management.
Reduced Redundancy: Eliminates data Performance Overhead: Can be slower than
duplication, saving space and improving custom file systems due to its robust features.
consistency.
Data Sharing: Allows multiple users to Single Point of Failure: A system crash can
access and modify data concurrently. impact all applications.
Backup/Recovery: Provides automated Vendor Dependence: Can lead to being
tools to protect against data loss. locked into a single database provider.
Data Independence: The ability to change Setup Time: Initial database design and
storage methods without affecting implementation can be a lengthy process.
applications.

Q) ER Analysis
A) Inroduction:
ER analysis means entity-relationship model explains the construct
Showing off a database with the help of a diagram called an Entity Relationship diagram.

ER diagrams are a visual tool that helps represent the ER model. It was proposed by Peter
Chen in 1971 to create a uniform conviction that can be used for relational databases and
networks, and a mood to use an ER model as a conceptual modelling approach.

Rectangles: This symbol represents entity types.

Ellipses: symbols represent attributes.

Diamonds: relationship types

Lines: links attributes to entity types and entity types with other relation types.

Primary key: attributes are underlined.

Double ellipse means multi valued and double rectangles means weak set.
Dashed ellipse : derived attributes

Double lines : total participation of an entity in a relationship set

Relationshi

Weak
Relationship

Entity or Strong entity

Attribue

Multivalued attribute

Weak Entity

An ER diagram has three main components: (diagram should also be drawn)

A. Entity
B. Attribute
C. Relationship
Attributes and domains:
Attributes are very important because they help to describe the entities in a database.
Understanding Attributes and Domains
In the context of database management, an attribute is a specific piece of information that
describes an entity. Think of it as a column in a table, representing a characteristic of the items
in that table. For example, in a table of students, attributes could include 'Student ID', 'First
Name', 'Last Name', and 'Date of Birth'.

A domain defines the set of permissible values for an attribute. It establishes the rules and
constraints for the data that can be stored in that attribute, ensuring data integrity and
consistency. The domain specifies the data type (like integer, string, or date), length, and any
other restrictions. For instance, the domain for the 'Date of Birth' attribute might be restricted
to valid dates in the past, while the domain for 'Student ID' might require a unique, 5-digit
number.

❖ Simple attributes
❖ Composite attributes
❖ Single valued attribute
❖ Multi-valued attributes
❖ Derived attributes
❖ Key attributes
❖ Complex attribute
❖ Storage attribute

Integrity constrains:
Integrity constraints are a set of rules used in a database to ensure that data is accurate,
consistent, and reliable. They prevent accidental damage to the database by restricting the
type of data that can be inserted, updated, or deleted. Think of them as the quality control
checks for your data, making sure it follows the logical structure and business rules you've
defined.

There are four primary types of integrity constraints:

1. Domain Constraints
Domain constraints specify the valid set of values for a particular attribute (a column in a
table). They enforce rules about the data type, size, and format of the data. For example, a
domain constraint might ensure that a 'price' column only accepts positive numerical values,
or that a 'date' column only contains valid date formats.

2. Entity Integrity Constraints


This constraint guarantees that the primary key of a table is unique and not null. The primary
key is used to identify each record in a table, so if it were null or duplicated, you wouldn't be
able to distinguish between different records. This rule is crucial for maintaining the
uniqueness of each entity in the database.

3. Referential Integrity Constraints


Referential integrity is a rule that applies to relationships between tables. It ensures that a
foreign key in one table (the child table) must either be null or match a valid value in the
primary key of another table (the parent table). This prevents "orphaned" records, where a
record in one table refers to a non-existent record in another. For example, if you have an
'Orders' table that references a 'Customers' table, a referential integrity constraint would
prevent you from adding an order for a customer who doesn't exist.

4. Key Constraints
Key constraints enforce the uniqueness of data within a table. This is a broader category that
includes the primary key and other unique keys. For example, a 'Students' table might have
'StudentID' as its primary key, but a 'Social Security Number' or 'Email Address' could also be
defined as unique keys to ensure that no two students have the same value for those
attributes.

Keys:
Sure, let's add super key and surrogate key to the discussion of database keys.

Super Key
A super key is any set of attributes (columns) that can uniquely identify a row in a table. It's the
most general type of key. A super key can contain extra, unnecessary attributes as long as the
unique identification property is maintained. For instance, if StudentID is a primary key, then
combinations like (StudentID, StudentName) or (StudentID, DateOfBirth) are also
super keys, even though StudentName and DateOfBirth aren't needed for unique
identification.

Surrogate Key
A surrogate key is a special kind of primary key. It's an artificial, system-generated attribute
with no inherent business meaning.1 These keys are typically auto-incrementing integers,
GUIDs (Globally Unique Identifiers), or other randomly generated numbers. The main
advantage of a surrogate key is that it's simple, reliable, and never changes, even if the data it
represents does. For example, instead of using a complex combination of attributes like
(FirstName, LastName, DateOfBirth) as a primary key, you could simply use a surrogate
key named UserID that automatically assigns a new number to each new user. This approach
is often preferred over using "natural" keys (like social security numbers or email addresses)
that might change or have business meaning.

Primary Key Foreign Key


Main Job Uniquely identifies each Connects a row in one table to a
row in its own table. row in another.
Uniqueness Must be unique. Can have repeated values.
Empty Values Can't be empty (null). Can be empty (null).
Example CustomerID in the CustomerID in the Orders
Customers table. table.
Export to Sheets

Integrity Rule Enforces entity integrity Enforces referential integrity


(uniqueness and non-null (linking to a valid primary key).
values).
Data Values Must be unique within Can have duplicate values and
the table and cannot be can be null if it's not a required
null. relationship.
Index Usage Automatically creates a Often creates a non-clustered
clustered or non- index to improve join
clustered index, which performance.
speeds up data retrieval.
Export to Sheets
Q) Data normalisation 1NF,2NF,3NF,4NF,5NF:
A) Data normalization is a systematic process of organizing a
database to reduce data redundancy and improve data
integrity. The goal is to design a logical, flexible database that
can handle changes without errors or inconsistencies. The
process involves decomposing tables into smaller, more
organized tables and defining relationships between them.
The process of normalization is typically carried out in a series of steps called normal forms.
The most common normal forms are 1NF, 2NF, and 3NF.

First Normal Form (1NF)


A table is in 1NF if it meets two primary conditions:

1. Atomicity: Each column contains a single, indivisible value. There are no repeating
groups of data within a single cell. For example, a single cell in a 'phone number'
column should not contain multiple phone numbers.
2. Unique Rows: Each row is unique, typically ensured by a primary key.

Second Normal Form (2NF)


A table is in 2NF if it's already in 1NF and all non-key attributes are fully dependent on the
primary key. This applies only to tables with a composite primary key (a key made up of two or
more columns). If a non-key attribute depends on only part of the composite key, it must be
moved to a new table.

Third Normal Form (3NF)


A table is in 3NF if it's already in 2NF and all non-key attributes are non-transitively dependent
on the primary key. This means that a non-key attribute should not depend on another non-key
attribute. For example, if a Product table has a ManufacturerID and a
ManufacturerAddress, and the ManufacturerAddress only depends on the
ManufacturerID, this is a transitive dependency. To be in 3NF, the ManufacturerAddress
should be moved to a separate Manufacturer table.

Why is Normalization Important?

• Reduces Data Redundancy: Storing data in multiple places wastes space and can
lead to inconsistencies. For example, if a customer's address is stored in two different
tables and only one is updated, the data becomes inconsistent. Normalization prevents
this by storing data in a single, dedicated location.
• Improves Data Integrity: By removing dependencies and enforcing rules,
normalization ensures that data is accurate and reliable.
• Enhances Database Flexibility: A normalized database is easier to modify and extend
without disrupting the existing structure. It allows you to add new data or entities
without having to change multiple tables.

Data normalization is a structured approach to organizing a database to reduce data


redundancy and improve data integrity. It's a series of rules, or normal forms, that tables must
meet. Here is a breakdown of the most common normal forms with tables and descriptions.

First Normal Form (1NF)


A table is in 1NF if it meets two basic conditions:

• Atomicity: Each column contains a single, atomic value. There are no repeating groups
of data within a single cell.
• Unique Rows: Each row is unique, typically ensured by a primary key.

Example:

Student Grades (UNF)

Student Student
Course & Grade
ID Name

101 John Smith History: A, Math: B

102 Jane Doe Science: A, English: B


To convert to 1NF:
Student Grades (1NF)

Student Student Grad


Course
ID Name e

101 John Smith History A

101 John Smith Math B

Scienc
102 Jane Doe A
e

102 Jane Doe English B

Second Normal Form (2NF)


A table is in 2NF if it's already in 1NF and all non-key attributes are fully dependent on the
primary key. This primarily applies to tables with a composite primary key (a key made up of
two or more columns).

Using the 1NF table above, the composite primary key is (Student ID, Course). The
Student Name depends only on Student ID, not the full key. To achieve 2NF, we split the
table.

Example (2NF):

Students

Student Student
ID Name

101 John Smith

102 Jane Doe


Enrollments

Student Grad
Course
ID e

101 History A

101 Math B
Scienc
102 A
e

102 English B

Third Normal Form (3NF)


A table is in 3NF if it's in 2NF and there are no transitive dependencies. A transitive
dependency is when a non-key attribute depends on another non-key attribute.

Example (Not 3NF):

Books

Book Author
Book Title Author Name
ID ID

Douglas
1001 The Hitchhiker's Guide 501
Adams

1002 The Stand 502 Stephen King

Douglas
1003 The Restaurant 501
Adams
Here, Author Name depends on Author ID, not Book ID. Author ID is a non-key attribute.
To get to 3NF, we create a separate table for authors.

Example (3NF):

Books

Book Author
Book Title
ID ID

1001 The Hitchhiker's Guide 501

1002 The Stand 502

1003 The Restaurant 501


Authors
Author
Author Name
ID

Douglas
501
Adams

502 Stephen King

Boyce-Codd Normal Form (BCNF)


BCNF is a stricter version of 3NF. A table is in BCNF if for every functional dependency X -> Y,
X is a superkey. This addresses cases where 3NF fails, often with tables having multiple
overlapping candidate keys.

Fourth Normal Form (4NF)


A table is in 4NF if it's in BCNF and has no multi-valued dependencies. A multi-valued
dependency occurs when a non-key attribute can have multiple independent values.

Example (Not 4NF):

Student Interests

Student
Activity Hobby
ID

Readin
101 Soccer
g

Basketbal Readin
101
l g
Here, a student can have multiple activities and multiple hobbies, but the two are not related.
To achieve 4NF, you separate these independent attributes.

Example (4NF):

Student Activities

Student
Activity
ID
101 Soccer

Basketbal
101
l
Student Hobbies

Student
Hobby
ID

Readin
101
g

Fifth Normal Form (5NF)


A table is in 5NF if it's in 4NF and has no join dependencies. A join dependency occurs when a
table can be decomposed into smaller tables that can be joined back together to create the
original table without any loss of information. 5NF is rarely encountered in practical database
design because these dependencies are uncommon.
5 marks

Components of DBMS:

 HARDWARE

 SOFTWARE

 DATA

 PROCEDURES

 DATABASE ACCESS LANGUAGE

SOFTWARE

HARDWARE

DATA
DAT
ACCESS PROCEDURS
A
LANGUAGE

USER

Hardware refers to the physical, tangible parts of a computer system that you can touch. This
includes the electronic components and devices that make up the computer. Examples are
the keyboard, monitor, mouse, hard drive, and the central processing unit (CPU).
Software is the set of instructions and programs that tell the hardware what to do. It's the
intangible part of a computer system. There are two main types:

• System software manages and controls computer hardware, like the operating system
(e.g., Windows, macOS).
• Application software performs specific tasks for the user, such as a word processor or
a web browser.

Data is the raw, unorganized facts and figures that a computer processes. It's the information
that the software and hardware manipulate. This can be anything from text, numbers, images,
or sound files. For example, the name of a customer, their address, and their phone number
are all pieces of data.

Procedures are the rules and guidelines that govern how a computer system and its data are
used. They are the instructions for the users and operators. This can include how to log in, how
to back up a file, or how to handle a system error.

A data access language is a specialized language used to interact with a database. It allows
users and applications to retrieve, insert, update, and delete data. The most common example
is SQL (Structured Query Language), which is the standard language for managing relational
databases.

Q) data models and types:


A data model is a blueprint or a visual representation of how data is structured, organized, and
connected within a database or information system. It defines the data elements, their
attributes, and the relationships between them. Data models are essential for designing,
building, and managing databases effectively, ensuring data integrity and consistency.

Data models are frameworks that define how data is structured and stored in a database. They
can be categorized into three main types based on their level of abstraction: object-based,
record-based, and physical models.

Object-Based Data Models


Object-based data models apply the principles of object-oriented programming to database
design. Data is stored in the form of objects which are real-world entities that contain both
data (attributes) and behavior (methods). This model is more flexible and can handle complex
data types like multimedia content more naturally than traditional models. The key
characteristics are:
• Encapsulation: Data and the methods that operate on that data are bundled together
in a single object.
• Inheritance: Objects can inherit attributes and methods from a parent class,
promoting code reuse.
• Polymorphism: The same method can be used by different objects to produce different
results.

Record-Based Data Models


Record-based data models are used to structure data into a fixed number of fields or
attributes within a record. These models are popular for their simplicity and are the foundation
of most traditional database systems. The most common types of record-based models are:

• Relational Model: Data is organized into two-dimensional tables (called relations) with
rows and columns. Relationships are established through primary and foreign keys, and
data is manipulated using SQL. It's the most widely used data model today.
• Hierarchical Model: Data is organized in a tree-like structure with a parent-child
relationship. Each child record has only one parent, but a parent can have multiple
children. This model is rigid and was common in early database systems.
• Network Model: An extension of the hierarchical model, it allows a child record to have
multiple parents, forming a more complex, graph-like structure that can represent
many-to-many relationships.

Physical Data Models


Physical data models are the lowest-level and most detailed data models. They describe how
data is actually stored on the physical storage devices and are specific to a particular
database management system (DBMS). This model is used by database developers and
administrators to implement the database design. Key characteristics include:

• Platform-specific: The model is tied to a specific database technology (e.g., Oracle,


MySQL, SQL Server).
• Technical Details: It defines details like data types (e.g., VARCHAR, INTEGER), table
names, column names, indexes, and other low-level storage details.
• Implementation: This model provides the blueprint for generating the actual database
schema using a Data Definition Language (DDL) script.
q)

A) The Hierarchical Database Model represents one of the


earliest and most intuitive approaches to structuring data,
predating the widespread adoption of more flexible
paradigms. It laid critical groundwork for organized data
storage.
The foundational principle of this model is the parent-child relationship. In this structure, each
parent node can possess multiple child nodes, but, critically, each child node is linked
exclusively to one parent node. This strict one-to-many (1:M) relationship forms predictable,
top-down data access paths, originating from a single topmost node, known as the root node
(which has no parent), and extending downwards to various leaf nodes. Data units within this
model are often referred to as "segments," with parent segments bundling related fields (e.g.,
a "Department" segment might bundle "EmployeeName" and "EmployeeID" fields for its child
"Employee" segments). Interestingly, in contrast to tree structures commonly found in
computer software algorithms, in this model, the children typically contain pointers or
references that "point to" their respective parents.

Network

The Network Database Model emerged as an evolution of the hierarchical model, specifically
designed to address its primary limitation: the inability to represent complex, many-to-many
relationships directly.

Relational

The Relational Database Model stands as the most dominant and influential database
paradigm of the past several decades, fundamentally reshaping how data is managed and
accessed. Its principles form the backbone of countless information systems worldwide.

Aspect Hierarchical Model Network Model Relational Model


Data
Tree-like (parent-child) Graph-like (nodes-edges) Tabular (rows and columns)
Structure
Physical pointers, implicit Physical pointers, explicit Mathematical set theory, logical
Foundation
hierarchy sets relations
Hierarchical Model Network Model
Relational Model
Relationship Type (Mechanism & (Mechanism &
(Mechanism & Example)
Example) Example)
Possible (via 1:M Supported (via
One-to-One (1:1) Supported
constraint) Primary/Foreign Keys)
Supported
Primary support
(Owner- Supported (Foreign Key)
(Parent-Child) e.g.,
One-to-Many (1:M) Member sets) e.g., Department →
Department →
e.g., Course → Employees
Employees
Students
Directly
Difficult, requires data supported Supported (Junction
Many-to-Many duplication or (Multiple Table) e.g., Student ↔
(M:M) complex design e.g., Owners) e.g., Course through
Student ↔ Course Student ↔ Enrollment table
Course

Hierarchical Model: This model is best suited for simple, stable data structures that
inherently exhibit clear, one-to-many parent-child relationships, such as file systems,
organizational charts, or product inventory categories. It excels in scenarios requiring
predictable, high-performance top-down data retrieval, particularly in specialized, high-
availability applications like certain banking, healthcare, and telecommunications systems.
Network Model: Representing an evolutionary step, the network model was designed to
overcome the hierarchical model's limitations by allowing for more complex many-to-many
relationships with explicit links between records. It offered a more natural representation of
interconnected data than its predecessor. However, its operational complexity, reliance on
explicit pointer management, and procedural querying limited its long-term viability in the face
of more abstract and flexible alternatives.

Relational Model: The relational model stands as the most versatile and widely adopted
paradigm for structured data management. It offers robust data integrity through ACID
properties and normalization, provides high data independence, and enables powerful, easy-
to-use declarative querying via SQL. It is the ideal choice for the vast majority of transactional
systems, including e-commerce, Enterprise Resource Planning (ERP), Customer Relationship
Management (CRM), and financial applications, where consistency, flexibility, and a mature
ecosystem are paramount.

You might also like