Professional Documents
Culture Documents
Study Material
(2017-18)
https://mguugcs.blogspot.in
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 1
B.SC-Computer Science-III Year DBMS Study Material
II. Improved data security: Accessing the data According to the number of users:
by many users gives risks to the data According to the number of users, the databases are
security. So that the DBMS provides a classified as single-user or multiuser.
framework for better enforcement of data I) A single-user database supports only one
privacy and security policies. user at a time.
III. Better data integration: DBMS provides If user A is using a single- user database,
an integrated view of the organization’s then users B and C must wait until user A
operations. completes his work.
IV. Minimized data inconsistency: Data A single-user database that runs on a
inconsistency exists when different personal computer is called a desktop
versions of the same data appear in different database.
places. The probability of data
inconsistency is greatly reduced by the II) A multiuser database supports multiple
DBMS. users at the same time.
V. Improved data access: The DBMS makes it When the database is used by a specific
possible to produce quick answers to department within an organization then it is
queries. A query is a request issued to the called a workgroup database.
DBMS, for example, to read or update the When the database is used by the entire
data. organization then it is known as an
VI. Improved decision making: Better- enterprise database.
managed data provides better quality
information and knowledge. It is useful to According to the Location:
make better decisions. According to the location, the databases are
VII. Increased end-user productivity: The classified as Centralized or Distributed.
availability of data and information I) Centralized database: A database that supports
empowers end users give better
data located at a single site is called a centralized
productivity.
**** database.
II). Distributed Database: A database that
Database Systems – Disadvantages supports data distributed at several sites is called a
Database systems have the following distributed database.
disadvantages:
According to the type of usage:
Increased costs: Training, licensing, and
According to type of usage, the databases are
regulation costs are high.
classified as Operational database or data
Management complexity: It is a complex
process to manage the changes in the warehouse.
database system. I). Operational Database: A database that is
Maintaining currency: The database designed to support a company’s day-to-day
system must keep current. operations is known as an operational database
Frequent upgrade/replacement cycle: Changes (Also called as a transactional or production
in the DBMS may require hardware upgrades too. database)
II) Datawarehouse: A database that focuses
primarily on storing large amount of data and to
***
generate information required for decision support.
Types of Databases
Databases can be classified according to: ***
The number of users
The database locations and
The type of usage.
https://mguugcs.blogspot.in Page: 2
B.SC-Computer Science-III Year DBMS Study Material
Why Database design is important? Even a simple file system requires several file
Database design refers to the activities that design management programs.
the database structure. A database’s structure must IV) Lack of security and limited data sharing:
be designed carefully. Otherwise even a good File system has lack of security and limited data
DBMS will perform poorly with a badly designed sharing. Sharing data among multiple users
database. introduces a lot of security risks.
A database designer should mind that: Designing V) Extensive programming:
a transactional database / data warehouse database In a file system environment it is difficult to make
/ a centralized database or a distributed database, changes. Even a simple file system management
each requires a different approach. requires several programs.
A well-designed database facilitates better data VI) Structural and Data Dependence:
management. A poorly designed database may lead In a file system, access to a file is dependent on its
to bad decision making and bad decision making structure. For example, to add a new field to a file,
can lead to the failure of an organization. So that all of the file system programs must be modified.
the Database design is very important. VII) Data Redundancy:
With a file system, the same data might be stored in
*** different locations. It leads to Data redundancy.
Files and File Systems Uncontrolled data redundancy leads to Poor data
Basic File Terminology: security, data inconsistency.
Field: A character or group of characters VIII) Lack of Design and Data-Modeling Skills
that has a specific meaning. A field is Data-modeling skills are very important part of the
used to define and store data. design process. File systems does not have design
Record: A collection of fields is known and data modeling features.
as a Record. It describes a person, place, ****
or thing.
File: A collection of related records is Database Systems
known as a File.
For example, a file might contain the Database System Components
records for the students currently enrolled at A group of components that define and control the
Mahatma Gandhi University. collection, storage, management, and use of data
*** within a database environment are known as
Database Systems.
Problems with File System Data Management
The following are the problems associated with file A database system has five major parts:
systems: 1) Hardware: Hardware refers to all of the
I) Lengthy development times: system’s physical devices. It includes computers,
In a file system approach even the simplest data- storage devices, printers, network devices and other
retrieval task requires extensive programming. devices.
Programmers need to specify what must be done 2) Software: Database systems require three types
and how to do it. of software:
II) Difficulty of getting quick answers: I. Operating System software
In a file system there is a need to write programs to II. DBMS software, and
produce even a simplest report. It does not support III. Application Programs and Utilities.
ad-hoc queries. Operating system software: It manages all
III) Complex system administration: hardware components and runs all other software.
As the number of files in the system expands it Examples: Microsoft Windows, Linux, Mac OS,
makes the System administration very difficult. UNIX, and MVS.
https://mguugcs.blogspot.in Page: 3
B.SC-Computer Science-III Year DBMS Study Material
DBMS software: It manages the database within system. Procedures play an important role in a
the database system. company because they enforce the
Examples: Oracle Corporation’s Oracle, standards within the organization.
Microsoft’s SQL Server, Sun’s MySQL, and 5) Data: Data are the collection of facts stored in
IBM’s DB2. the database. Data are the raw material that is useful
Application programs and utility software: for generating information.
These are used to access and manipulate data in the ***
DBMS. DBMS Functions
Utilities are the software tools used to manage the A DBMS performs several important functions.
database system’s computer components. They include:
1. Data dictionary management
3) People: People include all users of the database 2. Data storage management
system. There are five types of users in a database 3. Data transformation and presentation
system: 4. Security management
I. System Administrators 5. Multi-user access control
II. Database Administrators 6. Backup and recovery management
III. Database Designers 7. Data integrity management
IV. System Analysts and Programmers, 8. Database access languages and application
and programming interfaces
V. End Users. Data dictionary management: DBMS
stores metadata in a data dictionary. DBMS
uses the data dictionary to manage data
component structures and relationships.
https://mguugcs.blogspot.in Page: 4
B.SC-Computer Science-III Year DBMS Study Material
Data integrity management: The DBMS customer phone, customer address, and
provides integrity rules to minimize data customer credit limit.
redundancy and maximize data
consistency. A relationship is an association among entities.
For example:: an agent can serve many customers,
Database access languages and and each customer may be served by one agent.
application programming interfaces:
The DBMS provides data access through
Data models use three types of relationships:
Structured Query Language (SQL). DBMS
one-to-many[1:M] , many-to-many[M:N], and
also provides application programming
interfaces to COBOL, C, Java, etc. one-to-one[ 1:1].
*** One-to-many (1:M ) relationship. A painter
paints many different paintings, but each one of
Data Models them is painted by only one painter. Thus, the
painter (the “one”) is related to the paintings (the
“many”). Therefore, the relationship “PAINTER
The importance of Data models
paints PAINTING” is 1:M.
A data model is a graphical representation of real-
Many-to-many (M:N) relationship. An
world data structures. Data models represent data
employee may learn many job skills, and each job
structures and their characteristics, relations, and
skill may be learned by many employees.
constraints.
Therefore, the relationship “EMPLOYEE learns
Importance of Data Models:
SKILL” is M:N.
Data models are a communication tool that can
One-to-one (1:1 or 1..1) relationship. A retail
facilitate interaction among the designer, the
company’s store be managed by a single employee.
applications programmer, and the end user. A well-
In turn, each store manager, who is an employee,
developed data model provides a better
manages only
understanding of the organization. A database
a single store. Therefore, the relationship
should be designed in a way that it can support all
“EMPLOYEE manages STORE” is 1:1.
categories of users in an organization. So, a proper
data model is necessary to create a good database.
A constraint is a restriction placed on the data.
The basic data-modeling components are entities,
Constraints are normally expressed in the form of
attributes, relationships, and constraints.
rules. For example:
Explain about the basic building blocks of Data An employee’s salary must have values that
are between 6,000 and 350,000.
models?
A student’s GPA must be between 0.00 and
The basic building blocks of all data models are:
4.00.
Entities
Each class must have one and only one
Attributes
teacher.
Relationships and
Constraints
***
An entity is anything about which data are to be
collected and stored.
For example: A Person, A Place or a Thing. Business Rules
https://mguugcs.blogspot.in Page: 5
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 6
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 7
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 8
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 9
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 10
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 11
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 12
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 13
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 14
B.SC-Computer Science-III Year DBMS Study Material
For Example: SKILLS is a multivalued attribute. The basic types of connectivity for relations are:
Because a person may have many SKILLs. • One to Many (1:M)
• Many to Many (M:M)
Derived Attribute: • One to One (1: 1).
A derived attribute is an attribute whose value One-to-many (1:M ) relationship. A painter
is calculated (derived) from other attributes. paints many different paintings, but each one of
For example: them is painted by only one painter. Thus, the
An employee’s age, EMP_AGE, may be painter (the “one”) is related to the paintings (the
calculated from the current date and the “many”). Therefore, the relationship “PAINTER
EMP_DOB paints PAINTING” is 1:M.
Relationship Degree
https://mguugcs.blogspot.in Page: 15
B.SC-Computer Science-III Year DBMS Study Material
Unary Relationship
A unary relationship exists when an association is
maintained with in a single entity.
For example: EMPLOYEE manages EMPLOYEE
entity is the manager
Ternary Relationship
A ternary relationship exists when three entities are
associated. A ternary relationship implies an Here, the relationship between COURSE and
association among three different entities. CLASS is a weak relationship. Because the CLASS
PK did not inherit the PK component from the
COURSE entity. A weak relationship is also known
as Non-identifying relationship.
***
https://mguugcs.blogspot.in Page: 16
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 17
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 18
B.SC-Computer Science-III Year DBMS Study Material
Entity Clustering
In Real-time, An ER diagram may contain
Subtype Discriminator: hundreds of entity types and their relationships. It
A subtype discriminator is the attribute in the makes the ERD becomes crowd and makes it
supertype entity that determines to which subtype unreadable and inefficient. Entity clusters can solve
the supertype occurrence is related. this problem.
An entity cluster is a “virtual” entity type that is
Disjoint and Overlapping Constraints used to represent multiple entities and relationships
An entity supertype can have disjoint or in the ERD. It is a temporary entity used to
overlapping entity subtypes. represent multiple entities and relationships to
Disjoint subtypes: simplify the ERD.
Disjoint subtypes are subtypes that contain a ***
unique subset of the supertype entity set. Each
entity instance of the supertype can appear in only ENTITY INTEGRITY
one of the subtypes. For example, An employee SELECTING PRIMARY KEYS
who is a pilot can appear only in the PILOT Primary key is the most important characteristic of
subtype, not in any of the other subtypes. These are an entity. It uniquely identifies each entity instance.
also known as non-overlapping subtypes. Disjoint Therefore, it is essential to select the primary key
subtypes can be indicated with the letter ‘d’ . properly.
Overlapping subtypes :
Overlapping subtypes are subtypes that contain Primary Key Guidelines:
non-unique subsets of the supertype entity set A primary key is the attribute or combination of
Each entity instance of the supertype may appear in attributes. It’s main function is to uniquely identify
more than one subtype. For example, in a university an entity instance or row within a table.
environment, a person may be an employee or a •As a determinant, it should determine all its
student or both. Overlapping subtypes can be dependents. It should guarantee entity integrity.
indicated with the letter ‘o’ .
****
https://mguugcs.blogspot.in Page: 19
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 20
B.SC-Computer Science-III Year DBMS Study Material
First Normal Form (1NF): of the primary key. Every time activity value
A table is in first normal form, if it has no repeats, corresponding fee value repeats
multivalued attributes or repeating groups. To To create 2NF relations from the activities
convert the above table into first normal form, the relation, can divide relations in two:
process starts with a 3-step procedure.
1. Eliminate the repeating groups, by SIDActivity
eliminating nulls and ensuring data Activity Fee
values.
2. Identify the Primary Key. Student_Activity
3. Identify All Dependencies like partial, SID Activity
transitive etc.
By doing the above 3 steps the DATAORG table 100 Skiing
will be as follows; 100 Golf
150 Swimming
175 Squash
200 Swimming
200 Golf
Activity_Fee
Activity Fees
Skiing 200
Swimming 50
Golf 65
Second Normal Form (2NF):
A Relation is said to be in 2NF if every non- Squash 50
key attribute is fully functionally dependent
on all parts of primary key A table is in 2NF form if:
Note: If the primary key consists of Single value dependencies in each cell
one attribute, relation is automatically in No Repeating Attributes
2NF. No Multi-valued Attributes
Each Attribute is fully dependent
SID Activity Fee on the primary key.
100 Skiing 200
Third Normal Form (3NF)
100 Golf 65 A relation is in third normal form (3NF) if it is in
second normal form and no transitive dependencies
150 Swimming 50 exist.
175 Squash 50
A transitive dependency in a relation is a functional
175 Swimming 50 dependency between the primary key and one or
200 Swimming 50 more non-key attributes that are dependent on the
primary key via another non-key attribute.
200 Golf 65
For example, there are two transitive dependencies
Fee not fully functionally dependent on
in the following CUSTOMER ORDER relation:
primary key, only dependent on activity part
https://mguugcs.blogspot.in Page: 21
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 22
B.SC-Computer Science-III Year DBMS Study Material
set of B and C values for each A value, but those B This could be controlled by designing a program
and C values are independent of each other. that can validate classification based
on the student hours.
To remove the multivalued dependency from a 3. Preaggregated Data: This occurs for example,
relation, we divide the relation into two new when we try to store the student grade point
relations. Each of these tables contains two
average(STU_GPA) aggregate value in the
attributes that have a multivalued relationship in the
original relation. STUDENT table when this can be calculated from
ENROLL and COURSE tables.
Fifth Normal Form (5NF)
This could be controlled by updating STU_GA via
A Fifth normal form is a higher level normal form administrative routines.
that deals with a property called “lossless joins.”
4. Information Requirements: This occurs when
5NF is not of practical significance because lossless we try to use a temporary de-normalized table to
joins occur very rarely and are difficult to detect. hold report data.
The solution is to eliminate the problems caused by
the multivalued dependency. We do this by A temporary table is deleted once the report is
creating new tables for the components of the done.
multivalued dependency.
Domain-key normal form (DKNF) ***
Denormalization
Denormalization is the process of attempting to
optimize the performance of a database by adding
redundant data or by grouping data. In some cases,
denormalization helps cover up the inefficiencies
inherent in relational database software.
The following are some common denormalizaion
examples:
1. Redundant Data: This occurs for example, when
we try to store ZIP and CITY attributes in the
CUSTOMER table when ZIP determines the CITY.
This could be controlled by designing a program
that can validate city based on the
zip code.
2. Derived Data: This occurs for example, When
we try to store STU_HRS and STU_CLASS when
STU_HRS determines STU_CLASS.
https://mguugcs.blogspot.in Page: 23
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 24
B.SC-Computer Science-III Year DBMS Study Material
Drop Command: The Drop command is used to Delete Command: This command is used to delete
permanently delete a table from the database. The the records/rows in a table.
drop table will delete both the table structure and 1) Syntax:
the data in the table. Once a table is dropped, we Delete * from tablename;
cannot retrieve it back, so we have to be cautious Example:
before using a drop command. Delete * from Student;
Syntax: Description: The above query will delete all the
Drop table tablename; rows in the student table.
Example: 2) Syntax:
Drop table Student; Delete from tablename where condition;
Example:
Data Manipulation Language (DML) Delete from student where marks < 35;
Data manipulation commands are used to Description: The above query will delete only
add data to a table, modify data in the table, delete those student rows who got less than 35.
data in the table and to display data in the table. The
different Data Manipulation Commands are: Select Command: This command is used to
INSERT, SELECT, UPDATE, DELETE, retrieve and display the data in the database. Select
COMMIT, and ROLLBACK. command can display all rows and columns or
particular rows and particular columns;
Insert Command: This command is used to add Example:
data values into a table. select * from student;
Syntax: Description: Retrieves and display all the rows and
Insert into tablename values(col1val, col2val, ... columns data of the student table.
colNval);
Example: Example:
Insert into student values(101,'Ravi',755,'21-DEC- Select * from student where marks > 75;
1985','Hyderabad'); Description: Displays only the particular rows of
the students who got marks greater than 75 but
Update Command: display all the columns.
This command is used to modify the data values in
a table: Example:
1) Syntax: Select rno, dob from student;
Update tablename SET columnname = value; Description: Display only the particular columns
Example: rno and dob data but displays all rows.
Update student SET marks = 75;
Description: The above query will change all the Example:
data values of all the students to 75. Select rno, dob from student where marks > 75;
Description: Displays only the particular columns
2) Syntax: RNO and DOB and display only those rows of
Update tablename SET columname=value students whose marks is greater than 75.
WHERE condition;
Example:
Update Student SET marks=40 where marks = 30;
Description: The above query will set the data
value 40 to only those students whose marks are
equal to 30 as specified in the where clause.
https://mguugcs.blogspot.in Page: 25
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 26
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 27
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 28
B.SC-Computer Science-III Year DBMS Study Material
Note: The GROUP BY clause is generally used to only specified columns and specified
when we have attribute columns combined with rows in a table.
aggregate functions in the SELECT statement. The Views may also be used as the basis for
GROUP BY clause is valid only when used in reports.
conjunction with one of the SQL aggregate
functions, such as COUNT, MIN, MAX, AVG, and We can create a view by using the CREATE VIEW
SUM. command:
Example: Syntax:
SELECT ENAME, SUM(SAL), DEPTNO FROM CREATE VIEW viewname AS SELECT query
EMP GROUP BY DEPTNO; Example:
CREATE VIEW EMP_10 AS
The above code will generate a “not a GROUP BY SELECT JOB, AVG(SAL) AS TOTAL FROM EMP
expression” error. However, if we write the GROUP BY JOB;
preceding SQL command sequence in conjunction ****
with some aggregate function, the GROUP BY ADVANCED SQL
clause works properly. Relational Set Operators:
The set operators in SQL are Set-Oriented
The GROUP BY Feature's HAVING Clause that is they operate over entire sets of rows and
A particularly useful extension of the GROUP BY columns of the tables at once. SQL provides four
feature is the HAVING clause. The HAVING different set operators, they are;
clause operates very much like the WHERE clause 1) UNION
in the SELECT statement. However, the WHERE 2) UNION ALL
clause applies to columns and expressions for 3) INTERSECT
individual rows, while the HAVING clause is 4) MINUS
applied to the output of a GROUP BY operation.
1) UNION: The UNION statement combines rows
Example: from two or more queries without including
SELECT ENAME, SUM(SAL), DEPTNO FROM duplicate rows.
EMP GROUP BY DEPTNOHAVING In other words, the UNION statement combines the
(SUM(SAL) > 2500) output of two SELECT queries. (Remember that
ORDER BY SUM(SAL) DESC; the SELECT statements must be union-compatible.
That is, they must return the same number of
Virtual Tables attributes and similar data types.)
A view is a virtual table based on a SELECT query. Syntax:
The query can contain columns, computed query UNION query
columns, aliases, and aggregate functions from one Example:
or more tables. The tables on which the view is SELECT * FROM EMP1 UNION SELECT *
based are called base tables. A relational view has FROM EMP2;
several special characteristics:
We can use the name of a view anywhere a 2) UNION ALL: The UNION ALL statement
table name is expected in a SQL statement. combines rows from two or more queries
Views are dynamically updated. That is, the including the duplicate rows.
view is re-created on demand each time it is Syntax:
invoked. query UNION ALL query
Views provide a level of security in the Example:
database because the view can restrict users SELECT * FROM EMP1 UNION ALL SELECT *
FROM EMP2;
https://mguugcs.blogspot.in Page: 29
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 30
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 31
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 32
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 33
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 34
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 35
B.SC-Computer Science-III Year DBMS Study Material
dbms_output.put_line(‘Hello’); Example
end if; Accept number from a user and find out whether it
is Odd or Even.
end;
Declare
/
num number;
Example
Begin
Display Salary of a specified employee increasing
by 500 if its salary is more than 3000. num := #
Declare if mod(num,2) = 0 then
sal number(9,2); dbms_output.put_line(no,’is even’);
num emp.empno%type; else
Begin dbms_output.put_line(no,’is Odd’);
num := # end if;
Select salary into sal from emp where empno=num; End;
If sal > 3000 then /
sal := sal + 500; ****
end if; CURSORS
dbms_output.put_line(‘Salary : ’ || sal); Oracle creates a special memory area, known as
context area. The Context Area can be useful for
End;
processing the records(rows) returned by an SQL
/ statement.
https://mguugcs.blogspot.in Page: 36
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 37
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 38
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 39
B.SC-Computer Science-III Year DBMS Study Material
Database Design
Information System:
An Information System is composed of People,
Hardware, Software, Databases, application
programs and Procedures. It provides for data
collection, storage, and retrieval. It also facilitates
to transform the data into useful information.
Databases are a part of an information system. So
that, for a successful database design, it is
important to study an information system. A
carefully designed development process is
mandatory for a better information system.
https://mguugcs.blogspot.in Page: 40
B.SC-Computer Science-III Year DBMS Study Material
Detailed Systems Design: The following figure shows the phases in DBLC:
Implementation:
During the implementation phase, the hardware,
DBMS software, and application programs are
installed, and the database design is implemented.
This phase also performs the coding, testing, and
debugging. The system becomes fully operational
at the end of this phase.
Maintenance
While operating the system, the end users may
begin to request some changes in it. Those changes
generate 3 types of system maintenance activities:
Corrective maintenance: It can be done in response 2. Database design
to the systems errors.
This phase focuses on the design of the database.
Adaptive maintenance : It can be due to the changes
During this phase, it:
in the business environment.
Perfective maintenance: It can be done to enhance Creates a conceptual design
the system. Selects the DBMS Software
Creates the logical design
***** Creates the Physical design.
Data Base Life Cycle (DBLC) 3. Implementation and Loading: During this
The process of designing, implementing and phase, it:
evaluating the database in an information system is Install the DBMS: The DBMS may be
known as Database Life Cycle (DBLC). The installed on a server.
Database Life Cycle (DBLC) is a continuous Create the Database(s): It creates the
(iterative) process. The DBLC is contains six tables and other objetcs.
phases: Load or Convert the Data: It loads the
1. Database initial study data into the database tables
2. Database design 4. Testing and Evaluation: During this step, the
3. Implementation and loading DBA tests and fine tunes the database.
4. Testing and evaluation
5. Operation 5. Operation:
6. Maintenance and evolution. This phase contains a complete information
system. It provides the database, its admins, its
1. The Database Initial Study: users, and its application programs.
The database initial study is to: 6. Maintenance and Evolution:
Analyze the company situation. During this phase, the database administrator
Define problems and constraints. performs routine maintenance activities.
Define objectives.
****
Define scope and boundaries.
https://mguugcs.blogspot.in Page: 41
B.SC-Computer Science-III Year DBMS Study Material
***
****
https://mguugcs.blogspot.in Page: 42
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 43
B.SC-Computer Science-III Year DBMS Study Material
Concurrency Control with Locking Methods tables The following figure shows a page level
lock:
A lock guarantees exclusive use of a data item to a
current transaction.
Lock Granularity
Lock granularity indicates the level of lock use. It
is of following levels:
Database Level Locks
Table Level Locks
Page Level Locks
Row Level Locks
Field (attribute) Level Locks
Row Level
A row-level lock allows the transactions to access
Database Level
different rows of the same table. The following
In a database-level lock, the entire database is
figure illustrates row-level lock:
locked. It prevents other transactions to use any
tables in the database. The following figure shows
a database level lock:
Field Level
The field-level lock allows the transactions to
Table Level access different fields (attributes) within a row.
In a table-level lock, the entire table is locked. It Field-level locking requires high overhead. It can
prevents other transactions to use any row in the be rarely implemented.
tables. The following figure shows a database level
lock: ***
Lock Types
The DBMS may use different lock types, such as:
Binary Locks
Shared/exclusive.
Binary Locks
A binary lock has only two states: locked
(1) or unlocked (0). It specifies that every
transaction requires a lock and unlock
operation for each of its data items.
Page Level If a transaction locks any database object
In a page-level lock, an entire diskpage is locked. then the other transactions are not allowed
It prevents other transactions to use any page in the until it unlocks that object.
https://mguugcs.blogspot.in Page: 44
B.SC-Computer Science-III Year DBMS Study Material
Shared Locks:
A shared lock grants read access on the
object. A shared lock is issued when a
transaction wants to read data from the
database.
***
Deadlock
A database deadlock is caused when two or Two-phase locking increases the transaction
more transactions wait for each other to processing cost and might cause deadlocks.
unlock data.
***
*** Deadlocks
Two-Phase Locking Protocol A deadlock occurs when two transactions wait
indefinitely for each other to unlock data.
Two-phase locking protocol defines how For example, a deadlock occurs when two
transactions acquire and release locks. This transactions, T1 and T2, exist in the following
protocol has two phases in it. Those are: mode:
1. A growing phase T1 = access data items X and Y
2. A Shrinking Phase
T2 = access data items Y and X
Growing Phase:
The three basic techniques to control deadlocks are:
In this phase a transaction gets all required
locks without unlocking any data.
Deadlock prevention: A transaction with a new
Once all locks have been acquired, the
transaction is in its locked point. lock request is aborted when there is the possibility
Shrinking phase: that a deadlock can occur.
In this phase a transaction releases all locks Deadlock detection: The DBMS periodically tests
and cannot obtain any new lock. the database for deadlocks. If a deadlock is found,
then that transaction is aborted and the other
The two-phase locking protocol is has the transaction continues.
following rules:
Two transactions cannot have conflicting Deadlock avoidance: The transaction must obtain
locks. all of the locks it needs before it can be executed.
No unlock operation can precede a lock
operation in the same transaction.
***
No data are affected until all locks are
obtained.
https://mguugcs.blogspot.in Page: 45
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 46
B.SC-Computer Science-III Year DBMS Study Material
Database buffers are temporary storage areas in 1. Identify the last checkpoint in the
primary memory. These are useful to speed up disk transaction log.
operations. It saves a lot of time. 2. For a transaction that started and was
committed before the last checkpoint,
For any transaction that had a ROLLBACK nothing needs to be done
operation after the last checkpoint, nothing needs to
be done. Because the database was never updated. 3. For a transaction that was committed after
Database checkpoints are operations in which the the last checkpoint, the DBMS redoes the
DBMS writes all of its updated buffers to disk. transaction.
https://mguugcs.blogspot.in Page: 47
B.SC-Computer Science-III Year DBMS Study Material
***
DDBMS Advantages and Disadvantages:
Advantages Disadvantages
Data are located Complexity of
near the greatest management and
demand site: control:
https://mguugcs.blogspot.in Page: 48
B.SC-Computer Science-III Year DBMS Study Material
DDBMS Components
The DDBMS must include at least the following
components:
Computer workstations that form the network
Multiple-Site Processing, Single-Site Data
system.
(MPSD):
Network hardware and software: The network
In the multiple-site processing, single-site data
components allow all sites to interact and exchange (MPSD) runs multiple processes on different
data. Computers by sharing a single data repository.
Communications media that carry the data from
In a multiple-site processing, single-site
one workstation to another.
data(MPSD) scenario, different computers share a
The Transaction Processor (TP) It is a software
single data repository. All data selection, search,
component. It receives and processes the
application’s data requests. and update functions take place at the workstation.
Processor (DP: It is a software component that The MPSD scenario requires a network file server.
stores and retrieves data located at the site. The following figure shows a MPSD scenario:
The following figure shows the DDBMS
components :
https://mguugcs.blogspot.in Page: 49
B.SC-Computer Science-III Year DBMS Study Material
Suppose,
The EMPLOYEE data are distributed over three
different locations: New Delhi employee data are
stored in fragment E1, Mumbai employee data
are stored in fragment E2, and Chennai
employee data are stored in fragment E3, as
shown in the figure:
Distributed Database Transparency Features
https://mguugcs.blogspot.in Page: 50
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 51
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 52
B.SC-Computer Science-III Year DBMS Study Material
Distributed database design: Each fragment has unique columns. The primary
A distributed database design includes three new key column is common to all fragments.
issues: This is the equivalent of the PROJECT statement in
1. Data Fragmentation SQL.
2. Data Replication
3. Data Allocation The Vertical fragmentation on CUSTOMER yields
the following fragments:
Data fragmentation—Dividing a database into
two or more fragments is known as Data CUST_SERVICES_V1
Cust_I Cust_Name City State
Fragmentation. It is three types: D
1. Horizontal fragmentation
1001 ARUN HYDERABAD TELANGANA
2. Vertical fragmentation
1002 BHARATH MUMBAI MAHARASTRA
3. Mixed fragmentation.
1003 CHARITHA WARANGAL TELANGANA
CUST_COLLECTIONS_V2
Cust_ID Loan_Amount Due
CUST_SERVICES_M1
Cust_ID Cust_Name City State
CUST_COLLECTIONS_M2
Cust_ID Loan_Amount Due
https://mguugcs.blogspot.in Page: 53
B.SC-Computer Science-III Year DBMS Study Material
***
https://mguugcs.blogspot.in Page: 54
B.SC-Computer Science-III Year DBMS Study Material
***
https://mguugcs.blogspot.in Page: 55
B.SC-Computer Science-III Year DBMS Study Material
Unit-V BI architecture
BI architecture is composed of data, people,
The need for data analysis processes, technology, and the management of
Most of the managers want to monitor daily such components.
transactions to see how the business is performing. The following figure shows all those components
This kind of data analysis provide information such within the BI framework:
as:
Are our sales promotions working?
What market percentage are we
controlling?
Are we attracting new customers?
The business climate is dynamic. The firms should
react fast to the changes in order to remain
competitive. So that, there is a need for decision
support systems.
Different managerial levels require different
decision support needs. Changes in the business
world had shown new ways of managing the data.
This new decision support framework became
known as Business Intelligence. BI environment should provide the following four
*** components:
Business Intelligence 1. ETL Tools
2. Data Store
Business intelligence (BI) is a term used to 3. Data query and analysis tools
describe the set of tools and processes that 4. Data presentation and visualization tools.
efficiently extract useful information to
support decision making. ETL Tools:
ETL refers to Extraction,
BI is a framework that allows a business to Transformation and Loading tools. These are
transform data into information, used for collecting, filtering, integrating, and
information into knowledge, and aggregating operational data to load it into a data
knowledge into wisdom. store.
Data Store:
BI has the potential to create “business
The data store is meant for decision support. It is
wisdom”. This business wisdom can be
useful to make sound business decisions. generally represented by a data warehouse or a data
BI involves the following general steps: mart.
1. Collecting and storing operational data Data Query and Analysis tools:
2. Aggregating the operational data into decision This component performs data retrieval, data
support data analysis, and data mining tasks. This component is
3. Analyzing decision support data to generate known as OLAP (On-Line Analytical Processing)
information tool.
4. Presenting the information to the end users. Data Presentation and Visualization tools:
5. Making business decisions. This component is responsible for presenting the
6. Monitoring results. data to the end user. The query tool and the
presentation tool are the front end to the BI
environment.
***
https://mguugcs.blogspot.in Page: 56
B.SC-Computer Science-III Year DBMS Study Material
Operational Data vs. Decision Support Data Subject-oriented: Data warehouse data are
Decision support data differ from Operational data organized and summarized by topic. For each
in three main areas: topic, the data warehouse contains specific
I. Time span subjects—products, customers, departments,
II. Granularity regions, and so on.
III. Dimensionality.
Time span: Time-variant: Data warehouse data is time-
Operational data cover a short time frame. Decision variant. The data warehouse contains a time ID
support data covers a longer time frame. that is used to generate summaries and
Granularity (level of aggregation): aggregations by week, month, quarter, year, and
Decision support data must be presented at so on.
different levels of aggregation from highly
Nonvolatile: The data in a data warehouse are
summarized to near-atomic. It allows drill down /
never removed. Because data are never deleted
roll up the data.
Dimensionality: and new data are continually added, the data
Operational data focus on individual transactions. warehouse is always growing.
But the Decision support data can have different
dimensions.
https://mguugcs.blogspot.in Page: 57
B.SC-Computer Science-III Year DBMS Study Material
A data mart can also be created from a 12. The data warehouse contains a chargeback
larger data warehouse to support faster data mechanism for resource usage.
access to a group.
Constructing a data mart reduces the costs ***
and time. It is possible to migrate gradually
from data marts to data warehouses. OnLine Analytical Processing ( OLAP)
Information technology (IT) departments
also benefit from this approach because On-Line Analytical Processing (OLAP), create an
their personnel have the opportunity to advanced data analysis environment that supports
learn the issues and develop the skills decision making.
required to create a data warehouse. OLAP systems share four main characteristics:
The only difference between a data mart Multidimensional data analysis techniques.
and a data warehouse is the size and scope. Advanced database support.
*** Easy-to-use end-user interfaces.
Twelve Rules that define a DWH Client/Server architecture.
In 1994, W.H. Inmon and Chuck Kelley created Multidimensional data analysis techniques:
12 rules that defines a data warehouse, Modern OLAP tools have the capability for
multidimensional analysis. It allows end users to
1 The data warehouse and operational aggregate data at different levels.
environments are separated.
Advanced Database Support:
2. The data warehouse data are integrated. OLAP tools must have advanced data access
features, such as:
3. The data warehouse contains historical data
Access to different kinds of DBMSs, files
of a long time. and external data sources.
4. The data warehouse data are time variant. Access to data warehouse data as well as the
operational databases.
5. The data warehouse data are subject Advanced features such as drill-down and
oriented. roll-up.
.
6. The data warehouse data are mainly read- Easy-to-Use End-User Interface
only. OLAP tools designed with easy-to-use GUI. This
makes OLAP easily accepted and readily used.
7.The data warehouse development life cycle
differs from SDLC. Client/Server Architecture
8. The data warehouse contains data with The client/server environment enables an OLAP
several levels of detail system to be divided into several components.
Those components can then be placed on the same
9.The data warehouse environment is computer, or they can be distributed among several
characterized by read-only transactions. computers.
***
10. The data warehouse environment has a
system that traces data sources,
transformations and storage.
11. The data warehouse’s metadata identify
and define all data elements.
https://mguugcs.blogspot.in Page: 58
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 59
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 60
B.SC-Computer Science-III Year DBMS Study Material
GROUP BY CUBE (column1, column2 [, ...]) Top level management makes strategic
decisions.
[HAVING condition] [ORDER BY column1 [,
Middle level management makes tactical
column2, _]]
decisions.
Example: Operational management makes daily
operational decisions.
SELECT V_CODE, P_CODE, SUM(Units * Price) as “Total
Sales” FROM SALES
At the top level management , the database must
GROUP BY ROLLUP(MONTH, P_CODE) be able to do the following:
1. Provide the information for strategic decision
ORDER BY MONTH, P_CODE;
making, planning, and goals.
*** 2. Provide access to external and internal data.
3. Provide a framework for defining and enforcing
Data as a corporate asset
organizational policies.
Data is a valuable asset that requires careful
4. Provide feedback to monitor whether the
management. The Data can be translated into
company is achieving its goals.
information. An organization is subject to a data-
information-decision cycle.
At Middle level management, the database must
be able to do the following:
1) Deliver the data for tactical decisions and
planning.
2) Provide a framework for enforcing and ensuring
the data security.
3) Security means protecting the data against
unauthorized users.
4) Privacy deals with the rights of individuals.
https://mguugcs.blogspot.in Page: 61
B.SC-Computer Science-III Year DBMS Study Material
https://mguugcs.blogspot.in Page: 62
B.SC-Computer Science-III Year DBMS Study Material
2. The data dictionary stores the name of the table 3. Testing and evaluating databases and
creator, the date of creation and the number of applications.
columns. 4. Operating the DBMS, utilities and
3. The data dictionary stores the information about applications
indexes. 5. Training and supporting users.
4. The data dictionary also stores the relationships 6. Maintaining the DBMS, utilities and
among data elements. applications.
The DBA can use the data dictionary to support
data analysis and design. So the data dictionary can
be used to support a wide range of data 1. Evaluating, selecting and installing the DBMS
administration activities. and Utilities:
DBA’s first and most important technical
2. CASE Tools: responsibilities is selecting the database
CASE stands for Computer Aided Systems management system, utility software and
Engineering. CASE tool provides an automated supporting hardware to be used in the organization.
framework for the systems development life DBA must develop a plan for evaluating and
cycle(SDLC). CASE tools play an important role in selecting the DBMS based on the organization’s
information systems development. needs.
CASE uses structured methodologies and
powerful graphical interfaces. 2. Designing and Implementing Databases and
CASE tools provide support for the Applications:
planning analysis and design phases. The DBA function also provides data modeling and
Backend CASE tools provide support for design services to end users. The primary activities
the coding and implementation phases. of a DBA are to determine and enforce standards
The CASE data dictionary stores data flow and procedures to be used. It also involves the
diagrams (DFDS), structure charts, physical design, including storage space
descriptions of all external and internal determination and creation, data loading,
entities, data stores, data items, report conversion.
formats and screen formats.
A CASE tool provides 5 components. 3. Testing and Evaluating Databases and
Graphics. Applications:
Screen painters and report generators The DBA, must also provide testing and evaluation
An integrated repository. services for all the database and end user
An analysis segment. applications. Testing usually starts with the loading
A program documentation generator. of the testbed database. Testbed database contains
The database and application designers use the test data for the applications. Testbed database’s
CASE tools to store the description of the database purpose is to check the data definition and integrity
schema, data elements, etc. rules of the database and application programs.
****
4. Operating the DBMS, Utilities and
The DBA’s Technical Roles Applications:
The DBA’s technical role include the following: DBMS operations can be divided into 4 main areas:
1. Evaluating, selecting and installing the a. Systemsupport
DBMS and Utilities. b. Performance monitoring and tuning
2. Designing and implementing databases and c. Backup and recovery
applications d. Security auditing and monitoring
https://mguugcs.blogspot.in Page: 63
B.SC-Computer Science-III Year DBMS Study Material
*****
https://mguugcs.blogspot.in Page: 64