Professional Documents
Culture Documents
Objective:
The primary goal of a DBMS is to provide a way to store and retrieve database information that is
both convenient and efficient.
Database systems are designed to manage large bodies of information. Management of data involves
both defining structures for storage of information and providing mechanisms for the manipulation of
information. The database system must ensure the safety of the information stored, despite system crashes
or attempts at unauthorized access.
Database-System Applications:
1. Enterprise Information
2. Banking and Finance
Banking: For customer information, accounts, loans, and banking transactions.
Credit card transactions: For purchases on credit cards and generation of monthly statements.
Finance: For storing information about holdings, sales, and purchases of financial instruments
such as stocks and bonds; also for storing real-time market data to enable online trading by
customers and automated trading by the firm.
3. Airlines
4. University
5. Web-Based services.
6. Telecommunication.
There are two modes in which databases are used OLTP and OLAP.
Purpose of Database Systems:
1. Data redundancy and inconsistency
2. Difficulty in accessing data
3. Data isolation
4. Integrity problems
5. Atomicity problems
6. Concurrent-access anomalies
7. Security problems
View of Data:
A database system is a collection of interrelated data and a set of programs that allow users to
access and modify these data. A major purpose of a database system is to provide users with an abstract
view of the data. That is, the system hides certain details of how the data are stored and maintained.
1 Data Models
Relational Model: collection of tables to represent both data and the relationships among
those data. Each table has multiple columns, and each column has a unique name. Tables are
also known as relations.
Entity-Relationship Model: uses a collection of basic objects, called entities, and
relationships among these objects. An entity is a “thing” or “object” in the real world that is
distinguishable from other objects
Semi-structured Data Model: permits the specification of data where individual data items
of the same type may have different sets of attributes
Object-Based Data Model: Object-oriented programming (especially in Java, C++, or C#)
has become the dominant software-development methodology
2 Relational Data Model
3 Data Abstraction: developers hide the complexity from users through several levels of data
abstraction, to simplify users interactions with the system.
Physical level: The lowest level of abstraction describes how the data are actually stored. The
physical level describes complex low-level data structures in detail.
Logical level: The next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. The user of the logical level does not
need to be aware of this complexity. This is referred to as physical data independence.
Database administrators, who must decide what information to keep in the database, use the
logical level of abstraction.
View level: The highest level of abstraction describes only part of the entire database. Even
though the logical level uses simpler structures, complexity remains because of the variety of
information stored in a large database. The view level of abstraction exists to simplify their
interaction with the system. The system may provide many views for the same database
4. Instances and Schemas:
Databases change over time as information is inserted and deleted. The collection of information
stored in the database at a particular moment is called an instance of the database. The overall design of
the database is called the database schema.
Database systems have several schemas, partitioned according to the levels of abstraction. The
physical schema describes the database design at the physical level, while the logical schema describes
the database design at the logical level. A database may also have several schemas at the view level,
sometimes called subschemas, that describe different views of the database.
Database Languages:
A database system provides a data-definition language (DDL) to specify the database schema and a data-
manipulation language (DML) to express database queries and updates.
Domain Constraints. A domain of possible values must be associated with every attribute (for example,
integer types, character types, date/time types). Domain constraints are the most elementary form of
integrity constraint. They are tested easily by the system whenever a new data item is entered into the
database.
Referential Integrity. There are cases where we wish to ensure that a value that appears in one relation
for a given set of attributes also appears in a certain set of attributes in another relation (referential
integrity). Example: the dept name value in a course record must appear in the dept name attribute of
some record of the department relation. Database modifications can cause violations of referential
integrity. When a referential-integrity constraint is violated, the normal procedure is to reject the action
that caused the violation.
Authorization. We may want to differentiate among the users as far as the type of access they are
permitted on various data values in the database. These differentiations are expressed in terms of
authorization
o The DBMS design depends upon its architecture. The basic client/server architecture is used to
deal with a large number of PCs, web servers, database servers and other components that are
connected with networks.
o The client/server architecture consists of many PCs and a workstation which are connected via
the network.
o DBMS architecture depends upon how users are connected to the database to get their request
done.
1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user can directly sit
on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide a handy tool
for end users.
o The 1-Tier architecture is used for development of the local application, where programmers can
directly communicate with the database for the quick response.
2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on
the client end can directly communicate with the database at the server side. For this interaction,
API's like: ODBC, JDBC are used.
o Where the application resides at the client machine, and invokes database system
functionality at the server machine through query language statements.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and transaction
management.
o To communicate with the DBMS, client-side application establishes a connection with the server
side.
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this architecture,
client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further communicates
with the database system.
o End user has no idea about the existence of the database beyond the application server. The
database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
o Where the client machine acts as merely a front end and does not contain any direct database
calls; web browsers and mobile applications are the most commonly used application clients
today.
o The front end communicates with an application server. The application server, in turn,
communicates with a database system to access data.
o The business logic of the application, which says what actions to carry out under what conditions,
is embedded in the application server, instead of being distributed across multiple clients.
o Three tier applications provide better security as well as better performance than two-tier
applications.
Features
Databases in the collection are logically interrelated with each other. Often they represent a
single logical database.
Data is physically stored across multiple sites. Data in each site can be managed by a DBMS
independent of the other sites.
The processors in the sites are connected via a network. They do not have any multiprocessor
configuration.
A distributed database is not a loosely connected file system.
A distributed database incorporates transaction processing, but it is not synonymous with a
transaction processing system.
Distributed Database Management System
A distributed database management system (DDBMS) is a centralized software system that manages a
distributed database in a manner as if it were all stored in a single location.
Features
It is used to create, retrieve, update and delete distributed databases.
It synchronizes the database periodically and provides access mechanisms by the virtue of which
the distribution becomes transparent to the users.
It ensures that the data modified at any site is universally updated.
It is used in application areas where large volumes of data are processed and accessed by
numerous users simultaneously.
It is designed for heterogeneous database platforms.
It maintains confidentiality and data integrity of the databases.
Modular Development − If the system needs to be expanded to new locations or new units, in
centralized database systems, the action requires substantial efforts and disruption in the existing
functioning. However, in distributed databases, the work simply requires adding new computers and
local data to the new site and finally connecting them to the distributed system, with no interruption in
current functions.
More Reliable − In case of database failures, the total system of centralized databases comes to a halt.
However, in distributed systems, when a component fails, the functioning of the system continues may
be at a reduced performance. Hence DDBMS is more reliable.
Better Response − If data is distributed in an efficient manner, then user requests can be met from local
data itself, thus providing faster response. On the other hand, in centralized systems, all queries have to
pass through the central computer for processing, which increases the response time.
Lower Communication Cost − In distributed database systems, if data is located locally where it is
mostly used, then the communication costs for data manipulation can be minimized. This is not feasible
in centralized systems.
Distributed databases can be broadly classified into homogeneous and heterogeneous distributed
database environments, each with further sub-divisions, as shown in the following illustration.
Architectural Models
Data Model gives us an idea that how the final system will look like after its complete implementation. It
defines the data elements and the relationships between the data elements. Data Models are used to show
how data is stored, connected, accessed and updated in the database management system. Here, we use a
set of symbols and text to represent the information so that members of the organization can communicate
and understand it.
1 Hierarchical Model
2 Network Model
3 Entity-Relationship Model
4 Relational Model
5 Object-Oriented Data Model
6 Object-Relational Data Model
7 Flat Data Model
8 Semi-Structured Data Model
9 Associative Data Model
10 Context Data Model
Hierarchical Model
Hierarchical Model was the first DBMS model. This model organizes the data in the hierarchical tree
structure. The hierarchy starts from the root which has root data and then it expands in the form of a
tree adding child node to the parent node.
In the above diagram, the entities are Teacher and Department. The attributes of Teacher entity are
Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The attributes of entity Department entity
are Dept_id, Dept_name. The two entities are connected using the relationship. Here, each teacher
works for a department.
Features of ER Model
Graphical Representation for Better Understanding: It is very easy and simple to understand
so it can be used by the developers to communicate with the stakeholders.
ER Diagram: ER diagram is used as a visual tool for representing the model.
Database Design: This model helps the database designers to build the database and is widely
used in database design.
Advantages of ER Model
Simple: Conceptually ER Model is very easy to build. If we know the relationship between the
attributes and the entities we can easily build the ER Diagram for the model.
Effective Communication Tool: This model is used widely by the database designers for
communicating their ideas.
Easy Conversion to any Model: This model maps well to the relational model and can be easily
converted relational model by converting the ER model to the table. This model can also be
converted to any other model like network model, hierarchical model etc.
Disadvatages of ER Model
No industry standard for notation: There is no industry standard for developing an ER model.
So one developer might use notations which are not understood by other developers.
Hidden information: Some information might be lost or hidden in the ER model. As it is a
high-level view so there are chances that some details of information might be hidden.
Relational Model
Relational Model is the most widely used model. In this model, the data is maintained in the form of a
two-dimensional table. All the information is stored in the form of row and columns. The basic
structure of a relational model is tables. So, the tables are also called relations in the relational
model. Example: In this example, we have an Employee table.
In the above example, we have two objects Employee and Department. All the data and
relationships of each object are contained as a single unit. The attributes like Name, Job_title of the
employee and the methods which will be performed by that object are stored as a single object. The two
objects are connected through a common attribute i.e the Department_id and the communication
between these two will be done with the help of this common id.
Object-Relational Model
As the name suggests it is a combination of both the relational model and the object-oriented model.
This model was built to fill the gap between object-oriented model and the relational model. We can
have many advanced features like we can make complex data types according to our requirements
using the existing data types. The problem with this model is that this can get complex and difficult to
handle. So, proper understanding of this model is required.
Flat Data Model
It is a simple model in which the database is represented as a table consisting of rows and columns. To
access any data, the computer has to read the entire table. This makes the modes slow and inefficient.
Semi-Structured Model
Semi-structured model is an evolved form of the relational model. We cannot differentiate between
data and schema in this model. Example: Web-Based data sources which we can't differentiate between
the schema and data of the website. In this model, some entities may have missing attributes while
others may have an extra attribute. This model gives flexibility in storing the data. It also gives
flexibility to the attributes. Example: If we are storing any value in any attribute then that value can be
either atomic value or a collection of values.
Associative Data Model
Associative Data Model is a model in which the data is divided into two parts. Everything which has
independent existence is called as an entity and the relationship among these entities are
called association. The data divided into two parts are called items and links.
Item: Items contain the name and the identifier(some numeric value).
Links: Links contain the identifier, source, verb and subject.
Example: Let us say we have a statement "The world cup is being hosted by London from 30 May
2020". In this data two links need to be stored:
1. The world cup is being hosted by London. The source here is 'the world cup', the verb 'is being'
and the target is 'London'.
2. ...from 30 May 2020. The source here is the previous link, the verb is 'from' and the target is '30
May 2020'.
This is represented using the table as follows:
Entity-Relationship Model
The entity-relationship (E-R) data model was developed to facilitate database design by allowing
specification of an enterprise schema that represents the overall logical structure of a database. The E-R
model is very useful in mapping the meanings and interactions of real world enterprises onto a conceptual
schema
The E-R data model employs three basic concepts: entity sets, relationship sets, and attributes.
Entity Sets
An entity is a “thing” or “object” in the real world that is distinguishable from all other objects.
For example, each person in a university is an entity. An entity has a set of properties, and the values for
some set of properties must uniquely identify an entity. For instance, a person may have a person id
property whose value uniquely identifies that person. Thus, the value 677-89-9011 for person id would
uniquely identify one particular person in the university.
Similarly, courses can be thought of as entities, and course id uniquely identifies a course entity
in the university. An entity set is a set of entities of the same type that share the same properties, or
attributes. An entity is represented by a set of attributes.
Attributes are descriptive properties possessed by each member of an entity set. The designation
of an attribute for an entity set expresses that the database stores similar information concerning each
entity in the entity set; however, each entity may have its own value for each attribute.
Possible attributes of the instructor entity set are ID, name, dept name, and salary. And also
course entity set are course id, title, dept name, and credits. Each entity has a value for each of its
attributes. For instance, a particular instructor entity may have the value 12121 for ID, the value Wu for
name, the value Finance for dept name, and the value 90000 for salary.
An entity set is represented in an E-R diagram by a rectangle, which is divided into two parts.
The first part, which in this text is shaded blue, contains the name of the entity set. The second part
contains the names of all the attributes of the entity set. The E-R diagram in Figure 6.1 shows two entity
sets instructor and student. The attributes associated with instructor are ID, name, and salary. The
attributes associated with student are ID, name, and tot cred. Attributes are part of the primary key.
Relationship Sets:
A relationship is an association among several entities. For example, we can define a relationship advisor
that associates instructor Katz with student Shankar. This relationship specifies that Katz is an advisor to
student Shankar. A relationship set is a set of relationships of the same type.
Consider two entity sets instructor and student. We define the relationship set advisor to denote the
associations between students and the instructors who act as their advisors. A relationship instance in an
E-R schema represents an association between the named entities in the real-world enterprise that is being
modeled. As an illustration, the individual instructor entity Katz, who has instructor ID 45565, and the
student entity Shankar, who has student ID 12345, participate in a relationship instance of advisor. This
relationship instance represents that in the university, the instructor Katz is advising student Shankar.
A relationship set is represented in an E-R diagram by a diamond, which is linked via lines to a number of
different entity sets (rectangles). Eg: Studens and enrolled courses. Relationship is taken.
The function that an entity plays in a relationship is called that entity’s role. Since entity sets participating
in a relationship set are generally distinct, roles are implicit and are not usually specified. However, they
are useful when the meaning of a relationship needs clarification. Such is the case when the entity sets of
a relationship set are not distinct; that is, the same entity set participates in a relationship set more than
once, in different roles. In this type of relationship set, sometimes called a recursive relationship set,
explicit role names are necessary to specify how an entity participates in a relationship instance.
Figure 6.4 shows the role indicators course id and prereq id between the course entity set and the prereq
relationship set
A relationship may also have attributes called descriptive attributes. As an example of descriptive
attributes for relationships, consider the relationship set takes which relates entity sets student and section.
We may wish to store a descriptive attribute grade with the relationship to record the grade that a student
received in a course offering. An attribute of a relationship set is represented in an E-R diagram by an
undivided rectangle. We link the rectangle with a dashed line to the diamond representing that
relationship set. A relationship set may have multiple descriptive attributes.
It is possible to have more than one relationship set involving the same entity sets. The relationship sets
advisor and takes provide examples of a binary relationship set—that is, one that involves two entity sets.
Most of the relationship sets in a database system are binary. Occasionally, however, relationship sets
involve more than two entity sets. The number of entity sets that participate in a relationship set is the
degree of the relationship set. A binary relationship set is of degree 2; a ternary relationship set is of
degree 3.
Complex Attributes
For each attribute, there is a set of permitted values, called the domain, or value set, of that attribute. The
domain of attribute course id might be the set of all text strings of a certain length. Similarly, the domain
of attribute semester might be strings from the set {Fall, Winter, Spring, Summer}.
E-R diagrams also provide a way to indicate more complex constraints on the number of times each entity
participates in relationships in a relationship set. A line may have an associated minimum and maximum
cardinality, shown in the form l..h, where lis the minimum and h the maximum cardinality. A minimum
value of 1 indicates total participation of the entity set in the relationship set; that is, each entity in the
entity set occurs in at least one relationship in that relationship set. A maximum value of 1 indicates that
the entity participates in at most one relationship, while a maximum value ∗ indicates no limit.