Professional Documents
Culture Documents
Introduction To Database
Introduction To Database
• Data: Is unorganized form (such as alphabets, numbers, or symbols) that refers to, or
represent, conditions, ideas, or objects.
• Data item—smallest named unit of data that has meaning in the real world (examples:
last name, address, ssn, political party)
• Data aggregate (or group) - a collection of related data items that form a whole concept;
a simple group is a fixed collection, e.g. date (month, day, year); a repeating group is a
variable length collection, e.g. a set of aliases.
• Database - computerized collection of interrelated stored data that serves the needs of
multiple users within one or more organizations, i.e. interrelated collections of records of
potentially many types. Motivation for databases over files: integration for easy access
and update, non-redundancy, multi-access.
• Open Source Software (OSS): Computer software available with its source code under
an open source license to study, change and improve its design.
• Domain of a database attribute is the set of all allowable values that attribute may
assume. a data domain refers to all the values which a data element may contain. The
rule for determining the domain boundary may be as simple as a data type with
an enumerated list of values.
• Examples: A field for gender may have the domain {male, female, unknown} where
those three values are the only permitted entries in that column.
• Table/relation – The view that displays the data base as a combinations of rows
(records) and columns (fields). The cells contain the bits and pieces of data for each
record in each field. The first row of a table is reserved for the field names.
• View - A database component that behaves exactly like a table but has no independent
existence of its own; a virtual table.
• Database schema or simply the schema is the physical layout of the database, which
describes how data are organized and stored in the database. In a relational database, the
schema defines the tables, the fields in each table, and the relationships between fields
and tables.
• Data Redundancy - Having the same data stored in more than one place in a database.
• Null Value - If a field contains a data item, that has a specific value. A field that does not
contain a data item is said to have a null value. In a numeric field, a null value is not the
same as a value of zero; in a character field, a null value is not the same as a blank both
the numeric zero and blank character are definite values. A null value indicates that the
that the field’s value is undefined it's value is not known.
• Database administrator (DBA) - person or group responsible for the effective use of
database technology in an organization or enterprise. Rules of DBA include Design
logical/physical schemas, Handle security and authorization, Data availability (crash
recovery) and Database tuning as needs evolve.
One of the main reasons for using DBMS is to have a central control of both data and the
programs accessing those data. A person who has such control over the system is called a
Database Administrator (DBA). The following are the functions of a Database administrator
• Schema Definition
The Database Administrator creates the database schema by executing DDL statements.
Schema includes the logical structure of database table(Relation) like data types of
attributes, length of attributes, integrity constraints etc.
• Database Security: Ensuring that only authorized users have access to the database and
fortifying it against any external, unauthorized access.
• Backup and Recovery: It is a DBA's role to ensure that the database has adequate
backup and recovery procedures in place to recover from any accidental or deliberate loss
of data.
• Producing Reports from Queries: DBAs are frequently called upon to generate reports
by writing queries, which are then run against the database.
DATABASE MODELS
Data model is the way in which data is organised for storage in a database. There are a number
of different types of database models, also referred to as DBMS models. Each one represents a
somewhat different approach to organizing data in a systematic manner. They include:
• Flat files
• Hierarchical
• Network
• Relational
• Object oriented
Flat files
• The most basic way to organize data is as a flat file. You can think of this as a single
table with a large number of records and fields. Everything you need is stored in this
table, or flat file.
Hierarchical
• In a hierarchical database model data item is subordinate to another one. This is called a
parent-child relationship. The hierarchical data model organizes data in a tree-like
structure.
• One of the rules of a hierarchical database is that a parent can have multiple children, but
a child can only have one parent For example, think of an online store that sells many
different products. The entire product catalog would be the parent, and the various types
of products, such as books, electronics, etc., would be the children.
Network
• In a network database model every data item can be related to many others ones. The
database structure is like a graph. This is similar to the hierarchical model and also
provides a tree-like structure. However, a child is allowed to have more than one parent.
In the example of the product catalog, a book could fall into more than one category. The
structure of a network database becomes more like a cobweb of connected elements.
Relational
• In a relational database model all data are organized in the form of tables. This DBMS
model emerged in the 1970s and has become by far the most widely used type of DBMS.
Most of the DBMS software developed over the past few decades uses this model. In a
table, each row represents a record, also referred to as an entity. Each column represents
a field, also referred to as an attribute of the entity.
Object oriented
• Object databases consist of objects rather than relations, tables or other data structures.
• Attributes - Attributes are data which defines the characteristics of an object. This data
may be simple such as integers, strings, and real numbers or it may be a reference to a
complex object.
• Methods - Methods define the behavior of an object and are what was formally called
procedures or functions.
• Objects = Attributes + Behaviour (Operations)
• There are other methods which are also commonly used in database design
RELATIONAL DATABASE
A relational database is a collection of data items organized as a set of formally-described tables
from which data can be accessed or reassembled in many different ways without having to
reorganize the database tables.
DBMS
• A Database Management System (DBMS) is a software package designed to store and
manages databases.
Popular DBMSs
• Microsoft Access. Aimed at small businesses, and useful for desktop applications and
systems with a small number of users
• Microsoft SQL Server, Oracle, IDM DB2. Scalable and secure, and widely used by
large organisations
• Microsoft SQL Server Compact, Java DB, SQLite. Compact DBMSs, suitable for
mobile devices in particular
DBMS COMPONENTS
• Hardware. The physical computer system that allows physical access to data
• Software. The actual program that allows users to access, maintain, and update physical
data
• Procedure. A set of procedures (rules) that should be clearly defined and followed by the
users of the database
DATABASE APPLICATIONS
• Banking: all transactions
• Requirement Analysis
Requirement Analysis
In this phase a detailed analysis of the requirement is done. The objective of this phase is to get a
clear understanding of the requirements. It makes use of various information gathering methods
for this purpose. Some of them are
• Interview
• Analyzing documents
• Survey
• Site visit
The requirement analysis is modeled in this conceptual design. The ER Model is used at the
conceptual design stage of the database design. The ER diagram is used to represent this
conceptual design. ER diagram consists of Entities, Attributes and Relationships.
Once the relationships and dependencies are identified the data can be arranged into logical
structures and is mapped into database management system tables. Normalization is performed
to make the relations in appropriate normal forms.
It deals with the physical implementation of the database in a database management system. It
includes the specification of data elements, data types, indexing etc. All these information are
stored in the data dictionary.
There are the domain integrity, the entity integrity, the referential integrity and the foreign key
integrity constraints.
• Domain Integrity. Means the definition of a valid set of values for an attribute. You
define - data type, - length or size- is null value allowed- is the value unique or not for an
attribute.
You may also define the default value, the range (values in between) and/or specific
values for the attribute. Some DBMS allow you to define the output format and/or input
mask for the attribute.
These definitions ensure that a specific attribute will have a right and proper value in the
database.
• Entity Integrity – Every table requires a primary key. The primary key, nor any part of
the primary key, can contain NULL values. This is because NULL values for the primary
key means we cannot identify some rows. For example, in the EMPLOYEE table, Phone
cannot be a key since some people may not have a phone.
• Foreign Key Integrity Constraint There is two foreign key integrity constraints:
cascade update related fields and cascade delete related rows. These constraints affect the
referential integrity constraint.
BUSINESS RULES
Another term used is semantics. Business rules are obtained from users when gathering
requirements. The requirements gathering process is very important and should be verified by
the user before the database design is built. If the business rules are incorrect, the design will be
incorrect and ultimately the application built will not function as expected by the users.
• It must be flexible enough so that it can be used and understood in practically any
environment where information is modelled
The ER model
• It is expressed in terms of entities in the business environment, the relationships (or
associations) among those entities and the attributes (properties) of both the entities and
their relationships
• Entity instance – is a single occurrence of an entity type. An entity type is described just
once (using metadata) in a database, while many instances of that entity type may be
represented by data stored in the database. e.g. – there is one EMPLOYEE entity type in
most organizations, but there may be hundreds of instances of this entity stored in the
database
Sample E-R Diagram
The entity type on which the weak entity type depends is called the Identifying owner (or
owner for short).
Identifying relationship is the relationship between a weak entity type and and its owner
(such as ‘Has’ in the following Fig.)
Weak entity identifier is its partial identifier (double underline) combined with that of its
owner. During a later design stage dependent name will be combined with Employee_ID
(the identifier of the owner) to form a full identifier for DEPENDENT.
Attributes
• An attribute is a property or characteristic of an entity type, for example the entity
EMPLOYEE may have attributes Employee_Name and Employee_Address.
• In ER diagrams place attributes name in an ellipse with a line connecting it to its
associated entity
• The component attributes may appear above or below the composite attribute on an ER
diagram
• A simple (atomic) attribute is one that cannot be broken down into smaller components
A composite attribute
• A multivalued attribute is one that may take on more than one value – it is represented by
an ellipse with double lines
Derived Attributes
• Some attribute values can be calculated or derived from others
• A derived attribute is signified by an ellipse with a dashed line (see previous Fig.)
Identifier attribute
• Identifier attribute or Key is an attribute (or combination of attributes) that uniquely
identifies individual instances of an entity type, such as Student_ID
• To be a candidate identifier, each entity instance must have a single value for the
attribute, and the attribute must be associated with each entity
Primary Key (PK) - Is a value that can be used to identify a unique row in a
table. In the relational model of data, a primary key is a candidate key chosen as the main
method of uniquely identifying a tuple in a relation.
To qualify as a primary key for an entity, an attribute must have the following properties:
The values must not change or become null during the life of each entity instance
Types of Key
• Candidate key - A candidate key is a field or combination of fields that can act as a
primary key field for that table to uniquely identify each record in that table. It can be
defined as minimal Super Key or irreducible Super Key
• Alternate key - An alternate key is any candidate key which is not selected to be the
primary key.
• Super Key – An attribute or a combination of attribute that is used to identify the records
uniquely is known as Super Key. A table can have many Super Keys.
• Secondary (Index) Key: An attribute or a set of attributes that has been used to construct
the data retrieval index.
• Concatenated (Combined, compound or Composite) Key: A set of attributes that has
been used as the key. Is a key that consists of 2 or more attributes
• Foreign key - a foreign key (FK) is a field or group of fields in a database record that
points to a key field or group of fields forming a key of another database record in some
(usually different) table. Usually a foreign key in one table refers to the primary key (PK)
of another table. This way references can be made to link information together and it is
an essential part of database normalization
• Natural key. A composite primary key which is composed of attributes (fields) which
already exist in the real world e.g. First Name, Last Name, Social Security Number.
• The Unique key (UK) uniquely identifies each record in a database table.
• A Composite Identifier is when there is no single (or atomic) that can serve as an
identifier