Professional Documents
Culture Documents
CBDA2103 - Data Analysis and Modelling
CBDA2103 - Data Analysis and Modelling
SEMESTER 1171
CBA2103
MATRICULATION NO : 901107025629001
IDENTITY CARD NO. : 901107025629
TELEPHONE NO. : 011 1075 1814
E-MAIL : ikhwan_ishak@oum.edu.my
LEARNING CENTRE : KEDAH LEARNING CENTRE, ALOR SETAR
INDEX
CONTENT PAGE
INTRODUCTION 3
PHASES OF DDLC
Database Planning 5
Database Design 7
Implementation and Downloading 8
Testing and Evaluation 9
Operation 9
Maintenance and Evolution 10
COMPONENTS OF DBMS
Hardware 11
Software 13
Data 14
Procedures 14
Users 15
REFERENCES 17
2
INTRODUCTION
Data plays an important role in our daily lives whether we realized it or not. On personal level,
we might be involved in data like recording monthly income and expenditure, filing receipts
from various transactions and much more. On a larger scale, for example in a corporate
organization, data like personal information of the employees, their salary, the price of items the
organization is selling and the availability of a particular product needs to be taken care of.
Therefore, it is important for both individuals and organizations to manage the data correctly so
that it remains systematic and safe. Hence, Database Management System (DBMS) plays a
crucial role in helping human to deal with various forms of data.
The general purpose of a database is to store a large amount of data, organize them and allow the
data to be retrieved in an effective and accurate way. It also allows the user to have control over
excess data because they are being stored in one place which will result in better organization of
the data. Other than data, database allows the sharing of information among individuals and
departments as long as it belongs to the organization and not to a specific party. Database also
handles issues like integrity, restoration, support and security pretty well by controlling the
password usage and users’ restrictions. This is very important to protect the confidentiality of the
data especially those involving private or personal data like credit cards number, bank account
number, medical history and so forth. Last but not least, the purpose of database is to give the
organization a financial advantage because using database is economically better compared to
conventional ways of storing data. Database requires less manpower, storage space and
maintenance cost while maintaining productivity.
3
In order to develop a good database, a process called Database Development Life Cycle (DDLC)
is involved. DDLC consists of six phases which are Database Planning, Database Design,
Implementation and Downloading, Testing and Evaluation, Operation and lastly Maintenance
and Evolution. DDLC is a very important process in order to build a good database because it
allows the developer to identify the current situation of the organization and their problems,
identifying the objective of the database and the costs needed to complete the database including
manpower, hardware and software required. DDLC is also important to make sure that the
database is being designed according to the needs of the organization. This includes process like
definition of data and data modelling. DDLC also plays important role in determining whether
the database will be working as intended or not through continuous process of testing and
evaluation. Systematic testing needs to be done in order to ensure that all systems are working
correctly and also helps in identifying problems that arise during the testing period. Meanwhile,
evaluation needs to be done on the performance and system security to protect the system from
unauthorized users.
4
PHASES OF DDLC
1) DATABASE PLANNING
The first step in creating a database is creating a plan that serves both as a guide to be used when
implementing the database and as a functional specification for the database after it has been
implemented. Size of the database and also user population will be determining the complexity
and detail of the database. The complexity of a database and the process of planning it can vary
significantly according to the requirements. For example, a database can be simple for the usage
of a single person compared to a database needed for a bank to handle all the banking
transactions for thousands of clients which will be more complex.
There are systemic approaches that can be done to make sure that all the information required to
build a good database can be obtained. They are as follows:
5
iii) Define the objective
The objective of the new system needs to be clear so that it can meet the requirement of the
organization. Identify whether the new system can integrate and interact with existing system
currently and in the near future. This involves processes like data migration from one system to
another system and data sharing between users. The methods of data collection and the rules of
the system need to be determined.
v) Feasibility research
Research on the potential of the new system should be continuously done to see whether it will
be feasible in the future in terms of technology, economy and administration.
6
2) DATABASE DESIGN
This phase is regarded as the most important step in DDLC because it will determine the
performance of the database. The purposes of database design are to represent data and their
relationship, to provide a data model to support all required transactions and to determine if the
system design has met the requirements of users and the system. Database design includes more
detailed phases including Conceptual Design, Logical Design and Physical Design.
i) Conceptual Design
Conceptual design is used to design a database that can represent real world objects in a realistic
manner. Conceptual design includes three sections:
Data Modelling – Involve the process where the designer needs to understand the overall
data characteristics of the organizations. This can be chieve through identifying
information requirement, information source, information content and information user.
Normalization – Determine the dependency of attributes to an entity and fixes
relationships between entities trough common attributes. This process helps with
problems such as data duplication, additional anomalies, updating and deletion.
Data Model validation – Adjustments need to be made to the models during data
modelling and normalization stages depending on how they functions during the testing
process.
Conceptual model is free from DBMS model and physical storage structure hence they are not
dependent on any software or hardware. This makes them highly mobile across various DBMS
platforms thus making them to be very important in DDLC.
7
ii) Logical Design
The process of mapping conceptual design into logical design involves translating each entity,
attribute and relationship into a data representation form compatible with DBMS model. DBMS
data model based on relationship model from conceptual design will be converted into table
forms with rows and columns. Not only that, logical data model can defines views, standard of
accessibility authority and limits via Data Definition Language (DDL) statement. It can also
checks data constraints and ensure data credibility.
Implementation involves the process to realize the database design and usage. Most of the time,
prototype will be developed to test the basic functions of the system before the system can be
fully operated. Implementation and downloading phase involves setting up a DBMS where the
new system needs to be set up to allow users to adapt. Usually, the Database Administrator
(DBA) will create database storage groups. Next, by using DDL statement, the scheme, data
dictionary and users’ view are created. Once the database is created, data will be downloaded
into the database in case the new database is replacing the old database. Current data needs to be
changed into the new data format. Consideration on aspects which includes accessibility,
security, recovery and integrity is also done during this phase.
8
4) TESTING AND EVALUATION
Database testing involves the retrieved values from the database by the web or desktop
application. Data in the User Interface should be matched as per the records are stored in the
database. It involves three main tests which are unit testing, integration testing and system
testing.
Meanwhile, evaluation will focus more on the performance and system security. Database needs
to be protected against unauthorized users by increasing the physical security, password, access
rights, data confidentiality and simultaneous accessibility control.
5) OPERATION
Once the database system has been tested and evaluated, it will enter the operation phase. When
the system has been fully established, the developer needs to take some actions which include
preparing user’s manual, conducting training sessions and providing technical support. There are
several approaches to the operation phase of a new system which are as follows:
i) Direct Transition Plan – The new system fully replace the old system immediately.
ii) Parallel Transition Plan – Both old and new system are operated simultaneously within a
period of time. Old system will be stopped once the management became satisfied with the new
one.
iii) Pioneer Transition Plan – Initially, the new system is operated only in particular department
in the organization while the old system is still being used in other departments. Once the users
fully adapt with the new system and the management felt happy with its performance, the new
system will be fully implemented.
iv) Staggered Transition Plan – Implementation is done stage by stage until the new system is
fully implemented depending on the performance.
9
6) MAINTENANCE AND EVOLUTION
This phase involves continuous monitoring and maintaining the performance of the system and
database. In case of the system producing undesirable results, adjustments will be made
accordingly. This process must be made on a regular basis in order to identify errors that occur
later which do not happen during the earlier stage. This phase is also important when there are
changes to the rules and policy of a particular organization. When that happened, the database
too needs to be adjusted according to the requirements. After a few years, most systems need to
be upgraded to be on par with the latest technology to cope with the needs of users and the
organization.
10
COMPONENTS OF DBMS
1) HARDWARE
Hardware consists of a set of physical electronic devices such as computers, storage devices,
Input/Output devices and so forth that make interface between computers and the real world
systems. In DBMS, the three main components that require a good balance in term of cost are
Central Processing Unit (CPU), memory and disk.
i) CPUs
There are two basic criteria that we need to consider before deciding the CPU for the DBMS.
The first one is which processor family do we prefer. The most popular choice is usually
between Intel and AMD but there are also others available such as Itanium and SPARC. The
second one is deciding whether you need more cores or faster cores. More cores usually benefits
database system with large number of users accessing them at the same time while faster cores
help with data loading and exporting.
ii) Memory
The priority of memory we want to use in DBMS greatly depends on the size of data we are
using in our database. Most of the time, adding more Random Access Memory (RAM) will
result in performance boost. However, there are situations where adding more RAM might not
help the system. Firstly, if the data set is small enough to fit into a smaller amount of RAM,
adding more won't help much. Consider using a faster processor instead. Secondly, when
running applications that scan tables much larger than what you can feasibly purchase as RAM,
getting a faster disk might be a better choice.
11
iii) Disks
Nowadays, the choice to make for disks to be used in database server is whether to use Serial
ATA (SATA) or Serial Attached SCSI (SAS). The descriptions for both drives are as follows:
SATA disks:
Drives typically have a slower RPM: 7200 is standard, some 10,000 designs exist such as
the Western Digital VelociRaptor
Higher drive capacity: 2 TB available
Cost per MB is lower
SAS disks:
The maximum available RPM is higher: 10,000 or 15,000
Not as much drive capacity: 73 GB-1 TB are popular sizes
Cost per MB is higher
12
2) SOFTWARE
This is the set of programs used to control and manage the overall database. In order to make the
database system function fully, three types of software are needed: operating system software,
DBMS software, and application programs and utilities.
13
3) DATA
Data is the most important component of the DBMS. In DBMS, databases are defined,
constructed and then data is stored, updated and retrieved to and from the databases. The
database contains both the actual data and the metadata. Metadata (aka “data about data”) stores
the information like how many tables, their names, how many columns and their names, primary
keys, foreign keys and so forth. Basically these metadata will have information about each tables
and their constraints in the database.
4) PROCEDURE
Procedures refer to the instructions and rules that help to design the database and to use the
DBMS. The users that operate and manage the DBMS require documented procedures on how to
use or run the database management system. These may include:
14
5) USERS
The users are the people who manage the databases and perform different operations on the
databases in the database system. There are five categories of user in DBMS:
15
v) Database Administrator (DBA)
The person who is responsible in managing the overall database management system. There are
different kinds of DBA depending on the responsibility that he owns.
Administrative DBA - Mainly concerned with installing, and maintaining DBMS servers.
His prime tasks are installing, backups, recovery, security, replications, memory
management, configurations and tuning.
Development DBA - He is responsible for creating queries and procedure for the
requirement.
Data Warehouse DBA - DBA should be able to maintain the data and procedures from
various sources in the datawarehouse.
Application DBA - Acts like a bridge between the application program and the database.
He makes sure all the application program is optimized to interact with the database. He
ensures all the activities from installing, upgrading, and patching, maintaining, backup,
recovery to executing the records works without any issues.
(3032 words)
16
REFERENCES
Chia, K. H., Seow, E. H., & Teo, K. C. (2004). Database. Singapore: Pearson Prentice Hall.
17