You are on page 1of 11

Running head: DATABASE AND DATA WAREHOUSING DESIGN 1

Database and Data Warehousing Design

Edmond Labule

CIS 599

Strayer University

Dr. Mark Afolabi

February 16, 2017


DATABASE AND DATA WAREHOUSING DESIGN 2

In creating a database and designing a data warehouse for an internet-based organization,

this paper emphasis on the need to support the use of a relational database and data warehousing

from a management standpoint to show the importance and efficiency to gain from an executive

oversight. For proper completion of such task, creating a database schema that will support the

company’s business and processes, in which to provide explanation and support of the database

schema to support the structure. Thirdly, to identify and create a database table with a naming

convention of appropriate fields distinguishing the primary and foreign keys with an explanation

of how to achieve inferential integrity and to normalize the table into third normal form. Also,

creating an entity relationship diagram as well as a data flow diagram of the information in the

database schema. Finally, to illustrate how data flows, and to include both the input and output

for data warehouse use.

Relational Databases and Data Warehousing Use

According to Raman (2011), a data warehouse is a receptacle to store bulk data, as well

as filtering and organizing. Data storage helps to facilitate reporting and analysis of information.

A data storage has tools for extraction, transformation, and loading of data into the repository,

and contributes to managing and retrieve Metadata. Data warehousing is a collection or

combination of many databases across an organization. For better understanding of a data

warehouse, its characteristics explains how and what it consist of, and proper functioning.

1. Subject Oriented: For a data warehouse to be subject oriented, the data originates from

different sources such as the OLTP, where storing occurs according to the type

application to keep track of daily transactions, whereas to the OLPA, storing of data

occurs according to the subject such as company branches, products, and customers.
DATABASE AND DATA WAREHOUSING DESIGN 3

2. Time Variant: Time variant occurs when the data is the data warehouse is stored from a

period of one to ten years. DWH analyzes and takes decisions to help management team

decides on what is beneficial to the company. For example, in a sales department, DWH

will help determine the products customers like best, or the product that gives more profit

to a company.

3. Non-Updateable: The information with DWH cannot be modified or deleted, but for

viewing purpose only. DWH analyzes the data by periodically refreshing the data by

picking up the data from the OLTP systems.

4. Integrated: For integration to occur, DWH puts data from disparate sources into a

particular format. All conflicts or inconsistencies must be resolved before the data can be

integrated.

The primary goal of a relational database (RDB) is to organize data into tables or in

relations to each other. The table makes up the rows and columns, where the row record,

while the column attributes. The database of RDB is similar to those of a spreadsheet. The

relationship of the table of RDB is what makes RDB to store huge amount of data efficiently

and efficiently retrieve the data. The main objective of RDB is to eliminate redundancy

(meaning the same data cannot in stored in more than one place). With redundancy, duplicate

data does not only occupy space, but also lead to inconsistencies. RDB also ensures data

integrity and accuracy. Many commercial organizations uses Relational Database

Management system (RDBMS) such as Oracle, Microsoft SQL Server, DB2, etc. A

Structured Query Language (SQL) was created to work with relational database because SQL

is the foundation of all applications for traditional databases.


DATABASE AND DATA WAREHOUSING DESIGN 4

In relation to data warehouse and relational database, it is up most difficult to predict

when or how much a data warehouse will grow for a short or long-term goal, therefore a

relational database is necessary when it is sure that a data warehouse is likely to grow to

hundreds of gigabytes or larger. A relational database is used to host an enterprise data

warehouse for scalability, high-speed query processing, for integration with local and central

metadata, support open system standards, etc. (Hull, 2011).

Database Schema of Company’s Business and Processes

A database schema is the visual representation of a database, as a set of rules that governs

the database, or an entire set of objects that belongs to a particular user. A database schema

represents the logical configuration of all the necessary parts of a relational database, which exist

as a visual representation and a set for known formulas (integrity constraints) that governs a

database. It indicates how the entities of a database relates to each other which includes the

tables, procedures, views, fields, relationships, and indexes. The creation of a database schema

by a database administrator or designer helps programmers initiate a software that will interact

with a database. The creation process of a data schema is data modeling. Since a databases

schema is just a sketch of a planned database, it does not actually have any data in it. It is useful

to integrate multiple sources into a single schema to make sure every overlapping element for

integration is in the table, that independent relationships and entities are not lumped together in

the same table. In addition, that all the elements that appear in one source that may have an

association with other overlapping elements be copied to the resulting database schema and

finally that no ideal elements are lost (Jones, Plew, & Stephens, 2008). What makes a database

schema a good thing is that it consist of a single table with no limit as to the number of objects

that it may contain otherwise that some restrictions exist by a specific database implementation.
DATABASE AND DATA WAREHOUSING DESIGN 5

The database schema will define the structure of the table, and for the entities, the relationship

and the table defines on the type of association between each entity.

The database schema consist of the following for staff assignments:

1. Organization: ID, Name, Address, Other Details.

2. Organization Units: organization ID, Parent Organization Unit ID, Organization Unit

Name, Other Details.

3. Staff Assignments: Organization Unit ID, Staff ID, Date From, Date To.

4. Staff: Staff ID, First Name, Last Name, Gender, Date of Birth, Address, Other Details.

5. Ref Roles: Role Code, Roles Description.

Identifying and Creating Database Tables with Appropriate Field-Naming Convention

Organization Organization_Units Staff_Assignments Staff Ref_Roles

Organization_ID Organization_Unit_ID Organization_Unit_ Staff_ID Role_Code

ID

Organization_ Name Organization_ID Staff_ID First_Name Role_Description

Address Parent_Organization_Unit_I Date_From Middle_Name

Other_Details Organization_Name Date_To Last_Name

Other_Details Report_to_Staff_ID Gender_M/F

Role_Code Date_of_Birth

Address

Other_Details
DATABASE AND DATA WAREHOUSING DESIGN 6

Primary is indicated with the color yellow, while the foreign key is indicated with the color

green.

According to Chapple (2017), referential integrity is a database feature that ensures there is a

relationship between the database tables to remain accurate by application of constraints to

prevent users from entering inaccurate data or data that does not exist. The primary and foreign

keys maintain the relationship between the tables by functioning with the database. The

referential integrity of this relational database will ensure there is no duplication of data, prevent

deletion of records, prevent addition of record that has a foreign key unless that there is a

primary key in the table linked, and finally guaranteed consistency between tables. In a relational

database for referential integrity to apply, all the field of the table must contain either a value of

null or a value from a primary key in the parents table. This is to imply that using a foreign key

value must reference a valid primary key in the parent table. Referential integrity will be

distorted if record that contains a value referred by a foreign key is deleted.

Normalization is a process of removing irrelevant or redundant data and dependencies

from a database. For a table to be in third normal form (3NF) it must be in 2NF and all non-

primary field must depend on the primary key. The dependency of non-primary fields has to be

between the data. For example, organization ID, Organization Unit ID, and Staff ID are

unbreakably bound to the organization itself. The dependency between the organization and the

staff assignment is referred to as the transitive dependency. Moving the staff ID and the

organization unit ID into another table will be called the Organization ID because those ID are

formed under the organization to distribute assignments of related jobs. In this instance,

removing the transitive dependencies reduces the amount of data duplication and therefore

reduces the amount of data in the database, and for data integrity.
DATABASE AND DATA WAREHOUSING DESIGN 7

Entity Relationship Diagram of Database Schema

An entity-relationship diagram (ERD) is a representation using graphs of a system of

information to present the relationship between objects, concepts, people, or events within that

particular system. Being that an ERD is a data modeling technique, it helps defines business

processes and is the foundation of relational database. Despite the fact that an ERD is useful for

organizing relational structure data, it is not sufficient for semi-structured or unstructured data,

and may not be necessary or on its own to integrate data into pre-existing information system.

An ERD consists of entities, actions, connecting lines, cardinality, and attributes. These

components of the ERD facilitates in the proper location and visualization of information the

system produces (Rose, 2014).

Data Flow Diagram of Database Schema

In software development, one of the most crucial and annoying process to take is to create a

data flow diagram. The creation of a data flow diagram allows for the creation of a program with

less discomfort. With a data flow diagram in an organization, the programs are organized, and

this facilitates on the planning on how new programs will accomplish other purposes. DFD is the

illustration of processing data by a system in inputs and outputs and focuses on the flow of

information that is where data comes from to where it goes until storage (Darrington, 2015).

Flow of Data

According to Patil et al (2011), within a data warehouse, the functions of staging, integration,

and access maintains the system. The staging is for storing of raw data, integration is to integrate

data for the users, and finally access is for getting data out for the users. The means to retrieve

and analyze data is to extract the data, transform the data, and load the data. Data warehousing

plays a significant role in the process of engineering, knowledge, and decision-making.


DATABASE AND DATA WAREHOUSING DESIGN 8

Conclusion

Designing a database and data warehousing for a company happens to be one of the most

important aspect an organization can be able to achieve and make intellectual decisions. They

both emphasizes on or the organization, formatting, and standardization of facts in such a

way that an organization can be able to derive information. Not every organization may need

to design a database or data warehouse, but with the organizational goals and objects, a

company should be able to make considerations on if to design one.


DATABASE AND DATA WAREHOUSING DESIGN 9

References

Chapple, M. (2017, January 4). About Tech: Referential Integrity Ensures Database

Consistency. Retrieved from

http://databases.about.com/cs/administration/g/refintegrity.htm

Darrington, J. (2015, March 31). Techwalla: Importance of Data Flow Diagrams. Retrieved

from https://www.techwalla.com/articles/importance-of-data-flow-diagrams

Hull, S. (2011, July 5). Scalable Startups: Relational Database – What is it and why is it

Important? Retrieved from http://www.iheavy.com/2011/07/05/relational-database-what-

is-it-and-why-is-it-important/

Jones, A., Plew, R., & Stephens, R. (2008, July 1). InformIT: What Is a Schema?-Managing

Database Objects in SQL. Retrieved from http://www.informit.com/articles/article.aspx?

p=1216889&seqNum=2

Patil, P., Srikantha, R., and Patil, S. (2011). Optimization of Data Warehousing

System: Simplification in Reporting and Analysis. IJCA Proceedings on International

Conference and workshop on Emerging Trends in Technology (ICWET) (9):33-37

Raman, V. (2011, July 27). Features of Data Warehouse: What is Data Warehouse and

Features of Data Warehouse. Retrieved from

http://vibhorraman.blogspot.com/2011/07/what-is-datawarehouse-and-features-of.html

Rouse, M. (2014, October 14). Definition from WhatIS.com: What is Entity Relationship?

Diagram. Retrieved from http://searchcrm.techtarget.com/definition/entity-relationship-

diagram
DATABASE AND DATA WAREHOUSING DESIGN 10

Appendix

Figure 1.1: Entity Relationship Diagram of Data Schema


DATABASE AND DATA WAREHOUSING DESIGN 11

Figure 1.2: Data Flow Diagram of Data Schema

You might also like