Professional Documents
Culture Documents
IS Module 3
IS Module 3
DATABASES
AND DATA
WAREHOUSES
Opening Case
Searching for
Revenue - Google
DATABASE
FUNDAMENTALS
LEARNING OUTCOMES
1. List, describe, and provide an example of
each of the five characteristics of high quality
information
LEARNING OUTCOMES
4. Define the fundamental concepts of the
relational database model
UNDERSTANDING INFORMATION
• Information is everywhere in an organization
Information Quality
• Business decisions are only as good as the
quality of the information used to make the
decisions
Information Quality
• Low quality information example
6-9
DATABASE FUNDAMENTALS
• Information is everywhere in an
organization
E F Codd Rule
• Codd rules were proposed by E.F. Codd which should be satisfied by relational
model.
• Foundation Rule: For any system that is advertised as, or claimed to be, a
relational data base management system, that system must be able to manage data
bases entirely through its relational capabilities.
• Information Rule: Data stored in Relational model must be a value of some cell of a
table.
• Guaranteed Access Rule: Every data element must be accessible by table name,
its primary key and name of attribute whose value is to be determined.
• Systematic Treatment of NULL values: NULL value in database must only
correspond to missing, unknown or not applicable values.
• Active Online Catalog: Structure of database must be stored in an online catalog
which can be queried by authorized users.
• Comprehensive Data Sub-language Rule: A database should be accessible by a
language supported for definition, manipulation and transaction management
operation.
• View Updating Rule: Different views created for various purposes should be
automatically updatable by the system.
6-14
E. F Codd Contd
• High level insert, update and delete rule: Relational Model should
support insert, delete, update etc. operations at each level of relations.
Also, set operations like Union, Intersection and minus should be
supported.
• Physical data independence: Any modification in the physical location of
a table should not enforce modification at application level.
• Logical data independence: Any modification in logical or conceptual
schema of a table should not enforce modification at application level. For
example, merging of two tables into one should not affect application
accessing it which is difficult to achieve.
• Integrity Independence: Integrity constraints modified at database level
should not enforce modification at application level.
• Distribution Independence: Distribution of data over various locations
should not be visible to end-users.
• Non-Subversion Rule: Low level access to data should not be able to
bypass integrity rule to change data.
6-15
DATABASE ADVANTAGES
Increased Flexibility
• A well-designed database should:
– Handle changes quickly and easily
– Provide users with different views
– Have only one physical view
• Physical view – deals with the physical storage of
information on a storage device
– Have multiple logical views
• Logical view – focuses on how users logically
access information
6-17
Reduced Redundancy
Increased Security
• Information is an organizational asset and must
be protected
RELATIONAL DATABASE
FUNDAMENTALS
• Entity – a person, place, thing, transaction, or
event about which information is stored
– The rows in each table contain the entities
– In Figure 6.5 CUSTOMER includes Dave’s Sub Shop
and Pizza Palace entities
RELATIONAL DATABASE
FUNDAMENTALS
• Attributes (fields, columns) – characteristics
or properties of an entity class
– The columns in each table contain the attributes
– In Figure 6.5 attributes for CUSTOMER include:
• Customer ID
• Customer Name
• Contact Name
• Phone
6-23
RELATIONAL DATABASE
FUNDAMENTALS
• Primary keys and foreign keys identify the
various entity classes (tables) in the
database
– Primary key – a field (or group of fields) that
uniquely identifies a given entity in a table
– Foreign key – a primary key of one table that
appears an attribute in another table and acts
to provide a logical relationship among the
two tables
6-24
Potential relational
database for Coca-
Cola
6-25
DATA
WAREHOUSE
FUNDAMENTALS
LEARNING OUTCOMES
LEARNING OUTCOMES
Multidimensional Analysis
Multidimensional Analysis
• Cube – common term for the representation
of multidimensional information
6-40
Multidimensional Analysis
• Data mining – the process of analyzing data to
extract information not offered by the raw data
alone
BUSINESS INTELLIGENCE
DATA MINING
• Data-mining software includes many forms of AI such
as neural networks and expert systems
6-48
DATA MINING
Cluster Analysis
• Cluster analysis – a technique used to divide
an information set into mutually exclusive
groups such that the members of each group
are as close together as possible to one
another and the different groups are as far
apart as possible
Association Detection
• Association detection – reveals the
degree to which variables are related
and the nature and frequency of these
relationships in the information
– Market basket analysis – analyzes such
items as Web sites and checkout scanner
information to detect customers’ buying
behavior and predict future behavior by
identifying affinities among customers’
choices of products and services
6-51
Statistical Analysis