You are on page 1of 67

Ralph M. Stair | George W.

Reynolds

Chapter 5

Database Systems and


Applications

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Principles and Learning Objectives: Data
Management and Modeling

• Data management and modeling are key


aspects of organizing data and information
– Define general data management concepts
and terms, highlighting the advantages of the
database approach to data management
– Describe logical and physical database
design considerations, the function of data
centers, and the relational database model

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
2
Principles and Learning Objectives:
Database Support in Decision Making

• A well-designed and well-managed


database is an extremely valuable tool in
supporting decision making
– Identify the common functions performed by
all database management systems, and
identify popular database management
systems

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
3
Principles and Learning Objectives:
Evolving Database Applications

• The number and types of database


applications will continue to evolve and
yield real business benefits
– Identify and briefly discuss business
intelligence, data mining, and other database
applications

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
4
Why Learn About Database Systems and
Applications?

• A huge amount of data is captured for


processing by computers every day
• Learning about database systems and
applications can help you make the most
effective use of information
• Databases and applications to extract and
analyze valuable information can help you
succeed in your career

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
5
Introduction

• Database: an organized collection of data


• A database management system (DBMS)
is a group of programs that:
– Manipulate the database
– Provide an interface between the database
and its users and other application programs

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
6
Data Management

• Without data and the ability to process it:


– An organization could not successfully
complete most business activities
• Data consists of raw facts
• Data must be organized in a meaningful
way to transform it into useful information

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
7
The Hierarchy of Data

• A bit (binary digit) represents a circuit that


is either on or off
• A byte is made up of eight bits
– Each byte represents a character
• Field: a name, number, or combination of
characters that describes an aspect of a
business object or activity

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
8
The Hierarchy of Data (cont’d.)

• Record: a collection of related data fields


• File: a collection of related records
• Database: a collection of integrated and
related files
• Hierarchy of data: bits, characters, fields,
records, files, and databases

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
9
An Example of Hierarchy of Data

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
10
Data Entities, Attributes, and Keys

• Entity: a person, place, or thing for which


data is collected, stored, and maintained
• Attribute: a characteristic of an entity
• Data item: the specific value of an attribute
• Primary key: a field or set of fields that
uniquely identifies the record

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
11
Keys and Attributes

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
12
The Database Approach

• Traditional approach to data management


– Each distinct operational system used data
files dedicated to that system
• Database approach to data management
– Information systems share a pool of related
data

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
13
Database Approach to Data Management

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
14
Data Centers, Data Modeling and
Database Characteristics

• Considerations when building a database


– Content: what data should be collected and at
what cost?
– Access: what data should be provided to
which users and when?
– Logical structure: how should data be
arranged so that it makes sense?
– Physical organization: where should data be
physically located?

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
15
Data Modeling

• Data model: a diagram of data entities and


their relationships
• Enterprise data modeling: data modeling
done at the level of the entire enterprise
• Entity-relationship (ER) diagrams: data
models that use basic graphical symbols
to show the organization of and
relationships between data

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
16
Entity-Relationship (ER) Diagram for a
Customer Order Database

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
17
Relational Database Model

• Relational model: a simple but highly


useful way to organize data into collections
of two-dimensional tables called relations
• Relational model databases include:
– Oracle, IBM DB2, Microsoft SQL Server,
Microsoft Access, MySQL, and Sybase
• Domain: range of allowable values for a
data attribute

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
18
Relational Database Model (cont’d.)

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
19
Manipulating Data

• Selecting: eliminating rows according to


certain criteria
• Projecting: eliminating columns in a table
• Joining: combining two or more tables
• Linking: combining two or more tables
through common data attributes to form a
new table with only the unique data
attributes

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
20
Simplified ER Diagram

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
21
Linking Data Tables to Answer an Inquiry

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
22
Data Cleansing

• Also called data cleaning or data


scrubbing
• The process of detecting and then
correcting or deleting incomplete,
incorrect, inaccurate, irrelevant records
that reside in a database
• The cost of performing data cleansing can
be quite high

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
23
Tradeoff of Cost Versus Accuracy

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
24
Database Management Systems

• Creating and implementing the right


database system ensures that the
database will support both business
activities and goals
• Capabilities and types of database
systems vary considerably

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
25
Overview of Database Types

• Single-user DBMS
– Installed on a personal computer and meant
for a single user
– Examples: Microsoft Access and InfoPath,
Lotus Approach, and Personal Oracle
• Multiple-user DBMS
– Allows dozens or hundreds of people to
access the same system at the same time
– Vendors: Oracle, Microsoft, Sybase, and IBM
© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
26
Overview of Database Types (cont’d.)

• Flat file
– Simplest database program
– The records have no relationship to one
another
– Store and manipulate a single table or file
– Examples: OneNote and Evernote

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
27
SQL Databases

• SQL: a special-purpose programming


language for accessing and manipulating
data stored in a relational database
• SQL databases conform to ACID
properties:
– Atomicity, consistency, isolation, and durability
• 1986: SQL was adopted by ANSI as the
standard query language for relational
databases
© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
28
Table 5.1 Examples of SQL Commands

SQL Command Description


SELECT ClientName, Debt FROM This query displays all clients (ClientName) and the amount
Client WHERE Debt > 1000 they owe the company (Debt) from a database table called
Client for clients who owe the company more than $1,000
(WHERE Debt > 1000).
SELECT ClientName, This command is an example of a join command that combines
ClientNum, OrderNum FROM data from two tables: the Client table and the Order table
Client, Order WHERE (FROM Client, Order). The command creates a new table with
Client.ClientNum=Order.ClientNum the client name, client number, and order number (SELECT
ClientName, ClientNum, OrderNum). Both tables include the
client number, which allows them to be joined. This ability is
indicated in the WHERE clause, which states that the client
number in the Client table is the same as (equal to) the client
number in the Order table (WHERE
Client.ClientNum=Order.ClientNum).
GRANT INSERT ON Client to This command is an example of a security command. It allows
Guthrie Bob Guthrie to insert new values or rows into the Client table.

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
29
NoSQL Databases

• A database designed to store and retrieve


data in a manner that does not rigidly
enforce the atomic conditions associated
with the relational database model
– Provides faster performance and greater
scalability
• Examples
– Cassandra used by Facebook
– DynamoDB used by Amazon

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
30
Visual, Audio, and Other Database
Systems

• Visual databases store images of charge


slips, X-rays, and vital records
– Images can be stored in some object-
relational databases or special-purpose
database systems
• Spatial databases provide location-based
services
– Maps are embedded into a Web site’s Web
applications and operational systems
© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
31
Database Activities

• Providing a user view of the database


• Adding and modifying data
• Storing and retrieving data
• Manipulating the data and generating
reports

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
32
Providing a User View

• Schema: a description of the entire


database
• A schema can be part of the database or a
separate schema file
• The DBMS can reference a schema to find
where to access the requested data in
relation to another piece of data

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
33
Creating and Modifying the Database

• Data definition language (DDL)


– A collection of instructions and commands
used to define and describe data and
relationships in a specific database
– Allows the database’s creator to describe data
and relationships that are to be contained in
the schema
• Data dictionary: a detailed description of
all the data used in the database
© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
34
Data Definition Language

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
35
Data Dictionary Entry

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
36
Storing and Retrieving Data

• When an application program needs data,


it requests the data through the DBMS
• Concurrency control deals with the
situation in which two or more users or
applications need to access the same
record at the same time

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
37
Logical and Physical Access Paths

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
38
Manipulating Data and Generating
Reports

• Query by Example (QBE) is a visual


approach to developing database queries
or requests
• Data manipulation language (DML): a
specific language, provided with a DBMS
– Allows users to access and modify the data,
to make queries, and to generate reports

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
39
Manipulating Data and Generating
Reports (cont’d.)

• Once a database has been set up and


loaded with data, it can produce reports,
documents, and other outputs
• A DBMS can produce a wide variety of
documents, reports, and other output that
can help organizations achieve their goals

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
40
Database Administration

• DBA: a skilled and trained IS professional


– Works with users to define their data needs
– Applies database programming languages to
craft a set of databases to meet those needs
– Tests and evaluates databases
– Implements changes to improve their
databases’ performance
– Assures that data is secure from unauthorized
access
© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
41
Database Administration (cont’d.)

• Data administrator: a nontechnical position


responsible for defining and implementing
consistent principles for a variety of data
issues
• The data administrator can be a high-level
position reporting to top-level managers

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
42
Table 5.2 Popular Database Management
Systems

Open-Source Relational DBMS for Relational DBMS for  


Relational Individuals and Workgroups and NoSQL DBMS
DBMS Workgroups Enterprise

MySQL Microsoft Access Oracle Mongo DB

PostgreSQL IBM Lotus Approach IBM DB2 Cassandra

MariaDB Google Base Sybase Redis

SQL Lite OpenOffice Base Teradata CouchDB

    Microsoft SQL Server  

    Progress OpenEdge  

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
43
Popular Database Management Systems

• Database as a Service (DaaS)


– The database is stored on a service provider’s
servers
– The database is accessed by the client over a
network, typically the Internet
– Database administration is handled by the
service provider
• Example of DaaS: Amazon Relational
Database Service (Amazon RDS)
© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
44
Using Databases with Other Software

• DBMSs can act as front-end or back-end


applications
– Front-end applications interact directly with
people
– Back-end applications interact with other
programs or applications
• Spin-off database applications include:
– Big data, data warehouses and data marts,
and business intelligence
© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
45
Big Data

• Extremely large and complex data


collections
– Traditional data management software,
hardware, and analysis processes are
incapable of dealing with them
• Three characteristics of big data
– Volume
– Velocity
– Variety
© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
46
Table 5.3 Big Data Generators

Source Magnitude of Data Generated

Large Hadron particle 40 terabytes of data per second


accelerator at CERN
Commercial aircraft engines More than 1 petabyte per day of sensor data
Cell phones More than 5 billion people worldwide are making
cell phone calls, exchanging text messages, and
accessing Web sites
YouTube 48 hours of video uploaded per minute
Facebook 100 terabytes uploaded per day
Twitter 500 million tweets per day
RFID tags 1,000 times the volume of data generated by bar
codes

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
47
Challenges of Big Data

• How to choose what subset of the data to


store
• Where and how to store the data
• How to find the nuggets of data that are
relevant to the decision making at hand
• How to derive value from the relevant data

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
48
In-Memory Databases

• A database management system that


stores the entire database in random
access memory (RAM)
• Enable the analysis of big data and other
challenging data-processing applications
• Perform best on multiple multicore CPUs

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
49
Data Warehouses and Data Marts

• Data warehouse: a large database that


collects business information from many
sources in the enterprise in support of
management decision making
• ETL process
– Extract
– Transform
– Load

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
50
Elements of a Data Warehouse

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
51
Data Warehouses and Data Marts
(cont’d.)

• Data mart: a subset of a data warehouse


that is used by small- and medium-sized
businesses and departments within large
companies to support decision making
• A specific area in the data mart might
contain greater detailed data than the data
warehouse

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
52
Business Intelligence

• A broad range of technologies and


applications
– Enabling an organization to transform mostly
structured data obtained from information
systems to perform analysis, generate
information, and improve the decision making
of the organization

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
53
Business Intelligence (cont’d.)

• Technologies include:
– Data mining
– Online analytical processing
– Predictive analytics
– Data visualization
– Competitive intelligence

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
54
Data Mining

• An information-analysis tool that involves


the automated discovery of patterns and
relationships in a data warehouse
• Provides bottom-up, discovery-driven
analysis

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
55
Online Analytical Processing (OLAP)

• A form of analysis that allows users to


explore data from a number of
perspectives, enabling a style of analysis
known as “slicing and dicing”
• Provides top-down, query-driven data
analysis

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
56
Table 5.6 Comparison of OLAP and Data
Mining

Characteristic OLAP Data Mining


Purpose Supports data analysis and Supports data analysis and
decision making decision making

Type of analysis Top-down, query-driven data Bottom-up, discovery-driven


supported analysis data analysis

Skills required of user Must be very Must trust in data-mining


knowledgeable of the data tools to uncover valid and
and its business context worthwhile hypotheses

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
57
Predictive Analysis

• Also called predictive analytics


• A form of data mining that combines
historical data with assumptions about
future conditions to predict outcomes of
events, e.g., future product sales or the
probability that a customer will default on a
loan

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
58
Data Visualization

• In analyzing data, charts and graphs make


it easier to:
– See trends and patterns
– Identify opportunities for further analysis
• Software examples:
– Excel and SAS Visual Analytics

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
59
Data Visualization: Social Graph Analysis

• A data visualization technique in which


data is represented as networks
– Vertices are the individual data points (social
network users)
– Edges are the connections among the
vertices

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
60
Data Visualization: Key Performance
Indicators (KPIs) and Dashboards

• KPIs: quantifiable measurements that


assess progress toward organizational
goals and reflect the critical success
factors of an organization
• Dashboard: a data visualization tool that
displays the current status of the key
performance indicators (KPIs) for an
organization

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
61
Competitive Intelligence

• Competitive intelligence encompasses


information about competitors and the
ways that knowledge affects strategy,
tactics, and operations
• Counterintelligence describes the steps an
organization takes to protect information
sought by “hostile” intelligence gatherers

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
62
Summary – Principle 1

• Data is one of the most valuable resources


that a firm possesses
• An entity is an object for which data is
collected, stored, and maintained
• Database considerations: content, access,
logical structure, and physical organization
• The relational model places data in two-
dimensional tables

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
63
Summary – Principle 2

• A DBMS is a group of programs used as


an interface between a database and its
users and other application programs
• DBMS basic functions include:
– Providing user views
– Creating and modifying the database
– Storing and retrieving data
– Manipulating data and generating reports

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
64
Summary – Principle 2 (cont’d.)

• After a DBMS has been installed, the


database can be accessed, modified, and
queried via a data manipulation language
• A database administrator (DBA) plans,
designs, creates, operates, secures,
monitors, and maintains databases
• Database as a Service (DaaS) is a new
form of database service

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
65
Summary – Principle 3

• “Big data” is the term used to describe


enormous and complex data collections
• Data warehouses are relational DBMSs
specifically designed to support
management decision making

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
66
Summary – Principle 3 (cont’d.)

• Data mining allows the automated


discovery of patterns and relationships in a
data warehouse
• Counterintelligence describes the steps an
organization takes to protect information
sought by “hostile” intelligence gatherers

© 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
67

You might also like