You are on page 1of 29

Chapter 14

The Second Component:


The Database

14 - 1
The Importance of the Database
• The database is where an organization stores
content for instantaneous retrieval when needed
– Data, documents, pictures, or anything that can be
represented in a computer is stored in the database
manipulated by a DBMS
• A business depends on information/knowledge to
operate and the database is one of its most
valuable resources
• Make technology so powerful - search at
incredible speeds
• How?
– Retrieve?
– Correlate?
– Go beyond data mining to semantics? 14 - 2
Storage of Data
• The first computers stored information on paper
tape, punched cards, and magnetic tape
• The development of the magnetic disk changed
processing dramatically
• RAM - flash memory - secondary storage
• Programs bring data from the disk (or secondary
memory) into primary memory for processing
– Disk access however is up to a million times slower
than primary memory access

14 - 3
14 - 4
Database Management Software
(DBMS)
• DBMSs automate tasks associate with using direct
access files
• DBMS administrators
– describe the database and its required indexes
– define records
• Individual programs ask for specific pieces of data
– only programs that access a piece of data need to be
changed when that data changes

14 - 5
DBMS Requirements
• A DBMS must provide the following
– A method for defining the contents of the database
– A way to describe relationships among data elements
and records
– A mechanism to set up the database in the first place
– Ways to manipulate the data including
• Updating (adding, modifying, and/or deleting
information)
• Retrieval

14 - 6
Benefits of the Relational Database
Model
• Data are organized in two dimensional tables
which are easy to develop and understand
• The structure can be described mathematically
– each table represents a relation
• Columns from tables can be extracted and even
joined
• Relational databases are easy to use

14 - 7
14 - 8
An Example
• Records consist of related data fields
– Student number, student last name, student first name,
address line 1, address line 2, city, state, zip code,
phone number
• Each field consists of one data element and the
size of the field is the same for each record
• Index or key fields make it easier to search
records
– Student ID number

14 - 9
Microsoft Access RDBMS
• Can create relations and add data to them
– E.g., Student and Class
• Inquire for information based on criteria
• Join relations on some key

14 - 10
14 - 11
14 - 12
14 - 13
14 - 14
14 - 15
14 - 16
Object-Oriented Databases
• Traditionally relational databases supported a
limited number of data types
– Alphabet, numeric, dates, and time
• Modern organizations use a variety of data
– Graphics objects, audio clips, videos, subscripted
arrays, and complex data for data mining
• RDBMS vendors have extended their packages to
handle such data objects

14 - 17
Structured Query Language (SQL)
• A retrieval language for users
• Basic structure of a SQL expression
– The select clause lists the attributes desired in answer
to a query
– The from clause is a list of relations or tables that the
query language processor should consult in filling the
request
– The where clause describes the attributes desired in the
answer
• SQL is used as an intermediary and a standard in
accessing several different database systems

14 - 18
Oracle: An Enterprise DBMS
• Oracle DBMS architectures are server-centric
• Extended relational data model that supports
many different data types and uses SQL for
queries
• Typically supports thousands of users, processes
terabytes of data, and integrates with Oracle
application packages in financial management,
supply chain management, manufacturing, etc.

14 - 19
Distributed Databases
• Different parts of the database are located on
different computers in a network
• Issues distributed databases raise are
– Will data be replicated across computers or will their be
only one copy
– If data are replicated, how frequently must different
versions be updated to reflect changes
– How will updates to the database be coordinated so that
integrity is maintained
– Who “owns” distributed data and who has access to it
– distributed databases offer users easier access to data
at the cost of higher overall complexity of the system
14 - 20
The Data Warehouse
• Businesses collect a tremendous amount of
transactions data from routine operations
• These data can be analyzed to understand the
business better
– Requires multidimensional analysis called Online
Analytical Processing (OLAP)
– Helps create a learning organization that is better able to
understand its markets, customers and itself

14 - 21
14 - 22
14 - 23
14 - 24
Data Mining
• Discovers interesting structure in large amounts
of data
• This structure consists of
– Patterns
– Statistical or predictive models of the data
– Relationships between the data
• Applied extensively to customer data
– Allows firms to determine for instance which products
sell together

14 - 25
Reasons for Data Mining
• Increasing business unit and overall profitability
• Understanding customer desire and needs
• Identifying profitable customers and acquiring new ones
• Retaining customers and increasing loyalty
• Increasing ROI and reducing costs on promotions
• Cross-selling and up-selling
• Detecting fraud, waste, and abuse
• Determining credit risks
• Increasing web site profitability
• Increasing store traffic and optimizing store layouts
• Monitoring business performance

14 - 26
Approaches to Data Mining
• Visualization - graph can tell a lot
• Statistical techniques
• Search and optimization
• Artificial intelligence (e.g., neural networks)

14 - 27
Databases and the Organization
• The typical organization has many databases
– Some are organized while others are loose collections
of information
• A manager is responsible for the creation,
maintenance, and protection of data
• Databases are the firm’s memory and allow it to
remain in business
• Provide with incredible opportunities

14 - 28
Summary
• Organizations keep tremendous amounts of
machine readable data
• Data in files are stored in records which
consist of fields which contain groups of
characters
• The DBMS automates the task of setting up a
database
• The relational model is the most dominant
DBMS model today
• Data warehouses and data mining can
contribute significantly to firm success

14 - 29

You might also like