You are on page 1of 38

Database

Week 1 - Database concepts and systems


Dr. Hasan TINMAZ
Woosong University
Endicott College of International Studies
Why Databases?
 In today’s world, data is ubiquitous (abundant, global, everywhere)
and pervasive (unescapable, prevalent, persistent).
 From birth to death, we generate and consume data.
 The trail of data starts with the birth certificate and continues all
the way to a death certificate (and beyond!).
 In between, each individual produces and consumes enormous
amounts of data.
 As you see, databases are the best way to store and manage data.
 Databases make data persistent and shareable in a secure way.

2
Why Databases?

3
Why Databases?
➢ Data is not only ubiquitous and pervasive; it
is also essential for organizations to survive
and prosper.
➢ Imagine trying to operate a business without
knowing who your customers are, what
products you are selling, who is working for
you, who owes you money, and to whom you
owe money.
➢ All businesses must keep this type of data and
much more.
➢ At the heart of all these systems are the
collection, storage, aggregation,
manipulation, dissemination, and
management of data.
➢ Databases are specialized structures that
allow computer-based systems to store,
manage, and retrieve data very quickly.
➢ Virtually all modern business systems rely on
databases.
4
Data is the new oil…

data value
5
Data vs. Information
Data Information
 Raw facts  Produced by processing data
 Raw data - Not yet been
processed to reveal the meaning  Reveals the meaning of data
 Building blocks of information  To reveal meaning,
 Raw data must be properly information requires context
formatted for storage, processing,  Enables knowledge
and presentation (ie;
creation
DD/MM/Y)
 Data management  Should be accurate, relevant,
 Generation, storage, and and timely to enable good
retrieval of data decision making

6
Data vs. Information

7
Data / Information / Knowledge / Wisdom Pyramid
Data vs. Information
 Data is conceived of as symbols or signs,
representing stimuli or signals.
 Information is defined as data that are endowed
with meaning and purpose.
 Knowledge is a fluid mix of framed experience,
values, contextual information, expert insight and
grounded intuition that provides an environment
and framework for evaluating and incorporating
new experiences and information.
 It originates and is applied in the minds of
knowers.
 In organizations it often becomes embedded
not only in documents and repositories but
also in organizational routines, processes,
practices and norms.
 Wisdom is the ability to increase effectiveness.
 Wisdom adds value, which requires the mental
function that we call judgment.
 The ethical and aesthetic values that this
implies are inherent to the actor and are
unique and personal.
8
Data vs. Information
 Example 1 - Imagine the string “WifiPassword”. The string alone is data. Understanding
that it is a string is information. Knowing it is your wifi password is knowledge. And
using is to access your wireless is wisdom.

 Example 2 - Fitness tracking devices collect your health and activity data, but your end
goal is to use that to make decisions about how to train or how to manage your health.
 Data: The smartwatch collects raw data such as the number of steps taken, heart rate, and
sleep duration.
 Information: The smartwatch app organizes and structures the data, displaying it in a
comprehensible format, such as daily step count, average heart rate, and hours of sleep
per night.
 Knowledge: Analyzing and interpreting the information may reveal patterns, such as
increased step count leading to improved sleep quality or a correlation between heart rate
and workout intensity.
 Wisdom: Understanding these patterns lets you make informed decisions about adjusting
your exercise routine, sleep habits, and other lifestyle factors to improve your health and
fitness.

9
Database
 … is a shared, integrated computer structure that stores a
collection of:
 End-user data - Raw facts of interest to end user
 Metadata - Data about data, which the end-user data are integrated
and managed
 Describe data characteristics and relationships
 For example, the metadata component stores information such as the name
of each data element, the type of values (numeric, dates, or text) stored on
each data element, and whether the data element can be left empty.
 The metadata provides information that complements and expands the value
and use of the data.
 In short, metadata presents a more complete picture of the data in the
database.
 Given the characteristics of metadata, you might hear a database described as
a “collection of self-describing data.”

10
11
Database management system (DBMS)
 is a collection of programs that manages the database structure and
controls access to the data stored in the database.
 Roles of the DBMS:
 Intermediary between the user and the database,
 Enables data to be shared,
 Presents the end user with an integrated view of the data,
 Receives and translates application requests into operations
required to fulfill the requests,
 Hides database’s internal complexity from the application
programs and users.

12
The DBMS Manages the Interaction between the
End User and the Database

13
Advantages of the DBMS
• Better data integration (how actions in one segment of the company
affect other segments) and less data inconsistency
– Data inconsistency: Different versions of the same data appear in different
places
 For example, when a company’s sales department stores a sales
representative’s name as Bill Brown and the company’s personnel department
stores that same person’s name as William B. Brown, or
 When the company’s regional sales office shows the price of a product as
$45.95, and its national sales office shows the same product’s price as
$43.95.
• Increased end-user productivity
– The availability of data, combined with the tools that transform data into
usable information, empowers end users to make quick, informed decisions
that can make the difference between success and failure in the global
economy.

14
Advantages of the DBMS
• Improved:
 Data sharing
 Data security
 Data access
 The DBMS makes it possible to produce quick answers to ad hoc
queries.
 From a database perspective, a query is a specific request issued
(SQL codes) to the DBMS for data manipulation-for example, to
read or update the data.
 Decision making
 Data quality: Promoting accuracy, validity, and timeliness of data

15
Types of Databases
 Over the years, as technology and innovative uses of databases have evolved,
different methods have been used to classify databases.
 For example, databases can be classified by the number of users supported, where
the data is located, the type of data stored, the intended data usage, and the degree
to which the data is structured.
 Single-user database: Supports one user at a time
the #
of  Desktop database: Runs on PC
users
 Multiuser database: Supports multiple users at the same time
 Workgroup databases: Supports a small number of users (less than 50) or a
specific department
 Enterprise database: Supports many users (more than 50) across many
departments

16
Types of Databases
 Centralized database: Data is located at a single site
Loca
tion  Distributed (decentralized) database: Data is distributed
across different sites
 Cloud database: Created and maintained using cloud data services
(such as Microsoft Azure or Amazon AWS) that provide defined performance
measures for the database.

17
Types of Databases
 General-purpose databases: Contains a wide variety of data used
in multiple disciplines
 For example, a census database that contains general demographic data and the
Type
LexisNexis and ProQuest databases that contain newspaper, magazine, and
of journal articles for a variety of topics.
data
stored  Discipline-specific databases: Contains data focused on specific
subject areas
 Examples of discipline-specific databases are financial data stored in databases
such as CompuStat or CRSP (Center for Research in Security Prices),
geographic information system (GIS) databases that store geospatial and other
related data, and medical databases that store confidential medical history data.

18
Types of Databases
 Operational database: Designed to support a company’s day-
to-day operations.
The  also known as an online transaction processing (OLTP)
most
popular
database, transactional database, or production database.
categor
ization  Analytical database: Stores historical data and business metrics
used exclusively for tactical or strategic decision making.
 … allows the end user to perform advanced analysis of business
data using sophisticated tools for pricing decisions, sales
forecasts, market strategies, and so on.
 «data massaging» = data manipulation for information
production.

19
Types of Databases
 Analytical databases comprise two main components: a data
warehouse and an online analytical processing front end.
 Data warehouse: Stores data in a format optimized for
decision support.
 … contains historical data obtained from the operational databases as well
as data from other external sources.
 Online analytical processing (OLAP)
 … is a set of tools that work together to provide an advanced data analysis
environment for retrieving, processing, and modeling data from the data
warehouse.

20
Types of Databases
This graphic illustrates the concept
of OLAP.

The business question is: In the


Eastern region, what are the actual and
projected sales of our products (nuts, bolts,
washers, and screws)?

More importantly, when compared


to a spreadsheet model of the same
data, a graphical OLAP is much
faster, easier to understand and
visualize the relationships.

21
Types of Databases
 In recent times, the area of database application has grown in importance and usage, to the point
that it has evolved into its own discipline: business intelligence.
 Business intelligence (BI): Captures and processes business data to generate information
that support decision making.
 The lack of skilled and proficient workforce is one of biggest challenge faced by most of the
organization while implementing this tool hence acts as a major constraints in the growth of
this market.

22
Types of Databases
 Unstructured data: It exists in their original state,
therefore, does not lend itself to the processing that yields
information.
The
degree
 Structured data: It results from formatting
of data
structure  Structure is applied based on type of processing to be
performed
 For example, the data value 37890 might refer to a zip code, a sales value, or a
product code. If this value represents a zip code or a product code and is stored
as text, you cannot perform mathematical computations with it. On the
other hand, if this value represents a sales transaction, it must be formatted
as numeric.

23
Types of Databases

 To further illustrate the concept of structure, imagine a stack of printed paper


invoices.
 If you want to only store these invoices as images for future retrieval and display, you
can scan them and save them in a graphic format.
 On the other hand, if you want to derive information such as monthly totals and
average sales, such graphic storage would not be useful.
 Instead, you could store the invoice data in a (structured) spreadsheet format so that
you can perform the requisite computations.
 Most data you encounter is best classified as semistructured.
24
Types of Databases

25
Database Design
 A problem that has evolved with the use of personal productivity tools such as
spreadsheets and desktop database programs is that users typically lack proper data-
modeling and database design skills.
 People naturally have a “narrow” view of the data in their environment.
 Database design refers to the activities that focus on the design of the database
structure that will be used to store and manage end-user data.
 Even a good DBMS will perform poorly with a badly designed database.
 Data is one of an organization’s most valuable assets.
 Because current-generation DBMSs are easy to use, an unfortunate side effect is that
many computer-savvy business users gain a false sense of confidence in their ability
to build a functional database.
 Well-designed database; Facilitates data management & Generates accurate and valuable
information.
 Poorly designed database causes difficult-to-trace errors.

26
Database Design
ID Enum Name Title HireData Skill1 Skill1Date Skill2 Skill2Date Skill3 Skill3Date

27
Database Design
Designing appropriate data
repositories of integrated A better solution is…

information using the two-


dimensional table structures
found in most databases is a
process of decomposition.
The integrated data must be
decomposed properly into its
constituent parts, with each
part stored in its own table.
Further, the relationships
between these tables must be
carefully considered and
implemented so the integrated
view of the data can be
recreated later as information
for the end user.

28
Basic File Terminology

➢ The CUSTOMER file contains 10


records.
➢ Each record is composed of 9
fields: C_NAME, C_PHONE,
C_ADDRESS, C_ZIP, A_NAME,
A_PHONE,TP, AMT, and REN.
➢ The 10 records are stored in a
named file.
➢ Because the file contains customer
data for the insurance company, its
filename is CUSTOMER.
29
Data Redundancy
 Unnecessarily storing same data at different places.
 Islands of information: Scattered data locations.
 Increases the probability of having different versions of the same data.
 Uncontrolled data redundancy sets the stage for the following:
 Poor data security. Having multiple copies of data increases the chances for a
copy of the data to be susceptible to unauthorized access.
 Data inconsistency. Data inconsistency exists when different and conflicting
versions of the same data appear in different places. For example, suppose
you change an agent’s phone number in the AGENT file. If you forget to
make the corresponding change in the CUSTOMER file, the files contain
different data for the same agent.
 Data-entry errors. Data-entry errors are more likely to occur when complex
entries (such as 10-digit phone numbers) are made in several different files
or recur frequently in one or more files.
 Data integrity problems. It is possible to enter a nonexistent sales agent’s name
and phone number into the file.

30
Data Anomaly
Data anomaly: Develops when not all the required changes
in the redundant data are made successfully.

Update Anomalies

Insertion Anomalies

Deletion Anomalies

31
Database Systems
 Logically related data stored in a single logical data
repository
 Physically distributed among multiple storage
facilities
 DBMS eliminates most of file system’s problems
 Current generation DBMS software:
– Stores data structures, relationships between structures, and
access paths
– Defines, stores, and manages all access paths and components

32
The Database System Environment

The term database system refers to an organization of components that define and regulate
the collection, storage, management, and use of data within a database environment.

33
From a general management point of view, the database system is composed of the five
major parts: hardware, software, people, procedures, and data.
The Database System Environment
Hardware. … refers to all the system’s physical
devices, including computers (PCs, tablets,
workstations, servers, and supercomputers), storage
devices, printers, network devices (hubs, switches,
routers, fiber optics), and all other devices.

Software. Three types of software are needed to make


the database system function fully: operating system
software, DBMS software, and application programs and
utilities.

Procedures. Procedures are the instructions People. This component includes all users of the
and rules that govern the design and use of database system. Based on primary job functions, five
the database system. types of users can be identified in a database system:
system administrators (database system’s general
Data. The word data covers the collection of operations), database administrators (ensure that the
facts stored in the database. database is functioning properly), database designers
(design the database structure), system analysts and
programmers (design and implement the application
programs), and end users (the people who use the
34 application programs to run the organization’s daily
operations).
DBMS Functions
Data dictionary management

Data dictionary: Stores definitions of the data elements and their relationships. The DBMS provides
data abstraction, and it removes structural and data dependence from the system.

Data storage management


Performance tuning: Ensures efficient performance of the database in terms of storage and access
speed.

Data transformation and presentation


Transforms entered data to conform to required data structures. For ex; regardless of the data
presentation format, the DBMS must manage the date in the proper format for each country.

Security management
Enforces user security and data privacy.
35
DBMS Functions
Multiuser access control
Sophisticated algorithms ensure that multiple users can access the database
concurrently without compromising its integrity.

Backup and recovery management

Enables recovery of the database after a failure.

Data integrity management


Minimizes redundancy and maximizes consistency.

36
DBMS Functions
Database access languages and application programming interfaces

Query language: Lets the user specify what must be done without having to specify
how.
Structured Query Language (SQL): De facto query language and data access
standard supported by the majority of DBMS vendors.

Database communication interfaces

Accept end-user requests via multiple, different network environments.


For example, the DBMS might provide access to the database via the Internet through
the use of web browsers such as Mozilla Firefox, Google Chrome, Microsoft Edge, or
Microsoft Internet Explorer.
37
Disadvantages of Database Systems
Increased costs (+Training, licensing, and regulation compliance costs)

Management complexity

Maintaining updates and applying patches

Vendor dependence

Frequent upgrade/replacement cycles

38

You might also like