Professional Documents
Culture Documents
DATA
WAREHOUSE:
Group 6
Table of contents
Defining features
-Integrated Data
01
-Time-Variant Data
-Nonvolatile Data
Data Warehouses and Data
Marts
-How Are They Different? 02
-Centralized Data Warehouse
-Independent Data Marts
-Data-Mart Bus
In data warehousing,
what is the term
“time variant”?
TIME-VARIANT Dat
data warehouse
specific time period.
TIME-VARIANT Dat
Operational System
The basic
components
are the same
Source Data Component
PRODUCTION
INTERNAL
ARCHIVED
EXTERNAL
Production
Data
This comes from the different operating
systems. Basis: Data requirements in
data warehouse → segments of data
from various operational modes
PRODUCTION DATA
This is data from many vertical applications.
DISPARITY
Noun
dis·par·i·ty | \ di-ˈsper-ə-tē , -ˈspa-rə- \
Definition
: a noticeable and usually significant difference or dissimilarity
Internal
Data
In each organization, the client keeps
their “private” spreadsheets, reports,
customer profiles, and sometimes even
department databases.
INTERNAL DATA
You cannot ignore the internal data held in private files in
your organization. It is a collective judgment call on how
much of the internal data should be included in the data
warehouse.
Stage:
1st: recent data → separate archival database
2nd: older data → flat files on disk storage
3rd: oldest data → tape cartridges or microfilm and kept
off-site
ARCHIVED DATA
Data staging provides a place and an area with a set of functions to clean,
change, combine, convert, deduplicate, and prepare source data for storage
and use in the data warehouse.
Three major functions need to be performed for getting the
data ready
Data
Data Extraction. Data Loading.
Transformation.
Data Extraction.
This function has to deal with numerous data sources. You have to
employ the appropriate technique for each data source. Source
data may be from differ- ent source machines in diverse data
formats. Part of the source data may be in relation- al database
systems. Some data may be on other legacy network and
hierarchical data models. Many data sources may still be in flat
files. You may want to include data from spreadsheets and local
departmental data sets. Data extraction may become quite
complex.
Data Transformation
Data transformation involves many forms of combining pieces of
data from the differ- ent sources. You combine data from a single
source record or related data elements from many source records.
On the other hand, data transformation also involves purging
source data that is not useful and separating out source records
into new combinations. Sorting and merging of data takes place
on a large scale in the data staging area.
Data Loading
Two distinct groups of tasks form the data loading function. When
you complete the design and construction of the data warehouse
and go live for the first time, you do the initial loading of the data
into the data warehouse storage. The initial load moves large
volumes of data using up substantial amounts of time. As the data
warehouse starts functioning, you continue to extract the changes
to the source data, transform the data revisions, and feed the
incremental data revisions on an ongoing basis.
Information Delivery Component
USERS OF INFORMATION FROM THE DATA WAREHOUSE