Professional Documents
Culture Documents
Lecture-3
Introduction and Background
What is a Data Warehouse?
A complete repository of historical
corporate data extracted from
transaction systems that is available for
ad-hoc access by knowledge workers.
What is a Data Warehouse?
❑ Complete repository
❑ History
❑ Transaction System
❑ Ad-Hoc access
❑ Knowledge workers
What is a Data Warehouse?
Transaction System
▪ Management Information System (MIS)
▪ Could be typed sheets (NOT transaction system)
Ad-Hoc access
▪ Dose not have a certain access pattern.
▪ Queries not known in advance.
▪ Difficult to write SQL in advance.
Knowledge workers
▪ Typically NOT IT literate (Executives, Analysts, Managers).
▪ NOT clerical workers.
▪ Decision makers.
Another View of a DWH
Subject
Oriented
Integrated
Time
Variant
Non
Volatile
Another view of DWH
◦ Subject-Oriented:
A data warehouse can be used to analyze a particular subject area. For example, "sales" can
be a particular subject.
◦ Integrated
A data warehouse integrates data from multiple data sources. For example, source A and
source B may have different ways of identifying a product, but in a data warehouse, there will
be only a single way of identifying a product
◦ Non-Volatile
Once data is in the data warehouse, it will not change. So, historical data in a data
warehouse should never be altered.
◦ Time Variant
Historical data is kept in a data warehouse. For example, one can retrieve data from 3
months, 6 months, 12 months, or even older data from a data warehouse.
What is a Data Warehouse ?
It is a blend of many technologies, the basic concept
being:
Answers result
User requests
in more questions
IT people
?
Business user
may get answers
IT people do
system analysis
and design
IT people
send reports to IT people
business user create reports
9
How is it Different
◦ Different patterns of hardware utilization
100%
0%
Operational DWH
❑ Industry.
❑ Cost of storing historical data.
▪ Decision makers typically don’t work 24 hrs a day and 7 days a week. An
ATM system does.
▪ Once decision makers start using the DWH, and start reaping the benefits,
they start liking it…
▪ Start using the DWH more often, till want it available 100% of the time.
▪ For business across the globe, 50% of the world may be sleeping at any
one time, but the businesses are up 100% of the time.
▪ 100% availability not a trivial task, need to take into account loading
strategies, refresh rates etc.
How is it Different?
Does not follows the traditional development model
Requirements
Program
Classical SDLC
▪ Requirements gathering
▪ Analysis
▪ Design
▪ Programming
▪ Testing
▪ Integration
▪ Implementation
How is it Different?
Does not follows the traditional development model
DWH
Program
Requirements
DWH SDLC (CLDS)
▪ Implement warehouse
▪ Integrate data
▪ Test for biasness
▪ Program w.r.t data
▪ Design DSS system
▪ Analyze results
▪ Understand requirement
Data Warehouse Vs. OLTP