You are on page 1of 24

Unit 2

Data Warehouse and OLAP Technology


• A data warehouse is simply a single, complete and consistent store of
data optained from a variety of sources and made available to end
users in a way they can understand and use it in a business context.
• A data warehouse is a subject oriented ,integrated, time variant and
nonvolatile collection of data in support of managements decision
making process.
Data warehouse- subject oriented
• Oriented to the major subject areas of the corporation that have been
defined in the data model.
• for example, for an insurance company :customer, product,
transaction or activity, policy ,claim, account etc.
Data warehouse-Integrated
• There is no consistency in encoding, naming conventions, among
different data sources.
• heterogeneous data sources
• when data is moved to the warehouse, it is converted.
Data warehouse- nonvolatile
• Operational data is regularly accessed and manipulated a record at a
time and update is done to data in the operational environment.
Data warehouse time variance
• That time Horizon for the data warehouse is sufficiently longer than
that of operational systems.
• operational database: current value data
Building blocks or component
• Meta data -good metadata is essential to the effective operation of a
data warehouse and it is used in data collection, data transformation
and data access.
• Meta data maps the translation of information from the operational
system to the analytical system.
Data marts
• Data mart are smaller than data warehouses and generally contain
information from a single department of a business or organisation.
The current trend in data warehouseing is to develop a data
warehouse with several smaller related data marts for specific kinds
of queries and reports.
Security
• As with any information system security of data is determined by the
hardware software and the procedures that created them. The
reliability and authenticity of the data and information extracted from
the warehouse will be a function of the reliability and authenticity of
the warehouse and the various source systems.
Construction
• That steps in planning of data warehouse are identical to the steps
for any other type of computer application. Users must be involved to
determine the scope of the warehouse and what business
requirements need to be met.
Why a warehouse
• Two approaches:
• 1.Query-driven (lazy)
• 2.Warehouse (Eager)
• The traditional research
• Query driven lazy on demand
Disadvantages of query driven approach
• Delay in query processing.
• Slow or unavailable information sources
• complex filtering and integration
• inefficient and potentially expensive for frequent queries
• competes with local processing at sources
• has not caught on in industry
The warehousing approach
• Information integrated in advance
• stored in warehouse for Direct.
• Advantages of warehousing approach
• High query performance
• but not necessarily most current information
• does not interfere with local processing at sources
• complex queries at warehouse.
Data warehouse architectures
• 1. Single layer
• every data element is stored once only
• virtual warehouse
• 2.Two layer
• real time+ derived data
• most commonly used approach in industry today
• 3. three layered architecture
• transformation of real time data to derived data really requires two steps: view
level ‘particular informational needs’
• physical implementation of the data warehouse.
Issues in data warehouse
• Warehouse design
• extraction
• Wrappers, monitor
• integration
• cleansing and merging
• warehousing specification and maintenance
• optimisation.

You might also like