You are on page 1of 12

Business

Intelligence
Section One
Topic

01 02 03
Data Warehouse Data marts OLAP
How does data warehouse ODS, EDW, Data mart
differ from a database
What is Data Warehouse

● A pool of data produced to support decision making.


● A subject-oriented, integrated, time-variant, nonvolatile
collection of data in support of management’s decision-making
process.
Data warehouse vs database

● Technically speaking, a data warehouse is a database, albeit with ertain


characteristics to facilitate its role in decision support.
● Most databases are highly normalized, in part to avoid update anomalies.
● Data ware houses are highly de-normalized for performance reasons. This is
acceptable because their content is never updated, just added to Historical
data are static.
Major components of a data warehouse
● Data sources. Data are sourced from operational systems and possibly
from external data sources.
● Data extraction and transformation. Data are extracted and properly
transformed using custom-written or commercial software called ETL.
● Data loading. Data are loaded into a staging area, where they are
transformed and cleansed. The data are then ready to load into the data
warehouse.
Major components of a data warehouse cont’d
● Comprehensive database. This is the EDW that supports decision
analysis by providing relevant summarized and detailed information.
● Metadata. Metadata are maintained for access by IT personnel and users.
Metadata include rules for organizing data summaries that are easy to
index and search.
● Middleware tools. Middleware tools enable access to the data warehouse
from a variety of front-end applications.
Middleware tools
● Enable access to the data warehouse. Power users such as
analysts may write their own SQL queries. Others may
access data through a managed query environment. There
are many front-end applications that business users can use
to interact with data stored in the data repositories, including
data mining, OLAP, reporting tools, and data visualization
tools. All these have their own data access requirements.
Those may not match with how a given data warehouse must
be accessed. Middleware translates between the two.
Data mart, ODS, EDW
● An ODS (Operational Data Store) is the database from
which a business operates on an ongoing basis.
● Both an EDW and a data mart are data warehouses.
An EDW (Enterprise Data Warehouse) is an all-
encompassing DW that covers all subject areas of
interest to the entire organization.
● A data mart is a smaller DW designed around one
problem, organizational function, topic, or other
suitable focus area.
OLAP
● Online analytical processing. In general, it has a low volume of complex
transactions generated by large reports. The response time is an
effectiveness measure. These databases store aggregated, historical
data in multi-dimensional schemas. OLAP databases are used to
analyze multidimensional data from multiple sources and perspectives.
We have two ways to load data into our
analytics database
● ETL: Extract, transform and ● ELT: Extract, load and
load. This is the way to generate transform. First, extract the data
our data warehouse. First, from the production database,
extract the data from the load it into the database and
production database, transform then transform the data. This
the data according to our way is called Data Lake and it’s
requirement, and then, load the a new concept to manage our
data into our data warehouse. big data.
Setup environment
Installation
Installation

01 02
Download pgAdmin Download postgreSQL

You might also like