Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Download
Standard view
Full view
of .
Look up keyword
Like this
0Activity
0 of .
Results for:
No results containing your search query
P. 1
Data Warehousing Mid

Data Warehousing Mid

Ratings:
(0)
|Views: 3|Likes:
Published by Nikhil Boorla

More info:

Published by: Nikhil Boorla on Jun 15, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as DOCX, PDF, TXT or read online from Scribd
See more
See less

06/15/2012

pdf

text

original

 
Data Warehousing Mid-Term Answers (Tentative).
1.
 
(a.)Describe STAR Schema?The entity-relationship data model is commonly used in the design of relationaldatabases, where a database schema consists of a set of entities and the relationshipsbetween them. Such a data model is appropriate for on-line transaction processing.A data warehouse, however, requires a concise, subject-oriented schema that facilitateson-line data analysis.The most popular data model for a data warehouse is a multidimensional model.Such a model can exist in the form of a star schema, a snowflake schema, or a fact con-stellation schema.
Star Schema:
The most common modeling paradigm is the star schema, in which thedata warehouse contains (1) a large central table (fact table) containing the bulk of the data, with no redundancy, and (2) a set of smaller attendant tables (dimensiontables), one for each dimension. The schema graph resembles a starburst, with thedimension tables displayed in a radial pattern around the central fact table.
Example:
A star schema for All Electronics sales is shown in Figure 3.4. Sales areconsidered along four dimensions, namely, time, item, branch, and location. The schemacontains a central fact table for sales that contains keys to each of the four dimensions,along with two measures: dollars sold and units sold. To minimize the size of the facttable, dimension identifiers (such as time key and item key) are system-generatedidentifiers.
 
(b.) Multiple Approaches for Data integration?ETL (Extract-Transform-Load): typically performed in an application server tier. Otherflavors include:(Extract-Load-Transform): With MPP database engines, an alternative is todo the transformation in the MPP DW(Extract-Transform-Load-Transform): do some processing in theapplication server tier and the rest is done in the DWEII (Enterprise Information Integration): optimized & transparent data access andtransformation layer providing a single relational interface across all enterprise data(structured, semi-structured, and unstructured)EAI (Enterprise Application Integration): message-based and transactional to integrate atboth the business process and data levels including application-2-application integration
2.
 
How many types of Schemas? Draw an essential schema using star schema andwrite and sql statement using that schema.
There are three types of Schemas in Data Warehousing. They are :1.
 
Star Schema.
 
2.
 
Snowflake Schema
 
3.
 
Fact Constellation Schema
 
1.
 
Star Schema:
The most common modeling paradigm is the star schema, in whichthe data warehouse contains (1) a large central table (fact table) containing thebulk of the data, with no redundancy, and (2) a set of smaller attendant tables(dimension tables), one for each dimension. The schema graph resembles astarburst, with the dimension tables displayed in a radial pattern around thecentral fact table.
 
2.
 
Snowflake Schema:
The snowflake schema is a variant of the star schema model,where some dimension tables are normalized, thereby further splitting the datainto additional tables. The resulting schema graph forms a shape similar to asnowflake.
 
3.
 
Fact Constellation Schema:
 
Sophisticated applications may require multiple facttables to share dimension tables. This kind of schema can be viewed as acollection of stars, and hence is called a galaxy schema or a fact constellation.
 
Star Schema Diagram:
Consider a database of sales, perhaps from a store chain, classified by date, store andproduct.
Fact_Sales
is the fact table and there are three dimension tables
Dim_Date
,
Dim_Store
 and
Dim_Product
.Each dimension table has a primary key on its
Id
column, relating to one of the columns(viewed as rows in the example schema) of the
Fact_Sales
table's three-column(compound) primary key (
Date_Id
,
Store_Id
,
Product_Id
). The non-primary key
Units_Sold
column of the fact table in this example represents a measure or metric thatcan be used in calculations and analysis. The non-primary key columns of the dimensiontables represent additional attributes of the dimensions (such as the
Year
of the
Dim_Date
 dimension).
Query for the Star Schema:
SELECTP.Brand,S.Country,SUM(F.Units_Sold)FROM Fact_Sales FINNER JOIN Dim_Date D ON F.Date_Id = D.IdINNER JOIN Dim_Store S ON F.Store_Id = S.IdINNER JOIN Dim_Product P ON F.Product_Id = P.Id

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->