Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Standard view
Full view
of .
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1
Data Warehouse Ques

Data Warehouse Ques

Ratings: (0)|Views: 1|Likes:
Published by viswanath12

More info:

Categories:Types, Resumes & CVs
Published by: viswanath12 on Jun 29, 2012
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as DOC, PDF, TXT or read online from Scribd
See more
See less





What is data warehouse?
A data warehouse is a electronic storage of an Organization's historical data for the purpose of analysis and reporting. According to Kimpball, a datawarehouse should be subject-oriented,non-volatile, integrated and time-variant.
Explanatory Note
Note here, Non-volatile means that the data once loaded in thewarehouse will not get deleted later. Time-variant means the data willchange with respect to time. The above definition of the data warehousing is typically considered as "classical" definition.However, if you are interested, you may want to read the article -What is a data warehouse -A 101 guide to modern data warehousing- which opens up a broader definition of datawarehousing.
What is the benefits of data warehouse?
A data warehouse helps to integrate data (seeData integration) and store them historically sothat we can analyze different aspects of business including, performance analysis, trend,prediction etc. over a given time frame and use the result of our analysis to improve theefficiency of business processes.
Why Data Warehouse is used?
For a long time in the past and also even today, Data warehouses are built to facilitatereporting on different key business processes of an organization, known as KPI. Datawarehouses also help to integrate data from different sources and show a single-point-of-truthvalues about the business measures.Data warehouse can be further used for data mining which helps trend prediction, forecasts,pattern recognition etc. Check this article to know more about data mining
What is the difference between OLTP and OLAP?
OLTP is the transaction system that collects business data. Whereas OLAP is the reporting andanalysis system on that data.
OLTP systems are optimized for INSERT, UPDATE operations and therefore highly normalized.On the other hand, OLAP systems are deliberately denormalized for fast data retrieval throughSELECT operations.
Explanatory Note:
In a departmental shop, when we pay the prices at the check-outcounter, the sales person at the counter keys-in all the data into a"Point-Of-Sales" machine. That data is transaction data and therelated system is a OLTP system.On the other hand, the manager of the store might want to view areport on out-of-stock materials, so that he can place purchase orderfor them. Such report will come out from OLAP system
What is data mart?
Data marts are generally designed for a single subject area. An organization may have datapertaining to different departments like Finance, HR, Marketting etc. stored in data warehouseand each department may have separate data marts. These data marts can be built on top of the data warehouse.
What is ER model?
ER model is entity-relationship model which is designed with a goal of normalizing the data.
What is dimensional modeling?
Dimensional model consists of dimension and fact tables. Fact tables store differenttransactional measurements and the foreign keys from dimension tables that qualifies thedata. The goal of Dimensional model is not to achive high degree of normalization but tofacilitate easy and faster data retrieval.
What is dimension?
A dimension is something that qualifies a quantity (measure).If I just say… “20kg”, it does not mean anything. But 20kg of Rice (Product) is sold to Ramesh(customer) on 5th April (date), gives a meaningful sense. These product, customer and datesare some dimension that qualified the measure. Dimensions are mutually independent.
 Technically speaking, a dimension is a data element that categorizes each item in a data setinto non-overlapping regions.
What is fact?
A fact is something that is quantifiable (Or measurable). Facts are typically (but not always)numerical values that can be aggregated.
What are additive, semi-additive and non-additivemeasures?
Non-additive measures are those which can not be used inside any numeric aggregationfunction (e.g. SUM(), AVG() etc.). One example of non-additive fact is any kind of ratio orpercentage. Example, 5% profit margin, revenue to asset ratio etc. A non-numerical data canalso be a non-additive measure when that data is stored in fact tables.Semi-additive measures are those where only a subset of aggregation function can be applied.Let’s say account balance. A sum() function on balance does not give a useful result but max()or min() balance might be useful. Consider price rate or currency rate. Sum is meaningless onrate; however, average function might be useful.Additive measures can be used with any aggregation function like Sum(), Avg() etc. Example isSales Quantity etc.At this point, I will request you to pause and make some time to readthis article on"Classifying data for successful modeling". This articlehelps you to understand the differences between dimensional data/factual data etc. from a fundamental perspective
What is Star-schema?
 This schema is used in data warehouse models where one centralized fact table referencesnumber of dimension tables so as the keys (primary key) from all the dimension tables flowinto the fact table (as foreign key) where measures are stored. This entity-relationship diagramlooks like a star, hence the name.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->