Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Standard view
Full view
of .
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1
DataStage Matter

DataStage Matter

Ratings: (0)|Views: 10|Likes:
Published by mukesh

More info:

Published by: mukesh on Aug 24, 2013
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





DatawareHouse Basic Questions
Centralized Data Warehouse
 A Centralized Data Warehouse is a data warehousing implementation wherein a single data warehouse serves the needsof several separate business units simultaneously using a single data modelthat spans the needs of multiple business divisions.
What is Central Data Warehouse
A Central Data Warehouse is a repository of company data where a database is created from operational data extracts. This databaseadheres to a single, consistent enterprise data model to ensure consistency in decision making support across the company. A Central Data Warehouse is a single physical database which contains business data for a specific function area, department, branch,division or the whole enterprise. Choosing the central data warehouse is commonly based on where there is the largest common needfor informational data and where the largest numbers of end users are already hooked to a central computer or a network.A central data warehouse employs the computing style of having all the information systems located and managed from one physicallocation even if there are many datasources spread around the globe.
What is Active Data Warehouse
 Active Data Warehouse is repository of any form of captured transactional data so that they can be used for the purposeof finding trends and patterns to be used for future decision making.What is Active Metadata Warehouse An Active Metadata Warehouse is a repository of Metadata to help speed up data reporting and analyses from an activedata warehouse.In its most simple definition, a Metadata is data describing data.
What is Enterprise Data Warehouse
Enterprise Data Warehouse is a centralized warehouse which provides service for the entire enterprise. A data warehouseis by essence a large repository of historical and currenttransactiondata of an organization. An Enterprise DataWarehouse is a specialized data warehouse which may have several interpretations.In order to give a clear picture of an Enterprise Data Warehouse and how it differs from an ordinarydata warehouses, fiveattributes are being considered. This is not really exclusive they bring people closer to a focused meaning of theEnterprise Data Warehouse from among the many interpretations of the term. These attributes mainly pertain to theoverall philosophy as well as the underlying infrastructure of an Enterprise Data Warehouse.The first attribute of an Enterprise Data Warehouse is that it should have a single version of truth and that entire goal of the warehouse's design is to come up with a definitive representation of the organization's business data as well asthe corresponding rules. Given the number and variety of systems and silos of company data that exist within anybusiness organization, many business warehouses may not qualify as an Enterprise Data Warehouse.The second attribute is that an Enterprise Data Warehouse should have multiple subject areas. In order to have a unifiedversion of the truth for an organization, an Enterprise Data Warehouse should contain all subject areas related tothe enterprise such as marketing, sale, finance, human resource and others.The third attribute is that an Enterprise Data Warehouse should have a normalized design. This may be an arguableattribute as both normalized and de-normalized databaseshave their own advantages for a data warehouse. In fact, may data warehouse designers have used denormalized models such as star or snowflake schemas for implementing data marts.But many also go for normalized databases for an Enterprise Data Warehouse in the consideration of flexibility first and performance second.The fourth attribute is that an Enterprise Data Warehouse should be implemented as a Mission-Critical Environment. Theentire underlying infrastructure should be able to handle any unforeseen critical conditions because failure in thedata warehouse means stoppage of the business operation and loss of income and revenue. An Enterprise DataWarehouse should have high availability features such as online parameter or database structural changes,business continuance such as failover and disaster recovery features and security features.Finally an Enterprise Data Warehouse should be scalable across several dimensions. It should expect that a company'smain objective is to grow and that the warehouse should be able to handle the growth of data as well as the growingcomplexities of processes which will come together with the evolution of the business enterprise.What is Functional Data WarehouseToday's business environment is very data driven and more companies are hoping to create competitive advantage over other business organization competitors by creating a system whereby they can assess the current status of their operations any at any given moment and at the same time, they can also analyze trends and patterns within thecompany operation and its relation to the trends and patterns of the industry in a truly up-to-date fashion.Breaking down the Enterprise Data Warehouse into several Functional Data Warehouses can have many big benefits.Since the organization as a data driven enterprise deals with very high level volumes of data, having separateFunctional Data Warehouses distributes the load and compartmentalize the processes. With this set up, there willno way the whole information system will break down because if there is a glitch in one of the functional datawarehouses, only that certain point will have to be temporarily halted while being fixed. As opposed to onemonolithic data warehouse setup, if the central database breaks down, the whole system will suffer.What is Operational Data Store (ODS) An Operational Data Store (ODS) is an integrated database of operational data. Its sources include legacy systems and itcontains current or near term data. An ODS may contain 30 to 60 days of information, while a data warehousetypically contains years of data. An operational data store is basically a database that is used for being an interim area for a data warehouse. As such, itsprimary purpose is for handling data which are progressively in use such astransactions, inventory and collectingdata from Point of Sales. It works with a data warehouse but unlike a data warehouse, an operational data storedoes not contain static data. Instead, an operational data store contains data which are constantly updated through the course of the business operations.What is Operational Database
 An operational database contains enterprise data which are up to date and modifiable. In an enterprise data managementsystem, an operational database could be said to be an opposite counterpart of a decision support database whichcontain non-modifiable data that are extracted for the purpose of statistical analysis. An example use of a decisionsupport database is that it provides data so that the average salary of many different kinds of workers can bedetermined while the operational database contains the same data which would be used to calculate the amount for pay checks of the workers depending on the number of days that they have reported in any given period of time.Data profilingData profiling is the process of examining the data available in an existing data source (e.g. a database or a file
) andcollecting statisticsand information about that data. The purpose of these statistics may be to: Find out whether existing data can easily be used for other purposesGivemetrics ondata qualityincluding whether the data conforms to company standards  Assess the risk involved in integrating data for new applications, including the challenges of   joins  Trackdata quality  Assess whether metadataaccurately describes the actual values in the source databaseUnderstanding data challenges early in any data intensive project, so that late project surprises are avoided. Finding dataproblems late in the project can incur time delays and project cost overruns.Have an enterprise view of all data, for uses such asMaster Data Management where key data is needed, or  Data governance for improving data qualityData governance is aquality control discipline for assessing, managing, using, improving, monitoring, maintaining, and protecting organizational information.[1] It is a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with whatinformation, and when, under what circumstances, using what methods.[2]: What are derived facts and cumulative facts?There are 2 kinds of derived facts that are additive and can be calculated entirely from the other facts in the same facttable row can be shown in a user view as if they existed in the real data. The user will never know the difference.The second kind of derived fact is a non additive calculation, such as ratio or cumulative fact that is typically expressed ata different level of details than the base facts themselves. A Cumulative fact might be year-to-date or month-to-date fact. In any case these kinds of derived facts can not bepresented in a simple view at the DBMS level because they violate the grain of the fact table. They need to becalculated
at query time by the BI tool.
Question :what is the data type of the surrogate keyAnswer :
Data type of the surrogate key is either integer or numeric or number 
Question :What is hybrid slowly changing dimensionAnswer :
Hybrid SCDs are combination of both SCD 1 and SCD 2.It may happen that in a table, some columns are important and we need to track changes for them i.e. capture the historical data for them whereas in some columns even if the data changes,we don't care.For such tables we implement Hybrid SCDs, where in some columns are Type 1 and some areType 2.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->