Professional Documents
Culture Documents
What Are Critical Success Factors?
What Are Critical Success Factors?
Key areas of activity in which favorable results are necessary for a company to obtain its goal.
There are four basic types of CSFs which are:
Industry CSFs
Strategy CSFs
Environmental CSFs
Temporal CSFs
2. What is data cube technology used for?
Data cubes are commonly used for easy interpretation of data. It is used to represent data along
with dimensions as some measures of business needs. Each dimension of the cube represents
some attribute of the database. E.g profit per day, month or year.
3. What is data cleaning?
Data cleaning is also known as data scrubbing.
Data cleaning is a process which ensures the set of data is correct and accurate. Data accuracy
and consistency, data integration is checked during data cleaning. Data cleaning can be applied
for a set of records or multiple sets of data which need to be merged.
4. Explain how to mine an OLAP cube.
An extension of data mining can be used for slicing the data the source cube in discovered data
mining. The case table is dimensioned at the time of mining a cube.
5. What are different stages of Data mining?
A stage of data mining is a logical process for searching large amount information for finding
important data.
Stage 1: Exploration: One will want to explore and prepare data. The goal of the exploration
stage is to find important variables and determine their nature.
Stage 2: pattern identification: Searching for patterns and choosing the one which allows making
best prediction, is the primary action in this stage.
Stage 3: Deployment stage. Until consistent pattern is found in stage 2, which is highly
predictive, this stage cannot be reached. The pattern found in stage 2, can be applied for the
purpose to see whether the desired outcome is achieved or not.
6. What are the different problems that Data mining can solve?
Data mining can be used in a variety of fields/industries like marketing of products and services,
AI, government intelligence.
The US FBI uses data mining for screening security and intelligence for identifying illegal and
incriminating e-information distributed over internet.
7. What is Data purging?
Deleting data from data warehouse is known as data purging. Usually junk data like rows with
null values or spaces are cleaned up.
Data purging is the process of cleaning this kind of junk values.
8. What is BUS schema?
A BUS schema is to identify the common dimensions across business processes, like identifying
conforming dimensions. It has conformed dimension and standardized definition of facts.
9. Define non-additive facts?
Non additive facts are facts that cannot be summed up for any dimensions present in fact table.
These columns cannot be added for producing any results.
10. What is conformed fact? What is conformed dimensions used for?
Conformed fact in a warehouse allows itself to have same name in separate tables. They can be
compared and combined mathematically. Conformed dimensions can be used across multiple
data marts. They have a static structure. Any dimension table that is used by multiple fact tables
can be conformed dimensions.
11. What is real time data-warehousing?
In real time data-warehousing, the warehouse is updated every time the system performs a
transaction. It reflects the real time business data. This means that when the query is fired in the
warehouse, the state of the business at that time will be returned.
Explain the use lookup tables and Aggregate tables?
An aggregate table contains summarized view of data.
Lookup tables, using the primary key of the target, allow updating of records based on the
lookup condition.
Business Intelligence is used to analyze the data from the point of business to measure any
organizations success.
The factors like sales, profitability, marketing campaign effectiveness, market shares and
operational efficiency etc are analyzed using Business Intelligence tools like Cognos,
Informatica etc.
What is snapshot in a data warehouse?
Snapshot refers to a complete visualization of data at the time of extraction. It occupies less
space and can be used to back up and restore data quickly.
What is ETL process in data warehousing?
ETL stands for Extraction, transformation and loading.
Extracting data from different sources such as flat files, databases or XML data, transforming
this data depending on the applications needs and load this data into a data warehouse.
Explain the difference between data mining and data warehousing?
Data mining is a method for comparing large amounts of data for the purpose of finding patterns.
It is normally used for models and forecasting.
Data warehousing is the central repository for the data of several business systems in an
enterprise. Data from various resources extracted and organized in the data warehouse selectively
for analysis and accessibility.
What is an OLTP system and OLAP system?
OLTP = OnLine Transaction Processing.
Applications that supports and manages transactions which involve high volumes of data are
supported by OLTP system. OLTP is based on client-server architecture and supports transactions
across networks.
OLAP = OnLine Analytical Processing.
Business data analysis and complex calculations on low volumes of data are performed by
OLAP. An insight of data coming from various resources can be gained by a user with the
support of OLAP.
Eg:In date dimension the level could be year, month, quarter, period, week, day of granularity.
The process consists of the following two steps:
- Determining the dimensions that are to be included
- Determining the location to place the hierarchy of each dimension of information
Difference between star and snowflake schema.
A snowflake schema is a more normalized form of a star schema. In a star schema, one fact table
is stored with a number of dimension tables. In a star schema, one dimension table can have
multiple sub dimensions. This means that in a star schema, the dimension table is independent
without any sub dimensions.
What is the difference between view and materialized view?
View:
Tail raid data representation is provided by a view to access data from its table.
Has logical structure cannot occupy space.
Materialized view
MAX
10
RANGE
6
Code modification to fix a bug or to implement a new functionality which makes us to to find
errors.
These introduced errors are called regression. Identifying for regression effect is called
regression testing.
14) Retesting:
Re executing the failed test cases after fixing the bug.
15) System Integration Testing:
Integration testing: After the completion of programming process. Developer can integrate the
modules there are 3 models
a) Top Down
b) Bottom Up
c) Hybrid
Project
Here I am taking emp table as example. For this I will write test scenarios and test cases, that
means we are testing emp table.
http://etltestingguide.blogspot.com/p/sql.html
The full form of ODS is Operational Data Store.ODS is a layer between the source and target
databases..ODS is used to store the recent data.
Staging layer is also a layer between the source and target databases..Staging layer is used for
cleansing purpose and store the data periodically.
ODS (Operational Data Source) is the first point in the Datawarehouse. Its store the real time
data of daily transactions as the first instance of Date.
Staging Area, is the later part which comes after the ODS. Here the Data is cleansed and
temporarily stored before loaded into the Datawarehouse.
ODS is a Open Data Source where it contains real time data (because we should apply any
changes on real time data right..!) so dump the real time data into ODS called Landing area later
we get the data into staging area here is the place where we do all transformation.