You are on page 1of 26

Data Analysis

is Data analysis is a process of:


 Gathering;
 Modeling; and
 Transforming of data.

With the goal of highlighting useful


 Information;
 Suggesting conclusions; and
 Supporting decision making.

11/06/09 XIDAS, Jabalpur 2


11/06/09 XIDAS, Jabalpur 3
11/06/09 XIDAS, Jabalpur 4
11/06/09 XIDAS, Jabalpur 5
11/06/09 XIDAS, Jabalpur 6
11/06/09 XIDAS, Jabalpur 7
11/06/09 XIDAS, Jabalpur 8
11/06/09 XIDAS, Jabalpur 9
11/06/09 XIDAS, Jabalpur 10
11/06/09 XIDAS, Jabalpur 11
Major Data Analysis Techniques
 Correlation Analysis;
 Regression Analysis;
 Factor Analysis;
 Cluster Analysis;
 Correspondence Analysis (Brand
Mapping);
 Conjoint Analysis;
 CHAID Analysis;
 Discriminant /Logistic
Regression Analysis;
 Multidimensional Scaling; and
 Structural Equation Modeling.

11/06/09 XIDAS, Jabalpur 12


CORRELATION
ANALYSIS
 Correlation analysis, expressed by correlation
coefficients, measures the degree of linear relationship
between two variables.

 Feature of Correlation coefficient:


 Between + and – 1;
 The sign of the correlation coefficient (+, -) defines
the direction of the relationship, +tive or –tive;
 A positive correlation coefficient means that as the
value of one variable increases, the value of the
other
variable also increases; as one decreases the
other decreases; and
 A negative correlation coefficient indicates that as
one variable increases, the other decreases, and vice-
versa.

11/06/09 XIDAS, Jabalpur 13


Cont.
. The absolute value of the correlation

coefficient measures the strength of the
relationship.
 A correlation coefficient of r=0.50 indicates a
stronger degree of linear relationship than one
of r=0.40.
 Correlation coefficient of zero (r=0.0) indicates
the absence of a linear relationship.
 Correlation coefficients of r=+1.0 and r=-1.0
indicate a perfect linear relationship.

11/06/09 XIDAS, Jabalpur 14


Diagrammatic presentation “r”

R=1

R= -0.5

R=0.5

11/06/09 XIDAS, Jabalpur 15


Regression analysis
 Regression analysis measures the:
 strength of a relationship between a variable (e.g. overall
customer satisfaction)
 one or more explaining variables (e.g. satisfaction with
product quality and price).

 Correlation provides a single numeric summary of a relation


(called the correlation coefficient), while regression analysis
results in a "prediction" equation.

 The regression equation describes the relation between the


variables. If the relationship is strong (expressed by the
Rsquare value), it can be used to predict values of one
variable given the other variables have known values.

11/06/09 XIDAS, Jabalpur 16


Factor Analysis
Types of Factor Analysis
 Factor analysis aims to
describe a large number of
variables or questions by Factor
only using a reduced set of Analysis
underlying variables, called
factors.

 It explains a pattern of
similarity between
observed variables.
Questions which belong to Exploratory Confirmatory
one factor are highly
correlated with each other.

11/06/09 XIDAS, Jabalpur 17


Use of Factor
Analysis
Factor analysis is often used in customer satisfaction
studies to identify underlying service dimensions,
and in profiling studies to determine core attitudes.
 For example, as part of a national survey on political
opinions, respondents may answer three separate
questions regarding environmental policy, reflecting
issues at the local, regional and national level.
 Factor analysis can be used to establish whether the
three measures do, in fact, measure the same thing.
 It is can also prove to be useful when a lengthy
questionnaire needs to be shortened, but still retain key
questions.
 Factor analysis will indicate which questions can be
omitted without losing too much information.

11/06/09 XIDAS, Jabalpur 18


CLUSTER
ANALYSIS
Cluster analysis is an exploratory tool designed to reveal natural
groupings within a large group of observations. Cluster analysis
segments the survey sample, i.e. respondents or companies,
into a small number of groups.

11/06/09 XIDAS, Jabalpur 19


BRAND MAPPING
(CORRESPONDENCE ANALYSIS)
 Correspondence analysis is a technique
which:
 Allows rows and columns of a data
 matrix,
E.g. average satisfaction scores for
several
products, to be displayed as points in a
two
dimensional space or map. It reduces a
complicated set of data to a graphical
 display
which is immediately and easily
interpretable.
Brand maps are based on correspondence
analysis.
Brand maps are often used to illustrate
customers' images of the market by
placing
11/06/09products and attributes together
XIDAS, Jabalpur on a 20
11/06/09 XIDAS, Jabalpur 21
CONJOINT
ANALYSIS
 Conjoint analysis is a technique for measuring
respondent preferences about the attributes of a
product or service.
 It is the ideal tool for new/improved product
development.
 The conjoint analysis task asks the respondents to
make choices in the same fashion as consumers
normally do, by trading off features one against the
other, either by ranking or choosing one of several
product combinations.
 E.g. a task could be: do you prefer a "flight that is
cramped, costs £250 and has one stop" or a "flight
that is spacious, costs £500 and is direct"?

11/06/09 XIDAS, Jabalpur 22


CHAID
ANALYSIS
 CHAID (Chi Squared Automatic
Interaction Detection) is used to
build:
 a predictive model, based on a
classification system.
 The analysis subdivides the sample into
a series of subgroups that :
○ 1) share similar characteristics towards a
specific response variable and that
○ 2) maximises our ability to predict the
values of the response variable.

11/06/09 XIDAS, Jabalpur 23


DISCRIMINANT/
LOGISTIC REGRESSION
ANALYSIS
 Discriminant and logistic Is often used :
regression analysis are  to determine which
statistical techniques that customers are likely to
point out the differences buy a company's
product
between two or more
 to decide whether a
groups based on several bank should offer a loan
characteristics (most often to a new company or
rating scales when  to identify patients
Discriminant analysis, which may be at high
while logistic regression risk for medical
can handle any type of problems
variable)

11/06/09 XIDAS, Jabalpur 24


MULTIDIMENSIONA
L SCALING
 Multidimensional scaling (MDS) can be
considered to be an alternative to factor
analysis.
 In general, the goal of the analysis is to detect
meaningful underlying dimensions that allow
the researcher to explain observed similarities
or dissimilarities between the investigated
objects. In factor analysis, the similarities
between objects (e.g. variables) are expressed
in the correlation matrix.
 With MDS one may analyse any kind of
similarity or dissimilarity matrix, in addition to
correlation matrices.

11/06/09 XIDAS, Jabalpur 25


STRUCTURAL EQUATION
MODELING
 Structural Equation Modeling (SEM) is a
very general, very powerful multivariate
analysis technique that includes a
number of other traditional analysis
methods as special cases.
 It effectively includes a whole range
of
standard multivariate analysis
methods,
such as regression, factor analysis and
analysis of variance.
 A structural equation model can exist
with several regression and factor
analysis models, which are estimated
simultaneously.
11/06/09 XIDAS, Jabalpur 26

You might also like