P. 1
RM -Multivariate Analysis

# RM -Multivariate Analysis

|Views: 202|Likes:

See more
See less

12/30/2013

pdf

text

original

# By: KrIsHna

Multivariate Analysis
Multivariate Analysis is a study of several

dependent random variables simultaneously. These analysis are straight generalization of univariate analysis. Certain distributional assumptions are required for proper analysis. The mathematical framework is relatively complex as compared with the univariate analysis. These analysis are being used widely around the world.

Multivariate Analysis Methods
Two general types of MVA technique Analysis of dependence

Where one (or more) variables are dependent variables, to be explained or predicted by others  E.g. Multiple regression, PLS

Analysis of interdependence  No variables thought of as “dependent”  Look at the relationships among variables, objects or cases  E.g. cluster analysis, factor analysis

Some Multivariate Measures
The Mean Vector Collection of the means of the variables under study The Covariance Matrix
Collection of the Variances and Covariances of

the variables under study

The Correlation Matrix Collection of Correlation Coefficients of the variables involved under study The Generalized Variance
Determinant of the Covariance Matrix

Some Multivariate Tests of Significance
Testing significance of a single mean vector Testing equality of two mean vectors Testing equality of several mean vectors Testing significance of a single covariance

matrix
Testing equality of two covariance matrices Testing equality of several covariance

matrices
Testing independence of sets of variates

The Factor Analysis
Deals with the grouping of like variables in

sets.
Sets are formed in decreasing order of

importance.
Sets are relatively independent from each

other.
Two types are commonly used:
The Exploratory Factor Analysis The Confirmatory Factor Analysis

One of the most commonly used technique in

The Exploratory Factor Analysis
This technique deals with exploring the

structure of the data. The variables involved under the study are equally important. Variables are grouped together on the basis of their closeness. Groups are generally formed so that they are orthogonal to each other but this assumption can be relaxed. This technique exactly explains the Covariances of the variables.

Some Measures in Factor Analysis
The Factor Analysis Model is:

X i   ij f j  ei
The quantity ij
j 1

m

i  1, 2,..., p

is loading of i–th variable on j–th factor and measures the degree of dependence of a variable on a factor. The i–th communality; that measures the portion of variation of i–th variable explained by j–th factor; is givenmas

ij2 
j 1

Factor Rotation
Rotation is done to simplify the solution of

factor analysis.
Interpretations can be easily done from rotated

solution.
Two types of rotations are available:
Orthogonal Rotation; factors formed are

orthogonal
Oblique Rotation; factors formed are correlated

Cluster Analysis
Techniques for identifying separate groups of

similar cases

Similarity of cases is either specified directly in

a distance matrix, or defined in terms of some distance function

Also used to summarise data by defining

segments of similar cases in the data
“dissection”

This use of cluster analysis is known as

Clustering Techniques
Two main types of cluster analysis methods Hierarchical cluster analysis

Each cluster (starting with the whole dataset) is divided into two, then divided again, and so on

Iterative methods  k-means clustering (PROC FASTCLUS)  Analogous non-parametric density estimation method Also other methods
Overlapping clusters  Fuzzy clusters

Applications
Market segmentation is usually conducted

using some form of cluster analysis to divide people into segments
Other methods such as latent class models or

archetypal analysis are sometimes used instead

It is also possible to cluster other items such

as products/SKUs, image attributes, brands

Cluster Analysis Options
There are several choices of how to form clusters

in hierarchical cluster analysis

Ward’s method (like k-means) tends to form equal

sized, roundish clusters Average linkage generally forms roundish clusters with equal variance Density linkage can identify clusters of different shapes

FASTCLUS

Cluster Analysis Issues
 Distance definition
 Weighted Euclidean distance often works well, if weights are

chosen intelligently

 Cluster shape
 Shape of clusters found is determined by method, so choose

method appropriately

 Hierarchical methods usually take more computation time

than k-means  However multiple runs are more important for k-means, since it can be badly affected by local minima  Adjusting for response styles can also be worthwhile

 Some people give more positive responses overall than others  Clusters may simply reflect these response styles unless this is

adjusted for, e.g. by standardising responses across attributes for each respondent

=max.

=min.

Cluster Means
Cluster 1 4.55 4.32 4.43 3.85 4.10 4.50 3.93 4.09 4.17 4.12 4.58 3.51 4.14 3.96 4.19 Reason 1 Reason 2 Reason 3 Reason 4 Reason 5 Reason 6 Reason 7 Reason 8 Reason 9 Reason 10 Reason 11 Reason 12 Reason 13 Reason 14 Reason 15

Cluster 2 2.65 4.32 3.28 3.89 3.77 4.57 4.10 3.17 4.27 3.75 3.79 2.78 3.95 3.75 2.42

Cluster 3 4.21 4.12 3.90 2.15 2.19 4.09 1.94 2.30 3.51 2.66 3.84 1.86 3.06 2.06 2.93

Cluster 4 4.50 4.02 4.06 3.35 3.80 4.28 3.66 3.77 3.82 3.47 4.37 2.60 3.45 3.83 4.04

Cluster Means
=max.
Usage 1 Usage 2 Usage 3 Usage 4 Usage 5 Usage 6 Usage 7 Usage 8 Usage 9 Usage 10

=min.
Cluster 1 3.43 3.91 3.07 3.85 3.86 3.87 3.88 3.71 4.09 4.58 Cluster 2 3.66 3.94 2.95 3.02 3.55 4.25 3.29 2.88 3.38 4.26 Cluster 3 3.48 3.86 2.61 2.62 3.52 4.14 2.78 2.58 3.19 4.00 Cluster 4 4.00 4.26 3.13 2.50 3.56 4.56 2.59 2.34 2.68 3.91

Thank You

scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->