You are on page 1of 13

CHAPTER 1

INTRODUCTION
TO
MULTIVARIATE ANALYSIS
2

1.1 Multivariate Statistics


A taxonomy of multivariate statistical analyses shows that most techniques fall
into one of the following categories:
a) Data reduction or structural simplification.
b) Sorting and grouping.
c) Investigation of the dependence among variables.
d) Prediction.
e) Hypothesis construction and testing.

11/02/2023
3
1.2 Data Organization
Multivariate data are a collection of observations (or measurements) of:
 variables ().
 “items” ().
 “items” can also be thought of as subjects/examinees/individuals or entities (when
people are not under study) .
 in some disciplines (such as educational measurement), “items” are considered the
variables
 Collected per individual.
11/02/2023
1.3 Data Organization
4
= measurement of the kth variable on the jth entity.

Variable 1 Variable 2 ….. Variable k ….. Variable p

Item 1

Item 2

..
Item j

….
Item n
11/02/2023
5
1.4 Arrays
To represent the entire collection of items and entities, a rectangular array can be constructed:

11/02/2023
Example 1.1:
• So,
6 putting things all together, envision standing outside of the Kansas Union
Bookstore, asking people for receipts. Interested in looking at two variables:
Variable 1: the total amount of the purchase.
Variable 2: the number of books purchased.
You find four people, and here is what you see observe:

Person 1 Person 2 Person 3 Person 4

Variable 1 42 52 48 58

Variable 2 4 5 4 3

( )
42 4
52 5
𝑿 =
48 4 11/02/2023

58 3
Notice for any variable, :
7
 The first subscript represents the row location in the data array.
 The second subscript represents the column location in the data array.

1.5 Descriptive Statistics Review


• When we have a large amount of data, it is often hard to get a manageable description of the nature of
the variables under study. For this reason (and as a way of introducing a review topics
from previous courses), descriptive statistics are used.
Such descriptive statistics include:
 Means.
 Variances.
 Covariance.
11/02/2023

 Correlations.
1.5.1 Population / Sample Mean Vector
The
8 population mean is the measure of central tendency for the population. Here, the population mean for
variable is

For the variable, the sample mean is:

An array of the means for all variables then looks like this (which we will come to know as the mean vector):

11/02/2023
1.5.2 Population / Sample Variance Covariance
A variance
9 measures the degree of spread (dispersion) in a variable’s values. Theoretically, a
population variance is the average squared difference between a variable’s values and the
mean for that variable. The population variance for variable is

For the variable, the sample variance is:

 Note the “” subscript, this will be important because the equation that produces the
variance for a single variable is a derivation of the equation of the covariance for a pair of
variables.
 Also note the division by n. Reasons for this will become apparent in the near future.

11/02/2023
10
The population covariance is a measure of the association between pairs of variables in a
population. Here, the population covariance between variables and is

For a pair of variables, and , the sample covariance is:

11/02/2023
1.5.3 Population / Sample Covariance Matrix
11
• Making an array of all sample covariance give us:

11/02/2023
1.5.4
12 Sample Correlation
 Sample covariance are dependent upon the scale of the variables under study.
 For this reason, the correlation is often used to describe the association between two variables.
 For a pair of variables, and , the sample correlation is found by dividing the sample covariance
by the product of the standard deviation of the variables:

The sample correlation:


 Ranges from -1 to 1.
 Measures linear association.
 Is invariant under linear transformations of and .
 Is a biased estimator.
11/02/2023
13
1.5.5 Sample Correlation Matrix
• Making an array of all sample correlations give us:

Example 1.2: Find mean vector, variance covariance and correlation matrices for the
example 1.1.

11/02/2023

You might also like