Professional Documents
Culture Documents
Wolfgang HÄRDLE
Leopold SIMAR
Comparison of Batches
Example: (cont.)
The dataset consists of 200 measurements on Swiss bank notes.
The first half of these bank notes are genuine, the other half are
forged bank notes.
It is important to be able to decide whether a given banknote is
genuine.
We want to derive a good rule that separates the genuine and
counterfeit banknotes.
Which measurement is the most informative? We have to visualize
the difference.
Boxplots
Boxplot
is a graphical technique for displaying the distribution of
variables.
helps us in seeing location, skewness, spread, tail length and
outlying points.
is particularly useful in comparing different batches.
is a graphical representation of the Five Number Summary.
Upper quartile FU
Lower quartile FL
Median = deepest point
Extremes
Consider the order statistics
→ Depth of a data value x(i) : min{i, n − i + 1}
[depth of median] + 1
depth of fourth =
2
Median
Order statistics {x(1) , x(2) , . . . , x(n) } is the set of the ordered values
x1 , x2 , . . . , xn
where x(1) denotes the minimum and x(n) the maximum.
Median M
x n+1 n odd
(n2 ) o
M=
1
2 x( n ) + x( n +1)
2 2
n even
3000
2500
Values
2000
1500
World Cities
40
35
●
30
25
20
15
US JAPAN EU
142
141
140
●
139
138
●
●
●
GENUINE COUNTERFEIT
216
215.5
215
214.5
214
● ●
●
GENUINE COUNTERFEIT
Summary: Boxplots
Summary: Boxplots
Histograms
n
XX
fbh (x) = n−1 h−1 I{xi ∈ Bj (x0 , h)}I{x ∈ Bj (x0 , h)}
j∈Z i=1
10 20 30
8
Diagonal
Diagonal
4
0
0
138 139 140 141 138 139 140 141
h = 0.1 h = 0.3
40
Diagonal
Diagonal
15
20
0 5
0
138 139 140 141 138 139 140 141
h = 0.2 h = 0.4
40
40
Diagonal
Diagonal
20
20
0
0
138 139 140 141 138 139 140 141
40
Diagonal
Diagonal
20
20
0
0
138 139 140 141 138 139 140 141
Summary: Histograms
Summary: Histograms
Kernel densities
n
X
x − xi
fbh (x) = n−1 h−1 K
h
i=1
K is the kernel.
Kernel functions
K (•) Kernel
Kernel functions
Uniform Triangle Epanechnikov
1 1 1
0 0 0
−2 0 2 −2 0 2 −2 0 2
1 1
0.5 0.5
0 0
−2 0 2 −2 0 2
0.8
Density estimates for diagonals
0.6
0.4
0.2
0.0
Counterfeit / Genuine
Quartic kernel
15
K (u) = (1 − u 2 )2 I(|u| ≤ 1)
16
hQ = 2.62hG
s
n
P
Sample standard deviation: σ
b= n−1 (xi − x̄)2
i=1
Applied Multivariate Statistical Analysis
143
0.02
0.04
0.06
142
0.12
0.14
0.18
141
0.16
0.1
0.1
140
0.12
0.16
0.14
139
0.08
138
9 10 11 12
Scatterplots
Rotation of data
Separation lines
Draftman’s plot
Brushing
Parallel coordinate plots
●
● ● ●
● ●
142
● ● ●
● ● ● ● ●● ●
● ●● ● ●
● ●● ● ● ● ●● ●
● ● ● ●●
● ●● ●● ● ●
● ● ● ●●●● ● ●
● ●● ● ● ●
● ● ● ●●
● ●● ●
141
● ● ●● ●
● ●● ●
●● ●●
●
● ●
●
140
●
139
138
7 8 9 10 11 12 13
● ●
●● ●●
●●
●● ● ●
142 ●●● ●●
●●
● ●● ●
●
●
●● ●●●
●●
●
● ●●●●●●●● ●●
●
● ● ● ●●●●●●
●●
● ●●●●● ●●
● ● ● ●● ● ●
Diagonal (X6)
141 ●●●●●● ●
●●
●●
●● ●
● ●
●●
140
●
140
139
8 10 11 12
11 9 10
Lower 13 14 8
5)
inner fr ame (X
ame (X inner fr
4 ) Upper
13
143
13
12
142
12
11
141
11
10
10
Y
140
9
139
9
8
138
8
7
137
7
128.5 129.0 129.5 130.0 130.5 131.0 131.5 128.5 129.0 129.5 130.0 130.5 131.0 131.5 128.5 129.0 129.5 130.0 130.5 131.0 131.5
X X X
● ●
4
13
143
131.0
● ●
12
142
● ● ●●● ●
130.5
● ● ●
● ● ●●●● ● ●●
● ● ●
11
141
● ●●
●●●● ● ●● ●
● ● ● ●●● ●● ● ●
● ● ●●● ●● ●●●● ●●
130.0
10
●● ● ●●●●●●●
140
●●●
● ● ●
● ● ● ●
● ● ●
139
9
●
129.5
●
●
138
8
129.0
137
7
7 8 9 10 11 12 7 8 9 10 11 12 13 7 8 9 10 11 12 13
X X X
● ● ●
5
143
●●
131.0
●
● ●●●●
12
●● ●●●●●
142
● ●● ● ●● ● ● ●●
●●
●● ●
●●●●●
●
130.5
●● ● ●●●●●
11
●● ●● ●● ●●●●●● ●
141
● ● ●●●●●●●●●● ●●●●● ● ●●
● ● ●●●●●●● ● ●●●●●●●●
●● ●●
●●●●● ●●●● ● ● ●●● ●● ●● ●●
●●
10
●●
130.0
●●●● ● ●● ●●●
140
● ●● ● ●● ●● ●●●●●●●
● ● ● ●● ●
● ● ●
139
9
● ● ●●
129.5
● ●●● ● ●●
●
138
8
●
129.0
137
7
8 9 10 11 12 8 9 10 11 12 7 8 9 10 11 12 13
X X X
● ● ●● ●
6
● ●● ●●●●
131.0
●●
●●● ● ●●●●
12
● ● ●●● ● ●●●● ●
12
● ● ●● ● ● ● ●
● ●● ●
●●●●●●●● ●● ●● ●●●● ●●●●
●● ● ●● ● ●●● ●● ●● ●● ● ● ●
● ●● ●● ●●●●●
●●●●
130.5
● ●● ●●
11
●●●● ●● ●
11
● ● ● ●●●●●●
●●●●●● ●●●●●
●●●
●●● ●●●●●●●●●
●●●●●● ●● ● ●
●●●●●● ●●●●● ●
●● ● ● ●● ●● ●
●● ● ●●●●●●● ● ● ● ●● ●●
●● ● ●● ●
10
●
130.0
10
● ●●●●●●
Y
Y
●●●●●● ●● ●●●● ● ●
●●●● ●● ●
● ●● ●
9
● ● ● ●
9
129.5
● ●● ●● ● ●
●
8
●
129.0
7
138 139 140 141 142 138 139 140 141 142 138 139 140 141 142
X X X
Summary: Scatterplots
Chernoff-Flury Faces
91 92 93 94 95
96 97 98 99 100
X1 = 1, 19 (eye sizes)
X2 = 2, 20 (pupil sizes)
X3 = 4, 22 (eye slants)
X4 = 11, 29 (upper hair lines)
X5 = 12, 30 (lower hair lines)
X6 = 13, 14, 31, 32 (face lines and darkness of hair)
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
Index Index Index Index Index Index Index Index Index Index
21 22 23 24 25 26 27 28 29 30
Index Index Index Index Index Index Index Index Index Index
31 32 33 34 35 36 37 38 39 40
Index Index Index Index Index Index Index Index Index Index
41 42 43 44 45 46 47 48 49 50
Index Index Index Index Index Index Index Index Index Index
Index
Flury facesIndexfor observations
Index Index
1Index
to 50Index
of theIndexbankIndex
notes.Index Index
MVAfacebank50
Observations 51 to 100
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
Index Index Index Index Index Index Index Index Index Index
71 72 73 74 75 76 77 78 79 80
Index Index Index Index Index Index Index Index Index Index
81 82 83 84 85 86 87 88 89 90
Index Index Index Index Index Index Index Index Index Index
91 92 93 94 95 96 97 98 99 100
Index Index Index Index Index Index Index Index Index Index
MVAfacebank50
Observations 101 to 150
101 102 103 104 105 106 107 108 109 110
111 112 113 114 115 116 117 118 119 120
Index Index Index Index Index Index Index Index Index Index
121 122 123 124 125 126 127 128 129 130
Index Index Index Index Index Index Index Index Index Index
131 132 133 134 135 136 137 138 139 140
Index Index Index Index Index Index Index Index Index Index
141 142 143 144 145 146 147 148 149 150
Index Index Index Index Index Index Index Index Index Index
Index
Flury facesIndexfor observations
Index Index Index
101 to Index
150 ofIndex Index
the bank Index
notes. Index
MVAfacebank50
Observations 151 to 200
151 152 153 154 155 156 157 158 159 160
161 162 163 164 165 166 167 168 169 170
Index Index Index Index Index Index Index Index Index Index
171 172 173 174 175 176 177 178 179 180
Index Index Index Index Index Index Index Index Index Index
181 182 183 184 185 186 187 188 189 190
Index Index Index Index Index Index Index Index Index Index
191 192 193 194 195 196 197 198 199 200
Index Index Index Index Index Index Index Index Index Index
Index
Flury facesIndexfor observations
Index Index Index
151 to Index
200 ofIndex Index
the bank Index
notes. Index
MVAfacebank50
Comparison of Batches 1-40
Summary: Faces
Andrews’ Curves
Each multivariate observation Xi = (Xi,1 , .., Xi,p ) ∈ Rp is
transformed into a curve as follows
p odd
Xi,1 p−1 p−1
fi (t) = √ +Xi,2 sin(t)+Xi,3 cos(t)+. . .+Xi,p−1 sin t +Xi,p cos t
2 2 2
p even
Xi,1 p
fi (t) = √ + Xi,2 sin(t) + Xi,3 cos(t) + . . . + Xi,p sin t
2 2
Andrews’ Curves
Let us take the 96th observation of the Swiss bank note dataset,
0.5
0.25
0
−0.25
0 1 2 3 4 5 6
Let us take the 96th observation of the Swiss bank note dataset,
0.5
0.25
0
−0.25
0 1 2 3 4 5 6
Idea
Instead of plotting observations in an orthogonal coordinate system
one draws their coordinates in a system of parallel axes. This way
of representation is however sensitive to the order of the variables.
1
0.8
0.6
0.4
0.2
0
V1 V2 V3 V4 V5 V6
1
0.8
0.6
0.4
0.2
0
V1 V2 V3 V4 V5 V6
The full bank dataset. Genuine bank notes displayed as black lines.
The forged bank notes are shown as red lines. MVAparcoo2
Comparison of Batches 1-51
a11 · · · a1p
. . ..
A(n×p) =
.
. .. .
an1 · · · anp
Definition Notation
Transpose A>
Sum A+B
Difference A−B
Scalar product c ·A
Product A·B
Rank rank(A)
Trace tr(A)
Determinant det(A) = |A|
Inverse A−1
Generalised Inverse A− : AA− A = A
scalar p=n=1 a 3 !
1
column vector p=1 a
3
row vector n=1 a> 1 3
!
1
vector of ones (1, . . . , 1)> 1n
| {z } 1
n !
0
vector of zeros (0, . . . , 0)> 0n
| {z } 0
n !
2 0
square matrix n=p A(p × p)
0 2
Spectral Decomposition
A = ΓΛΓ>
Xp
= λj γj γj>
j=1
Λ = diag (λ1 , · · · , λp )
Γ = (γ1 , · · · , γp )
Covariance matrix !
1 ρ
Σ=
ρ 1
Eigenvalues:
1−λ ρ
=0
ρ 1−λ
λ1 = 1 + ρ, λ2 = 1 − ρ, Λ = diag(1 + ρ, 1 − ρ)
Eigenvectors:
! ! !
1 ρ x1 x1
= (1 + ρ)
ρ 1 x2 x2
MVAspecdecomp
x1 + ρx2 = x1 + ρx1
ρx1 + x2 = x2 + ρx2
⇒ x1 = x2 .
√ !
1 2
γ1 = √ .
1 2
√ !
1 2
γ2 = √ .
−1 2
√ √ !
1 2 1 2
Γ = (γ1 , γ2 ) = √ √
1 2 −1 2
Check: A = ΓΛΓ>
Eigenvectors
4
original data (y2), rotated data (y2)
2
0
-2
-2 0 2
original data (x1), rotated data (x1)
A = Γ Λ ∆>
Quadratic Forms
Definiteness
Example:
Q(x) = x > Ax = x12 + x22 , A = 10 01
Eigenvalues: λ1 = λ2 = 1 positive definite
2 1 −1
Q(x) = (x1 − x2 ) , A = −1 1
Eigenvalues λ1 = 2, λ2 = 0 positive semidefinite
Q(x) = x12 − x22
Eigenvalues λ1 = 1, λ2 = −1 indefinite.
Theorem
If A is symmetric and Q(x) = x > Ax is the corresponding quadratic
form, then there exists a transformation x 7→ Γ> x = y such that
p
X
x> A x = λi yi2 ,
i=1
A > 0 ⇔ λi > 0,
A ≥ 0 ⇔ λi ≥ 0, i = 1, . . . , p.
x > Ax x > Ax
max = λ 1 ≥ λ2 ≥ · · · ≥ λp = min ,
x x > Bx x x > Bx
Derivatives
f : Rp → R, (p × 1) vector x:
∂f (x)
column vector of partial derivatives
∂x
∂f (x)
, j = 1, . . . , p
∂xj
∂f (x)
row vector of the same derivatives
∂x >
∂f (x)
is called the gradient of f .
∂x
∂a> x ∂x > a
= =a
∂x ∂x
Example:
f : Rp → R, f (x) = a> x
∂x > Ax
= 2Ax
∂x
∂ 2 x > Ax
= 2A
∂x∂x >
Summary: Derivatives
∂f (x)
The column vector ∂x is called the gradient.
>
∂a x ∂x > a
The gradient ∂x = ∂x equals a.
∂x > Ax
The derivative of the quadratic form ∂x equals 2Ax.
The Hessian of f : Rp → R is the (p × p) matrix of the
2 (x)
second derivatives ∂∂xfi ∂xj
.
The Hessian of the quadratic form x > Ax equals 2A.
Partitioned Matrices
!
A11 A12
A(n × p), B(n × p), A =
A21 A22
Aij (ni × pj ), n1 + n2 = n and p1 + p2 = p
!
A11 + B11 A12 + B12
A+B =
A21 + B21 A22 + B22
!
>
B11 >
B21
B> = > >
B12 B22
!
> + A B>
A11 B11 > >
12 12 A11 B21 + A12 B22
AB > = > + A B> > >
A21 B11 22 12 A21 B21 + A22 B22
!
A11 A12
For a partitioned matrix A(n × p) = and
A21 A22
!
B11 B12
B(n × p) = holds
B21 B22
!
A11 + B11 A12 + B12
A+B = .
A21 + B21 A22 + B22
!
1 b>
For B = and for non-singular A we have
a A
|B| = |A − ab > | = |A||1 − b > A−1 a|.
A−1 ab > A−1
(A − ab > )−1 = A−1 + 1−b > A−1 a
Geometrical Aspects
A = Ip , Euclidean distance
Ed = {x ∈ Rp | (x − x0 )> (x − x0 ) = d 2 }
Example: x ∈ R2 , x0 = 0, x12 + x22 = 1
Norm of a vector w.r.t metric Ip
√
kxkIp = d(0, x) = x >x
Scalar product
Norm of a vector
√
kxkIp = d(0, x) = x >x
√
kxkA = x > Ax
Unit vectors
{x : kxk = 1}
x >y
cos θ =
kxk ky k
Example: Angle = Correlation
Observations {xi }ni=1 , {yi }ni=1
x =y =0 P
xi yi
rXY = qP P = cos θ
2
xi yi2
Column space
X (n × p) data matrix
C (X ) = {x ∈ Rn | ∃a ∈ Rp so that X a = x}
Projection matrix
P(n × n), P = P > = P 2 (P is idempotent)
let b ∈ Rn , a = Pb is the projection of b on C (P)
Projection on C (X )
Q = In − P, Q2 = Q
y >x
px = y
ky k2
PX = X
QX = 0
Applied Multivariate Statistical Analysis
y >x
Projection. px = y (y > y )−1 y > x = y
ky k2
A Short Excursion into Matrix Algebra 2-44
Covariance
Covariance is a measure of (linear) dependency between variables.
σXY = Cov(X , Y ) = E(XY ) − (E X )(E Y )
Covariance of X with itself:
σXX = Var(X ) = Cov(X , X )
Covariance matrix for p-dimensional X :
σX X . . . σX1 Xp
.1 1 .. ..
Σ=
.
. . .
σXp X1 . . . σ Xp Xp
Empirical versions:
n
X
sXY = n−1 (xi − x)(yi − y )
i=1
Xn
sXX = n−1 (xi − x)2
i=1
13
● ●● ●
● ● ●● ●●●
12
●● ●●●●
● ● ● ● ●
● ●● ●●● ● ●
● ●
● ● ●
● ●●● ●
● ● ●●●●● ●
●●
● ● ● ● ● ●● ●● ● ● ● ●
11
●● ●● ●● ●
●●
● ● ●●●
●● ●
● ●●● ● ●
● ●●
●●
●●
● ●
●● ● ● ● ●● ● ● ●●
●●● ●
●●
● ● ●●●● ● ●
● ● ● ●
●●● ●●● ●● ●●● ● ●
10
● ●● ●●●● ●●● ● ●
● ●● ●●● ●●●● ●●
●● ●●
● ●● ●
● ● ●
● ●●● ● ●
9
●
8
●
7
7 8 9 10 11 12 13
240
●
200
●
●
● ●
Sales (X1)
● ●
●
160
●
120
●
80
Price (X2)
Summary: Covariance
Summary: Covariance
Correlation
Cov(X , Y )
ρXY = p
Var(X ) Var(Y )
The empirical version of ρXY :
sXY
rXY = √
sXX sYY
Correlation matrix:
ρX X . . . ρ X1 Xp
.1 1 .. ..
P = .. . .
ρXp X1 . . . ρXp Xp
Test of Correlation
W − E(W ) L
Z= p −→ N(0, 1)
Var(W )
H0 : ρ = 0 H1 : ρ 6= 0
1 1 + rX2 X8 −1.166 − 0
w = log = −1.166, z = q = −9.825
2 1 − rX2 X8 1
71
H0 : ρ = −0.75
−1.166 − (−0.973)
z= q = −1.627.
1
71
4000
Weight (X8)
3000
●
●
● ●
● ●
●
●
●
2000 ●
15 20 25 30 35 40
Mileage (X2)
Mileage (X2 ) vs. weight (X8 ) of U.S. (star), European (plus) and
Japanese (circle) cars. MVAscacar
Moving to Higher Dimensions 3-19
Summary: Correlation
Summary: Correlation
Summary Statistics
X (n × p) data matrix
x11 · · · x1p
.. ..
. .
X = .. ..
. .
xn1 . . . xnp
Mean
x1
.
x = . −1 >
. = n X 1n
xp
Empirical covariance matrix
Centering matrix
H = In − n−1 1n 1>
n
R = D−1/2 SD−1/2
−1/2
with D = diag(sXj Xj ) and D−1/2 = diag(sXj Xj ) for j = 1, . . . , p.
Linear Transformations
A (q × p) matrix
Example:
Let x = (1, 2)> and y = 4x, x ∈ R2
Then y = 4x = (4, 8)> .
Mahalanobis Transformation
Z = (z1 , . . . , zn )>
zi = S −1/2 (xi − x), i = 1, . . . , n
SZ = n−1 Z > HZ = Ip
Z=0
One-sample t-test
H0 : µ = µ0 H1 : µ 6= µ0
2
Assume that σ is known:
√ |x̄n − µ0 |
n ∼ N(0, 1)
σ
Show that P(reject H0 |H0 is true) = α.
Test:
H0 : E (X ) = µ0 H1 : E (X ) 6= µ0
We reject H0 if
√ |x̄n − µ0 |
n > t1−α/2;n−1 .
bn
σ
t1−α/2;n−1 : 1 − α critical value (i.e. 1 − α/2 quantile) of the
Student’s t-distribution with (n − 1) degrees of freedom
C̄n = 222.11
bn2 = 123.22
σ
n = 128
√ (C̄n − 200)
n = 2.0301 > t0.975;n−1 = 1.9788
bn
σ
We reject that average costs are equal to 200.
Two-sample t-test
Test statistic
r
m + n (ȳ1 − ȳ2 ) − (µ1 − µ2 )
T = ∼ tn+m−2
mn bP
σ
240
●
200
●
●
● ●
Sales (X1) ● ●
●
160
120 ●
●
80
Price (X2)
11.5
Upper inner frame (X5), genuine
●
● ●
● ●
●● ●
● ●●●●
● ● ●● ●
● ● ● ●
10.5
● ● ● ● ●●
●● ● ●
● ●
● ● ●●●● ● ●
●●● ● ●
●
● ●● ●● ●●● ●
●● ●
● ● ● ●●
● ●● ● ●●
9.5 ●
●
●●
● ●● ●
● ●
●
●
● ● ● ●
8.5
●
7.5
7 8 9 10
Regression of upper inner frame (X5 ) on lower inner frame (X4 ) for
genuine bank notes. MVAregbank
Moving to Higher Dimensions 3-38
Total variation
195
●
185
Sales (X1)
●
175
●
165
88 90 92 94 96 98 100 102
Price (X2)
Coefficient of determination
P
n
yi − y )2
(b
i=1 SSTR
r2 = =
Pn
SSTO
(yi − y )2
i=1
t-Test for β1
H0 : β1 = 0 (ρXY = 0) H1 : β1 6= 0
σb2 b
σ βb1
Var(βb1 ) = , SE (βb1 ) = , t =
(n · sXX ) (n · sXX )1/2 SE (βb1 )
t1−α/2;n−2 : 1 − α critical value (i.e. 1 − α/2 quantile) of the
Student’s t-distribution with (n − 2) degrees of freedom
Distance of the inner frame to the lower and to the upper border,
i.e. X4 vs. X5 .
Why is negative slope to be expected?
sXY −0.26347
βb0 = 14.666 and βb1 = = = −0.626.
sXX 0.41321
βb1
The t-test for the hypothesis β1 = 0 is t = , where
SE (βb1 )
SE (βb1 ) = σ
b
(n·sXX )1/2
.
The t-test rejects the null hypothesis β = 0 at the level of
significance α if |t| ≥ t1−α/2;n−2 where t1−α;n−2 is the
1 − α/2 quantile of the Student’s t-distribution with (n − 2)
degrees of freedom.
The standard error SE (β) b increases/decreases with less/more
spread in the X variables.
Assumptions
Note
I Each factor has a mean value µl
I Observation ykl equals the sum of µl and a zero mean random
error εkl
I Linear regression model: m = 1, p = n and µi = α + βxi ,
where xi is the i-th level value of the factor
Test
Variation under H1
Xp Xm m
X
SS(full) = (ykl − y¯l )2 , y¯l = m−1 ykl
l=1 k=1 k=1
F -test
{SS(reduced) − SS(full)}/{df (r ) − df (f )}
F =
SS(full)/df (f )
Degrees of freedom
I Number of observations minus the number of parameters
I Full model df (f ) = n − p
I Reduced model df (r ) = n − 1
ANOVA Table
SS df MS F -stat p-value
SS(explained) SS(explained)/(p−1)
SS(explained) p−1 p−1 MSE
p-value
SS(full)
SS(full) n−p n−p
= MSE
SS(reduced) n−1
F ∼ Fp−1,n−p
Test: reject H0 if F > F1−α;p−1,n−p , or if p-value< α
Reduced model: H0 : µl = µ l = 1, 2, 3
Full model: H1 : µl different
df (r ) = n − #parameters(r ) = 30 − 1 = 29
df (f ) = n − #parameters(f ) = 30 − 3 = 27
SS(reduced) = 260.3
SS(full) = 157.7
(260.3 − 157.7)/(29 − 27)
F = = 8.78 > F2;27 (0.95) = 3.35
157.7/27
SS df MS F -stat p-value
260.3 29
Reduced model: yi = β0 + 0 · xi + εi
n
X
SS(reduced) = (yi − ȳ )2
i=1
n
X
SS(full) = (yi − ybi )2 = RSS
i=1
{SS(reduced) − SS(full)}/{1}
F =
SS(full)/ (n − 2)
Explained Variation
n
X n
X 2
yi − ȳ )
(b 2
= βb0 + βb1 xi − ȳ
i=1 i=1
n
X
= βb12 (xi − x̄)2
i=1
= βb12 n · sXX
βb12 n · sXX
F =
RSS/(n − 2)
!2
βb1
=
SE(βb1 )
Summary: ANOVA
Summary: ANOVA
yi = β0 + β1 xi1 + . . . + βp xip + εi i = 1, . . . , n
can be written as
y = X ∗β∗ + ε
where
X ∗ = (1
n X )
β0
βb∗ = = (X ∗> X ∗ )−1 X ∗> y
b
βb
Remark:
The coefficient of determination is influenced by the number of
regressors.
For a given sample size n, the r 2 value will increase by adding
more regressors into the linear model.
A corrected coefficient of determination for p regressors and
a constant intercept:
2 p(1 − r 2 )
radj = r2 −
n − (p + 1)
2 3(1 − 0.9072 )
radj = 0.907 −
10 − 3 − 1
= 0.818.
m = 10, p = 3, n = mp = 30; X (n × p)
β = (µ1 , µ2 , µ3 )> parameter vector
y = X β + ε linear model
βbH0 = y
df (r ) = n − 1
n
X
SS(reduced) = (yi − ybi )2 = ky − X βbH0 k2
i=1
SS(full) = ky − X βbH1 k2
{SS(reduced) − SS(full)}/{df (r ) − df (f )}
F =
SS(full)/df (f )
{||y − X βH || − ||y − X βbH ||2 }/{df (r ) − df (f )}
b 2
0 1
=
||y − X βbH1 ||2 /df (f )
2 p(1 − r 2 )
radj = r2 − .
n − (p + 1)
Multivariate Distributions
Random vector X ∈ Rp
(Multivariate) distribution function is
F (x) = P(X ≤ x) = P(X1 ≤ x1 , X2 ≤ x2 , . . . , Xp ≤ xp )
f (x) denotes density of X , i.e.
Z x
F (x) = f (u)du
Z ∞ −∞
f (u) du = 1
−∞
Zb
P{X ∈ (a, b)} = f (x)dx
a
Marginal density of X1 is
Z ∞
fX1 (x1 ) = f (x1 , x2 )dx2
−∞
Example
(
1
2 x1 + 32 x2 0 ≤ x1 , x2 ≤ 1,
f (x1 , x2 ) =
0 otherwise.
f (x1 , x2 ) is a density since
Z 1 1
1 x12 3 x22 1 3
f (x1 , x2 )dx1 x2 = + = + = 1.
2 2 0 2 2 0 4 4
Definition of independence
Example
0.6
0.5
0.5
0.4
0.4
Density
Density
0.3
0.3
0.2
0.2
0.1
0.1
0.0
0.0
7 8 9 10 11 7 8 9 10 11 12
0.15
0.1
0.05
0
12 14
10 12
10
8 8
6
0.15
0.1
0.05
0
5
15
10
10
15 5
Summary: Distributions
Summary: Distributions
R
E X1 Z x1 f (x)dx
. ..
E X = .. = xf (x)dx = . = µ.
R
E Xp xp f (x)dx
X ∼ (µ, Σ)
Σ = (σXi Xj )
σXi Xj = Cov(Xi , Xj )
σXi Xi = Var(Xi )
X
Var(a> X ) = a> Var(X ) a = ai aj σXi Xj
i,j
>
Var(AX + b) = A Var(X ) A
Cov(X + Y , Z ) = Cov(X , Z ) + Cov(Y , Z )
Var(X + Y ) = Var(X ) + Cov(X , Y ) + Cov(Y , X ) + Var(Y )
Cov(AX , BY ) = A Cov(X , Y ) B > .
Example
(
1
2 x1 + 32 x2 0 ≤ x1 , x2 ≤ 1,
f (x1 , x2 ) =
0 otherwise.
Z Z Z 1Z 1
1 3
µ1 = x1 f (x1 , x2 )dx1 dx2 = x1 x1 + x2 dx1 dx2
0 0 2 2
Z 1 3 1 2 1
1 3 1 x1 3 x1
= x1 x1 + dx1 = +
0 2 4 2 3 0 4 2 0
1 3 4+9 13
= + = = ,
6 8 24 24
Z Z Z 1Z 1
1 3
µ2 = x2 f (x1 , x2 )dx1 dx2 = x2 x1 + x2 dx1 dx2
0 0 2 2
Z 1 2 1 3 1
1 3 1 x2 3 x2
= x2 + x2 dx2 = +
0 4 2 4 2 0 2 3 0
1 1 1+4 5
= + = = ·
8 2 8 8
Applied Multivariate Statistical Analysis
Multivariate Distributions 4-18
Covariance Matrix
σX1 X1 = E X12 − µ21 with
Z 1Z 1
2 1 3
E X12 = x1 x1 + x2 dx1 dx2
0 0 2 2
4 1 3 1
1 x1 3 x1 3
= + =
2 4 0 4 3 0 8
σX 1 X 2 = E(X1 X2 ) − µ1 µ2 with
Z 1Z 1
1 3
E(X1 X2 ) = x1 x2 x1 + x2 dx1 dx2
0 0 2 2
Z 1
1 3
= x2 + x22 dx2
0 6 4
1 1
1 x22 3 x23 1
= + = .
6 2 0 4 3 0 3
!
0.0815 0.0052
Σ =
0.0052 0.0677
Conditional Expectations
U = X2 − E(X2 | X1 )
(1) E(U) = 0
(2) E(X2 |X1 ) is the best approximation of X2 by a function h(X1 )
of X1 in the sense of mean squared error (MSE) when
MSE (h) = E[{X2 − h(X1 )}> {X2 − h(X1 )}] and
h : Rk −→ Rp−k .
Summary: Moments
R
The expectation of a random vector X is µ = xf (x) dx, the
covariance matrix Σ = Var(X ) = E(X − µ)(X − µ)> . We
denote X ∼ (µ, Σ).
Expectations are linear, i.e., E(αX + βY ) = α E X + β E Y . If
X , Y are independent then E(XY > ) = E X E Y > .
Summary: Moments
Characteristic Functions
Properties of cf:
ϕX (0) = 1, |ϕX (t)| ≤ 1
Z ∞
if ϕ is absolutely integrable ( |ϕ(x)|dx exists and is finite)
−∞
then Z ∞
1 >x
f (x) = e −it ϕX (t) dt.
(2π)p −∞
Z ∞ 2
1 itx x
ϕX (t) = √ e exp − dx
2π −∞ 2
2 Z ∞
t 1 (x − it)2
= exp − √ exp − dx
2 −∞ 2π 2
2
t
= exp − ,
2
R n 2
o
since i2 = −1 and √12π exp − (x−it)
2 dx = 1.
Theorem (Cramér-Wold)
The distribution of X ∈ Rp is completely determined by the set of
all (one-dimensional) distributions of t > X , t ∈ Rp .
This theorem says that we can determine the distribution of X in
Rp by specifying all the one-dimensional distributions of the linear
combinations
p
X
tj Xj = t > X , t = (t1 , t2 , . . . , tp )> .
j=1
Cumulants
κ1 = m 1 .
For k = 2 we obtain
m1 1!
κ2 = − 1 = m2 − m12
m2 m1
0
m1 = κ1
m2 = κ2 + κ21
m3 = κ3 + 3κ2 κ1 + κ31
m4 = κ4 + 4κ3 κ1 + 3κ22 + 6κ2 κ21 + κ41
γ3 = E(X − µ)3 /σ 3
γ4 = E(X − µ)4 /σ 4
Transformations
X = u(Y )
one-to-one transformation u: Rp → Rp
Jacobian:
∂xi ∂ui (y )
J = =
∂yj ∂yj
fY (y ) = abs(|J |)fX {u(y )}
Example
Y = 3X → X = 13 Y = u(y )
1
3 0
..
J =
.
1
0 3
1 p
abs(|J |) = 3
Y = AX + b, A nonsingular
X = A−1 (Y − b)
J = A−1
Summary: Transformations
Multinormal Distribution
X ∼ Np (µ, Σ)
Expected value is E X = µ,
Covariance matrix of X is Var(X ) = Σ > 0.
(What is the meaning of the quadratic form (x − µ)> Σ−1 (x − µ)
in the formula for density?)
Applied Multivariate Statistical Analysis
Multivariate Distributions 4-42
(x − µ)> Σ−1 (x − µ) = d 2
10
● ●
●●
● ●
● ● ●
6
● ● ●
● ●
● ● ● ● 0.015
● ●● ● ●●
● ● ● ● ● ●●
●●●●●
5
●●●
●●
4
● ● ●●● 0.025
●● ●● ●●●
● ●● ●●
●● ● ● ●● ●●
●●
●●●●● ●
●●●● ●● ●
● ● 0.035
●●
●● ●●●● ●●
● ●●
●●
● ● ●●● ●● ●
X2
X2
● ●●
●
● ●●●●● ● ●●
●
●●●●●●●● ●●●●●
●●●●
●●●
●●● ●
2
●●
● ●●●● ●
●●●
●● ● ●
●●●● ●● ●●●●
● ● ●● ●
● ●● ●● ●●●
● ●●
●● ●
●●●●●● ● ●●●
●
● ●● ●●●●●
0
●● ● ● ●
● ●● ●
0.03
●●● ● ●● ●● ● ●
● ● ●
● ●
0
●
●●● ● ● 0.02
● ● ●
●● ●●●
● ●●●● ●
● ● ●
● ●●●●● ● ●● 0.01
● ● ● ● ●●● 0.005
−2
●
●●●
−5
●
●
●●
●
−1 0 1 2 3 4 5 6 −2 0 2 4 6
X1 X1
!
3
Scatterplot of normal sample and contour ellipses for µ =
2
!
1.0 −1.5
and Σ = MVAcontnorm
−1.5 4.0
Multivariate Distributions 4-44
Σ− = G-inverse
Limit Theorems
√ L
n(x̄ − µ) −→ Np (0, Σ) for n −→ ∞.
0.4
0.4
Estimated and Normal Density
0.3
0.2
0.2
0.1
0.1
0.0
0.0
−3 −2 −1 0 1 2 3 −4 −2 0 2 4
0.15 0.15
0.1 0.1
0.05 0.05
0 0
5 5
4 4
0 2 0 2
0 0
−2 −2
−5 −4 −5 −4
1.0
●
Empirical
●
Theoretical
0.8
EDF(X), CDF(X)
0.6
0.4
0.2
0.0
−3 −2 −1 0 1 2 3
The standard normal cdf and the empirical distribution function for
n = 100. MVAedfnormal
EDF and CFD
1.0
●
Empirical
●
Theoretical
0.8
EDF(X), CDF(X)
0.6
0.4
0.2
0.0
−3 −2 −1 0 1 2 3
The standard normal cdf and the empirical distribution function for
n = 1000 MVAedfnormal
EDF and 2 bootstrap EDFs, n = 100
1.0
edf
1. bootstrap edf
0.8
2. bootstrap edf
edfs[1..3](x)
0.6
0.4
0.2
0.0
−3 −2 −1 0 1 2
X
Transformation of Statistics
√ L
If n(t − µ) −→ Np (0, Σ) and if f = (f1 , . . . , fq )> : Rp → Rq
are real valued functions which are differentiable at µ ∈ Rp , then
f (t) is asymptotically normal with mean f (µ) and covariance
D> ΣD, i.e.,
√ L
n{f (t) − f (µ)} −→ Nq (0, D> ΣD) for n −→ ∞,
where
∂fj
D= (t)
∂ti t=µ
(p × q) matrix of all partial derivatives.
This theorem can be applied e.g. to find the “variance stabilizing”
transformation.
Applied Multivariate Statistical Analysis
Multivariate Distributions 4-57
Example
Suppose
!
0 1 0.5
{Xi }ni=1 ∼ (µ, Σ); µ= , Σ= , p = 2.
0 0.5 1
This yields
! !!
√ x̄12 − x̄2 L 0 1 − 72
n → N2 , .
x̄1 + 3x̄2 0 − 27 13
Heavy-Tailed Distributions
Definition
Distribution Comparison
0.4
−2f −1f 1f 2f ●
Gauss
●
Cauchy
0.2
Y
0.0
−6 −2 0 2 4 6
X
Kurtosis
PDF of GH Distribution
The density of a one-dimensional generalised hyperbolic (GH)
distribution for x ∈ R is
fGH (x; λ, α, β, δ, µ) =
p p
( α2 − β 2 /δ)λ Kλ−1/2 {α δ 2 + (x − µ)2 } β(x−µ)
=√ p · p e ,
2πKλ (δ α2 − β 2 ) ( δ 2 + (x − µ)2 /α)1/2−λ
Kλ is a modified Bessel function of the third kind with index λ
Z
1 ∞ λ−1 − x (y +y −1 )
Kλ (x) = y e 2 dy
2 0
Parameters
p
Kλ+1 (δ α2 − β 2 )
δβ
E[X ] = µ + p p
α2 − β 2 Kλ (δ α2 − β 2 )
" p
2 Kλ+1 (δ α2 − β 2 )
Var[X ] = δ p p
δ α2 − β 2 Kλ (δ α2 − β 2 )
p
β2 Kλ+2 (δ α2 − β 2 )
+ 2 p
α − β 2 Kλ (δ α2 − β 2 )
p #
Kλ+1 (δ α2 − β 2 ) 2
− p
Kλ (δ α2 − β 2 )
PDF of GH, HYP and NIG CDF of GH, HYP and NIG
1.0
0.5
●
GH ●
GH
●
NIG ●
NIG
●
HYP ●
HYP
0.8
0.4
0.6
0.3
Y
Y
0.2
0.4
0.1
0.2
0.0
0.0
−6 −2 2 6 −6 −2 2 6
X X
Figure: pdf (left) and cdf (right) of GH (λ = 0.5), HYP and NIG with
α = 1, β = 0, δ = 1, µ = 0 MVAghdis
Student’s t-distribution
ft (x; n) = √ 1+
nπΓ n2 n
1.0
0.4
●
t3 ●
t3
●
t6 ●
t6
●
t30 ●
t30
0.8
0.3
0.6
0.2
Y
Y
0.4
0.1
0.2
0.0
0.0
−4 −2 0 2 4 −4 −2 0 2 4
X X
Figure: pdf (left) and cdf (right) of t-distribution with different degrees
of freedom (t3 stands for t-distribution with 3 degrees of freedom)
MVAtdis
µ = 0
n
σ2 =
n−2
Skewness = 0
6
Kurtosis = 3 + .
n−4
Property
0.04
●
t1
●
t3
●
t9
●
t45
●
Gaussian
0.03
0.02
Y
0.01
0.00
Laplace Distribution
µ = µ
σ 2 = 2θ2
Skewness = 0
Kurtosis = 6
1.0
0.5
●
L1 ●
L1
●
L1.5 ●
L1.5
●
L2 ●
L2
0.8
0.4
0.6
0.3
Y
Y
0.4
0.2
0.2
0.1
0.0
0.0
−6 −2 2 6 −6 −2 2 6
X X
Figure: pdf (left) and cdf (right) of Laplace distributions with zero mean
and different scale parameters (L1 stands for Laplace distribution with
θ = 1) MVAlaplacedis
Applied Multivariate Statistical Analysis
Multivariate Distributions 4-81
e −|x|
f (x) =
(2
ex
2 for x < 0
F (x) = e −x
1− 2 for x ≥ 0
Cauchy Distribution
1 1
fCauchy (x; m, s) =
sπ 1 + ( x−m
s )
2
1 1 x −m
FCauchy (x; m, s) = + arctan
2 π s
0.4
●
C1 C1
0.6
●
0.2 ●
C1.5 ●
C1.5
●
C2 ●
C2
Y
Y
0.2
0.0
−6 −2 2 6 −6 −2 2 6
X X
Figure: pdf (left) and cdf (right) of Cauchy distributions with m = 0 and
different scale parameters (C1 stands for Cauchy distribution with s = 1)
MVAcauchy
Mixture Model
Mean, Variance,
n
Skewness and Kurtosis
X
µ = wl µl
l=1
Xn
σ2 = wl {σl2 + (µl − µ)2 }
l=1
n
( 3 3 )
X σl 3σ 2 (µl − µ) µl − µ
Skewness = wl SKl + l 3 +
σ σ σ
l=1
n (
X σl 4 6(µl − µ)2 σl2 4(µl − µ)σl3
Kurtosis = wl Kl + + SKl
σ σ4 σ4
l=1
)
µl − µ 4
+ ,
σ
where µl , σl , SKl and Kl correspond to l’th distribution.
Applied Multivariate Statistical Analysis
Multivariate Distributions 4-90
Pdf of a Gaussian mixture and Gaussian Cdf of a Gaussian mixture and Gaussian
0.4
1.0
●
Gaussian Mixture ●
Gaussian Mixture
●
Gaussian ●
Gaussian
0.8
0.3
0.6
0.2
Y
Y
0.4
0.1
0.2
0.0
0.0
−6 −2 2 6 −6 −2 2 6
X X
Figure: pdf (left) and cdf (right) of a Gaussian mixture MVAmixture
Parameters of GHd
λ ∈ R, β, µ ∈ Rd
δ > 0, α > β > ∆β
∆ ∈ Rd×d positive definite matrix
|∆| = 1
Second Parameterization
Σ = δ2∆
Second Parameterization
where
Kλ+1 (x)
Rλ (x) =
Kλ (x)
2 (x)
Kλ+2 (x)Kλ (x) − Kλ+1
Sλ (x) =
Kλ2 (x)
Multivariate t-distribution
E[X ] = m
Cov[X ] = Σ + mm>
0.020
●
0.5
●
Laplace ●
Laplace ●
●
NIG ●
NIG ●
●
Cauchy ●
Cauchy ●
●
Gaussian ●
Gaussian ●
●
0.4
0.015
●
●
●
●
●
●
●● ●
0.3
●● ●
● ● ●
●
● ● ●
0.010
●
● ● ●
Y
Y
●
● ● ●
●
●
● ● ●
0.2
● ●
● ●
● ●
● ●
0.005
● ●
● ●
● ●
0.1
● ●
● ●
● ●
● ●
●
● ●
●
●
● ●
●
●
● ●
●
●●
● ●
●●
●
●● ●●
●
●
●●
● ●
●●
●
●
●●
●●
● ●
●●
●●
●
0.000
●●
●●
●●
●●
● ●
●●
●●
●●
●●
●
●●
●●
●●
●●
●● ●●
●●
●●
●●
●●
●
0.0
X X
Advantages
Dependency Structures
4
4
●
● ●
● ●
● ● ●
●
● ● ●
● ● ● ● ●● ● ●
●● ● ● ●
● ● ● ● ● ● ●
● ● ● ● ●
● ● ● ● ● ● ●
● ● ● ● ●
●● ● ● ●
●● ● ● ● ● ●
● ● ● ●
2
2
● ● ● ● ● ● ●
●
● ● ●● ● ● ● ●● ●● ● ● ● ●●● ●
●●
●● ● ●● ● ● ● ● ● ● ●●
●● ●● ● ● ●● ●
●
● ● ●● ● ●●●
● ●● ●● ●
● ●● ● ● ● ●● ● ● ● ●
● ● ● ●●●● ●● ● ● ●●● ●● ●●●● ● ● ●●
● ●●
●
● ●●●●●● ●
● ●● ● ● ●● ● ●● ●● ●●
● ● ●●●●● ● ● ●●
● ● ●● ● ●● ●● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ●
●●● ● ●●
●
● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ●●
● ●●● ●●
● ● ●● ● ● ● ● ● ●
●●● ● ● ●● ●● ● ● ●●●●● ● ● ●●●● ● ● ●
●
● ● ● ●●●
●
●●●●●●● ●●
● ●● ●● ● ● ●
● ● ●● ● ●● ●●●
● ● ●● ● ● ●●
● ●●●●● ● ● ● ●
●●
●
● ● ●●●
● ● ● ●
●●
● ●● ●●● ● ●● ●
● ● ●●● ●
● ● ●
● ●●
● ●●●● ●● ● ●●●●●● ●●
●● ●● ●●●● ● ●●●
●● ●● ●
●●
●●● ●●●
●●● ●●
● ● ● ●
● ● ●
●●● ●●●
● ● ● ●● ● ● ● ●
●●
●●● ●
●●● ● ● ●● ●● ● ● ● ●●●●●● ●
●●●● ● ●●● ●
●●●●● ● ● ●
● ● ● ●●
● ●●● ●●● ●● ● ● ●
● ● ●
● ● ●●● ● ●●
● ●●● ● ●
●●●●● ●
●
●● ●
● ● ●● ●● ● ●● ● ● ● ●●● ●●
●
● ● ●● ●
● ● ●● ● ● ●●●● ● ●●●
●● ● ●●
● ● ●● ● ●● ●● ● ●
● ● ● ●
●●●● ● ●● ● ● ● ● ● ● ● ● ● ●● ●●● ● ●●●
● ●●● ● ● ●● ● ●●
●● ●●
● ● ● ●● ●
● ● ●●● ● ●●●●●● ●●● ●●● ● ●●●● ● ●● ●●●● ●●●● ●● ●
● ● ● ● ●● ●●● ●● ●● ●● ●● ●
●● ● ●●●
●
●●● ● ● ●●●● ●●●●● ●● ●●● ●●
●
● ●● ● ● ● ● ● ●●●● ●●● ●●●●●
●●● ●● ●●● ●● ●● ● ● ● ●● ●●● ●● ● ●
●●
● ●
● ●● ● ●●●●●● ●●● ●● ●● ●
●
● ● ● ●● ●
● ●●
● ● ● ●● ●● ● ●●● ● ●● ● ●●●
● ●
● ●
● ●●
● ● ●●● ● ● ●●● ●
●●● ●
●
●
●
● ●
●●● ● ●
● ●●
●● ● ●
●
● ●● ● ● ● ● ●●●●● ●●● ●●●●● ●●
● ●● ● ●● ● ●● ● ●●●● ● ● ● ●
●●● ●●●● ●● ● ● ● ●● ● ● ● ● ●●
● ● ●
●
●● ●
●●● ●● ● ● ●●●
●●● ●● ●
● ●
●●●
●
●● ● ● ●● ●●● ●
●● ● ● ● ● ●● ● ●
●●●●●
●
●●●● ●
●
●●●
●●●●
●
●●
●●●
●●● ● ●● ● ● ●● ●
●
●●● ● ●● ●
●●●● ●●●● ●●● ●
●● ●●● ●●●
● ●●●● ● ● ●
●● ●●●● ● ●● ●●●
●● ●● ●● ● ●
●●●● ● ● ● ●●● ●● ● ● ●●●● ●● ● ●● ● ●●
●●● ● ●● ●●
●● ●●
● ●● ●●●● ●●●
●●
● ●●●●● ●●● ●●● ● ●●●●●●●●● ● ● ● ●
●
●●
●●● ●● ●●●●●●
●
●
●●●●
●●● ●
●●●
●● ● ● ● ●● ●●● ● ●●●
●● ●
●
●
●● ●● ●
●●●
● ●
●●●● ● ●
●●●●
● ●● ●●●
● ●●●● ●●
●●●●
● ● ●●
●● ● ●●● ● ●● ● ●● ● ● ● ●● ●●●
●●●●●
●
● ● ●●●●●
● ●●
●
●●● ● ● ●● ● ●● ●
● ● ● ●● ●● ● ● ●● ●●
● ● ●●●● ● ● ● ●
● ● ● ● ●● ● ●
●● ● ●
● ●●●
●●
●●●
● ● ●●
●●
●●●● ●
●●
●● ● ●●●
● ● ●
● ● ● ●● ● ●●●●
●● ●
●●● ● ●●●
● ●●
●●●
●
●●●● ● ●●
●● ● ● ●●● ●
●●●●●● ●
●●●
●
●
●●●
●
●● ● ●● ●
● ● ●
●
0
0
●●●● ●●●
0
● ● ● ●●● ● ●
●
●●●● ● ●● ●● ● ●● ●●● ●●● ● ● ● ● ●●● ● ●●● ●●● ●
● ● ●● ● ●●● ●● ●● ●●● ● ●● ● ●
●●● ● ●
●●
● ●●● ● ●
● ●●● ●●● ●● ●●●● ●●●●
● ● ● ●● ●
● ● ●● ● ●●
●● ●●●● ● ●● ● ● ●● ● ●● ●
●●
●
●● ●●●●●● ●● ●● ● ●●● ●
●● ● ●●● ●●● ●● ●●
● ●● ●●●● ●●●●●●
● ● ● ●● ●●●● ●● ● ●●
●●●● ● ● ● ● ●● ●●
●● ● ●●
●●●●
●●
●●●
● ● ● ●● ● ●● ● ●
● ●● ● ●
●●●● ●●● ●●● ● ●
●●●
●
● ● ● ● ● ● ●● ● ●
● ● ● ●
●●● ● ●● ●●●●●●
●● ● ●●●●●● ● ●
●● ●●
●●● ● ● ●
●
● ● ● ●●●●● ● ●●
● ● ● ● ●● ●● ● ● ● ●●● ● ●●●●●
● ●●
● ●● ●●●●●● ●
●●● ●
●●●
●●
●●● ●●
●● ●●●●●●●● ●● ● ●● ● ● ● ●● ●●● ●●
● ●●
●● ● ● ●●●
●●● ●●● ●●
●
●●●●●●
●●●●●●●
●● ●●●● ●●● ●●● ●
● ● ● ● ●●
●●● ●●● ●● ●●●●
● ● ●●●
●●● ● ●● ●●
● ● ●● ●●●●●●
●
● ● ●
●●● ●
● ●● ●
●● ●●●
● ●● ●
●●●● ●●●
● ●
●● ● ● ●
●● ● ●●● ● ●● ●●
●●●● ●●
●●
●●●●
●
● ●● ●● ● ●● ● ● ● ● ●●●● ●●●
●
●●●
● ● ● ●●●
●●●●● ● ●● ●● ●●●
● ●● ●●●● ●● ●
● ● ●●● ●●
●●
●
● ●● ● ●
●● ●
● ● ●● ●●● ● ●
●
●
●● ●●●●● ● ●● ● ● ●● ●●● ● ●●
● ●●
● ● ●●●● ●●
● ● ●
● ●●●
●● ●●
● ●●● ●●●●● ● ●●●● ● ●●●● ● ● ●●● ● ●●● ●●●
●● ●●● ● ●● ●● ● ●
●● ● ●
●●●●
● ●● ● ●
● ●●●
●● ● ●●●● ● ● ●●●● ●●
● ● ●●●● ●● ●
●● ● ● ● ●
● ● ●●
●● ●● ● ●● ● ● ●● ●● ●●
●●●●●●● ●
● ●●
●●●●
●
●
● ● ●● ● ● ● ●
● ● ●
● ●
● ● ●
●● ● ●● ● ●
●●●
●● ●●● ●●●
●
● ●●
●● ● ● ● ●●●
● ●●
●● ● ●●●
●●●● ● ● ● ● ● ●● ●●●● ●
●●●●
●● ● ● ●●● ●● ● ●● ● ●
● ●●● ●●●●●
●●
●
●●●● ● ●●●
● ● ● ● ●● ● ● ●
● ●● ●● ● ●● ● ●● ●● ● ●●●●● ● ●
● ● ●● ●●●●● ●● ● ●●● ●●● ●● ●● ●●●●● ● ●
●● ●● ●
● ●●
●● ●● ●
● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●
●
●●●● ● ●● ●●● ● ● ● ● ● ● ● ●●● ● ●● ● ●●● ●● ●● ●●● ● ●● ● ●●● ●●● ● ●● ● ●● ● ●
● ● ● ●
● ●●
● ●●●●●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●●●●● ● ● ●
●
● ● ● ● ●● ● ●●● ● ●● ●
● ● ● ● ● ●
●●●● ●● ● ● ● ● ● ● ● ● ●
●● ● ● ●●●● ● ● ●●
● ● ● ● ●● ● ● ●●
●●
● ●●● ● ● ● ●
●●● ● ● ● ● ● ●
● ●●● ● ●●
● ● ● ● ●● ●●
●●● ● ●● ● ●●●●
●
● ● ● ● ● ● ● ●
●
●● ● ●●
● ● ●● ●
●● ●
● ● ●●●
●● ● ●●● ● ●● ●● ● ●● ● ●
● ●● ● ● ● ● ● ● ● ●●●
●
●● ● ●
● ●● ● ●
●
● ●● ● ● ●
● ● ● ●
● ● ● ●●● ● ● ● ● ● ●●
● ●● ● ●● ● ● ● ● ●
● ●●● ●
−2
−2
−2
●● ●● ●●●● ●● ●● ●
● ● ● ●● ● ●● ● ● ● ● ●● ●
● ●
●
● ● ●● ● ●
●
● ● ●
●
● ● ● ●● ● ●
● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ●●● ● ●
● ● ● ●● ●
● ● ● ● ●
● ●
● ● ●
● ● ●
● ● ●
●
−4
−4
−4
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
Varying Dependency
4 4
3 3
2 2
1 1
Siemens
Siemens
0 0
−1 −1
−2 −2
−3 −3
−4 −4
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
Bayer Bayer
Outline
1. Motivation X
2. Copulae
3. Parameter Estimation
4. Sampling from Copulae
5. Tail Dependence
6. Value-at-Risk with Copulae
7. Application
Copulae
175 8%
150 41%
125
100 10%
75
50
25
F-volume
2-increasing Function
VF (B) ≥ 0 (3)
2-increasing Function
Lemma
Let U1 and U2 be non-empty sets in R and let F : U1 × U2 −→ R
be a two-increasing function. Let x1 , x2 be in U1 with x1 ≤ x2 , and
y1 , y2 be in U2 with y1 ≤ y2 . Then the function
t 7→ F (t, y2 ) − F (t, y1 ) is non-decreasing on U1 and the function
t 7→ F (x2 , t) − F (x1 , t) is non-decreasing on U2 .
Grounded Function
Distribution Function
2
A distribution function is a function from R 7→ [0, 1] which:
is grounded
is 2-increasing
satisfies F (∞, ∞) = 1.
Margins
Bivariate Copulae
Fréchet-Hoeffding Bounds
M(u1 , u2 ) = min(u1 , u2 )
W (u1 , u2 ) = max(u1 + u2 − 1, 0)
Fréchet Copulae
Gauss Copula
C (u1 , u2 ) = Φρ {Φ−1 (u1 ), Φ−1 (u2 )}
Φ−1 −1
Z (u1 ) Φ Z (u2 )
x 2 − 2ρxy + y 2
1
= p exp − 2
dx dy
2π 1 − ρ2 2(1 − ρ )
−∞ −∞
t-Student Copula
C (u1 , u2 ) = tρ,ν {tν−1 (u1 ), tν−1 (u2 )}
−1 −1
tν (u1 ) tν (u2 ) −(ν+2)/2
x 2 − 2ρxy + y 2
Z Z
1
= exp 1 + dx dy
ν(1 − ρ2 )
p
2π 1 − ρ2
−∞ −∞
Archimedean Copulae
Archimedean copula:
for a continuous,
( decreasing and convex ψ, ψ(1) = 0.
−1
ψ (t), 0 ≤ t ≤ ψ(0),
ψ [−1] (t) =
0, ψ(0) < t ≤ ∞.
The function ψ is a generator of the Archimedean copula.
For ψ(0) = ∞: ψ [−1] = ψ −1 and the ψ is called a strict generator.
Gumbel Copula
h n o1 i
C (u, v ) = exp − (− log u)θ + (− log v )θ
θ
E. Gumbel on BBI:
Clayton Copula
n 1
o
C (u, v ) = max (u −θ + v −θ − 1)− θ , 0
Frank Copula
Clayton Gumbel
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
Y
Y
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
X X
Transformations of Margins
Product Copula
Independence implies that the product of the cdf’s FX1 and FX2
equals the joint distribution function F , i.e.:
Product Copula
P(X1 ≤ x1 , X2 ≤ x2 ) = H(x1 , x2 )
= C {F1 (x1 ), F2 (x2 )}
= F1 (x1 ) · F2 (x2 )
Partial Derivatives
Let C (u, v ) be a copula. For any v ∈ I , the partial derivative
∂C (u,v )
∂v exists for almost all u ∈ I . For such u and v one has:
∂C (u, v )
∈I (10)
∂v
∂C (u,v )
The analogous statement is true for the partial derivative ∂u :
∂C (u, v )
∈I (11)
∂u
Moreover, the functions
def
u 7→ Cv (u) = ∂C (u, v )/∂v and
def
v 7→ Cu (v ) = ∂C (u, v )/∂u
are defined and non-increasing almost everywhere on I .
Applied Multivariate Statistical Analysis
Multivariate Distributions 4-131
Copulae in d-Dimensions
F -volume
d-increasing Function
VF (B) ≥ 0. (13)
Grounded Function
Multivariate Copula
Sklar’s Theorem
For a distribution function F with marginals FX1 . . . , FXd , there
exists a copula C : [0, 1]d → [0, 1], such that
F (x1 , . . . , xd ) = C {FX1 (x1 ), . . . , FXd (xd )} (15)
for all xi ∈ R, i = 1, . . . , d. If FX1 , . . . , FXd are cts, then C is
unique. If C is a copula and FX1 , . . . , FXd are cdfs, then the
function F defined in (15) is a joint cdf with marginals
FX1 , . . . , FXd .
C (u1 , . . . , ud ) = FX {FX−1
1
(u1 ), . . . , FX−1
d
(ud )}
uj = FXj (xj ), j = 1, . . . , d
∂ d C (u1 , . . . , ud )
c(u1 , . . . , ud ) =
∂u1 . . .∂ud
the joint density fX is
d
Y
fX (x1 , . . . , xd ) = c{FX1 (x1 ), . . . , FXd (xd )} fj (xj )
j=1
M d (u1 , . . . , ud ) = min(u1 , . . . , ud )
d
!
X
W d (u1 , . . . , ud ) = max ui − d + 1, 0
i=1
Qd
3. Product copula Πd (u1 , . . . , ud ) = j=1 uj
4. The functions Md
and Πd
are d-copulae for all d ≥ 2, the
d
function W is not a d-copula for any d > 2.
Gauss
R Φ−1 (u1 ) R Φ−1 (u ) d 1
−∞ . . . −∞ d (2π)− 2 |R|− 2 exp − 12 r > R −1 r dr1 . . . drd ,
where r = (r1 , . . . , rd )>
t-Student
R tν−1 (u1 ) R tν−1 (ud ) − v +d
− d2 − 12 r > R −1 r 2
−∞ . . . −∞ (2π) |R| 1+ ν dr1 . . . drd
where r = (r1 , . . . , rd )>
Gumbel
n o1
θ θ θ
C (u1 , . . . , ud ) = exp − (− log u1 ) + . . . + (− log ud )
Cook-Johnson
− 1
d θ
X
−θ
C (u1 , . . . , ud ) = uj − d + 1
j=1
Frank
1 (e −θu1 − 1) . . . (e −θud − 1)
C (u1 , . . . , ud ) = − log 1 +
θ (e −θ − 1)d−1
Applied Multivariate Statistical Analysis
Multivariate Distributions 4-142
Dimensionality
In d-dimension
d(d−1)
1. Elliptical Copulae: correlation matrix with 2 parameters
2. Archimedean Copulae: 1 parameter
Conclusions
Pluses of copulae
flexible and wide range of dependence
easy to simulate, estimate, implement
explicit form of densities of copulae
modelling of fat tails, assymetries
Minuses of copulae
Elliptical: correlation matrix, symmetry
Archimedean: too restrictive, single parameter, exchangable
selection of copula
E(X ) = µ, Var(X ) = Σ
Linear transformations
Theorem
X1
X = ∼ Np (µ, Σ), X1 ∈ Rr X2 ∈ Rp−r
X2
X2.1 = X2 − Σ21 Σ−1
11 X1 with
!
Σ11 Σ12
Σ = .
Σ21 Σ22
⇒ X1 ∼ Nr (µ1 , Σ11 ),
independent
⇒ X2.1 ∼ Np−r (µ2.1 , Σ22.1 )
Corollary
X1
Let X = ∼ Np (µ, Σ).
X2
Σ12 = 0 if and only if X1 is independent of X2 .
Corollary
If X ∼ Np (µ, Σ), A and B matrices, then AX and BX are
independent if and only if AΣB > = 0.
Theorem
If X ∼ Np (µ, Σ) and A(q × p), c ∈ Rq , q ≤ p, then Y = AX + c
is a q-variate Normal, i.e.,
Y ∼ Nq (Aµ + c, AΣA> ).
Theorem
The conditional distribution of X2 given X1 = x1 is normal with
mean µ2 + Σ21 Σ−1
11 (x1 − µ1 ) and covariance Σ22.1 , i.e.,
Example
0 1 −0.8
p = 2, r = 1, µ = ,Σ=
0 −0.8 2
Σ11 = 1, Σ21 = −0.8, Σ22.1 = 2 − (0.8)2 = 1.36.
2
x
⇒ fX1 (x1 ) = √12π exp − 21
n o
1 (x2 +0.8x1 )2
⇒ f (x2 | x1 ) = √ exp − 2·(1.36) .
2π(1.36)
●
●●
●
●●
●
●
● ●
● ●
●
●
● ●
●● ●
●● ●● ●
●● ●
●
●●●● ●● ●
● ●
● ●
●● ● ●
●
●
●●● ● ●● ●● ●
0.000.050.100.150.200.250.300.35
● ●
● ●
●● ●● ●● ●
●
●
●●● ● ●
● ● ●● ●
● ●●
● ● ●
● ● ●● ● ● ●
●●
●●●
● ● ●
● ● ●● ● ● ●
●
●● ●● ● ● ●● ●
● ●● ● ●
●
●●●
●
● ● ●
● ● ●● ●
● ● ● ● ●● ●
● ● ●● ● ● ●
●● ● ● ● ●● ●
●
● ●
● ●● ● ● ●● ● ● ● ●
● ● ●● ● ●● ●
●
● ●● ● ● ●● ● ● ● ● ● ●
●
● ●● ● ● ● ● ●
● ● ● ● ● ● ● ●● ● ●
●
● ● ●● ● ● ●
● ●
● ● ●● ● ● ●● ●● ● ●
●
● ● ● ●
● ● ●● ● ● ● ●
● ●
● ● ● ●● ●● ● ●● ● ● ●
● ●● ● ●● ●● ● ●
● ● ●● ●● ● ●● ●● ● ● ●
● ● ●● ● ●● ●● ● ● ● ● ●
● ● ● ●● ●● ● ● ● ● ● ●
●
● ● ● ●● ●● ● ● ● ●
● ● ● ●● ● ● ● ● ●
● ●●
● ●
● ● ● ● ● ● ●● ●● ● ● ●
● ● ● ● ● ● ● ●● ● ● ●
●●
● ● ● ● ● ● ●● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●● ●● ● ●● ● ●
● ● ● ●● ● ●● ● ● ● ● ● ●
● ● ● ● ●● ●● ● ● ● ● ● ● ● ●
● ● ● ● ● ●● ● ● ● ● ● ● ●
● ● ● ● ● ● ●
● ● ● ● ● ● ● ●● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ●
● ● ● ● ● ● ●● ● ● ● ● ● ●
● ●● ● ● ●
● ● ● ● ●● ● ●● ● ● ● ●
● ●
● ● ● ● ● ● ●● ●
● ● ● ● ● ● ●
● ● ● ● ● ●●●●
●
● ●
● ● ● ● ● ●●
● ● ● ● ● ●
● ● ● ● ● ● ●
●
● ● ● ● ● ● ●
● ● ● ●
● ● ●● ●●
● ● ● ● ● ● ● ● ●● ●
● ●
● ● ●
● ●
●
● ● ● ● ● ● ●●●●●● ● ● ● ●
● ●
● ●●
●
● ● ●
● ●
● ●
● ● ●
● ●
●
● ●
●
●●
●●
●
●●
●
●
●
●
●●●●
●
●●
●
●
●
●●
●
●
●●
●
●●
● ●
● ● ●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●●
●●
●
●●
●●
●●
●●
●●
●
●●
●● 5
●
●●
●●
●
●
● ●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●●● ● ●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●
●
4
●
●●
●●
●
●●
●●
●●
●●
●●
●
●●
●
● ●
●
●
●
●●
●
● ●
●
●
● ●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
●
●●
●
●
●
●●
●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●●
●●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
●
●●
● 3
●
●●
●
●●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●●
●
●
●
●
●●
●
●
●●
●●
●●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●●
●●
●
●●
●●
●●
●●
●●
●
●●
●●
●
2
●
●●
●●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●
●●
●
●
●●
●●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
●●
●●
●
●●
●
●●
●●
●●
●
●●
●●
●
1
●
●
●●
●●
●
●●
●●
●
●●
●●
●
●●
●●
●●
●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
●●
●
●●
●●
●
●●
●●
●●
●●
●●
●
●●
●●
●
●● ●
●
●●
●●
●●
●●
●
●●
●●
●●
●●
● 0
−10 −5 0 5
Theorem
If X1 ∼ Nr (µ1 , Σ11 ) and (X2 |X1 = x1 ) ∼ Np−r (Ax1 + b, Ω) where
Ω does not depend on x1 , then
X1
X = ∼ Np (µ, Σ),
X2
where
µ1
µ=
Aµ1 + b
and !
Σ11 Σ11 A>
Σ= .
AΣ11 Ω + AΣ11 A>
Conditional Approximations
= β0 + BX1 + U
X2 = β0 + β > X1 + U.
!
Σ11 σ12
Σ =
σ21 σ22
Marginal variance of X2 :
σ21 Σ−1
11 σ12
ρ22.1...r = .
σ22
and variance
σ12 Σ−1
22 σ21
The multiple correlation is ρ21.234 = σ11 = 0.907.
Mahalanobis Transform
If X ∼ Np (µ, Σ) then the Mahalanobis transform is
Y = Σ−1/2 (X − µ) ∼ Np (0, Ip )
and it holds
Wishart Distribution
X ∼ Np (µ, Σ), µ = 0
X (n × p) data matrix
Theorem
Theorem
M ∼ Wp (Σ, n) , a ∈ Rp , a> Σa 6= 0
a> Ma
⇒ ∼ χ2n
a> Σa
Theorem (Cochran)
nS = X > HX ∼ Wp (Σ, n − 1)
S is the sample covariance matrix
x̄ and S are independent
Hotelling’s T 2 -Distribution
Theory of Estimation
MLE
θb = arg max L(X ; θ)
θ
log-likelihood
`(X ; θ) = log L(X ; θ)
Example
Sample {xi }ni=1 from Np (µ, I), i.e. from the pdf
−p/2 1 >
f (x; θ) = (2π) exp − (x − θ) (x − θ)
2
(xi − x̄)> (xi − x̄) + (x̄ − θ)> (x̄ − θ) + 2(x̄ − θ)> (xi − x̄).
Applied Multivariate Statistical Analysis
Theory of Estimation 6-4
Example
If we sum up this term over i = 1, . . . , n we see that
n
X n
X
(xi − θ)> (xi − θ) = (xi − x̄)> (xi − x̄) + n(x̄ − θ)> (x̄ − θ).
i=1 i=1
Hence
n
1X n
`(X ; θ) = log(2π)−np/2 − (xi − x̄)> (xi − x̄)− (x̄ −θ)> (x̄ −θ).
2 2
i=1
θb = µ
b = x̄.
and
n
n 1X
`(X ; θ) = − log |2πΣ| − (xi − µ)> Σ−1 (xi − µ).
2 2
i=1
(xi − x̄)> Σ−1 (xi − x̄) + (x̄ − µ)> Σ−1 (x̄ − µ) + 2(x̄ − µ)> Σ−1 (xi − x̄).
n
X
= (xi − x̄)> Σ−1 (xi − x̄) + n(x̄ − µ)> Σ−1 (x̄ − µ).
i=1
b = x̄,
µ b = S.
Σ
n
Note that the unbiased covariance estimator Su = n−1 S is not the
MLE!
and
n
1 1 X
`(y ; θ) = log − 2 (yi − β > xi )2
(2π)n/2 σ n 2σ
i=1
n 1
= − log(2π) − n log σ − (y − X β)> (y − X β)
2 2σ 2
n 1
= − log(2π) − n log σ − (y > y + β > X > X β − 2β > X > y )
2 2σ 2
Differentiating w.r.t. the parameters yields
∂ 1
` = − 2 (2X > X β − 2X > y )
∂β 2σ
∂ n 1
` = − + 3 (y − X β)> (y − X β) .
∂σ σ σ
Applied Multivariate Statistical Analysis
Theory of Estimation 6-11
b = σ 2 (X > X )−1 .
Var(β)
b = x̄
The MLE’s of µ, Σ from a Np (µ, Σ) distribution are µ
b
and Σ = S. Note that the MLE for Σ is not unbiased.
The MLE’s in a linear model y = X β + ε, ε ∼ Nn (0, σ 2 I) are
given by the least squares estimator βb = (X > X )−1 X > y and
b 2 . E(β)
b2 = n1 ||y − X β||
σ b = β and Var(β) b = σ 2 (X > X )−1 .
∂
s(X ; θ) = `(X ; θ)
∂θ ( n )
1 ∂ X
= − (xi − θ)> (xi − θ)
2 ∂θ
i=1
= n(x̄ − θ),
Theorem
If s = s(X ; θ) is the score function and if θb = t = t(X , θ) is any
function of X and θ, then under regularity conditions
> ∂ ∂t >
E(st ) = E(t > ) − E ·
∂θ ∂θ
Corollary
If s = s(X ; θ) is the score function, and θb = t = t(X ) is any
unbiased estimator of θ (i.e., E(t) = θ), then
E{s(X ; θ)} = 0.
Remark
If x1 , · · · , xn are i.i.d., Fn = nF1 where F1 is the Fisher
information matrix for sample size n = 1.
All estimators which are unbiased and attain the Cramer-Rao lower
bound are minimum variance estimators.
Theorem (Cramer-Rao)
If θ̂ = t = t(X ) is any unbiased estimator for θ then under
regularity conditions
Var(t) ≥ Fn−1 ,
where
Fn = E{s(X ; θ)s(X ; θ)> } = Var{s(X ; θ)}
Proof.
Consider the correlation ρY ,Z between Y and Z where Y = a> t,
Z = c > s, and s is the score and the vectors a, c ∈ Rp . By the
Corollary Cov(s, t) = I and thus
Hence,
cont’d.
In particular, this holds for any c 6= 0. Therefore it holds also for
the maximum of the left-hand side with respect to c. Since
c > aa> c
max = max c > aa> c
c c > Fn c c > Fn c=1
and
max c > aa> c = a> Fn−1 a
c > Fn c=1
Theorem
Suppose that the sample {xi }ni=1 is i.i.d. If θb is the MLE for
θ ∈ Rk , i.e., θb = arg max L(X ; θ), then under some regularity
θ
conditions, as n → ∞:
√ L
n(θb − θ) −→ Nk (0, F1−1 )
H 0 : θ ∈ Ω0
H 1 : θ ∈ Ω1 .
Example
Xi ∼ Np (θ, I)
H0 : θ = θ 0
H1 : no constraints for θ
or equivalently to Ω0 = {θ0 }, Ω1 = Rp .
Likelihood Ratio
Define L∗j = max L(X ; θ), the maxima of the likelihood for each of
θ∈Ωj
the hypotheses.
L∗0
λ(X ) =
L∗1
Likelihood Ratio Test
rejection region:
R = {x : λ(x) < c}
sup Pθ (x ∈ R) = α
θ∈Ω0
Theorem (Wilks)
If Ω1 ⊂ Rq is a q-dimensional space and if Ω0 ⊂ Ω1 is an
r -dimensional subspace, then under regularity conditions for
n→∞
L
∀ θ ∈ Ω0 : −2 log λ −→ χ2q−r .
Test problem 1
X1 , . . . , Xn , i.i.d. with Xi ∼ Np (µ, Σ)
H0 : µ = µ0 , Σ known, H1 : no constraints.
Ω0 = {µ0 }, r = 0, Ω1 = Rp , q = p
−2 log λ ∼ χ2p
H0 : µ = µ 0 , Σ unknown, H1 : no constraints.
This leads to
or equivalently
n−p
(x̄ − µ0 )> S −1 (x̄ − µ0 ) ∼ Fp,n−p
p
−2 log λ −→ χ2p ,
or equivalently
2
2 (n − 1) a> (µ − x̄)
t (a) = ≤ F1−α;1,n−1
a> Sa
which provides the (1 − α) confidence interval for a> µ:
s s
> >
a> x̄ − F1−α;1,n−1 a Sa ≤ a> µ ≤ a> x̄ + F1−α;1,n−1 a Sa .
n−1 n−1
p
where Kα = n−p F1−α;p,n−p .
Example
95% confidence region for µf , the mean of the forged banknotes, is
given by the ellipsoid:
6
µ ∈ R6 (µ − x̄f )> Sf−1 (µ − x̄f ) ≤ F0.95;6,94
94
214.692 ≤ µ1 ≤ 214.954
130.205 ≤ µ2 ≤ 130.395
130.082 ≤ µ3 ≤ 130.304
10.108 ≤ µ4 ≤ 10.952
10.896 ≤ µ5 ≤ 11.370
139.242 ≤ µ6 ≤ 139.658
Applied Multivariate Statistical Analysis
Hypothesis Testing 7-14
Example (cont’d)
Comparison with µ0 = (214.9, 129.9, 129.7, 8.3, 10.1, 141.5)>
shows that almost all components (except the first one) are
responsible for the rejection of µ0 .
In addition, choosing e.g. a> = (0, 0, 0, 1, −1, 0) gives c.i.
−1.211 ≤ µ4 − µ5 ≤ 0.005 shows that for the forged bills, the
lower border is essentially smaller than the upper border.
Test problem 3
Xi ∼ Np (µ, Σ)
H0 : Σ = Σ 0 , µ unknown, H1 : no constraints.
1
−2 log λ → χ2m , m= 2 {p(p + 1)}
H 0 : β = β0 , σ 2 unknown, H1 : no constraints.
Recall !
(n − p) ||y − X β0 ||2
F = −1 ∼ Fp,n−p
p ||y − X β̂||2
The test statistic for the LR test is −2 log λ = 4.55 which is under
the χ22 distribution not significant. However the exact F -test
statistic F = 5.93 under the F2,8 distribution is significant
(F2,8;0.95 = 4.46).
Linear Hypothesis
We present a general procedure which allows a linear hypothesis to
be tested.
Linear hypotheses are of the form Aµ = a with known matrices
A(q × p) and a(q × 1) with q ≤ p.
Example
Suppose that X1 ∼ N(µ1 , σ) and X2 ∼ N(µ2 , σ) are independent
and that you want to test the hypothesis H0 : µ1 = µ2
This can be written as linear hypothesis
!
µ
1
H0 : Aµ = 1 −1 = 0.
µ2
Applied Multivariate Statistical Analysis
Hypothesis Testing 7-23
Test problem 5
Xi ∼ Np (µ, Σ)
H0 : Aµ = a, Σ known, H1 : no constraints.
Example
µ1
We consider hypotheses on partitioned µ = µ2 .
H0 : µ 1 = µ 2 , H1 : no constraints,
for N2p ( µµ12 , Σ0 Σ0 ) with known Σ.
This is equivalent to A = (Ip , −Ip ), a = (0, . . . , 0)> and leads to
| {z }
p
Example
Another example is the test whether µ1 = 0, i.e.
H0 : µ1 = 0, H1 : no constraints,
for N2p ( µµ12 , Σ0 Σ0 ) with known Σ.
This is equivalent to Aµ = a with A = (I, 0),
a = (0, . . . , 0)> .
| {z }
p
Hence
−2 log λ = nx 1 Σ−1 x 1 ∼ χ2p .
Test problem 6
Xi ∼ Np (µ, Σ)
H0 : Aµ = a, Σ unknown, H1 : no constraints.
Example
Consider the bank data set and test if µ4 = µ5 , i.e., if the lower
border mean equals to the larger border mean for the forged bills.
A = (0 0 0 1 − 1 0)
a = 0.
Repeated Measurements
Frequently, n independent sampling units are observed under p
different experimental conditions (different treatments,...).
X1 , . . . , Xn are i.i.d. with Xi ∼ Np (µ, Σ) given p repeated measures.
Repeated Measurements
Note that in many cases one of the experimental conditions is the
“control” (a placebo, standard drug or reference condition). In this
case,
1 −1 0 · · · 0
1 0 −1 · · · 0
C ((p × 1) × p) =
.. .. .. .. ..
. . . . .
1 0 0 · · · −1
The null hypothesis will be rejected when
(n − p + 1) > >
x̄ C (CSC > )−1 C x̄ > F1−α;p−1,n−p+1
p−1
Repeated Measurements
Repeated Measurements
P
p
Let b = C > a, we have b > 1p = bj = 0, the result above
j=1
provides thus for all contrasts of µ, b > µ simultaneous confidence
intervals of level (1 − α)
s
(p − 1)
b > µ ∈ b > x̄ ± F1−α;p−1,n−p+1 b > Sb.
n−p+1
Example
40 children were randomly chosen and then followed from grade
level 8 to 11, the scores obtained from a test of their vocabulary.
Example (cont’d)
The matrix C providing successive differences of µj is:
1 −1 0 0
C = 0 1 −1 0 .
0 0 1 −1
−1.958 ≤ µ1 − µ2 ≤ −0.959
−0.949 ≤ µ2 − µ3 ≤ 0.335
−1.171 ≤ µ3 − µ4 ≤ 0.036.
Example (cont’d)
The rejection of the H0 is mainly due to the difference between the
first and the second year performance of children. The following
confidence intervals for the following contrasts may also be of
interest:
−2 log λ = 0.142
which is not significant for the χ21 distribution. The F -test statistic
F = 0.231
X1 = α + β1 X2 + β2 X3 + β3 X4 + ε
are
−2 log λ = 0.006,
the F statistic is
F = 0.007.
H 0 : µ1 = µ2 , H1 : no constraints.
Example
We want to compare the mean of the assets (X1 ) and of the sales
(X2 ) of the two sectors energy (group 1) and manufacturing (group
2). We have the!following statistics !n1 = 15, n2 = 10, p = 2,
4084 4307.2
x̄1 = , x̄1 = and
2580.5 4925.2
!
7 1.6635 1.2410
S1 = 10 ∗ ,
1.2410 1.3747
!
1.2248 1.1425
S2 = 107 ∗ ,
1.1425 1.5112
!
7 1.4880 1.2016
so that S = 10 ∗ .
1.2016 1.4293
Example
The observed value of the test statistic is Fobs = 2.7036. Since
F0.95;2,22 = 3.4434 the hypothesis of equal means of the two
groups is not rejected although it would be rejected at a less severe
level (p − value = 0.0892). The 95% simultaneous confidence
intervals for the differences are given by
Example
Let us compare the vectors of means of the forged and the genuine
bank notes. The matrices Sf and Sg were already calculated and
since here nf = ng = 100, S is simply the mean of Sf and
Sg : S = 12 (Sf + Sg ).
Example
The 95% simultaneous confidence intervals for the differences
δj = µgj − µfj , j = 1, . . . , p are:
−0.0443 ≤ δ1 ≤ 0.3363
−0.5186 ≤ δ2 ≤ −0.1954
−0.6416 ≤ δ3 ≤ −0.3044
−2.6981 ≤ δ4 ≤ −1.7519
−1.2952 ≤ δ5 ≤ −0.6348
1.8072 ≤ δ6 ≤ 2.3268
H0 : Σ1 = Σ2 = · · · = Σk , H1 : no constraints.
nh Sh ∼ Wp (Σh , nh − 1)
P
Under H0 , kh=1 nh Sh ∼ Wp (Σ, n − k), where Σ is the common
P
covariance matrix x and n = kh=1 nh . Let S = n1 S1 +···+n
n
k Sk
be
the weighted average of the Sh (it is in the fact the MLE of Σ when
H0 is true). The likelihood ratio test leads to the statistic
k
X
−2 log λ = n log | S | − nh log | Sh |
h=1
Example
Come back to US companies data, where the mean of assets and
sales have been compared for companies from the energy and
manufacturing sector. The test Σ1 = Σ2 leads to the value of the
test statistic
−2 log λ = 0.9076
H0 : µ 1 = µ 2 , H1 : no constraints.
Σ1 Σ2
(x̄1 − x̄2 ) ∼ Np δ, + .
n1 n2
Therefore,
−1
> Σ1 Σ2
(x̄1 − x̄2 ) + (x̄1 − x̄2 ) ∼ χ2p
n1 n2
Example
Let us compare the forged and the genuine bank notes again (n1
and n2 are large). The test statistic turns out to be 2436.8 which
is highly significant. The 95% simultaneous confidence intervals
are now:
−0.0389 ≤ δ1 ≤ 0.3309
−0.5140 ≤ δ2 ≤ −0.2000
−0.6368 ≤ δ3 ≤ −0.3092
−2.6846 ≤ δ4 ≤ −1.7654
−1.2858 ≤ δ5 ≤ −0.6442
1.8146 ≤ δ6 ≤ 2.3194
showing that all the components except the first are different from
zero, the larger difference coming from X6 (length of the diagonal)
and X4 (lower border).
Applied Multivariate Statistical Analysis
Hypothesis Testing 7-51
Profile analysis
Profile Analysis
Population Profiles
Group 1
5
Group 2
4
Mean
3
2
1
1 2 3 4 5
Treatment
Parallelism
Let C be a((p − 1) × p) matrix defined as
1 −1 0 ··· 0
C = 0 1 −1 · · · 0 .
0 ··· 0 1 −1
(1)
The hypothesis to be tested is H0 : C (µ1 − µ2 ) = 0. Under H0 ,
n1 n2
(n1 + n2 − 2) (C (x̄1 − x̄2 ))> (C SC > )−1 C (x̄1 − x̄2 )
(n1 + n2 )2
∼ T 2 (p − 1, n1 + n2 − 2)
when S is the pooled covariance matrix. The hypothesis is rejected
if
n1 n2 (n1 + n1 − p) −1
> >
(C x̄) C SC C x̄ > F1−α;p−1,n1 +n2 −p .
(n1 + n2 )2 (p − 1)
Applied Multivariate Statistical Analysis
Hypothesis Testing 7-55
2
n1 n2 1>
p (x̄1 − x̄2 )
(n1 + n2 − 2) ∼ T 2 (1, n1 + n2 − 2)
(n1 + n2 )2 1>p S1p
= F1,n1 +n2 −2.
Treatment effect
If the parallelism between the profiles has been rejected, then two
independent analyses should be done on the two groups using the
repeated measurement approach (see above). But if the parallelism
is accepted, we can exploit the information contained in both
groups (eventually at different levels) to test a treatment effect or
the horizontality of the two profiles.
This may be written as:
(3)
H0 : C (µ1 + µ2 ) = 0.
We obtain
Example
Wechsler Adult Intelligence Scale (WAIS) for 2 categories of
people: in group 1 are n1 = 37 people who do not present a senile
factor, group 2 are those (n2 = 12) presenting a senile factor. The
four WAIS subtests are X1 (information), X2 (similarities), X3
(arithmetic) and X4 (picture completion). The relevant statistics
are
Example
11.164
8.840 11.759
S1 =
6.210 5.778 10.790
2.020 0.529 1.743 3.594
9.688
9.583 16.722
S2 =
8.875 11.083 12.083
7.021 8.167 4.875 11.688
Example
The test statistics for testing the parallelism of the two profiles is
Fobs = 0.4634 which is not significant (p − value = 0.71) so we
can accept the parallelism.
The second test (equality of the levels of the 2 profiles) is given
with Fobs = 17.2146 which is highly significant (p-value ' 10−4 ):
the global level of the test for the non-senile people is superior to
the senile group.
Finally, the final test (horizontality of the average profile) gives
Fobs = 53.317 which is also highly significant (p-value ' 10−14 ).
There are significant differences among the means of the different
subtests.
Regression Models
Linear Regression
y = Xβ + ε
X (n × p) explanatory variable
y (n × 1) response
Example
2 2
yi = β0 + β1 xi1 + β2 xi2 + β3 xi1 + β4 xi2 + β5 xi1 xi2 + εi
i = 1, . . . , n
2
1 x11 x12 x11 2
x12 x11 x12
2 2
1 x21 x22 x21 x22 x21 x22
X =
.. .. .. .. .. ..
. . . . . .
2 2
1 xn1 xn2 xn1 xn2 xn1 xn2
Example
ANOVA Models
One factor (p levels) model
Multiple-Factors Models
A1 A2 A3
B1 18 15 5 8 8 10 14
B2 15 20 25 30 10 12 20 25
Table: A two factor ANOVA data set, factor A, three levels of the
marketing strategy and factor B, two levels for the location. The figures
represent the resulting sales during the same period.
Example
>
y= 18 15 15 20 25 30 5 8 8 10 12 10 14 20 25
>
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 0 0 0 0 −1 −1 −1 −1
X =
0 0 0 0 0 0 1 1 1 1 1 −1 −1 −1 −1
1 1 −1 −1 −1 −1 1 1 1 −1 −1 1 1 −1 −1
>
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 0 0 0 0 −1 −1 −1 −1
0 0 0 0 0 0 1 1 1 1 1 −1 −1 −1 −1
X =
1 1 −1 −1 −1 −1 1 1 1 −1 −1 1 1 −1 −1
−1 −1 −1 −1 −1 −1
1 1 0 0 0 0 0 1 1
0 0 0 0 0 0 1 1 1 −1 −1 −1 −1 1 1
Example
βb p-values
µ 15.25
α1 4.25 0.0218
α2 -6.25 0.0033
γ1 -3.42 0.0139
(αγ)11 0.42 0.7922
(αγ)21 1.42 0.8096
Table: The values of βb in the full model with interactions for the
marketing data (RSSfull = 158)
ANCOVA Models
Example: Consider the Car data and analyse the effect of weight
(W ) and displacement (D) on the mileage (M). Test if the origin
of the car (C ) has some effect on the response and if the effect of
the continuous variables is same for different factor levels.
Example
βb p-values βe p-values
µ 41.0066 0.0000 43.4031 0.0000
W -0.0073 0.0000 -0.0074 0.0000
D 0.0118 0.2250 0.0081 0.4140
C -0.9675 0.1250
Example
Categorical Responses
Two-Way Tables
No interaction
log m = X β
log m11 1 1 1 0
log m12 1 1 0 1 β0
log m13 1 1 −1 −1 β1
log m = , X = , β =
log m21 1 −1 1 0 β
2
log m22 1 −1 0 1 β3
log m23 1 −1 −1 −1
Likelihood
J X
X K X
Lβ = yjk log mjk s.t. mjk = n
j=1 k=1 j,k
α1 = β1 , α2 = −β1
γ1 = β2 , γ2 = β3 , γ3 = −(β2 + β3 )
K
X
(αγ)jk = 0, for j = 1, . . . , J
k=1
J
X
(αγ)jk = 0, for k = 1, . . . , K
j=1
yk count data
b k value predicted by the model
m
Pearson chi-square
K
X
2 b k )2
(yk − m
χ =
bk
m
k=1
Deviance
K
X
2 yk
G =2 yk log
bk
m
k=1
Degrees of freedom
Test
H0 : reduced model with r degrees of freedom
H1 : full model with f degrees of freedom
Reject H0
P χ2r −f > GH2 0 − GH2 1 observed
Example
M A1 A2 A3 A4 A5
DY 21 32 70 43 19
DN 683 596 705 295 99
F A1 A2 A3 A4 A5
DY 46 89 169 98 51
DN 738 700 847 336 196
Table: A Three-way Contingency Table: top table for men and bottom
table for women MVAdrug
Applied Multivariate Statistical Analysis
Regression Models 8-23
Example
Example
Logit Models
p
X
exp(β0 + βj xij )
j=1
p (xi ) = P(yi = 1 | xi ) = p
X
1 + exp(β0 + βj xij )
j=1
Logit Models
Likelihood function
n
Y
L(β0 , β) = p (xi )yi {1 − p (xi )}1−yi
i=1
Log-likelihood function
n
X
`(β0 , β) = [yi log p (xi ) + (1 − yi ) log{1 − p (xi )}]
i=1
Example
β̂ p-values
β0 3.6042 0.0660
β3 -0.2031 0.0037
β4 -0.0205 0.0183
β5 -1.1841 0.3108
Two statistics to test for the full model and the reduced
model are:
XK
2
X = (yk − mb k )2 /m
bk
k=1
K
X
G2 = 2 bk )
yk log (yk /m
k=1
)+*-,
!""#$
&%('
!#"$&%('
)+*-,
Summary: Introduction
!#"%$
a closer look:
pxi = xi> u1 ; kxi − pxi k2 = kxi k2 − kpxi k2 by Pythagoras’s
theorem equivalent to:
n
X
max kpxi k2
i=1
Theorem
The vector u1 which minimizes (19) is the eigenvector of X > X
associated with largest eigenvalue λ1 of X > X .
Subspaces of Dimension 2
second factor direction = second eigenvector!
zj = X uj factorial variable
uj = j-th eigenvector, j = 1, 2
⇒ uk = ck X > vk
Now consider the eigenvector equation in Rp : (X > X )uk = λk uk
⇒ vk = dk X > uk
1
uk = √ X > vk
λk
1
vk = √ X uk .
λk
Practical Computation
0 ca2
ma2
W[.2]
veget em3
Z[.2]
ca5
0
ma3
em4
-0.5 em5
ma4
wine
milk
bread
-1 ma5
-1
-1 -0.5 0 0.5 -2 0 2
W[.1] Z[.1]
Objective:
Reduce the dimension of a p-variate random variable X
through linear combinations.
These linear combinations should create the largest spread
among the values of X, i.e. we are looking for the linear
combinations with the largest variances.
Get:
first PC: Y1 = γ1> X
second PC: Y2 = γ2> X
...and so on with γi ⊥γj ∀i 6= j.
In general:
The PC transformation of a random variable X with E(X ) = µ,
Var(X ) = Σ = ΓΛΓ> is:
Y = Γ> (X − µ)
Direction in Data
5
−5
−3 −2 −1 0 1 2 3
Projection
1
−1
−3 −2 −1 0 1 2 3
−5
−3 −2 −1 0 1 2 3
Projection
1
−1
−3 −2 −1 0 1 2 3
Theorem
Let X ∼ (µ, Σ) and let Y = Γ> (X − µ) be the PC transformation.
Then, for j=1,...,p:
E Yj = 0
Var(Yj ) = λj
Cov(Yi , Yj ) = 0, for i 6= j
Var(Y1 ) ≥ · · · ≥ Var(Yp ) ≥ 0
Pp
Var(Yj ) = tr(Σ)
Qpj=1
j=1 Var(Yj ) = |Σ|.
Theorem
There exists no SLC that has larger variance than λ1 = Var(Y1 ).
Theorem
If Y = a> X is a SLC that is uncorrelated with the first k PCs of
X , then Var(Y ) is maximized by a = γk+1 .
Summary: SLC
Summary: SLC
Summary: SLC
µ becomes x,
Σ changes to S = GLG >
Y = (X − 1n x > )G
SY = n−1 Y > HY
= n−1 G > (X − 1n x > )> H(X − 1n x > )G
= n−1 G > X > HX G = G > SG = L
y1 = (X − 1n x > )g1
g1 = 1. Evec(S)
g2 = 2. Evec(S)
g3 = 3. Evec(S)
..
.
Example
Assume the bank notes data set. The mean vector of X is:
!241.5
!44
! !
! !
!! ! ! ! !
!46
! !! !
! !! !
!240.0
! ! !!
! ! !!
! !! !!
!! ! ! ! !!
PC2
PC3
! !!!!! ! ! !!
! ! !!! !! ! ! ! !!
! !!!!! !!!
! !! !!!!
! ! !!!! !
!
! !! !! !!! !!!
!!!!! ! ! !!!!!!! ! ! ! !
! !!!!! ! ! ! ! ! ! ! ! !
!!!! !! !
!! ! ! !
!48
!
! !!! ! !! ! !! !!
!
!!! ! !
!
! ! !! ! !!!!! ! ! !
! ! ! ! !!
! !
! ! ! !
! !
! !
!238.5
! !
!50
!
3.0
!
! !
2.0
! !
! !
Lambda
!
!240.0
!!!!! !
PC2
!! !
!! ! !! !!
!
!!!!! ! !!!!!!
! ! !! ! !! !
!
!! ! !
!
!! !!!
!
! !!
1.0
!
!! !
! !!
! !! ! !! !
!!! ! ! !
! ! ! !
! ! ! !
!!!!
! !
! !
!238.5
! ! ! !
0.0
! !
Example (Scaling)
Rescaling of variables: X1 , X2 , X3 , and X6 are now measured in cm,
X4 and X5 remain in mm.
This leads to:
and
` = (2.101, 0.623, 0.005, 0.002, 0.001, 0.0004)> .
The result clearly differs from the preceding example (see Figure
25): 1. PC is domintaed by X4 , 2. PC by X5 . The other variables
have much less weight.
!13.9
!
!6
!
!7
!
! !!!
! ! ! !
!
! !! !
!14.1
!
!8
! ! !!!! !! !
PC2
PC3
!
! ! ! !! !! !
!! ! !
!! !! !! !!! ! !! !
! !!! ! !! !!! !
! !!!! ! ! ! !! ! ! !! !
!! !!
! !
! ! !!
!! ! ! !! !
!10 !9
! ! !!! ! !! ! !
!
!!
! !! !!! ! !! !!!!!! ! !
!!
!! !!!!!! ! ! ! ! !
!!! !
! ! !! !
!
! ! ! ! !!! !!! !
!! ! ! !
! !! !
! ! !
! ! !!! !
!
! ! !
!
!14.3
!
! !
8 9 10 11 12 13 !10 !9 !8 !7 !6
PC1 PC2
2.0
!
Lambda
!
!14.1
!
PC2
! !! !!
! ! !! ! ! !
! !
1.0
! !!!!
! !!!!! !
! !
! !!!!!! !!
!! !!!! !
!!
! !!
! !! !!!!
!!
!!
! !
!! !
! ! ! !!!
!
! ! !! !!!
! ! !
!! ! !
! !
!
!14.3
0.0
! ! ! ! !
8 9 10 11 12 13 1 2 3 4 5 6
PC1 Index
The scale of the variables should be roughly the same for PCA.
For the practical implementation of principal components
analysis (PCA) we replace µ by the mean x and Σ by the
empirical covariance S. Then we compute the eigenvalues
`1 , . . . , `p and the eigenvectors g1 , . . . , gp of S.
The graphical representation of the PCs is obtained by
plotting the first PC vs. the second (and eventually vs. the
third).
The components of the the eigenvectors gi are the weights of
the original variables in the PCs.
Applied Multivariate Statistical Analysis
Principal Components Analysis 10-23
MVApcabanki
Applied Multivariate Statistical Analysis
Principal Components Analysis 10-26
0.8
!
0.6
Variance Explained
0.4
0.2
!
!
!
0.0
1 2 3 4 5 6
Index
λ1 + · · · + λq
ψ1 =
P
p
λj
j=1
Plotting procedure:
1. Compute the covariance matrix
2. Compute the eigenvalues
3. Standardize the eigenvalues by the sum of eigenvalues
4. Plot the proportions on the y –axes
1.0
X5
0.5
Second PC X2X3
0.0 X1
X6 X4
!0.5
!1.0
First PC
Theorem
Let Σ > 0 with distinct eigenvalues, and let U ∼ m−1 Wp (Σ, m)
with spectral decompositions Σ = ΓΛΓ> , and U = GLG > . Then
√ L
(a) m(` − λ) −→ Np (0, 2Λ2 ),
√ L P
(b) m(gj − γj ) −→ Np (0, Vj ), with Vj = λj k6=j (λ λ−λ
k
γ γ>,
)2 k k k j
λj λk γrk γsj
(c) Cov(gj , gk ) = Vjk , (r , s)-element of Vjk is: [m(λj −λk )2 ]
,
(d) elements in ` asymptotically independent of elements in G
√ L
n − 1(`j − λj ) −→ N(0, 2λ2j ), j = 1, . . . , p .
λ1 + · · · + λq
ψ= ·
P
p
λj
j=1
`1 + · · · + `q
ψb = ·
Pp
`j
j=1
√ L
n − 1(ψb − ψ) −→ N(0, D> VD)
V = 2Λ2
D = (d1 , · · · , dp )>
( 1−ψ
∂ψ tr(Σ) for 1 ≤ j ≤ q,
dj = = −ψ
∂λj tr(Σ) for q + 1 ≤ j ≤ p.
Theorem
√ L
n − 1(ψb − ψ) −→ N(0, ω 2 ),
2
ω 2 = D> VD = 2
(1 − ψ)2 (λ21 + · · · + λ2q )
{tr(Σ)}
+ ψ 2 (λ2q+1 + · · · + λ2p )
2 tr(Σ2 ) 2
= (ψ − 2βψ + β)
{tr(Σ)}2
λ21 + · · · + λ2q
β= 2 .
λ1 + · · · + λ2p
2 tr(S 2 ) b2
b2 =
ω (ψ − 2βbψb + β)
b = 0.142
tr(S)2
So: r
0.142
0.668 ± 1.96 = (0.615, 0.720).
199
The hypothesis that ψ = 75% would thus be rejected!
XC = HX
XS = HX D−1/2 ,
R = GR LR GR > ,
LR = diag(`R R
1 , . . . , `p ).
Z = XS GR = (z1 , . . . , zp ) .
z = 0,
SZ = GR > SXS GR = GR > RGR = LR .
0.8
0.6
Second Factor ! Families
0.4
MA4
EM5
0.2 EM4
MA3
0.0
CA5 EM3
MA2
!0.2
CA2
!0.4
CA3
CA4 EM2
1.0
bread
milk
wine
0.5
Second Factor ! Goods
0.0 vegetables
meat
poultry
fruits
!0.5
!1.0
the PCs are the factors representing the rows of the centered
data matrix;
the NPCs correspond to the factors of the standardized data
matrix.
XC = HX .
Y = XC G = (y1 , . . . , yp ).
y = 0,
SY = G > SX G = L = diag(`1 , . . . , `p ).
Duality Relations
Geometric Representation
Observe
where xC [j] and xC [k] denote the j-th and k-th column of XC . If
θjk is the angle between xC [j] and xC [k] , then
xC>[j] xC [k]
cos θjk = = rXj Xk
kxC [j] k kxC [k] k
zi> ek zik
cos ζik = =
kzi kkek k kxSi k
The values cos2 ϑik are sometimes called the relative contributions
of the k-th axis to the representation of the i-th individual
Flury (1988)
HCPC : Σi = ΓΛi Γ> , i = 1, ..., k
Σi population covariance matrix for group i
Γ = (γ1 , ..., γp ) transformation matrix
Λi = diag(λi1 , ..., λip ) eigenvalue matrix
Example
CPC analysis for the implied volatility surfaces of the DAX index in
1999. Day-by-day surface smoothing.
1.0
0.5
loading
0.0
!0.5
!1.0
1 2 3 4 5 6
moneyness
Figure: Factor loadings of the first (thick), the second (medium), and the
third (thin) PC MVAcpcaiv
Factor Analysis
P
k
xj = qj` f` + µj
`=1
↑
factors
aim at k small!
The random variables fj are ”unobserved underlying factors”.
Usually, factors cannot be uniquely determined. The choice of the
factors depends on the situation.
Applied Multivariate Statistical Analysis
Factor Analysis 11-3
Example
X = QF + µ
Y = Γ> (X − µ) ⇒ X − µ = ΓY = Γ1 Y1 + Γ2 Y2 ,
where
! !
Y1 Γ>
1
Y = = (X − µ),
Y2 Γ>
2
with
! !!
Γ>
1 Λ1 0
(X − µ) ∼ 0, .
Γ>
2 0 0
Therefore
1/2 −1/2
X = Γ1 Λ1 Λ1 Y1 + µ.
Defining
1/2
Q = Γ 1 Λ1
−1/2
F = Λ1 Y1 ,
X = QF + U + µ
Q = (p × k) loadings
F = (k × 1) common factors
U = (p × 1) specific factors
The object of factor analysis is to find the loadings Q and the
variance Ψ of the specific factors U. The estimates are deduced
from the covariance structure of X .
Applied Multivariate Statistical Analysis
Factor Analysis 11-7
Assumptions:
E [F ] = 0
Var(F ) = Ik
E [U] = 0
Var(U) = Ψ = diag(ψ11 , . . . , ψpp )
Cov(Ui , Uj ) = 0, i 6= j
Cov(F , U) = 0
P
k
2 +ψ
σXj Xj = Var(Xj ) = qj` jj
`=1
Notice that Var Xj = hj2 + ψjj , i.e., the communality is the part of
variance of Xj explained by the factors. The specific variance is the
unexplained part. The goal of FA is to explain as much as possible.
Decomposition of covariance
All we know about the factors, factors loadings, and specific
variances is that
Σ = QQ> + Ψ
The correlation is
PXF = D −1/2 Q, (20)
Invariance of Scale
Assume that we have the following FA model for X :
Var X = QX Q>X + ΨX .
What happens if we change the scale of X ?
Y = CX , C = diag(c1 , . . . , cp )
X = (QG)(G > F ) + U + µ.
This is a k-factor model with factor loadings QG and common
factors G > F . In practical analysis, we choose the rotation G which
gives ”desireable” interpretation.
For the purpose of evaluation, the non-uniqueness can be solved by
imposing additional constraints, e.g.
Example
p = 3, k = 1 ⇒ d = 0
q12 + ψ11
Σ = q1 q2 q22 + ψ22
q1 q3 q2 q3 2
q3 + ψ33
Example
q1 ψ11 0 0
Where Q = q2 and Ψ = 0 ψ22 0 .
q3 0 0 ψ33
We have
σ12 σ13 2 σ12 σ23 2 σ13 σ23
q12 = ; q2 = ; q3 =
σ23 σ13 σ12
and
ψ11 = σ11 − q12 ; ψ22 = σ22 − q22 ; ψ33 = σ33 − q32 .
Example
Suppose now p = 2 and k = 1, then d < 0.
! !
1 q12 + Ψ1
Σ= =
ρ 1 q1 q2 q22 + Ψ2
q1 = α; q2 = ρ/α; Ψ1 = 1 − α2 ; Ψ2 = 1 − (ρ/α)2
S = QbQ
b> + Ψ b
bQ
Q b> common factors
b
Ψ specific factors
b of Q:
Given an estimate Q
k
X
cjj = sX X −
ψ j j
qc
2
j` ,
`=1
P
k
hbj2 = qc
2
j` is an estimate for the communality. The problem can
`=1
be solved exactly in the ideal case of d = 0.
Applied Multivariate Statistical Analysis
Factor Analysis 11-20
Example
Data set consists of the averaged marks (from 1 low to 7 high) for
31 car types. We consider price, security and easy handling.
1 0.975 0.613
R= 1 0.620 .
1
Example
The equation
1 rX1 X2 rX1 X2 qb12 + ψb11 qb1 qb2 qb1 qb3
1 rX1 X3 = R = qb22 + ψb22 qb2 qb3
1 qb32 + ψb33
b2 = qb2
yields the communalities h i i
Example
Together with ψb11 = 1 − qb12 , ψb22 = 1 − qb22 and ψb33 = 1 − qb32 we
get the solution
Since the first two communalities are close to one we can conclude
that the first two variables, namely price and security, are explained
by the factor very well.
This factor might be interpreted as a “price+security” factor.
By substituting Σ = QQ> + Ψ
nh i
b, Q, Ψ) = − log{| 2π(QQ> + Ψ) |} − tr{(QQ> + Ψ)−1 S} .
`(X ; µ
2
Require that Q> Ψ−1 Q is diagonal matrix.
The maximum likelihood estimates of Q and Ψ are obtained using
an iterative numerical algorithm.
b = S − Q̂Q̂> .
Ψ
Error of Approximation
Example
This example uses a consumer-preference study from Johnson and
Wichern (1998). Customers were asked to rate several attributes
of a new product. The responses were tabulated and the following
correlation matrix R was constructed:
Attribute (Variable)
Taste 1 1.00 0.02 0.96 0.42 0.01
Good buy for money 2 0.02 1.00 0.13 0.71 0.85
Flavor 3 0.96 0.13 1.00 0.50 0.11
Suitable for snack 4 0.42 0.71 0.50 1.00 0.79
Provides lots of energy 5 0.01 0.85 0.11 0.79 1.00
Example
λ1 = 2.85 and λ2 = 1.81 of R are the only eigenvalues greater
than one.
λ1 + λ2 2.85 + 1.81
= = 0.93
p 5
Example
0.56 0.82
0.78 −0.53
!
>
0.56 0.78 0.65 0.94 0.80
Q̂ Q̂ + Ψ̂ = 0.65 0.75 +
0.82 −0.53 0.75 −0.11 −0.54
0.94 −0.11
0.80 −0.54
0.02 0 0 0 0 1.00 0.01 0.97 0.44 0.00
0 0.12 0 0 0 0.01 1.00 0.11 0.79 0.91
+
0 0 0.02 0 0 =
0.97 0.11 1.00 0.53 0.11 .
0 0 0 0.11 0 0.44 0.79 0.53 1.00 0.81
0 0 0 0 0.07 0.00 0.91 0.11 0.81 1.00
The communalities (0.98, 0.88, 0.98, 0.89, 0.93) indicate that the
two factors account for a large percentage of the sample variance
of each variable.
Applied Multivariate Statistical Analysis
Factor Analysis 11-32
0.6
service
0.4
value
0.2
Second Factor
price
security
easy
0.0
economic
!0.6 !0.4 !0.2
sporty
look
Rotation
Example
Let us return to the marketing example of Johnson and Wichern
(1998). The basic factor loadings given in Table 16 of the first
factor and a second factor are almost identical making it difficult
to interpret the factors. Applying the varimax rotation we obtain
the loadings q̃1 = (0.02, 0.94, 0.13, 0.84, 0.97)> and
q̃2 = (0.99, −0.01, 0.98, 0.43, −0.02)> . The high loadings,
indicated as bold entries, show that variables 2, 4, 5 define
factor 1, a nutricional factor. Variable 1 and 3 define factor 2
which might be referred to as a taste factor.
Factor Scores
fbi = Q
b> S −1 (xi − x̄)
fbi = Q
b> R−1 (zi )
Where
−1/2
zi = DS (xi − x̄)
DS = diag (s11 , ..., spp )
b is obtained with R
Q
If the factors are rotated by the orthogonal matrix G, the factors
scores have to be rotated accordingly:
0.4
X6 X8
X2X10
0.4
X9X1
X5 X1
X7
X10 X11 X6
0.0
0.0
X3
q2
q3
X12
X2 X14
X11 X13 X3
X12
X8 X13 X5
−0.6
−0.4
X7
−0.5 0.0 0.5 −0.6 −0.2 0.2 0.6
q1 q2
ML − Factor Loadings
X9
0.4
X8
X2 X10
X6 X11 X1
0.0
q3
X14
X12 X13
X3
X5
−0.4
X7
−0.5 0.0 0.5
q1
0.5
0.5
X12X2 X12 X6X14
q2
q3
X8
X9 X11 X9
X1
X7X5 X10
X13 X10
−0.5
X3
−0.5
X11
X3X1
X13 X7
X5
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
q1 q2
X12 X6
X14
q3
X11 X9
X13 X10
−0.5
X3 X1
X7X5
−1.0 −0.5 0.0 0.5 1.0
q1
0 1 2 3 4
0 1 2 3 4
370
876372
878
373
879 372
878
373
879 370
876
369
875 369
875
764 669
163
258668
162
167
673
164
670 764669
258 163
162
668
167
673
164
670
226
732
268
774263
769 263 774
769 268 226
732
234
740 234
740
187
693 229
735225768
731 262 158
664 158693
664
262
768 187 225 735
731 229
775
269
233
739 775
269 233
739
205
711
196
702204789
710 205
711
204
710196
702
283
281
787 733
227 259
765
265
771 259
765
265
771 281
787
733
227 283
789
284
790 605
99254687
760 408
914 605
99
284
790 408916
914
181
183
689 261
767 980
474871
365
410
916 181
687
183
689 261604
767 410 474 760
980 254
604
98 18098 257871
365
f2
f2
257
763 264
770 361
867
366
872 264
770 366
872 763 361
867
203
709 180
686 228766
232
738
734 260
267 161989
773 260
766 686
267
773 228
734 203
709
232
738
193
699
292
798191
697
282
788
305
811
280
786
780
274307 736
813
182
688
190
696 184
690
238
744
230 224 667
730 483
159
665 160 865
666
482
988 359
480
986
360
866
392
898
358
864
464
970363
869
367
873
423
929 368
874
376
882
160 690
666 161
667
182
688
159
665
184 274813
780 307730
224 280
786
368
874238
744
305
811
230
736
363
869
367
873
282
788
376
882
292
798 365
8
480
986
360
866
190
696
392
898
358
864 423
929
464
970
193 988
699
59697
191 483
989
482
547 53784
41511
509 100
606304
810
783
276
782
277
275
781
278 188
694 176
682 235
179741
685 221 672
223
729
727 166
169
675 473
979
481
987465
971
151
657 362
868
460
966
463
969 443
949
454
960927
421
357
863917
411
918
412 151
657 169
675166
672
100
606
509179
685
3 607 221
727
176
682 5
511
275
781 235
741
783
276
782
277
223
729 443
949
278
784 362
868
188
694
547
41
927
421
357
863
918
412917
411
454
960 304
810
460
966
463
969 473
979
465 987
971 481
510
4 189
695 218
724 101
607 170
676 488
994470
976 948
442
364
870 218
724
170
676 101 510
4884 948
442
364
870 189
695 470
976
488992
994
200
706
562
56
199
705 345
851
192
698 308
814
306
812 102
608
237
743 266674
772 168
165
671
133
639 486
992
484
990 152
658
154
660
977
978
471
472
466
972
487
993 462
968
469
975
467
973
461
967
468
974
433
939453
959
983 950
444
953
447
394
452
958
477 900920
414
409
915377
883
378
884
928
391
897
422 381
887
379
885 152
658
154
660 133
639266
772 168
674
165
671 102814
608 308306
812377
883
378 950
444
409
915
379
885 920
414
381
887
953
447
237
743
394
900
928
391
897
422 462
968
453
959
199
705
452
958 345
851
461
967
468
974
983
477 200
706
562
467
973
192
69856
977
978
471
472
466
972
433
939469
975 486
485990
484
197
703
201
707
564
58
571
65 546
40
291
797
300
806
596
253
75990
279
785
602
96
186
692
296
802 720
214
178
684
325
831680
174309
815
315
821
236
742
220
726
1001
495
618
112
231
737
312
8181005
499
485
991
678
172
132
638 155
661
156
662
653
147
149
655 940
479
985
459
965
432
938
456
962 396
902
455
961
449
955
434
397
903
448
954
450
956413
919
395
901
451
957
431
937 436
942
403
909
382
888
428
934
435
941
440
946 374
880
380
886
384
890
155
661
653
147
149
655 156
662 596
132
638
90
186
692 602
96
680
174
678
172
220
726315
8211001
495
309
815
618
112
312
818
178
684
1005
499
720
214
374
880
279
785
384
890
396
902
413
919
231
737
546
40
403
909
397
903
296
802
236
742
382
888
325
831
440
946
380
886
197
703
449
955
395
901
436
942
448
954
450
956
291
797
571
65
455
961
434
940
479
985
957
435
941 201
707
459
965
432
938
451431
937
456
962428
934 58993
564487
300
806
991
253
759
342
848
194704
700 6587
293
799
198512 81592
86
548
42 303
809
250
756
297
803
252
758326
832
595
89
600
94
778
272
273
779
1010
185
691504
514
216
722
545
39
321
827
75
581 175
681
207
713
322
828
683
177
319
825
8523
723
217
1011
327
833505 314
820
173
679
17
617
1000
494
623
117
1002
496
728
222
115
621
108
614
109
615
119
625
118
624
105
611
536
11130
310
816
219
725
131
637
493
999 140659
129
635
646
677
171
136
642
138
644 153 650
144
150
656 458
964
930
424
475
981
476
982
457
963
146
652
143
649
425
931
446
952
429
935
478
984383
889
947
390
896
441
437
943
445
951407
913
375
881
419
925
418
924
387
893
1010
504 650
1011
505
144
150
656
146
652
143
649 153
659129
635
140
646 131
637
185
691
595
136
642
138
644 89493
999
219
725 173
679
677
171
108
614
109
615
1000
314
820494
115
621
723
217
105
611
310
816
319
825
342
848
175
681623
117
273
779
194
700
119
625
118
624 375
881
592
86
1002
496
728
222
600
617948913
514
216
722407
512
207
713
778
272
683
177
536
30
111 6545
39383
889
548
42
387
893 390
896
523
297
803
17
321
827
322
828 446
952
587
81
293
799
326
832
947
441
445
951
198
704
75
581
418
924
327
833437
943
419
925930
424
476
982
478
984
303
809
475
981
457
963
429
935 458
964
425
931 250
756
252
758
354
860
285
791 559
301
80753 757
251
549
43570
344
850
750
239
745
244
550
589
8344
588
8264599
755
249
577
71
240
746 93
753
247
597
91 209
715
206
1008
502
712 126
632
320
826
627
121 313
819
120
626 620
114
113
619
110
616
106
612
103
609
116
622
104
610
107
613
1004
498 134
640
137
643 157
663 933
148
654
427 402
908
420
926
430
936
404
910
398
904438
944
405
911
389
895
393
899
426
932 126
632 148
654
627
121
1008
502 157
663137
643134
640
597
91106
612
313
819
107
613 620
114
113
619
110
616
116
622
104
610 209
715
285
791
599
93
103
609
120
626
320
826
1004
498206
712 549
43
389
895
550
44 402
908
398
904
588 404
910
438
944
405
911
577
393
899
828371
589 344
850
559
53 420
926
430
936
426
932 239
745
240
746 750
244354
860
757
251
570
64
933
427
755
249
753
247
195 507
701 290
796
1805
572
66 590
288
794
299 84
580
579
591
857374
565
59
63513
569
330
836
294
800
598
92
594
334
84088
302
808
241
747
603
97
7829
582
76
544
38213
719
323
328
834
584
78
601
95
1009
503 520
14
211
717
208
714
543
37
583
77
542 123
629
628
122
522
16
525
215
721
36 19
318
824 528
22
535
29
125
631
521
15
526
20
1003
497
1007
501 634
1006
500
311 128
817 135
641
489
995
130 648
636 145 923
651 417
416
922
439
945
400
906
406
912
386
892
385
891
401
907
388
894
399
905 123
629
628
122651
1009
503
125
631 145 141128
634
507 598
1 99892
489
995
130
636135603
641 1003
497
1007
501
1006
500
211
717 195
701
213
719
528
594
88
601
95 22591
208
714
535
29
215
721
542
318
82436
311
817 85 544
15525
543
37
521
526
20 19513
520
14
38
522
16
386
892
583 406
912
7800
294
323
829
385
891 580
328
834
5907
401
84
334
840
388
894
7778
590
399
90584
74
400
906
923
417
330
836
416 808
922
582
76
439
945
579
73 302
241
747569
63
796
572
66
288
794
301
807
290565
59
299
805
563
352
858
349
855
57 708
350
856 202
348
854 289
795
560
54 2749
508
574
68243
593
87
551
558
5245578
336
842
335
841
576
70
331
83772
295
801
339
845 519
777
271
566
60
748
242
13
586
80
585
79
248
754
270
776
341
847
340
846
329
835
337
843
553
552
46 518
12324
830
516
10 210
716
524
18
317
823
718
212
316
822 630
124
533
27538
531
2532
633
127
529
23
647
141
492
998
139
645 142 415
921 630
124
633
127
647
648
142
139
645 492 297
210
716
508 270
776
718
212317
823593
316
82287
538
32
529
23
777
271
524
18
533
531
2527
339
845
350
856
324
830
551
341
847
553 45
340
846518
337
843
552
46
519
1213
295
801
336
842
516
10
586
558
52
80
585
79
578
560
415
921
202
708
335
841 54
72 574
331
837
329
835 68
748
242
576
70
243
749
349
855 60 858
289
795
248
754
348
854
566
57 567
563
352
47
298
804
338
844567
61 47338
844 298
804 61
351 862
857 356 67557
573 51 556
50245
751
246
752568 119534
62517 28
532
26 541
530
2435
540
34
527
21 541
35
530
24
540
34
527
21 534
28
9556
532
26 50 351
857
517
11 557
51 56862573
67245
751
246
752 356
862
255 839
761
286
792
256333
762
353
859 346
852
343
849
55
561
355
861 347
853575
69554
48506515
1012 539
33
537
31 490
996
491
997 1012
506 490
996
491
997 343839
849 554
33348
537
31 515
539
33
286
792 255346
761
256852
762 347575
853 69
55
561 353
859
355
861
−2
−2
287 838
793 332 555
49 332
838
555
49 287
793
−2 −1 0 1 2 −2 −1 0 1 2
f1 f3
ML − Factor Scores
2
484
990
254
760 356
862 486
992
481
987
485
991
483
989
352
858353
859
355
861
253
759 482
988
488
994
354 806300299
805 252
758753 470
976
860
564
562
5658 699 301 756
807 250247
565
757
25159
570
64
755
249 487
993
466
972
978469 933
975
465
971
472
473
979
977 427
200
706
201
707 348
854
193
191
697290
796
192
698 750
244
288
794
289
795 566
56960
63248
754
245
751
246
752 980
474471433931
939
467
973 425
458934
964
572
66304
810
239
745 567
61 459
965 428
1
349
855 303
809
243
240
746
573
67 749
241
747 463
969
464
970
460
966
468
974 431
937
432
938
475
981
457
963
455
961
57 851
563 345
190
696189
695
55
561 302
808
576
70
574575
68 748
242
69 568 62 359
865
361
867
360
866
365
871 461
967
480
986
423
929
983
477
462
968 930
424
456
962
451
957
434
940
479
985
476
982
454
960
453
959
4984
9 35
41
429
935420
926
437
943
917
411
478426
932419
925
702 703
196 197
199
705 292
798
198
704 291
797
559
293
79953 344
850331
837
330
836
326
832327
833 392
898
358
864 452
958
362
868 449
955
448
954
450
956 923
417
430
936
436
942
446
952
920
414 381
887
416
922
418
924
405
911
204
710
257
763
205
711 571
65
547
41 587
81 188
694
560
54346
852
577
229
735 580
853
579
73
7172
233
739
557
296
802
589
83 51 230
736
7578
347 4831
238
744
329
835
75
581586
80
582
76 410
916 918
412
394
900
395
901
953
447
948
442
928
422
927
357
863
421
391
897
950
444 9 439
945
4947
441
438
944
404
910
45
51
400
906
203
709 283
789 202
708
282
788
546
40 305
811
278
784 590
84334
840
335
841
558
52
234
740
280
786 294
800 585
79
325
328
834
232
738
584
78237
743
235
741
323
829236
742
231
737 363
869364
870
443
949
367
873 396
902
368
874
397
903
376
882413
919
440
946
382
888
403
909
409
915 393
899
402
908
390
896
398
904
379
885 401
907
388
894415
921
406
912
255
761
256
762 783
277
276
782297
803
588
82336
842
295
801 298
804
519
13
322
828
321
827 523
17 370
876 366
872 377
883
380
886
383
889 399
905
385
891
3511
8
281
787
350
856 51781
57 548
27542
549
43
550
279
785306
812
44 733
227 516
10
72518
513 12223
729
730
28
34
224
583
77
324
830
520
14
522
16 371
877
369
875 408 884
914 378
384
890389
895
387
893
386
892
0
510
287
79345 6 813
512 551
30745
308
814
226
732 337
843
338
844
339
845
225
731552
46 545
340
84639
544
38
341
847
556
50 517
11
525
19 373
879
372
878 407
913
778
272
176
682553
47
6
1207
713
83
77
514
216
722
720
214206
712 8 221
727
521
15 374
880
f3
285286 780
791792 274591
85
600
94
775
269
592
8687 777
271
599
93209
715543
37
102
608
208
714 515
536
320
82630
617 9
526
20532
533
27
1002
496
120
626
535
11129 26
534
524
182
728
222
1004
6234988539
33 375
881
194
700
195 839
701 333 593
273
779
594
88 178
684 542
55436
48
213
719
215
721
319
825
101
607 117
618
112
318
824
316
822103
609
531
528
1001
495
1000
312
81849425
2223
118
624538
119
62532
5497
29
537
31
311
817
541
35
284
790 342
848 332
838 179
685601
270
77695 309
815
175
681555
49
315
821 1005
499
317
823 1003
104
610
115
621540
116
62234
530
24
527
1007
501
1006
50021
168
674
605
99
509
398 268 724
774
602
96
343
849 211
717
680
174
723
217
218 310
816
314
8207 110
616
18
2716
12
105
611 620
114
113
619 166
672
187 604100
606 220
726
173
679 210
109
615
108
614 165
671
693 180
686 596
902603
508
186
692
97668
164
670
162
167
673
163
669 266
772
313
819 107
613
106
612
678
172
170
676
677
171
169
675
493
999
182
688 263
769
597
91
595
89 262265
768 261
767 219667
725
267
773
158
664
264
770 161
133
639132
638
134
640
159
665 135
641
181
687
183
689 598
184
69092
258
764
185 771
691 260
766 489
995
136
642
131
637 130996
636
492
998 156
662
259
765 137
643
138
644
129
635 490
491
997
507
1 128
634
140
646
647
141
160
666
139
645
155
661 157
663
153
659
154
660
−2
151
657
152
658648
142
653
147
150
656
149
655146
652
650
144
148
654
143
649
145
651
1008
502 627
121
1012
506
1011
505
1009
503 628
122
126
632
1010
504 123
629
125
631630
124 633
127
−2 −1 0 1 2
f1
2
990
484 486
992481
987483
989
482
988 470
976
469
975 483
989
482
988
481
987484
990
486
992
470
976
469 934
975
485994
991 488
487
993
466
972465
971
933
427
433
939
473
979458
964
425
931 980
474428925
934 419917
411 980
474 473
979
488
994
465
971485
991
487
993
917
411
433
939
466
972 428
458
964 933
427
425
931419
925
978
472
977
471467
973
459
965
431
937455
961
432
938
463
969
464
970
460
966
930
424
456
962 480
986
423
929
435
941
359
865
451
957 361
867
365
871
437
943
420
926
434
940 381916
887 410 876370
371
877
373
879
372
878 370
876
877
371373
879
372
878 480
986
361 865
867
365
871
410
916 359 978
472
929
460
966467
973
977
471
463
969
464
970
423 381
887
455
961
459
965
431
937
432
938435
941
451
957
930
424
456
962
434
940 463
9 37
43
420
926
457
963
475
981461
967
429
935
468
974479
985
983
477
476
982 454
960
360
866
417
462
968
478
984 436
942
426
932
923358
864
416
922
430
936 918
920
412
414
405
911
4
918
24
392
898
446
952
438
944
947
441
448
954
449
955
453
959 927
421
406
912
948
442
357
863 376 408 875
882
368
874 369 369
875 360
866
392
898
358
864 454
960
368
874 461
967
918
412
468
974
462
968 920
414
983
477
927
376
882
421 479
985
959
948
442
357
863 448
954
449
955
453 4952
9
436
942
475
981
476
98257
429
935
446
947
441 426
932
405
911
418
924
923
417
430
936
478
984 416
922
438
944406
912
452
958395
901
362
868
450
956
439
945
953
447
928
422
4
9 413
919
415
921
404
910
394
900950
444
445
951
382
888
396
902
391
897443
949
363
869
364
870
01
07 366 914
379
885
377
883
399
905872 408872
914 362
868
366 869
363443
949 452
958
364
870394
900
950
444
953
447
928
422
377
883413
919
395
901
450
956
379
885
396
902
391
897 382
888 445
951 439
945
404
910401
907415
921
399
905
400
906440
946
393
899
390
896
398
904 380
886
403
909
402
908
367
873
388
894
409
915
397
903 378
884
385
891
387
893
383
889
389
895 367
873 409
915403
909
397
903
378
884 380
886
440
946
390
896 402
908
393
899
387
893
398
904
383
889 388
894
400
906
385
891
389
895
386
892
384
890 407
913 384
890407
913 386
892
1
760
254 375
881
374
880 254
760 374
880375
881
252
758
753
247
253
759
250
756
565
59
757
251 233
739
229
735 229233
735739 253 756
759 252
758
250 753
247
757
251565
59
570
64
755
249 304
810
193
699 238
744
230
736 234 732
740 226 234
740
226
732 304 806
810
238
744 570
64
755
249
352
858
356
862
353
859 300750
806
299
805 248
754
569
244663
566
60
245
751
567
246
752 191
697
1809
303
192
698 232
738
235
741
237
743 227731
733 225 227 699
225 733
731 193
191
697 230
736
232
738 300743
235
741
192
698 303
809
223237
352
858
750
244 63 754
569
299
805 248
356
862
566
6061245
751
567246
752
301
807
355 564
861 290
796
288
794
289
795 239
745
240
746
243
749 568
241
747 189
695190
696
62 832
345
851 231
737
236
742 730
224
228
734
223
729 164
670
167
673
163
669 164
670
167
673
163
669 190
696228
734730
224
189
695
345
851729 301
807
231
737
239
745
290
796
236
742
240
746 243 859
749
288
794
241
747
289
795
353355
861
568
62
0
162
668
0
58
562
56
354 854
860 200
706 302
808
748
242 327
833
326 188
694
325
831 523
17 221 775
727 269674 162
668 775
269 56188
694
200
706
562 564
58727
221
354
860 326
832
325
831 302
808
327
833
523
17 748
242
f2
305
811
f2
572
34866 575
201
707 576
70 344
850292
798
196
702
580
74
330
836 75
581
582
76323
829
328
834
586
80204
710
584
78 283
789
280
786 728
222 168 268
774
166
672 158
664 268
774 283
789
204
710
196
702 158
664 305
811
292
798
280
786 201
707 166
672 75
581582
572
168
67466
344
850348
85476576
323
829
328
834
728
580
222
74
330
836 70
586
80
584
78
573
67
349
855 69
574
68331
837 578
329
83572
577
71
579
73 296
802
585
79205
711322
828
321
827
583
783
277 520
7714
525
522
16
545
39
306
812 19
307
813 1002
4961001
1004
536
30498
623
120
626
117495
1005
4622
99 170
676
165
671
678
172 169
675 205
711 307
813783
277 296
802
306
812 1001
495
169
675 322
828
321
827
170
676
165
671
349
8551002
496
577578
71
545
39579
73
117 585
79
520
574
68
14 573
525
19
522
16
331
837
72 67
329
835
1004
498
583
678
172
623
536
3077
120
626 575
69
55 703
561 197
291
797
293
799
559
53
560
54
557
51 587
81
334
840
257
763
589
83
335
841
590
84294
800519
297
803
298
804
547
41295
801
336
842
588
82
13
282
788 324
830
276
782
516
278
78410
513
518
12738
517
11
275
781
279
785
544
281
787
308
814
207
713521
21615
526
20
524
514
818
534
28535
539
33
320
826
176
682
532
26
533
27
720
722
214 29
617
111103
609
102
608
528
22
538
32118
624
541
35
529
531
2523
319
825
1000
311
817494
618
112
101
607
1003
497
119
625
312
818
1007
501
309
815115
621
116
620
114
1006
500
104
610110
616 677
171 132
638
261
767
161
667
263
769
159
665
262
768 258
764
156
662 263
769
258
764 281
787
262763
768257 282
788
547276
782
278
784
275
781
41770
197
703
161
667
291
797
308
814
176
682 102
608
293
799
101
607
587
81
159
665
279
785 618
112
559
297
803
720
214 5381005
514
312
818
589
309
81583
207
713
216
722
319
825
588
82
499
1000
494
294
800 103
609
519
118
624
513
7
119
625
334
840
617
111
320
826
115
621
560
54
590
84
13
335
841
544
38
295
801
324
830
516
1003
497
156
662
518
521
15
12
526
20
677
171
1007
501
528
116
622
22
620
114 10
311
817
524
535
2918
298
804
1006
104
610
132
638
336
842 500
110
616557
51 517
55
561
538
32
533
27 11
534
28
529
23
531
25 532
26
541
35539
33
57 852
563
347
853
346199
705
198
704
571
203
709
546
65558
5240 548
42
549
43
550
44
338
844340
846
4847
510341
339
845
337
843 5777
511
778
272
683
177
515
9543
206
71237
209
715537
780
274
208
71431
215
721 540
34
530
178179
684685
213
719
318
824
316
822
542
36 317
823
175
681315
821
527
24
21
310
816
314
820
680
174
113
619
105
611
109
615
108
614
218
724107
613266
772135
641
493
999 267
773
133
639
264
770
265
771
260
766
136
642
157
663
259
765 160
666
154
660
155
661151
657
653
147
150
656 259767
203
709
265
771
765 261
780
274
511 199
705
267
773
5264
260
766179
685
546
40
571
65
4 724
510
160
666
198
704548
4243
178
684
218 680
683
177
315
821
133
639
266
772
5 49
550
4
151
657
174
206
712
209
715
4
57
563
175
681
778
272 154
660
314
820 113
619
493
999
543
37
558
105
61152
340
846
208
714
310
816
108
614215
721
213
719
339
845
109
615 341
847
318
824
337
843
155
661
542
36
653
147
107
613
347
853
157
663
135
641
346
852
316
822
317
823
338
844
150
656
136
642
540
34
515
530 537
31
921
527
24
202
708 551
45
552
46
556
50
553
647
600
94
271599
93 718
212
95 717
211210
716
723
217 220
726
173
679
313
819106640
612 134130
636
131
637
489
995
137
643
492
998138
644 153
659 152
658
650
146
652
144
149
655 600
94
220
726152
658
599
93
202
708
173
679
723
217 551
131
637 134
640
4546
153
659
777
271
106
612
211
717
313
819 650
552
149
655
210
716
137
643
138
644 130
636
489
995
146
652
144
556
553
4750
718
212492
998
−1
−1
512 592
86273
779
601 605
99 219 996491
997
490 140142
646
129
635 143
649
148
654 187 60599 604 6592
512 86779
273 601
95 140
646
129
635 143
649
148
654 142 996491
997
490
591
85
554
48
593
8788
5776
94
270
555
49 602
96 98693
604 187
725 141 648
128
634
647
139
645
145
651 693 98 602
96
591
85
594
88219
725
593
87270
776 128
634554
48145
651
647
141648
139
645 555
49
255
761 351
857
350
856 100 686
3606 180 181
687 181
687180688
686 100
606 350 857
351255
761
256
762 509
603
97 186 688
692
596
90 182
184
690183
689 183
689 509
1823 596
184
690 186 856
692
90 603
97 256
762
287 792
793 285
791 194
700
333
839 284
790 508
2 595
89
597
91 185
691 284
790 285
791
194
700 595
89
185
691 597
91 508
2 287
793
286 701
195
332
838342
848343
849 598
92 342701
848 195 598 286839
92 792 333
343
849332
838
−2
−2
627
121 126
632
628
122 633
127 627
126
632
121
628
122 633
127
507
1 1008
502505123
629
1011125
631
630
124 1011
505
507
11008
502 123
629125630
631124
1012 5031010
5061009504 1010
504 1009
503 1012
506
−2 −1 0 1 2 −3 −2 −1 0 1 2
f1 f3
491
997
490
996
555
49 539 415
921
33
537
31 527
21923
540
34
400
906
439
945
416
922 4905
399
388
894
9 386
892
01
07
385
891
406
912
1012
506 648
142
332 517
838 11 9532
51526 541
35
530
24 426
932
417 405
911
393
899
438
944
404
910
402
908
398
904389
895
387
893 139
645 633 145
651
127 148
534
28 647
1
554
48 529
23 430
936
420
926 418
924
419
925 492
998
130
636141630
375
881
861
355 55
561
287
793 347
853
575
246
752
245
751
256
762
69568
346
852 62839
333
343
849
556
50 933
427531
533
27
425
93125
316
822
524
18
538
32
718
212
457
963
317
823
311
817
476
982
478
984
429
935
210
716
930
424 437
943 445
951
947
441
390
896
446
952
440
946383
889
380
886
384
890
382
888
407
913
489
995
135
641 374
880
128631
634 157 654
124
663 143
649
146
652
650
144
353
859
356
862 573
67 567
255
761
566
60
61 557
286
79251
242
576
70 329
835
558
74852
331
837
298
804
551
45
516
338
844
337
84310
553
47
552
46
518
12
519
13
324776
830
341
847
340
846 777
271
583
77 270
525
522
1619526
20
521
15
215
721
542
36
475
981
458
964
535
29
318
824
528
22
432
938
1004
498
431
937
459
965
717
120
626
435
941
428
934
1006
500
1003
497
451
957
456
962
455
961
104
610
103
609
211
4613
9
450
956
448
954
1007
501
449
955
116
622
479
985
434
940 620
114
110
616
36
42
395
901
107
113
619
403
909
397
903
928
422
391
897
394
900
953
413
919
381
887134
640
379
885
677
171
396
902
106
612
920
447
414 377
883
137
643 1125
140
646
138
644
136
642 6
628
122
129
635
1009
378
884503 23
29
156
662153656
659150
149
655
653
147
248
754
351
857
574
68 841586
578
72
335 80
585
79339
845
295
801
336
842 584
78
328
834
323
829 520
593
87
544
3814 208
543
37
508
714
2
601
95
603
97
213
719
320
826
536
30 983
477
118
624
461
967 452
958
313
819
453
959
119
625
310
816105
611
115
621219
725
109
615
108
614 409
915
493
999
950
444 627
121
131
637
132
638
1008
502
678
172 126
632 155
661
348
854289
795
288
794 243
749560
54590
241
747
302
808
202
708
569
63 84
334
840
579
73 582
76
294
800
580
330
836
74 513
7 206
712
594
85993
59148788
469
975
209
715 617
433
939
111
467
973 728
222
468
974
598
1002
49692
462
968
623
117
597
91
723
2171005
499
1000
494
173
679 364
870
948
442
917
918
411
412
357
863
927
421 1011
505 154
660
152
658
57
563 485
991 466
972
977
471
599
93 8 5071 314
820 443
949376
882
−1 0
299
805
352
858 565
59
753
247
572
66755
249
290
796
349
855 350
856
240
746
239
745
750
244
570
64 589
83
344
850
484
990 550
588
577
718244
327
833
75828
581
549
43 322
488
994
321
827 978
472
470
976
523
683
17
177
207
713
545
39514
273
779
216
722 175
681
319
825
463
969
460
966
312
818
315
821
595
89
618
112 454
960
220
726
185
691
362
868 165
6711331010
639 504 657 151
301
807 757
251
252
758 559
53 486
992
326
832
297
803 778
272
325
831 236
742
600
94
592
86231
737
465
971
178
684 423
929
1001
495
680
174
464
970
309
815 358
864
392
898 266
367
873
772
168
674
363
869 170
676
368
874 169
675
250 809
756 303587195
701 6987
548
42
512481743 473 608
979
237
720
214 186
692
102 480
986
360
866
359
865 166 665
672
354 806
860 198799
704 29381
285
791 296
802
194
700 482
988
279
785 602
96 596
90101
607
221
727 218 872
724 159 666 160
300 291
797 306
812 223
729 366 667
f3
253
759 571
65
192
698 546
40 342
848 483
989308
814
235
741
176
682 361
867
365
871 161
564
58 345695
851 189 179
685 410
916
201 703
707
200197
56706
562
199
705 304
810 188
694
510
4783
278
784 275736
781
276
782
277
5
511 230
238
744 3730
509
232
738
224
100
606
980
474 688
228
734 182
184 773
690 267408
914
260
766
264
770
547 282
788
41696
190 786 780
274
280 813
307 261
767
191
697
193
699 292 811
798 305 180 689
686
604
98 687 183
181 259
765
265
771
203
709 733
227
254 763
760 257 605
99
283
789281 739
787 233 775 269 262664
768 158
204790
710 284 229 731
735 225
187
693
196711
702 205
−3
234
740 268 769
263
226 774
732 167
673
369
875
164
670
162
668373
879
372
878
258
764
163
669
370
876
371
877
−2 −1 0 1 2
f1
0.4
X7
0.4
X9X5 X3 X5 X14
X1
0.0
q2
q3
X10
X7 X13
X6
0.0
X2 X3 X1
X12 X11
X8
−0.4
X8 X11 X10X9
−0.4
X13 X2
−0.5 0.0 0.5 −0.4 0.0 0.4
q1 q2
PC − Factor Loadings
X12
0.4
X7
X14 X5
X3
0.0
q3
X13
X6
X11 X1
X8
−0.4
X10
X9
X2
−0.5 0.0 0.5
q1
0.5
0.5
X8 X14
X2
−0.5 0.0
X12 X6
q2
q3
X8 X2
X9
X10 X1 X5
X7 X3X7
X13X11
X5
−0.5
X3
X11 X1
X13 X10
X9
−0.5 0.0 0.5 −0.5 0.0 0.5
q1 q2
X8 X14
−0.5 0.0
X6
q3
X2
X11X13 X3X7
X5
X10 X1
X9
−0.5 0.0 0.5
q1
0 1 2 3
0 1 2 3
268
774
234
740 365878
871 365
871 234
740 268
774
225
731
233
739 372 233
739
372
878 225
731
229733
775
735
269 262 664
768
227 158 373
879 373
879
229
735 775
269227 768
733 262664
158
205 789
711 187
693 259
765
265
771 369
875 205
711 369 693
875 259
765
187
265
771
196
702204 283
710 281
787
254
760 702
196
204916
710254
760 283
789 281
787
99604
605 98181 744
687 261
767
232
738 264
770
260
766 160
666
980
474867 410
916
376
882 410980
474 960 376744
882 232
738 99604
605 98
261
767
264
770260
766 160
666
181
687
284
790 183 238
689 228
734 161
667 483
989 361454
960 284
790
483
989 361
867
454 238228 773
734 161 689
667 183
203
709
257
763 193
699 307
813 267
773 408
914 918
412 412 709
918 203
257
763
193696
699 307
813
408
914 267
190
696 780
274 736
180
686 230741 482657
988
159
665
169
675
166
672 464
970
358
864 455
961
381
887
451
957
927
421 917
411 455
961
451
957917
411 482 927
988
190 464
970
381
887
421 358
864230
736 780
274 169
675 159 657
665
180
686
191
697
305
811 221
727
730
224 170
676 151 480
986
359
865461
967432
938 432942
938 191
697
461
967 305
811
480
986
359
865 235672
166
730
224221
727170
676 151
f2
f2
292
798 280
786
282
788 304
810
100
606
783
277
509
276
7823 682
188
694 184
690179
685 223
729
235 362
868
360
866
433
939
156
662
463
969 378
884
434
940
452
958456
962
950
444
443
949377
883
948
442
357
863
396
902 436
942
437
943
428
934
435
941
420
926 428
934433434
939940
437
943
456
962
420
926 436798
435
941 292 304
810
188
694 362
868
360
866
783
277
452
958156
662
377
883950
444
948
442
276
782
463
969 357
863
396
902280
786
378
884
741
282
788
443
949 223
729 179
685
100
606 509
3 724
547
41 5 155
661
473
979 462
968 953
447 446
952 446
952 547
41 473
979953
447
462
968 155690
661 184
197
703
562
56199
705
200806
706
201
707 300
511
510275
781
278
784
189
695
4851
345
192
698
253
759
596
90
308
814176 743
595
89684
218
724
309
815
237 618
112 671
315
821 168
674481
987
486
992
165 465
971
978
472
152
658
392
898
154
660
460
966
983 431
937
394
900
453
959
423
929
363
869
653
147 382
888
450
956
403
909
467
973
459
965
477 397
903
448
954379
885
928
422
364
870
449
955
146
652
479
985 458
964
923
417
430
936
380
886
457
963 419
925
438
944
416
922 458
964
923
417
467
973 431
937
419
925
200
706
201
707
457
963 430
936
300
806
562
438
944
416
922
197
703
56 705
199345
851189
695
459
965
192
698
486
992481
987
253
759
465
971
423
929450
956 392
898
928
422
978
472
275
781
278
784
460
966
364
870
479
985
382
888
403
909
394
900
453
959
983
477 379
885
397
903
448
954
308
814
449
955
380
886168
674
363
869 45618
511
176
682
510
237
743 112
653
147
146
652
309
815
596
315
821
218
90660
154
165
671
595
89 152
658
58848
564 342
198
704 291
797
301
807
546
40 548
587
8142
279
785296
802
182
688
306
812
602
96
303
809
344
850 297
803
178 680
174
1010
504
312
818 101
607
102
608
220
726
314
820
319
825
266
772 115 678
621
623
117
172
133
639
677
171
977
471
487
993
488
994
466
972
470
976
132
638
131
637
468
974
469
975
150
656
395
901
391
897
476
982
157
663933
427
367
873
930
424
402
908
429
935 426
932
445
951
947
441
409
915 933
427
930
424
426
932429
935
564
58 198 797
704
301
807 291 488
994
466
972
546
40 469
975
344
850445
951
977
471
487
993
468
974
476
982
470
976
303
809 367
873
279
785
587
81
402
908
395
901
306
812
947
441
409
915
342
848
391
897
157
663
101
607
102
608
296
802
548
42602
96
623
117
297
803
178
684
115
621
312
818
677
171
319
825
266
772
678
172
680
174
133
639
314
820
132
638
131
637
220
726
182
688
150
6561010
504
571
65 250
756
252
758
592
86 720
214
325
831 236
742 103
609 485
991 485
991
103
609250
756
252
758 571
65 236
742
325
831 720
214
592
194
700
285
791
354
860195
701
349
855
293
799
507
299
805
352
858
1
53
290
796
240
746
241
747
239
745
6750
512
559244 273
779
302
808
757
251
330
836
588
82
600
94
570
64326
832
598
92
597
91
186
692
778
272569
63
755
249
321
827
599
93322
828
75
581
175
681
207
713
216
722
327
833
683
177582
1011
7677
505
231
737
514 118
624
1001
495
88161005
499
536
30
728
222
313
819
310219
725
620
114
109
615
116
622
113
619
119
625
1007
1000
494501
484
990
493
999129
635
138
644
136
642
140
646
650
144
153655
659 143
649 425
931
440
946
398
904
475896
981
149
920
414
401
907
384
890
383
889
390
404
910
439
945
406
912
418
924
425
931 439
945
299
805
352
858 354
860 418
924
349
855
484
990
293
799
290
796
920
414
285
791
757
251
570
64
755
249
240
746
239
745
750
244 241
747
302
808
569
63
194
700
195
701
475
981 559
330
83653440
946
398
904
326
832
401
907
406
912
588
82
327
833
404
910
582
76
231
737
321
827
384
890
600
390
89694
778
272
1001
322
828
383
889
75
581 599
93495
512
536
30
119
625 686
116
622
207
713
216
722
683
177
118
624
493
999
1005
273
779499
109
615
1007
1000
494
8620
514114
728
222
175
681
501 113
619129
635
138
644
507
191
598
136
642
597
310
816 9
313
819
186
692
140
646
1011
2 659
153505650
144
149649
219
725
655 143
572
350
85666
348
854 549
589
289
7958343577
243
749
550
44
591
8571
294
800
580
508
274513
7
601
95
545
39 320
826
5
723
217
1008
502
584
78 83
173
679 120
626105
611
108
614
535
29
126
632110
616
1004
498
104
610 492
998 478
984405
911
400
906 572
348
85466 289
795 405
911
400
906
243
749
350
856 589
83
478
984 577
71
580
74
584
78583
77
549
43
513
7
545
39 105
611
320
826
601
95
108
614
120
626
535
29
294
800
110
616
550
44
1004
498
591
85
104
610 723
217
492
998 173
679
5081008
502
2 691 126
632
563 202 794
57 708 288849
590
84
343 753
247
594
88
565
59
334
840
579
73
331
837
603
97
748
242
335
841 248
754
185
691
295
801 323
829
209
715
518
544
3812
585
79
542
36
206
712 523
17
516
10
543
37
586
80
328
834
519
13
617
111
517
11628
122
1002
627
121
520496
211
717123
629
521
1415125
631
528 106
612
1006
500 137
643
134
640 135654
641
647
141
489
995 148366899
872 399
905
368
874
386
892
393
374
880413
919 57
563 202794
413708
919 288753
247
565
59
368
874 248
754 331
837
748
242590
84523
17 323
829
617
111
334
840
399
905
579
73
585
79
386
892
586
80 135
641
335
841
366
872
328
834
519
13
393
899
520
14
518
12
517
11
1002
496
516
10
521
15
543
37
374
880
209
715
594
542
206
712
295
801
343
849
544
38528
88
36
1006
500 137
643
134
640
603
97
106
612 211
717
647
141
489
995 185
503654
148628
122
627
121123
629
125
631
286
792351 560
857 54
558
52 329
835
551
45
87
336
842
574
68 578
339
845
593
566
576
706072
270
776
341
847
777
271
340
846
213
719318
824
1009
503
522
16
208
714
324
830
568
62
317
823
316
822
526
20
22538
32
107
613
1003
524
18497
529
23
630
124
534
28541
35 128
634
139 651
645
130
636 145 389
895
387
893
388
894 351
857
566
60286
792574
68
576
70
568
62
560
329
83554
541
35 558
388
894
534
28
52
522
16
389
895
578
72 339
845
336
842
387
893324
830
524
18
777
271341
847
526
20
340
846
22
538
32
551
45
593
270
776
52987
1003
497
23
317
823
318
824
107
613
316
822
213
719
208
714130
636128
634
139 1009
645 145
651
630
124
255 859
761 333557
839
573
67 51 337
843
552
46 298
804
554
481012 525
19 531
311
81725
533
27 353 573
67 255 751
761 525 557
1983951 913
333 337
843298
804
533
27
311
817531
25
552
46
353 346
852 338
844
553
47
347
853567
61
245
751
556
50
506 9716
515
215
210
718
212 633
127
530
24
540
532
2634
539
33
527
490
996 407
913
385
891 415
921 859 415
921 567
61539
245385
891
33 346
852
347
853532
407
26 537
338
844
540
34
215 9 48718
553554
47
490
996
530
24
515
556
50
527 212 1012
210
716506 633
127
256
762
287 862
793 575
69 246
752 721 21
537
31 491
997
648
142 375
881 287
793 256
762 246
752 575
69 721
375
881
31 21
491
997 648
142
356 356
862
−2
−2
332
83855
561 55
561 332
838
355
861 555
49 355
861 555
49
−2 −1 0 1 2 −3 −2 −1 0 1 2
f1 f3
PC − Factor Scores
125
631
123
629
126
632 6
130
24
633
127
628
122
2
627
121 143
649
145
651
650
144
148
654
1010
504
1011
505 151
657
152
658 149
655
183
689
181
687 184691
690 1009
164
670
503
163
669
18510081012
506
167
673
162
668
502 219 666
725
158
664
160
153
659150
656
182
688 218
724
259
765 220
726 159
665 155
661
180
686
693509
187 258
764
595
89
508
6922768
263
769
186
598
92 265
771
262 260
766
173
679
264
770 313
819
211
717
555
497
210
716
161
667
218
12 165 635
671 154 648
660
139
645
140
646
129
128
634
142
1
6043596
198
507 5
9091 97 261
767
680
174723
217
315
821
314
820
267
773
309
815310
816 113
619 678
172
170
676
620
114 138
644
131
637
133
639136
642
137
643
132
638 647
141
130
636
492
998
179
685 554
48 317
823 106
612
169
675 134
640489
995
99 606
605 100268
774
592
603
97
226
732
225
731
8687
178
684
594
88
733
227
593
175
681
556
50 266
772
213
719
542
36
209
715
730
224
316
822
318
824
208
514
8
714
319
825
221222
727728
312
818
618
112
320
826 115
621
109
615107
613
91006
118
624
515 500
1007
501 677
171
530
24
1003
497
105
611
108
614 369 653
493875
999
527 146
652
147
490
996
512 602
96
591
775
26985
297
803
176
682 553
552
46
6 216
722
47207
713
206
712
720
214
5776
270
295
801 298
804
601
95
83
1843
77 1005
499
623
117
1000
494 116
622
110
616
531
528
22
1004
498
120
626 25
529
23
538
104
61032 21 491
997
456
511
510
281
787 548
42
780
274 551
34
234
740
273
779
550
343
84944
549
4
296
802294
800
233
739 544
38
232
738
337
513
340
846
339
845 7
545
238
74439
223
729
228
734
543
37
518
341
847
338
844 12 517
516
10
237
743 11
101
607
324
830
215
721
102
608
231
737
535
29
1001
495311
817
536
30
1002
496533
27
119
625
524
521
15
526
2018 166
672
371372
877878
540
34
537
31370
876 373
879 374
880375
881
283
789 280
786
282
788 778
272
229
735
600599
93
777
271
336
842
94
335
841
334
840
230
736 322
828
321
827
325
831 236
742
235
741
323
829
519
13
328
834583
77111 674
617
520
14
522
16 168 135
641 366
872
443
949
157
663 384
890
383
889 407
913
387
893
386
892
558
52 579
73 584
78
585
79 363
869378
884
408
914 390
896 389
895
393
899
−1 0
342
848 307
813
308
814
332
838
560
54 577
557
517172
326
832
580
7475
581
327
833586
80
582
5787 6523
17 534
28
532
26541 358
864 397
903
376
882 380
886
398399
904905
440
946
391
897
379
885
950
382
888
444
396
902
357
863402
908
403
909
948
442
395
901
449
955
394
900 401
907
404
910388
894
406
912
571
65 781 588
82
f3
559
53
275
333
839
276
782
783
277306
812
279
785 346
852
347
853
329
835 35 156
662
361
867
360
866
359
865 448
954
453
959
983
477
392
898
454
960
462
968
362
868 947
441
377
883
367
873
364
870
381
887
194
700 278
784
305
81181 590
589
5878384
330
836
331
837
574
68
576
70 69 568
575 62 52519 539 464
970
460
966
480
986
463
969 479
985
452
958 409
915
927
421
478 891
984
928
422
953
447
284
790
195
701
286 810
792 241
747
302
808
240
746
748
242
243
749569
63 248
754246
752
567
61
245
751
33
365
871
980
474978
473
979977
472
471
465
971
487
993
475
981
468
974 450
956
476
982
423
929 405385
400
906
911
351
857
350
856 304
190
696188
694
239
745
303
809
570
64566
60 483
989
482
988
481
987 470
976469
975 445
951
285547
791 546
41 254
760
40699
193
750
244
759
189
695
289
795
288
794
192
698
757
251
344
850
253252
758755
249
753
247
565
250
75659 486
992
484
990
488
994
485
991466 967
972 410920
916
461
459
965
414
368
874
205
711
257
763
204
710 256
762
255
761
292
798191
697
345
851
293796
799290 103
609
196 709
702 203
199
705
197
703 291
797
202
708
287
793 418 921
436924
942 415
354
860 349
855
198
704 66573
572
301
807
57 806
563
67 434
940435
941 413
919
562
56 300348
854
299
805 918
412
429
935
432
938 430
936439
945
446
952
564
58 353 561
859 55 431
937 917
411 438
944
416
922
200 858
706
201
707
352 433
939 456
962
467
973961
437
943
923
417
420
926
455
457
963
451
957
419
925
426
932
930
424
428
934
458
964
355
861 425
931
933
427
−3
356
862
−2 −1 0 1 2
f1
3
371
877 371
877
370
876 365
871 365
871 370
876
164
670
163
669
167
673 164
670
167
673
163
669
764
258162
668
226
732 372
878373
879 226769
732 258373
764 372
878
162
668
263
769 410
916 918
412455
961
451
957 455
961
451
957 918
412 740 410
916 263879
2
268
774234
740
233
739
225
731 917
411 428
934
437
943 428
934 917
411
437
943 233
739268
774
234225
731
369
875 376
882
980
474 432
938
433
939
436
942
434
940456
962420
926
923
417
446
952 433
939 432
938
420
926
456
962
923
417446
952
434
940 436
942980
474733 376
882 369
875
158
664
733
227
229
735 454
960
483
989
361
867 435
941
431
937
467
973 419
925
458
964 467
973
458
964 431
937 419
925
435
941
229
735
483
989 227361454
960 158
664
262
768
775
269205
711
204
710
196
702 254
760 421967
408887
482
988
914 381
927 461 430
936
429
935
438
944
416
922
930
424
457
963
426
932
933
427 205
702711
196 204
710 254933
760427
930
424
457
963 426
932
429
935
438
944
416
922
430
936
482
988 461 867
775
269
967 262
768
381
887
927
421 408
914
259
765
265 738
771 464
970
358
864480
986
362
868
359
865 452
958
377
883 459
965 425
931 425
931 459
965 464
970
480
986358
864
362
868
452
958
359
865 259
765
265
771
377
883
378
884
360
866
950
444 953
447
450
956 439
945 439
945 953
447
450
956
360
866 378
884
950
444
1
948
1
283
789 264232 156
662442
357
863
463
969 423
929 283
789 232
738423
929 156
662
463
969 948
442
357
863
187 787
693 281 261
767
770
260
766 160
666
238
744
203
709
228
734
257
763 193
699
161
667190
696 166
672
396
902
443
949
473
979
462
968
465
971
392
898
382
888928
422
403
909
379
885
481
987
460
966
453
959
448
954
394
900
486
992
983
477
364
870
397
903
449
955
978
472479
985
380
886 445
951
414 924 418 203
709
257
763 193696
699 190
281
787
486
992481
987 473
979
465
971
238
744
418
924187
693
228
734
978
472
462
968
445
951928
422
261
767
392
898
460
966
453
959
983
477
479
985
396
902
443
949
264
770
382
888
403
909
448
954
394
900
364
870 379
885
260
766
397
903
449
955
166
672 161
667
380
886
160
666
605
99
604
98
284 773
790 267
307
813 230
736
159
665
191
697
169
675
304
810
170
676
292
798
363
869 402
908
977
471468
974
395
901
488
994
487
993
391
897 476920
982
469
975
466
972
470
976
409
915 413
919 284
790
292
798
191 810
697 304 307
813 605
99
488
994
466
972
230
736469
975
413
919
470
976
604
487
99398 920
414773
468
974
476
982
977
471 267363
869
402
908
395
901
391
897
409
915 169
675
170147
676 159
665
f2
f2
181
687 274811
780 305
730
151
657
224
221
727223
729188
694
235
741 189
695
345
851200
706
201
707
168
674
300
806
154
660 146 663
652
653
147 367
873
947
485
991
441
103
609
484
990 200 806
706
201
707 300 189
695188811
694305 103
609 485
991
484
990780
274 235729
741 730
224
223 168
674
221
727 367
873
181
687 947
441 146
652
406653 155657
154
660 151
183
689 179
685
783
277
276
782
280781
786 547
41
197
703
199
705155
661
192
698
562
56253
759 157 440
946
398
904
383
889
384
890
406
912
401
907
475
981 405
911
400
906 368
874 562 199547
56 703
197
705 41851
345
192
698
253 784
759 783
277
276
782 280
786 368
874
179
685 405
911
400
906
475
981 183
689 912
440
946
157
663
398
904384 661
401
907
383
889
890
100
606282
788 275
278
784
309
815 618
112
308
814237
743 564
58807
198
704
102
608
678
172
101
607 301 390
896
404
910
478
984 415
921 564
58 301797
807
198
704 275
781
278 282
788
308
814 237
743180618
415
921
102
608
100101
606607 112
309
815
478
984 390
896
404
910 6 78
1671
72
0
180
686 176
682
218
54724
511 152
658
165
671 291
797
133
639
677
171
252805
758
344
850
250
756
303
809 399
905 291 850252
758
344
250
756
303 812
809 5 682
511 176 686 677
171
218
724 165
133
639
399
905 152
658
509
3 596
184
690 90
510 315
821
680
296
802
174 306
812
312
818115
621
546
40 150
656
240
746
623
117 132
638
131
637
293
799
750
244
241
747
236
239
745 493
999
757
251
570
64 352
858
299 386
892
393
899 352
858
299
805 546
40
293
799 750
244240
746
757
251
570
64
241
747 306510
4 509
3
596
90 623
117
312
818 115
621
680315
821493
999
184
690 132
638
386
892
393
899131
637 150
656
595 178
684
220
726
89 602
342
848 96 279
785
587
81
319
825
314
820
266
772 742
325
831354
860
231
737
118
624
326
832 138
644
129
635
302
808
1001
495
536
30
1005
499
114
650
144
75
581 755
572
66854
249
290
796
136
642
349
855
116
622
569
63
119
625
109
615
620 348 641 374
880
366
872 389
895
388 891
894
387
893 354854
860 572
34866
349
855 290 745
285796
239 587
302
808
755
249 81
569
63
342
848 279 832
785 326 296
802
325
831
75
581 236
742
602
96 231
737
178
684 174
1001
495
536
30
319
825118
624
1005
499
595
89
119
625 314
820
116
622
266
772 109
615 220
726
366
872
388
894
620
114 138
644
389
895129
635
136
642374659
880
387
893 650
144
1010
182
688 504 548
42
297
803
571
65
194
700
592
86 720
214
285
791
273
779
321
827
588
599
93
514
559
53
600
94
207
713
216
310
816
722
330
836
322
828
577
71
153
659
728
327
833
222
882113
619
582
76
143
649
494140
646
243
749
289
795
1007
501
149
655
1000
105
611
288
794
583
77110
616
535
29
108
614
120
626 59 135
492
998
753
247
565
248
754
523
17
104
610
137
643
617
111
1004
498
202
708 407
913 385 791 289
795
288
794
194
700753754
247248
243
749
565
59571 330
836
65559
53588 548
8257742
327
833
600
7194321
827
322
828720
214
297
803
582583
76
599
93
592
86
273
779
523
17 77
207
713
216
617
722
111 1000
494
8611
514105
110
616
535
29
108
614
182
688
120
626
104
610
135
641
728
385
891
222
1007
501
310
816
1004
498 113
619 492
998140153655
646
1010
504
407
913
137
643 149 143
649
1011
505
512195
701 313
819
175
681
683
177
778
272
6 549 320
826
513
7
545
39
350
856
601
95 584
580
589
837478
323
829
331
837 134
640
57
563
1002
585
79496
521
15
748 489
995
647
141 541
35 57 708
563 202 856195
701
350 566 589
83
331
837 580
74
778
272584
78683
177
513
323
8297
545
585
7939 541
35320
826
175
681
1002
601
95496
521
15 313
819 134
640
1011
505
489
995
647
141
507 597
1692
18691
598
92 219
725
294
800
43
723
217
173
679
550
44
591
85 518
12
579
73
590
334
840
209
715
542
3684
516
10242
517520
586
80
328
834
329
835
543
544
38
335
841
206
712 37 14
1006
500
11
106
612538
528
2232
128
634
107
613
522
148
654
519
13 16
145
651 566
60
130
636
139
645
568
524
1862534
28
573
67 353
859
525
19996 33862
490539 356375
881 356 859
862 353573
67 857 60748
242
590
84835
329
568
62512
579
736
586
334
840
335
84180
328
834
525
19
519
13
550
44520
294
800
549
43 14
518
507
112
522
16
539
33
544
38
591
85 517
534
28
516
10
543
3711
597
91
209
715
542
206
71236186
692
524
18 1006
528
598
92
22500
538
32
723
217 106725
612
107
613
173
679 219
128
634
130
636 375 654
881
139
645 148
145
651
1008
508
2
126
632502
594
88
603
97
343
849 211
717
295
801
560
54
558
52
213
719578
317
823
318
824
351
857
316
822
208
714
339
845
336
842
72
576
70
574
68
324
830
255
761 529
23
1003
497
526
20 245
751
567
533
27
531
25 61 532
26 255
761 351 576
70
560
54
574
68
245
751
567
61 578
558
52 72
343
849
339
845
336
842 295
801
594
88526
20
324
830
603
97
532
26 213
719
533
27529
23
1003
497
318
824
316
822
531
25
208
508
714
2 502996
211
717
317
8231008 490
126
632
628
122
185
691 551
45
286
792777
271
341
847
270
776
593
87 340
846
557
298
80451 311
817540
530
2434
246
752 491
997
55 861
561 55
561 286 839
792 246
752
557
51 843 551
777
27145
341
847
340
846
270
776
593
87 311
817
298 537
804 540
34 530
24 491
997
1851009
691 628
122
123
629
1009
503
627
121
125
631 337
843
333
839 346
852
338
844 515
9721
347
853
256
762575
69 537
31
527
21 355 355 793
861 333 346
852
347
853
575
69 337
338
844 31
9527
515 21 503627121 1631
6 23
29
125
630
124 552
46
554
1012
506
633
48
553
47210
716
127556
718
212
50 793
215
287 648
142 287762
256 552
46
553 215
47721
556
50
554
48 718
212210
716
1012
506
648
142630
124
633
127
−2
−2
332
838 332
838
555
49 555
49
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2
f1 f3
125
631 630
124
151152
657658 633
127
150
656149
655
160632
666 123
629
126 648
142
628
122 153
659
159
665121661
627 155 154
660
129
635140
646 139
645
146
652
128
634
647
141 374
880 375
881
163
669
167
673 158 1010
664 504
1011
505 165
671
219
725 678
172 131
637 653
147
138
644
136
642
132
638 492
998
137
643 130 996
636
489
995 490 386
892
491
997 407
913
387
893
389
895
164
670
162
668 161
667 1009
503
369
875 133
639 134
640 384
890
383
889 399
905
1
372
878 170
676
169
675 1012
506 210
716 380
886 390
896 393
899
406
912
401
907 388
894
259
765
183
689
371
877
260
766
184
690
264
770
265
771 218726
724
370
876 220373
879
1008
502
185
691 313
819
173
679 113
619
620
114
211
717
166
672
677
171
109
615
493
999408
914
106
612
718
212
376
882107
613
378
884
443
949
555
49
3897
8
382
888
950
403
909
444
363
869
397
903
157
663
396
902
357
863
948
442 79
85
402
908
391
395
901
449
955
377
883
448
954
394
900
530
24
381
887
527135
641
21
366
872
398
904
440
946404
910
947
441
367
873409984
915 385
891
764
258 262
768
263 687
769 181 767
261 267815
773 309315
821
680314
820 310
816
723
217 118
6241007
728
222
115116
621622501
316
822
318
824
1005
499 1006
317
823500
1003
497
358
864
110
616
108
614
105
611454
960
538
32
529
51523
362
868
156
662
531
25 364
870
927
421
453
959
452
958
392
898
9986 9
479
985
540
34
953
44728
431
22 478
405
911
182
688
595
179
685
221
727
89 692
598
92
186
7302618
174
597
91
224
112
508 266
772
319
825
312
818 8623
514
175
681 117
101
607 168
674
1000
494
208
714
554
48 104
610
1004
498
120
626
1001
495
119
625
213
719
320
826536
30
528
361
867
535
2922360
866
359
865
464
970
1002
496
983
477
462
968
480
460
966
533
463
969
27
311
817
524
18
537
450
956
468
974475
981
476
982
400
906
445
951
920
414874
368
226
732
187
693
268604180
686
733
227
225
731 596
90 228
734 223
729
178
684 603
97 209
715
102
608
207
713
365
871
216 542
36517
11
543
37 521
15
324
830 526
20
215
721977
471
410
916
534
978
47228 423
929
541
35 415
921
774 98 509
3 738 722
237
743
720
594
88 206
712
231
737
601
95 556
50
516
51810
298
804
12 617
111 532
26
−1 0
233 606
234
740
739 100 232
507
1682
238
744
176 214
235
741
297
803
592
86 683
177236
742
295
801
593
87 583
552
46
544
513
738
545
39 77
553
47
270
776
341
847
338
844520
980
47414
522 465
971
16979
473 461
967
487
993 459
965
469
975
470
976 539
33942 418919
924 413
775
269
605
99 602
96 591
85
273
779
296
802 322
828
321
827
294
800
551
599
9345 340
846
337
843
323
829
339
845
584
78
328
834
523
585
7917 436
f3
229 511
735 780
274 230
736
548
42
512
6 550
44
549
43325
831
343
849
600
94 519
13
777
336
842
75
581
327
833 586
27180
483
989
582
76 482
988 481
987 488
994
525
19466
972 435
941446
952 439
945
438
944
416
922
54
510 778
272326
832335
841
334
8407472 568
579
73
577578
71580 486
992 918
412
485 940
991 434917
411430
936419
925
281
787
283 788
789 307308
813814
280
786
282 306
812588
82558
52
330
836
560
54 557
3
851
346
852
329
835
332
838
31
37 347
853 62
575
69
484
990
103
609 432
938431
937
456
962
455
961
451
957 417932
437
943
429
935
923
420
926
426
275
781
276
782
783
277 279
785
559
53 590
84 576
70
748
242
248567
574
68
754 61246
752
245
751 433973
939 457
963
930
424
428
934
467964
342
848 305
811
278
784
571 81589
65 587 83
333
839
241
747569
302
80863 566
60 458425
931
188
694
304
810 240
746
303
809
239
745 243
749
755
249
570
64
757
251 753
247
565
59 933
427
190
696
254700
760 189
695
253
759 250
756
750
244
252
758
344
850
194
195
701 286
792
192
698 289
795
351
857
191
697
193
699 350
856
345
851 288
794
290
796
292
798
547
41 546
40 293
799
284711
790 204763
710
205 257 285797
791 291 255
761
202
708 256573
762 67 55
561
203 703
709 199 704
705
197 198 301
807572
349
85566854
287
793
348
196
702 300
806
354563
860 352 859
299
805
57858
353
562
56564
58
200
706
−3
201
707 356
862355
861
−3 −2 −1 0 1 2 3
f1
1.0
X14
X6
0.4
X9X5 X12 X7
X1 X3 X5 X14
0.0
X13
q2
q3
X10
X7 X6
X11
0.0
X2X12 X3 X2 X1
X8 X10X9
X11
X8
−0.4
−1.0
X13
−0.5 0.0 0.5 −0.4 0.0 0.4
q1 q2
PF − Factor Loadings
1.0
X7
X14X12 X5
X3
0.0
X13
q3
X6 X11 X1
X8 X2 X10
X9
−1.0
1.0
X14
X6 X1X9
X10
0.5
X5
X3X7
X2 X13 X11
X12
0.0
q2
q3
X8 X6
X9 X2 X14
X5
X10 X1 X7 X8
X12
−0.5
X11 X3
−1.0
X13
−0.5 0.0 0.5 −0.5 0.0 0.5
q1 q2
X9 X1
X10
X5
X11X13 X3X7
0.0
q3
X2 X6
X8 X14
X12
−1.0
1.0
X1
X5X7X3 X13
X9X10
0.5
X11
X7
X14 X12 X5
X3
0.0
X13
q2
q3
X6 X11 X1
X8X2 X10
X9
−0.5
X12
X6 X2
−1.0
X14 X8
−0.4 0.0 0.2 0.4 −0.5 0.0 0.5
q1 q2
X5X7X3
X14 X12
0.0
X13
q3
X1X6 X11
X2
X9X10 X8
−1.0
0 1 2 3 4
0 1 2 3 4
371
877
370
876 371
877
370
876
163
669 369879
875 373
372
878 373
879
372
878
369
875 163
669
258 162
764 668
164
670
167
673 162
668
164
670
167 764
673 258
263
769
226
732
268
774 226 774
732 268 769
263
234
740 234
740
225768
731 262 664 158 229731225
269 664 158
187 735
693 229
775
269
233
739 739
233 735 775 262
768
187
693
196711
702205204789
710 196 710
702 205
711
204
283
281605
787 99
254
760 733
227 771 259
765
265 365 410
871 254 980871
760 365 733
227
283787
789 281
605
99 259
765
265
771
284
790 980
474 408
914 474 867 410 914
408 284
790
604
98
181
687
183
689 261
767260
766 361 916
867 361916 98 767
604 261
260
766 181
687
183
689
203
709
257
763 264
770 160359
666 203 763
709 257 264
770 160
666
f2
f2
193
699 180
686 232
738
238
744228 773
734 267 667 161483
989 366
872 483699
989 193 238
744232
738
366
872228
734 161
667 180
686
191
697
292
798 305
811 307
813
190
696780 230
736 482 865
988 360
866
358
864
464
970 480
986 482 697
988 191917 359
865
190
696
480
986
464
970 360
866
358
864 230786
736
292
798 305813
811 307 267
773
547
41511
5509
282
788
274
280
786
304
810
276
782
100
606
783
277
3784
275
781
278 184
690
182
688
188
694 179741
685
730
224
235 223
729
159
665
166
672 473
979
481
987
169
675 465
971
151
657363
869
392
898
362
868
463
969
460
966454
960
367
873
423
929
368
874927
421
443
949
357
863
918
412
376
882
917
411 481
987 465
971
473
979 411
463
969454
960
423
929
304
810
460
966 392
898
918
412362
868
927
421
188
694
363
869
376
882547
443
949
357
863 41
367
873
368
874235
741 5 780
280
730
224
511
282
788
276
782
783
277
278
784
223
729275
781
274
166
672
100
606169
675
179
685
159 690
665
509
3 184151
657
182
688
200
706 510
4 192695
345
851 189176
682 221
727
101
607 486
992
170
676488
994467
973462
968
154
660 364
870
455
961
461
967
433
939
152
658 948
442
950
444 381
887 200
706
486
992 488
994 455
961
189
695
467
973
433
939 462
968
461
967
345
851
381
887 948
442
364
870
950
444 510
4682
176
221
727 170
676
101
607 154
660
152
658
562
564
58
197
703
56707
199
705
201
342
848 546
40
300
806
698
253
759
291
797
308
814
596
90
306
812
602
96684
279
785
218
724102
608
237
743
1010
504309
815 266 674
772
312
818
1001
618
112
315
821 495
168
133
639
165
671
484
990
485
991466
972
978
472
487
993
132
638
470
976
977
471156
662
155
661
468
974
451
957
434
940
456
962
459
965
469
975 953
447
452
958
453
959928
422
394
900
432
938
983
477
431
937
458
964
653
147 391
897
449
955428
934
436
942
435
941
448
954
479
985395
901377
883
378
884
396
902
920
414
437
446
952
450
956943
409
915403
909
397
903 379
885
419
925
382
888
484
990 470
976
201
707
485
991
300
806
978
472
469
975
759
487
993 197
703
192
698
466
972
428
934
977
471
253 562
432
938
431
937
458
964
564
58
56
451
957
199
705
434
940
456
962
459
965
435
941
419
925
452
958
453
959
983
477
468
974
437
943
479
985449
955
448
954953
447
928
422
394
900
4952
9 36
42
920
414 377
883
396
902
391
897
395
901
446
450
956 378
884
379
885
403
909
397
903
382
888 237
743
546
40
409
915
291
797 342
848 279
785
168
674
102
608
308
814
306
812
602
96165
671
309
815
312
818
1001
495
618
112 315
821
596
90 133
639
156
662
218 661
724
266
772
132
638 155
653
147 1010
504
571
65 198
704512
6293
799587
81 296
802
252
758
186
692
250
756
548
42 178
326
832
595
89 325
831
720
214 680
174 523
17
231
737
236
742
319
825
1011
505
314
820 1005
499 678
172
131
637 153
659 457
963
930
424
150
656429
935413
919
440
946
420
926380
886
947
445
951
441 252
758
250
756 457
963
930
424
420
926
198
704429
935 947
445
951
441380
886
413
919
440
946
571
65
326
832325
831
523
296
802
587
8117
231
737
236
742
293
799 548
42512 720
214
178
684678
172
319
825
61002 680
174
314
820
1005
499 595
89726131
637
186
692 153
659
150
656
1441011
505
354
860
285
791194
700
195
701
592
86
303
809
757
251
344
850
297
803
570
64
750
244 778
272
273
779
600
94 321
827
322
828
327
833
216
722
545
39
75
581
683
177 175
681
514
8829
207
713 220
726
1008
502
1000
617
111
520
14
494
536
30
103
609
623
117115
621
109
615
1002
496
118
624
105
611
310
816108
614
493
999171140663
129
635
138
644
136
642
677646 157
655650
144
146
652
149
425
931
933
427
475
981
476
982430
936
390
896
143
649
438
944
384
890
383
889
923
417418
924
402
908
426
932
398
904374
880
404
910
416
922
407
913 933
425
931
427757
251475
981
476
982
923
417
354
860
426
932
570
750
24464 438
944
430
936
303
809
418
924
344
850
416
922 402
908
404
910 390
896
75
581
398
904
384
890
383
889
327
833285
791194
700
321
827
322
828
297
803
407
913 536
30
103
609
623
117
545
39
514
374
880
520
195
701
14683
493
999
1000
494
592
86
216
722677
115
621
778
272
496
829
207
713
118
624617
111171
175
681109
615
273
779
105
611
600
94
310
816108
614220157
663
129
635
138
644
136
642 140
646 650
146
652
149
655
143
649
1008
502
350
856 301
807
507
352
858 1559
53 549
43
239
745
588
589
8382330
836
755
249
240
746
550
334
84044 185
691
753
247
565
59
569
63
597
91
577
71
598
92
599
9544
580 3 323
723
217
209
715 173
679
522
16
320
826119
625
525
19104
610
728
313
819
222
535
29
528 114640
110
616
620
1004
498
22
120
626116
622
113
619
106
612
126
632 134
137
643
135
641 148
654 478
984 405
911
439
945
387
893412
9
375
881 06 352 805
858 301753
807 755
249
247 565
59240
746405
911
478
984
239
745
330
836
569
63
350
856 406
912
439
945 559
53
387
893
588
577
589
8382
323
829
71 522
16177
549
43
550320
826
525
19
44535119
625
1004
120
626
375
881
334
840 110
616
104
610
728
222
620
114
528
498
22
116
622 173
679
599
93
113
619 134
640
313
819
597
91
135
6411 643
507
723
598
92
217
106
612 185654
691
137 148 126
632
57855
563 349 572
348
854
351
857
66
299
805
202
708 290
796 591
590
288
79484
289
795
560
54
85
335
841579
243
74973
339
845
336
842
74
513
594
88
294
800
241
747
302
808 603
97
248
754
7
38
328
8341009
206
712
582
76 503
213
719
543
37
584
601
9578
542
36
586
80
519
13 211
717
583
77
208
714
521
15
219
725
526
627
12120
628
122
318
824524
18
123
629
1007
501
107
613
1006
500
1003
497
538
32
128
634130
636
489
995
311 998
817 492
647
141 145899
651 393
389
895
401
907
386
892
385
891
400
906 399
905
388
894
415
921
299 56357572
66
348
854349
855
290
796
288
794
248
754 241
747
202
708
243
749
351
857
289
795 302
808 393
899
580
401
907
400
906
74
582
76
399
905
389
895
579
73
584
388
894
586
8078
590
84
385
891 513
328
834
386
892
583
77
519
13
7
294
800 521
15
526
335
841 20
544
38
591
524
18
336
842
53885
32
209
715
1007
501
206
7121006
500
594
88
1003
497
213
719
543
37
311
817
603
97
542
208
318
824
714
339
845
107
613
489
995
211
717
36998
492
601
95 130
636 128
634
219
725
647
141 1451009
651 503 627
121
628
122
123
629
356
862
558
52
331
837
508
574
682 578
341
847
566
60
329
835
593
87
295
801
340
846
551
45
576
70
337
843
748
242 585
79
72
777
271
518
12
270
776 324
830
516
10 533
316
82227
317
823
215
721 125
631
210
716 529
531
25
534
282335 645
541 139 648
142 356
862 331
837
329
835
576
7068921
574
566
60 415
748
242 560
585
7954
578
72518
12
516
10
558
52
324
830
295
801 533
551
4527
534
28 777
271
341
847
593
87
316
822
340
846
529
23
531
25317
823
337
843
541
35 508
270
776
215
721 2
210
716
139
645
648
142 125
631
255
761
286
792 353
859 343
849
573
67 338
844
557
51 552
46
567
61
553
47
245
751
298
804
568
62 1012
506 532
718
21226530
24
540
34
630
124
527
21633
127 353 861
859 245
751
573
67 567
255
76161568
62 343
849
557
286
79251298
804 552
46
532
26
553
47 530
24
338
844
540
34
527
21 718
212 1012
506 630
124
633
127
256
762861 333
839
35555 346
852
347
853 556
50554
246
752
575
69 517
11 539
33
537
31 355 561 246
752
575
69346
852
256
762
347
853 333
839
517
11 539
33
556
50537
31
48 515 490
996 55 9554
48 996
490
287 561
793 332
838 9 491
997 287 838
793 515
332 491
997
−2
−2
555
49 555
49
−2 −1 0 1 2 −2 −1 0 1 2 3
f1 f3
PF − Factor Scores
3
125
631
123
629 630
124633
127
628
122
627
121 126
632
1010
504
1009
503
1011
505
1008
502
2
1012
506
149
655 145
651
143
649
148
654
650
144
181
687 152
658
151
657
139
645
647
141
154
660
153
659653
147146
652
648
142
150
656
183
689
182
688
184691
185 128
634
160
666
140
646155
661
180690
693 686
187 186
692
258
764
263
769 768 259
765
265
262771
261
767
219 637
725
260
766
264
770
267
773 131
129
635
138
644
137
643
136
642130
636 157
663
218
724723
217 158
664 134
640
133
639
1
507
31 596
509 508
90
598
92
597
289
268
774
59591 601
95
270
776
220819
726
167
673
162
668
163
669
164
670
173
679
211
717
266
772
313 106
612
107
613
108
614
210
716
718
212
109
615
159
665
161
667 135662
132
638
641
492
998
156
489490
995996491
997
273
779
600
94599
93
179
685 680
174
175
681 314
820
215
721
315
821
213
719 105
611
310
816 1006
500
113
619
110
616 169
675
677
171
170
676
165
671
678
172
99604
98
100 594
88 542
36 555
491000
494
317
823 1007
501493
999
166
672
f3
284
790 281605606
787 780
274
307
813 339
845
775
269
337
843
338
844
593
308
814
602
96
592
86 87
778
272603
97
341
847
340
846
720
214
178
684 209
715
543
77737101
607
208
714
102
608
206
712
271
216
722 309
815
319
825
316
822
617
111 1005
499
104
610
312
818
318
824
1001
495
1002
496 1003
497
115
621
528
22
728
119
625
222116
622527
62021
530
114
24
168
674
540
34
311
817
226
732
591
85
306
812225
731
176
682
336
842 553
47
544
38
683
177 554
48
207
713 618
112
221
727
320
82620
535
29
524
18
103
609
533
623
27
117
118
624
536
30 529
531
25
1004
52649823
537
538
3231
541
35879
539
33 373
372
878
283
789 512
511
510
45
282
788 275
781
6811
305 279
785
335
841
334
840
234
740
276
782
280
786 550
44
549
43 552
551
4546
733
227545
39
514
556
50
2
7828
34
730
224 515
9120
626
521
15
525
19 534
28
532
26 369
875
371
877 375
881
374
880
548
42 324
830
0
205
711 342
848
204
710 194 784
195
701
700 783
277
278
333
839332
838
229
735
297
803 321
827
322
828
295
801
294
800513
7
519
13
298
804 522
236
74216
223
729
520
14 370 872
876 366 408 890
914 407
913
257
763285
791 546
40 293
799
292
798
291
797 558
52
588
82
590
84
560
589
587
8154
83 233
739
296
802
557
51
577 232
738
230
736235
518
741
12
328
834
325
831 231
737
523
17
517
583
323
829
51677
1011
237
743 367
873 384
383
889
378
884 387
893
386
892
389
895
65547
571 41
286
792 559
53 343
849 7172
579
73
326
832
238
744
5 78
327
833
585
58079
75
74
581 584
78
586
780
5826 363
869368
874
3
8443
949
64
70409
915 377
883
390
896
440
946
397
903 385
891
380
886
398
904
379
885
393
899
403
909 388
894
287
793 568
62 362
868
392
898
358
864 410
916442
928
422
927
421
953
447
394
900 376
882
413
919
396
902
391
897
357
863
950
444
948 382
888
445
951
395
901 402
908
400
906 399
905
401
907
415
921
406
912
196 709
702 203 856 188
694
344
850302
808
346
852 748
242 360
866
361
867
365
871 462
968452
958
453
959450
956
449
955920
414
448
954
918
412
446
952
436
942 947
404
910
441
439
945
438
944405
911
418
924
350256
762
255
761
202
708
345
851
351
857 304
810
190
696 574
68
239
745347
853
569
63
576
70
243
749 567
566
809
240
746 60
241
747
303 61
575
329
83569 359
865 454
960
983
477
423
929
480
986
461
967
468
974
460
966
434
940476
982
479
985
429
935478
984
430
936
923
417416
922
426
932
437
943
420
926 381
887
354703
860
562
56 199
705349
855
198572
704 290
796
288
794
289
795 330
836
570
64
331
837
189
695
565
59 246
752
245
751 980
474 463
969
464
970 456
962
451
957435
941
930
424
457
963
475
981
455
961
459
965 917
411 419
925
58197
564 66
348
854 55
561
750
244
573
67 755
249
757
251 248
754 473
979467
973
977
471
466
972 432
938
431
937
458
964 428
934
57 699
563 191
697
193192
698
253
759 250
756
252
758753
247 465
971
978
472 433
939 425
931
201
707
200806
706 254
760 485
991 487
993
488
994 469933
975
470
976 427
301
807
300 355
861 482
988
481
987
483
989
486
992
299
805
356
862
−2
484
990
353
859
858
352
−2 −1 0 1 2
f1
3
163
669
164
670
162
668
167
673 163
669
670
164
162
668
167
673
764
258 226
732 226
732 258
764
2
263774
769268 740
234 365
871
980 234
740 365
871
268410
774 263
769
229739
735
225
731
158
664 233 408
914 474
410 989
916
361
867 483 483
989 229980
735
739
233 474
225
731 361 916
867 408
914
158
664
262775
768 269
227 760
733 254 359
865
360
866
358
864
482917
988
480
986
464
970
376
882 411 254
760 482 733
988 775
269
227970359
865
464 480
986
360
866917
411 768
358
864 262
376
882
196
702 366 869
872 392
898
362
868
363 465
971
454
960
423
929 481
987
918
927
463
969
412
421
473
979
460
966
357
863 455
961
486
992 381
887
428
934 196
702 481 976
987
486
992 465
971
473
979 463
969
428
934
460
966
455
961 454
960
423
929392
898
918
362
868 366
872
381
887
927
412
421
363
869
357
863
187 711205 259
765 367
873368
874
443
949 948
442 433
939
488
994470
976 205
711 488
994470433
939 368
874
443
949
948
442
367
873 259
765
1
204
710 469
975 204990
710
1
693 265
771
283
789 467
973
461
967
950
444
462
968
364
870 451
957
434
940 4
9
484
990 35 925
41
432
938
459
965 419 283
789
484 469
975
467
973 451
957
461
967
187
693 435
941
462
968
434
940
432
938
459
965 419
925950
444 265
771
466
972
953 4942
436
920
414
978
472
456
962
977
377
883
471 487
993
458
964
431
937
437
943 466
972
487
993
978
472977
471458
964456
962
431
937 4
9
437
94336
42
364
870
920
414
953 377
883
f2
447
f2
193
699 452
958
394
900
453
959
928
422 3
8
485
99179
85 193
699 452
958
453
959447
394
900
928
422 379
885
605
99
281
787 261
767
604 264 666
203
709
770
98 766
260 160 238
744
232
738
230
736
228
734
161
667 191
697
190
696
730
224 166
672
396
902
378
884
391
897954
449
955
983
477
403
909
397
903
409
915
48
446
952
395
901
479
985
450
956
468
974 382
888
457
963
413
919 420
926
930
424
947
441
380
886
445
951
440
946 4
9
429
935
438
944
430
936933
427
25
31
418
924
923
417
426
932 203
709 190 991
191 696
697 485
281
787 605
99
238
744
232
738
230
736 228
734
933
427
604
98425
931
730
224
983
477
468
974
457
963
930
424479
985
429
935
448
954
446
952
449
955
450
956
420
926
430
936
923
417
426
932
396
902
395
901
391
897261
767
397
903
947
441
438
944
378
884
382
888
403
909161 666
667
264
770
380
886
260
766
409
915
413
919
445
951
166
672
440
946
418
924 160
284
790 257
763 267
773
292
798
280 657
786 159
665304
810
169
675
235
741
188
694
200
706
223
729189
695
170
676 390
896
384
890
383
889475
981
402
908
476
982 416
922
404
910
478
984405
911
439
945406
912 200 759
706 257
763
189
695284
790
304
810 292 811
798
188
694 786 235
280741 223
729475
981 476
982 416
922
478
984 405
911
439
945402
908
4
9
267
773
404
910169
675
390
896 06
12
159
665
384
890
383
889
170
676
181
687
183
689 305
811
307
813
780
274
547
41 151221
727
345
851
154
660201
707 192
698
168
674
156
662
253
759
165
671 374
880 398
904
407
913387
893
393
899 401
907 201
707 192
698
253 345
851 547
41 305 307
813
780
274 221
727 181
687
183
689168
674398
904
401
907
165
671
393
899 407
913
387
893 156657
374
880
662 151
154
660
180
686 5
511 276
782
783
277 197
703
101
607
152
658 237
743300
806 375
881389
895
386
892399
905
400
906415
921 300703
806197 5
511 276
782
783
277 237
743 101
607
1
6 80
86 389
895
399
905
415
921
400
906 386
892 152
658
282
788179
685
278
784
4781
275 56705
176
682 133
639
102
608
199 252
758 388
894
385
891 56705
199758
252
4784
278
282
788 275
781 176
682 102
608
179
685 132881
133
639
388
894
385
891 375
0
100
606 562 155
661
309
815 653
147
132
638
1001
495
312
818
618
112
325
831 678
172
523
17
231
737 250
756 562 250
756 325
831 100
606
523
17
231
737 309
815
618
1121001
495
312
818 678
172 638 155
661
653
147
184
690
182
688 3 510
509 308
814
218
724
596
90 306
812
546
40
602
96
266
772
564
291
797
279
785
58
720
214315
821 326
832
198
704
296
802
153
659
131
637
319
825
1000
494
149
655
236
742
150
656
1005
499303
809
157
663
493
999
536
30
650
146
652
144
103
609
327
833
623
117677
171757
251
301
807
570
64
750
244755
249 352
858
753
247 352 564
858 58
301
807198757
704 251
753
247
755
249
570
750
24464303
809 510
291
797
546
40 326
832
81802
296
327
833 236
742
308
814
509
306
812
279
785 3
602
96 90536
720
214
596 315
821
30
103
609
623
117
319
825 184
690
182
688
1005
499
218
724
1000
494
266
772
493
999677
171 637
131 157
663 150
656
153
659 650
146
652
144
342
848 571
65178
684587
680
174
293
79981
548
42 321
827
314
820
216
722322
828
129
635
138
644
545
39 1002
140
646496
8617
109
615
514
207
713111
105
611
75
581
115
621
118
624
344
850
136
642 239
745
520
14
143
649
119
625
620
114
323
829
728 330
836
1004
110
616498
525
222
134135
640641
310
816 19
120
626565
569
522
1663
240
746
572
535
320
82629
116
622
104
610 66
59
582
76
299
805
248
754
299
805
572
66
565
59
571
65
330
836
344
850
239
745
569
63
240
746
342
848
248
754
587
293
799 75
581
548
42 321
827
322
828
323
829 178
684
520
14
545
39
522
514
816
216
722
525
207
71319118
624
320
826
115
621
314
820
680
174
1002
496
119
625
617
111
120
626 728
1004
498
222
535
29 620
114
109
615
105
611
110
616
116
622
104
610
310
816 136
642
135
641
134
640 140 655
129
635
138
644646 149143
649
1010 595
186700
5041011
692 89
512
194 220
726
6592
86
273
779
600
94
175
681
778
272 108
614
297
803
354
860683
177
559
549
43
173
67953
350
856137
643
313
819
588
82
589
83
106
612
550
44
209
715
528
577
71
513
107
613
128
634
294
8007580
22
579
73
349
855 74
521
15
1007
501
113
619
290
796
148
654
526
57
202
708
56320241
747
302
808
328
834
1003
497
348
854
1006
500
130
636489
995
311
817
243
749
584
78
583
77
288
794
524
18
289
795
145
651585
79331
837
492
998
538
32
586
80329
835
566
60
576
70
356 859
862 353 353
859 354
860
356
862
57
563 348
854350
856
349
855 290
796
288
794
289
795
202
708 331
837241
747
302
808
243
749 559
194
700
329
835
566
60
576
70 53 6582
580
74
577
71
579
589
8373
588
8276
297
803
512513
586
80328
834
549
43
584
78
550
44
294
800
585
79
683
592
786
177
778
272
583
77 521
15
175
681
595
89
273
779
600
94
209
715
528
526
20
524
1822
538
32
186
692311
817
1003
497
173
679 108 995
614
1007
501
220
726
113
619
1006
500
313
819106
612
107
613 489 137
643
130
636
4921011
998 128
634
1010
504 148
654
145
651
505 285
791
195
701 723
217
597
91 334
840
599
93
219
725
591
85 206
544
38
712
213
719
590
84
543
37 351
857
318
824
647
208
141
714 578
72
519
13574
324
83068
529
533
2723
748
242
531
518
1225
516
10541
35
534
28
567
61 245
751 351
857 285
791
574
68748
242
195
701
245
751567
61590
84 334
840
578
72519
13
518
12
516
10
591
85544
38
324
830206
712533
599
93
543
3727
534
28
213
719
318
824
208
714 529
531
597
912523
541
35
723
217 219
725 505 647
141648
1 598
185
691
1008
502
507 594
92 84588 542
335
841
603
97
336
842
601
95
339 36
211
717
560
54
777
271
551
45 316
822
317
823
139
645
295
801
215
721 648
142 532
530
2426
540
34
568
527
2162
573
67 246
752
355
861
539
33 355 761
861 67 752
573 246 560
5462841
568 335
295
801
336
842
551
45 594
88
777
271
507
339
845 542
36
197532
603 316
822
598
92
26 211
717
317
823
530
24
540
539
601
9533
215
7213 4
185
691
527
21 502 645
1008 139 142
341
847
340
846
593
87
337
843 210
716
558
52255
761
343
849
552
46
718
212
553
47 298
804
557
51 517
11
537
31575
69
55
561
347
853 490
996 255
55 852
561 575
347
85369 558
343
84952
51 804
557 298517
11
552
46341
847
340
846
593
87
337
843
553
47 270537
31210
716
718
212 490
996
1009
503 508
126
632
627
121 2 776
270
338
844333
839
286
792 556
50
256
762 346
852
515
9 491
997 256
762 346
286 839
792 333 556 338
844
50 515
92776
508 491
997
1009
503
627
121126
632
628
122
123
629
125 1012
631 554
48 554
48 628
122123
629
125
631
−2
332
838
−2
630
124633
127506 287
793 287
793 332
838 1012
506 633
127
630
124
555
49 555
49
−2 −1 0 1 2 −2 −1 0 1 2
f1 f3
628
122
627
121 150
656 648
142
152
658
151
657 154
660 653
147
153647
659
1010
5041011
1009 160506661
503666
155140
646
129
635128
634
138
644
141
139
645
157
663
156636
662 375
881
502 1012
505
1008 131
637137
643
136
642 130 491880
997
490
996 374407
913
159
665133638
639 134135
640641
132 492
998
489
995 386
892
387
893
1
219
725 384
890 385
891
389
895 399
905
388
894
406
912
258 668
764
259
765
265
771
158767
664 260
766
261264
770
373
879 267 667
773 161266675
772 169
106
612
170
676
166
672
108
614
313
819107
613
165
671678
677
172
171
408
914
493
999
378
884
377
883
409
915
376
882397
903
8 380
886
3889
383
79
85
390
896
440
946
403
909
413
919 393
899
402
908
398
904
382
888 401
907
400
906
404
910
415
921
405
911
262
768
167
673
163
669 372 691
878 185 220
726
723
217 168
674
109
615
105
611 210
716
1006
500
718
212
113
619
110
616366
872
1007
501 527
21
367
873368
874
443
949 396
902
950
444 395
901 947
441
381
887
445
951 439
945
418
924
438
944
263162
769 164
670
689
181
687 184
690
183688
182 369692
875
371
877 218
724 173
679 1005
499
211
717
310
816
1000
494
1001
495
314
820
1003
497
620
114
116
622
104
610
115
621317
823
215
721
311
817
530
410
916540
34
24
541
35
363
869
529
23 364
870
537
31357
863
539
33 391
897
948
442
927
421928
422
394
900
953
447
918
412 920
414
446
952
4942
9 48
54
450
956
449
955
436 416
922
419
925
923
417
426
932
478
984
430
936
180 876
370 186 598 92
680
174
101
607
597
91 309
815312
818
315
821
601
95
319
825
270
776213
719
617
111 1004
498
119
625
528
728
22
222
1002
496
318
824316
822
535
29524
526
2018538
32
531
25
534
28362
868
392
898
555
49
532
26
358
864 452
958
454
960 917
411
453
959 479
985 420
926
437
943
429
935
476
982
268 686
774 175
681618
112 118
624
208
103
609
714
623
536
30
117 120
626533
27 435
941
542
36 48 866
360 423
929
480
986 983
477
f3
595
89 102
608
2727
508
273
779 221
599
93209
715543
37320
826
521
15
525
19
365
871
361
867 359
865 462
968434
940
461
967 451
957
456
962
455
961 930
424
428
934
475
981
0
187
693 1685
179
596
90
507 600
94
720
214
178
684 216
722206
712
603
594
8897
207
713
223
729 777
271
341
847
340
846 522
16 554 515
9970460
966
463
969
464 468
974 457
963
432
938
431
937
459
965
226
732775
269
225
731509 780
3 733
274
100 814
606 228
734
602
30896 730
224
778
272 339
845
337
843
544
593
87
177838
338
844
545
39
514
683 520
14
553
47
231
737
523
17
236
552
74246324
830
556
50 517
11 971467
973 458931
964 425
933
427
604
98 227
307
813 176
682
592
86
306
812
232
738 591
85
235
741 322323
828829
321
827
336
842 237
743
551
45
513
7328
834 583
77
980
474
516
518
1210
298
804 465
473
979 433
939
977
471
466
972 469
975
605
99
234
740 275
781 279
785
230
736
238
744 549
43 325
831
550
44800
335
841
334
840 294 519
295
80113584
78 978
472470
976
487
993
281 735
229739 297
803 332
838
787 233305
811
5
511
282
788
276
782
280
786
278
7846 548
783
277
512 42296
802326
832
588
82
590
589
8384
327
833
577
71
560
54
73
333
839
558
527578
75
581
580
579 586
557
343
8495180 568
585
79
4582
76
72 62 482
988
483
989
488
994
485
991
481
987
283
789 510
4 587
293
799
291
797 81559
53 808 748
242567
61575
69 486
992
284711
790
292
798
342
848
547
195
701
194
700
41791546
40 188
694 303
809
344
850 302
569
63
240
746 566
241
747
574
6860
329
835
576
243
749
330
836 70
346
852347
853 246 990
752
245
751 484
204
710
205 285 304695
810 286
792 239
745
570
64
565
59 3
8 248
754
31
37
257
763 65 696
571 190345
851 189 287
793
290
796
202
708
255
761 755
249
289
795
750
244
288
794
256
762
757
251
250
756 753
247
573
6755
561
350
856 351
857 252
758
203 760
709 191
697 198
704 253572
192
698
349
855
759 66
348
854
−2
196
702 199
705
254193
699
197
703
562
56 354
860 355
861
564
58
201 563
707
200
706 57807
300
806 301805
299 356
862
352 859
858 353
−2 −1 0 1 2
f1
Cluster Analysis
Application in Medicine
Application in Biology
W N Central 4 West 4
NH
S Atlantic 5 NY
E S Central 6
W S Central 7
Mountain 8
Pacific 9
Data Mining
Proximity
dij = kxi − xj k2 , xi , xj ∈ Rp
Mannheim distance
0.0 1811.0 108574.0
D= 0.0 110381.0
0.0
Maximum distance
0.0 669.0 49044.0
D= 0.0 49455.0
0.0 SMSdisthealth05
Applied Multivariate Statistical Analysis
Cluster Analysis 12-18
Projects
Example
Consider a data matrix X with column means x k . Define the
indicator variable yik = I(xik > x k ), i = 1, . . . , n. How do these
”deviations from the means” cluster?
M 16 − 29 30 − 44 45 − 64 65 − 74 75 + +
N 683 596 705 295 99
Y 21 32 70 43 19
F 16 − 29 30 − 44 45 − 64 65 − 74 75 + +
N 738 700 847 336 196
Y 46 89 169 98 51
xik = xjk = 1,
xik = 0, xjk = 1,
xik = 1, xjk = 0,
xik = xjk = 0.
p
X
a1 = I(xik = xjk = 1),
k=1
p
X
a2 = I(xik = 0, xjk = 1),
k=1
Xp
a3 = I(xik = 1, xjk = 0),
k=1
Xp
a4 = I(xik = xjk = 0).
k=1
The following proximity measures are used in practice
a1 + δa4
dij =
a1 + δa4 + λ(a2 + a3 )
Applied Multivariate Statistical Analysis
Cluster Analysis 12-23
Name δ λ Definition
a1
Jaccard 0 1 a1 +a2 +a3
a1 +a4
Tanimoto 1 2 a1 +2(a2 +a3 )+a4
a1 +a4
Simple Matching (M) 1 1 p
a1
Russel and Rao (RR) – – p
2a1
Dice 0 0.5 2a1 +(a2 +a3 )
a1
Kulczynski – – a2 +a3
Example: Let us consider a binary data set computed from the car
data set. (
1 if xik > x k ,
yik =
0 else,
for i = 1, . . . , n; k = 1, . . . , p
Example
French railway metric
Let (X , d) be a metric space (France), and fix xh ∈ X (Paris). Set
r = 1 in (1) and define a new metric dh on X by letting
0 xi = xj
def
dh (xi , xj ) =
d + d otherwise
ih hj
Example
Mannheim metric
Let (X , d) be a metric space (Mannheim). Set r = 1 in (1) and
define a new metric dm on X by letting
def
dm (xi , xj ) = dij
y
Mannheim B
(x1 , x2 )
L2
L2
0 LM
A
x
L2 = x12 + x22
LM = LMannheim = |x1 | + |x2 | = L1 metric
Example
Karlsruhe metric
Let (X , d) be a metric space (Karlsruhe). Set r = 1 in (1) and
represent xi , xj in polar coordinates: xi = (dih , ϕih ), xj = (djh , ϕjh ),
where dih , djh are the distances from a fixed point xh and ϕih , ϕjh
the angles from a fixed direction. Define:
min(d , d ) · δ(x , x ) + |d − d | 0 ≤ δ(x , x ) ≤ 2
def ih jh i j ih jh i j
dk (xi , xj ) =
d + d δ(xi , xj ) > 2
ih jh
Source: Wikipedia
Source: Proceedings of the 14th International Workshop on Graph-Theoretic Concepts in Computer Science,
Example
A = diag(sX−1
1 X1
, . . . , sX−1
p Xp
) gives
p
X (xik − xjk )2
dij2 = .
sXk Xk
k=1
Mahalanobis Distance
Take A = Σ−1 or S −1 in (2):
Example
! ! ! !
0 4
X ∼ N2 ,Σ , Y ∼ N2 ,Σ
0 0
!
1 ρ
Σ= , ρ = 0.9
ρ 1
4
2
0
y
−2
−4
−2 0 2 4 6
0 4
x1, y1
SMSmdmv
Calculate the D with A = I−1 −1
100 , A = S100 .
If X is a contingency table:
xij
row i characterized by the conditional frequency xi• ,
x
column j characterized by the conditional frequency x•jij ,
P P P P
where xi• = pj=1 xij , x•j = ni=1 xij , x•• = pj=1 ni=1 xij .
variable description
X1 : land area (land)
X2 : population 1985 (popu 1985)
X3 : murder (murd)
X4 : rape
X5 : robbery (robb)
X6 : assault (assa)
X7 : burglary (burg)
X8 : larcery (larc)
X9 : auto theft (auto)
X10 : U.S. states region number (reg)
X11 : U.S. states division number (div)
MVAclususcrime
Applied Multivariate Statistical Analysis
Cluster Analysis 12-47
Q-Correlation Distance
Cluster algorithms
Spectral Clustering
Group Building
Cost of partitioning Y into P and Q groups, Y = P + Q
X
Cut(P, Q) = djk
xj ∈P,xk ∈Q
cont’d
Ad.2
1 1 1
One has to show: (In − V − 2 DV − 2 )V 2 1n = 0n .
By definition of V it follows that 0 and 1n are the eigen- value and
vector of In − V −1 D, respectively
(In − V −1 D)1n = 0n
1 1
(V 2 − V − 2 D)1n = 0n
Ad.3
Follows from (1)
6 8
4
7
2
brand loyalty
0 4
3
5
-2
1
2
-4
-4 -2 0 2 4
price conciousness
Figure: The 8 points example MVAclus8p
Example
Build Laplacian matrix Z for the 8 points
3.26 1.95 4.07 3.47 1.06 1.91 0.71 0.33
1.95 2.47 1.46 0.90 2.40 2.28 0.13 0.10
4.07 1.46 3.56 4.22 1.01 2.14 1.40 0.71
3.47 0.90 4.22 3.22 0.62 1.55 1.64 0.77
Z=
1.06 2.40 1.01 0.62 2.42 3.24 0.24 0.30
1.91 2.28 2.14 1.55 3.24 3.23 0.83 0.87
0.71 0.13 1.40 1.64 0.24 0.83 2.17 2.41
0.33 0.10 0.71 0.77 0.30 0.97 2.41 1.73
6 8
4
7
2
brand loyalty
0 4
5
-2
2
-4
-4 -2 0 2 4
price conciousness
Example
Calculate Cut and Ncut for a division of x1 , . . . , x6 , given by a
similarity matrix D, into 2 subsets P and Q, where |P| = 1 or
P = {x1 , x2 , x3 } , {x1 , x3 , x4 } , {x1 , x4 , x5 } , {x1 , x5 , x6 } , {x1 , x2 , x4 } ,
{x1 , x2 , x5 }.
1.0 0.8 0.6 0.0 0.1 0.0
0.8 1.0 0.8 0.0 0.0 0.0
0.6 0.8 1.0 0.2 0.0 0.0
D=
0.0 0.0 0.2 1.0 0.8 0.7
0.1 0.0 0.0 0.8 1.0 0.8
0.0 0.0 0.0 0.7 0.8 1.0
Example
Table: Cut and Ncut for subsets: |P| = 1 (upper panel) and
P1 = {x1 , x2 , x3 }, P2 = {x1 , x3 , x4 }, P3 = {x1 , x4 , x5 }, P4 = {x1 , x5 , x6 },
P5 = {x1 , x2 , x4 } and P6 = {x1 , x2 , x5 } (lower panel).
Example
Build Laplacian matrix Z of the similarity matrix D
1.5 −0.8 −0.6 0.0 −0.1 0.0
−0.8 1.6 −0.8 0.0 0.0 0.0
−0.6 −0.8 1.6 −0.2 0.0 0.0
Z=
−0.8 0.0 −0.2 2.5 −0.8 −0.7
−0.1 0.0 0.0 0.8 1.7 −0.8
0.0 0.0 0.0 −0.7 −0.8 1.5
Example
Find eigenvalues Λ and eigenvectors Γ of Laplacian matrix Z
0.4 0.2 0.1 0.4 −0.2 −0.9
0.4 0.2 0.1 0.0 0.4 0.3
0.4 0.2 −0.2 0.0 −0.2 0.6
Γ=
0.4 −0.4 0.9 0.2 −0.4 −0.6
0.4 −0.7 −0.4 −0.8 −0.6 −0.2
0.4 −0.7 −0.2 0.5 0.8 0.9
Example
20
−20
−20 0 20
20
−20
−20 0 20
Figure: Parcellation results for the simulated data into 4 clusters by NCut
algorithm based on the Euclidean distance. Specclust
fMRI
15000
10000
5000
(
max {Corrt (Yj , Yk ), 0} , for kXj − Xk k < C
wjk =
0, otherwise
10
80
== Z =>
= X => <= Y ==
60
40
20
== Y =>
Figure: Parcellation results for the 1st subject’s brain into 1000 clusters
by NCut algorithm.
Applied Multivariate Statistical Analysis
Empirical Results 13-71
Example: orange spiral data
4
2
0
-2
-4
-3 -2 -1 0 1 2 3
d(R, P + Q) =
δ1 d(R, P) + δ2 d(R, Q) + δ3 d(P, Q) + δ4 |d(R, P) − d(R, Q)|
δj weighting factors
Pn
Denote by nP = i=1 I(xi ∈ P) the number of objects in group P.
Name δ1 δ2 δ3 δ4
Single linkage 1/2 1/2 0 -1/2
Complete linkage 1/2 1/2 0 1/2
Average linkage 1/2 1/2 0 0
(unweighted)
nP nQ
Average linkage nP +nQ nP +nQ
0 0
(weighted)
nP nQ n n
Centroid nP +nQ nP +nQ
− (n P+nQ )2 0
P Q
Median 1/2 1/2 -1/4 0
nR +nP nR +nQ nR
Ward nR +nP +nQ nR +nP +nQ
−n 0
R +nP +nQ
Pn
where nP = i=1 I(xi ∈ P) denotes the number of objects in
group P.
Applied Multivariate Statistical Analysis
Empirical Results 13-75
Example:
x1 = (0, 0), x2 = (1, 0), x3 = (5, 5) and the squared Euclidean
distance matrix with single linkage weighting.
Recall:
0 1 50
D2 = 1 0 41
50 41 0
The algorithm starts with N = 3 Clusters.
P = {x1 }, Q = {x2 }, R = {x3 }.
1 1 1
d(R, P + Q) = d(R, P) + d(R, Q) − |d(R, P) − d(R, Q)|
2 2 2
1 1 1
= d13 + d23 − · |d13 − d23 |
2 2 2
50 41 1
= + − · |50 − 41|
2 2 2
= 41
0 41
The reduced distance matrix is then 41 0 .
Dendrogram
8 points
6 8
4
7
2
brand loyalty
4
0
3
5
-2
1
2
-4
-4 -2 0 2 4
price conciousness
20
squared Euclidean distance
15
10
6
5
8
0
5
Figure: The dendrogram for the 8 points example, single linkage
algorithm, euclidean distance. If we decide to cut the tree at the level 10
we define three clusters: {1, 2}, {3, 4, 5} and {6, 7, 8}. MVAclus8p
6 8
20
4
15
7
2
brand loyalty
10
4
0
6
5
5
-2
2
1
8
0
4
2
5
-4
-4 -2 0 2 4
price conciousness
80
squared Euclidean distance
60
7
2
brand loyalty
40
4
0
20
3
−2
0
1
6
1
8
−4
5
2
−4 −2 0 2 4
price conciousness
Figure: Complete linkage algorithm on squared Euclidean distance for 8
point example with dendrogram. SMSclus8p
nQ
nP X
1 X
D(P, Q) = d(xi , yi )
nP nQ
i=1 i=1
Centroid Algorithm
is very similar and uses the natural geometrical distance between R
and the weighted center of gravity of P and Q.
50
6 8
4
30
brand loyalty
20
4
0
10
−2
6
0
1
8
4
−4
5
2
−4 −2 0 2 4
price conciousness
Figure: Average linkage algorithm on squared Euclidean distance for 8
point example with dendrogram. SMSclus8pa
Centroid Algorithm
R
%H
@HH
% @ HH
% @ H
P Q
weighted center of gravity of P + Q
6
40
6 8
4
20
4
0
10
3
−2
6
0
1
8
4
−4
5
2
−4 −2 0 2 4
price conciousness
Figure: Centroid algorithm on squared Euclidean distance for 8 point
example with dendrogram. SMSclus8pc
When two objects or groups P and Q will be joined, the new group
P + Q will have a larger inertia IP+Q .
IP + IQ ≤ IP+Q
Applied Multivariate Statistical Analysis
Empirical Results 13-91
4
2
96
21
6577
57 118
154
second PC
25
101
121
162
111
−2
161
−4
−4 −2 0 2 4
first PC
Figure: PCA for 20 randomly chosen bank notes MVAclusbank
Dendrogram for 20 Swiss bank notes 20 Swiss bank notes, cut height 60
100
4
80
Squared Euclidean Distance
2
96
60
21
6577
57 118
154
second PC
51 9834 105
129 119 164
0
25
40
101
121
162
111
−2
20
161
0
−4
118
154
105
164
162
121
101
119
129
161
111
34
77
65
57
96
98
21
25
51
−4 −2 0 2 4
Ward algorithm first PC
(a) The dendrogram for the 20 bank (b) PCA for 20 randomly chosen bank
notes, Ward algorithm notes
Figure: MVAclusbank
4
MA5
2
EM5 MA4
EM4 MA3
second PC
0
CA5 EM3 MA2
CA2
CA3
CA4 EM2
−2
−4
−4 −2 0 2 4
first PC
Ward Dendrogram for French Food French Food data, cut height 60
10 20 30 40 50 60 70
4
MA5
Squared Euclidean Distance
2
EM5 MA4
EM4 MA3
second PC
0
CA5 EM3 MA2
CA2
CA3
CA4 EM2
−2
0
−4
CA2
CA3
CA4
EM5
CA5
MA5
MA4
EM4
EM2
MA2
MA3
EM3
−4 −2 0 2 4
first PC
(a) The dendrogram for the (b) PCA for the standardized French
standardized French food expenditures, food expenditures
Ward algorithm
Figure: MVAclusfood
Applied Multivariate Statistical Analysis
Empirical Results 13-96
Clusters States
1 IL IN MI MO OH MA NJ GA NC TN VA
2 IA KS MN WI CT PR AL AR KY LA MD
MS OK SC WV AZ CO NV OR WA
3 NE ND SD ME NH RI VT VI GU AS MP
DE DC AK HI ID MT NM UT WY
4 NY PA FL TX CA
Figure: Cluster analysis of U.S. health data set using Ward algorithm and
Euclidean distance. SMSclushealth05
Applied Multivariate Statistical Analysis
Empirical Results 13-98
Table: The averages of the U.S. health data set within the 4 clusters.
SMSclushealth05
CA 4
2000
WA NC 3
MN
OR 42VA413 TX 3 FL 3
CO SC 4MA
WI
AZ 3
T4
IN223
N
OH 2
IL 2
ME KS 2
NE
ID
MT
NH
NM
UT PR 4112
IA
CT 51MO
44KY
32
MD
LA
3 333GA
3 23
SD
VT
AK 14OK
2
555414
HI
WV AR
0
WY
ND
DE
AS
MP
VI
GU
DC NV
RI 3
3 AL 3 NJ MI
1 2 PA 1
MS 3
-2000
PC2
-4000
-6000
-8000
NY 1
PC1
Figure: Plot of the first two principal components of the U.S. health
2005 data. SMSclushealth05
0 5 10 15 20 25 30 35
C46
C15
C53
C14
C19
C35
C47
C37
Empirical Results
C52
C54
C55
Euclidean distance.
C3
C57
C21
C8
C24
C60
C64
C41
C58
C10
C63
C30
C61
C45
C27
C56
SMScluscereal
C51
C36
C42
C20
C59
C34
C49
C5
C16
C23
C28
C22
C39
Ward dendrogram for US cereal data
C62
C13
C17
C7
C65
C29
C33
C40
C44
C18
C12
C50
C11
C4
C9
C48
C38
C25
C6
C43
Figure: Cluster analysis of U.S. cereal dataset using Ward algorithm and
C32
13-103
C31
C1
C2
Empirical Results 13-104
65 US cereals, cut height 15
C31
C3
2
C54
C55
C57
C10 C2
1
C63 C1
second PC
C46 C48 C32
C47 C15C60 C8
C64
C14 C61
C19
C53 C38
C37C58
C52 C30
C21
C24
C41 C51 C43 C12
0
C7 C36C6
C65 C18
C34 C50
C35 C42 C25
C39C59C26
C20 C49 C44
C16
C5 C29 C4
C22
C33
C62 C40
−1
C23 C11
C13
C17
C28 C45
C27C9
C56
−2 0 2 4
first PC
Figure: Plot of the first two principal components of the U.S. cereal
dataset. SMScluscereal
k - means Clustering
k - means Clustering
k X
X
Ŝ = argmin = ||xi − µj ||22 (23)
S j=1 i∈Sj
Standard Algorithm
(t)
Fix an initial set {µj }kj=1 , t = 1
(t+1) (t) −1 P
Update: µj = #Sj i∈S
(t) xi
j
6 8
7
2
brand loyalty
4
0
5
-2
2
-4
-4 -2 0 2 4
price conciousness
Figure: SMSclus8km
Applied Multivariate Statistical Analysis
Empirical Results 13-109
US health data
2000
CA
WA NC
MN
OR TX FL
VA
MA OH
COSCWITN
AZ IN
ME KS
PR
IAKY GA IL
MTNE
ID CT
0
NH
NM
UT
SD
VT
AKHI
ND
WY
DE WVARMD
LAMO
AS
MP
VI
GURINV
DC OKAL NJ MI PA
MS
PC2
−4000
−8000
NY
0 20000 40000 60000
PC1
Figure: SMScluskmhealth
US cereal data
31
3
2
55
5754
1 2 10
1 48 63
32 604615 47
8 64
PC2
38 30 5853
61 19
14
52
37
2421
43 41
636 51 7
0
12
18
50 34 65
25 42 35
26 59
49
44 4 20 2939
2262
33 5 16
−1
40 11 2313
17
45 28
9 27
56
−4 −2 0 2
PC1
Figure: SMScluskmcereal
−1.
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
Empirical Results 13-114
Metric MDS for BCS quantlets Metric MDS for MVA quantlets
** *
* *
1.0
3
* * *
** *
* *
0.5
* * *
*
2
* *
** * **
0.0
*
* *
***
* ** * *
1
* *
−0.5
* ** *
* * * ** * *** * * *
* * * * *
−1.0
0
●
* ●
*
● Cluster2 ● Cluster2 *************
****
***
● Cluster3 * ● Cluster3 * *
Cluster4 * * * *
−1.5
● ● Cluster4
* *
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 −3 −2 −1 0
* *
1
*
* ** ******** ******
3
* ** ********
******
********
*****
**
***0******
* ** * * ** *** *** * ** ** * ** * * ** ** ******
0
*
*
Applied Multivariate Statistical Analysis
*
2
−1
*
*
Empirical Results 13-115
6 9 8
4
13 4
17 32 7
2
brand loyalty
25 34
36 20 45
18 85 58
4 25 80 65
0
2 34
5 3 37
1 58
−2
5 41 40
25 37
20 1
5
−4
2
−4 −2 0 2 4
price conciousness
Spanning tree
Definition
Spanning tree is a connector, acyclical subgraph of G.
Try all possible spanning trees? NO!
Kruskal Algorithm
Prim Algorithm
Clustering into k-groups
Important in power networks, engineering, combinatorical
optimization.
Be greedy!
Spanning tree
MST: Given connected graph G with positive edge weights, find a
min weight set of edges that connects all of the vertices.
Def.: A spanning tree of a graph G is a subgraph T that is
connected and acyclic. 8 points
6 9 8
4 4
17 7
2
brand loyalty
4
0
2
3
1
−2
20 1
5
−4
−4 −2 0 2 4
price conciousness
SMSclus8pmst2
Property: MST of G is always a spanning tree.
Applied Multivariate Statistical Analysis
Source of slides: http://www.cs.princeton.edu/~rs/AlgsDS07/14MST.pdf
Empirical Results 13-134
Summary:
Discriminant Analysis
Discriminant Rule
S
J
Rj = Rp
j=1
↑
partition
Rj corresponds to Πj population
ML Discriminant Rule
Density fj of population Πj .
Allocate x to Πj that gives the largest likelihood
Theorem
The rule minimizing the ECM is given by
f1 (x) C (1|2) π2
R1 = x : ≥
f2 (x) C (2|1) π1
f1 (x) C (1|2) π2
R2 = x : <
f2 (x) C (2|1) π1
The ML rule is a special case of the ECM rule for equal
misclassification costs and equal prior probabilities πj .
Proof cont’d.
Z
= −C (1|2)π2 + I(x ∈ R2 )[−C (2|1)π1 f1 (x)+{C (1|2)+γ}π2 f2 (x)]dx
This is equivalent to
f2 (x) C (2|1)π1
R2 = x : ≥
f1 (x) {C (1|2) + γ}π2
Example
Suppose x ∈ {0, 1} and
1
Π1 : P(X = 0) = P(X = 1) =
2
1
Π2 : P(X = 0) = = 1 − P(X = 1).
4
the sample space is the set {0, 1}.
Allocate x = 0 to Π1 and x = 1 to Π2 .
Hence R1 = {0}, R2 = {1}, R1 ∪ R2 = {0, 1}.
Example
Consider two normal populations
Π1 : N(µ1 , σ12 ),
Π2 : N(µ2 , σ22 ).
Then ( 2 )
1 x − µi
Li (x) = (2πσi2 )−1/2 exp − .
2 σi
Example (cont’d)
Note that L1 (x) > L2 (x) iff
2 2
σ2 x−µ1 x−µ2
σ1 exp − 12 σ1 − σ2 >1
µ1 µ2 µ21 µ22
x2 1
σ12
− 1
σ22
− 2x σ12
− σ22
+ σ12
− σ22
< 2 log σσ21 .
Example
Suppose that µ1 = 0, σ1 = 1 and µ2 = 1, σ2 = 21 :
1 p 1 p
R1 = x : x < (4 − 4 + 6 log(2)) or x > (4 + 4 + 6 log(2)) ,
3 3
R2 = R \ R1 .
Densities 0.6
0.4 R1 R2 R1
0.2
0
-3 -2 -1 0 1 2 3
Hence x isMultivariate
Applied 1 (x ∈ R1 )Analysis
allocated to ΠStatistical if L1 (x) ≥ L2 (x). Note that L1 (x) ≥ L2 (x) is equivalent
to " ( 2 2 )#
Discriminant Analysis 14-14
Theorem
(a) Suppose Πi = Np (µi , Σ). The ML rule allocates x to Πj ,
where j ∈ {1, . . . , J} is the value that minimizes the square
Mahalanobis distance between x and µi :
x ∈ R1 ⇐⇒ α> (x − µ) > 0 ,
1
(µ1 − µ2 )> Σ−1 {x − (µ1 + µ2 )} > 0 ⇒ α> (x − µ) > 0.
Applied Multivariate Statistical2Analysis
Discriminant Analysis 14-16
is better if
pii > pii0 for at least one i
We call a discriminant rule admissible if there is no better one.
Applied Multivariate Statistical Analysis
Discriminant Analysis 14-18
Example
20 randomly chosen banknotes
Su pooled covariance estimate
b = Su−1 (x 1 − x 2 )
α
= (−12.18, 20.54, −19.22, −15.55, −13.06, 21.43)>
1
x = (x 1 + x 2 )
2
= (214.79, 130.05, 129.92, 9.23, 10.48, 140.46)>
Example
Allocation regions for J = 3 groups:
1
h12 (x) = (x 1 − x 2 )> Su−1 {x − (x 1 + x 2 )}
2
1
h13 (x) = (x 1 − x 3 )> Su−1 {x − (x 1 + x 3 )}
2
1
h23 (x) = (x 2 − x 3 )> Su−1 {x − (x 2 + x 3 )} .
2
The ML rule is to allocate x to
Π1 if h12 (x) > 0 and h13 (x) > 0
Π2 if h12 (x) < 0 and h23 (x) > 0
Π3 if h13 (x) < 0 and h23 (x) < 0.
Probabilities of misclassification
Example
In the above classification problem for the swiss bank notes we
have the following situation
predicted membership
genuine (Π1 ) forged (Π2 )
Π1 100 1
actual
Π2 0 99
0
n21
0
p̂21 = n1 .
Example
Suppose there are J = 2 groups and Yj ∈ Rnj .
Recall H = In − n−1 1n 1> 2
n the centering matrix with H = H and
calculate
to see that
nj
X
Yj> Hj Yj = (yj,i − y j )2 ,
i=1
The total-sum-of-squares
Xn
(yi − ȳ )2 = Y > HY = a> X > HX a = a> T a
i=1
can be now decomposed as
total SS = within SS + between SS
a> T a = a> Wa + a> Ba
Fisher’s idea was to select an a that maximizes the ratio
a> Ba
.
a> Wa
Applied Multivariate Statistical Analysis
Discriminant Analysis 14-32
Theorem
Ba >
The vector a that maximizes aa> Wa is the eigenvector of W −1 B
that corresponds to the largest eigenvalue.
Discrimination rule
We classify x into the group j for which a> x̄j is closest to a> x.
12
Densities of Projections 10
4
Forged
2 Genuine
0
-0.2 -0.1 0 0.1 0.2
Figure: Densities
Figure 13.2: ofofprojections
Densities ofgenuine
projections of genuineandand forged bank
counterfeit notesnotes
by by
Fisher’s
Fisher’s dis-
discrimination function
crimination rule. MVAdisfbank MVAdisfbank
Applied
Note thatMultivariate Statistical
the allocation Analysis
rule (13.15) is exactly the same as the ML rule for J = 2 groups
and for normal distributions with the same covariance. For J = 3 groups this rule will be
Discriminant Analysis 14-34
Example
The resulting discriminant rule consists of allocating an
observation x0 to the genuine data if
Boston Housing
Example
Define groups according to the median value of houses Xe14 : in
group Π1 the value of Xe14 is greater than or equal to the median
of Xe14 and in group Π2 the value of Xe14 is less than the median of
Xe14 . Apply the linear discriminant rule (excluding Xe4 and Xe14 ).
True
Π1 Π2
Π1 216 40
Predicted
Π2 34 216
Table: APER= 0.146 for price of Boston houses. MVAdiscbh
Applied Multivariate Statistical Analysis
Discriminant Analysis 14-37
Boston Housing
Example
The APER is biased since we use the data twice. A more
appropriate measure of precision is the AER using the
leave-one-out technique.
True
Π1 Π2
Π1 211 42
Predicted
Π2 39 214
Boston Housing
Example
Now define as in the cluster analysis chapter the groups via higher
quality of life and house, excluding Xe4 .
True
Π1 Π2
Π1 244 13
Predicted
Π2 7 242
Boston Housing
Example
True
Π1 Π2
Π1 244 14
Predicted
Π2 7 241
Boston Housing
Example
Figure: Discrimination scores for the two clusters created from the
Boston housing data. MVAdiscbh
Correspondence Analysis
Categorical scales
Attitudes, opinions and demographic characteristics, e.g.
gender, race, social class
Public health, ecology, education, marketing
Quality control: how soft to touch a certain fabric is, how
good a particular food product tastes, or how easy a worker
finds a certain task to be
Independence
The association between Z and Y is given by their joint
distribution, the cdtl distribution of Z given Y , or the cdtl
distribution of Y given Z .
Z and Y are independent if for all i and j:
πi|j = πij /π•j = πi• ,
πj|i = πji /πi• = π•j , or
πij = πi• π•j .
πij denotes the unknown true probabilities.
The sample relative frequencies are denoted by pij = xij /x•• , where
xij are the absolute frequencies and x•• is the sample size.
Measures of Independence
Sampling Distributions
The tests are often (not always) identical for all types of sampling.
Poisson sampling
(everything is random)
Multinomial sampling
(total number of observed subjects is fixed)
Independent multinomial sampling
(number of subject in each row or column is fixed)
2
π̂ij = pi• p•j = (xi• x•j )/x•• .
def
Eij = (xi• x•j )/x•• (25)
I X
X J
(xij − Eij )2
t= . (26)
Eij
i=1 j=1
MVAcorrEyeHair
Example
Original table and values “expected under independence”.
MVAcorrEyeHair
Example
Contributions to Chi-Square statistic and its sum
> (E - X) ^ 2 / E
[,1] [,2] [,3] [,4]
[1,] 19.3459095 1.5214189 0.005621691 34.234171
[2,] 0.2278650 1.8313775 0.726334723 4.963290
[3,] 3.8168794 0.1190937 5.210886943 0.375399
[4,] 9.4210781 3.8004599 2.993334093 49.696722
> Chi2
[1] 138.2898
MVAcorrEyeHair
2
Pearson Chi-Square t has χ (9) cdf
Critical value (α = 0.05) is 16.919.
Example
> X3X4
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 0 0 0
[2,] 0 7 1 0 0
[3,] 2 2 19 5 0
[4,] 0 0 6 11 0
[5,] 0 1 1 4 5
> Chi2_X3X4
[1] 87.75024
MVAcorrCar
Critical value is 26.296 (α = 0.05)
Example
> X3X13
[,1] [,2] [,3]
[1,] 2 0 0
[2,] 8 0 0
[3,] 26 0 2
[4,] 8 5 4
[5,] 2 6 3
> Chi2_X3X13
[1] 31.32032
MVAcorrCar
Critical value is 15.507 (α = 0.05)
Example
> X4X13
[,1] [,2] [,3]
[1,] 2 0 1
[2,] 10 0 1
[3,] 21 1 5
[4,] 13 5 2
[5,] 0 5 0
> Chi2_X4X13
[1] 33.60534
MVAcorrCar
Critical value is 15.507 (α = 0.05)
Correspondence Analysis
Example
The French “baccalauréat” data: region (e.g. Ile-de-France) and
modality (e.g. Philosophy)
Question: Do students in certain regions prefer certain modalities
or vice versa?
Percentages of the eight modalities for the Lorraine region:
A B C D E F G H
Example
Percentage of the eight modalities for all regions
A B C D E F G H
Example
n types of companies and p locations.
Contingency Table
Example
Suppose that n = 3, p = 3 and
4 0 2 6
← Finance
0 1 1 2
X = 1
← Energy
1 4 6
← HiTech
5 2 7 14
↑ Frankfurt
↑ Berlin
↑ Munich
Example
Location index:
P
n
x
sj ∝ ri x•jij with (company) weight vector r = (r1 , . . . , rn )>
i=1
Company index:
P
p
x
ri∗ ∝ sj∗ xi•ij with (location) weight vector
j=1
>
s∗ = s1∗ , . . . , sp∗
Simultaneously find r = (r1 , . . . , rn )> and s = (s1 , . . . , sp )>
such that proximity (distance) between ri and sj would
indicate positive (negative) association between the i th row
and the j th column.
χ2 Decomposition
√ √
C b = 0, C > a = 0
C = ΓΛ∆> .
R = rank(C) ≤ min(n − 1, p − 1)
1/2 1/2
Λ = diag (λ1 , . . . , λR )
λj Eval of CC > .
R
X 1/2
cij = λk γik δjk
k=1
R
X XX
>
tr (CC ) = λk = cij2 = t.
1 i j
Define
rk = A−1/2 Cδk
sk = B −1/2 C > γk
and observe
rk = √1 A−1/2 CB 1/2 sk
λk
sk = √1 B −1/2 C > A1/2 rk
λk
1 >
rk = r a=0
x•• k
1 >
sk = s b=0
x•• k
and
λk
Var(rk ) = = Var(sk )
x••
P
λk / i λi is the proportion of variance explained by factor k.
Ca (i, rk ) = xi• rki2 /λk are the contribution of row i to the variance
of the (row) factor rk .
Applied Multivariate Statistical Analysis
Correspondence Analysis 15-32
Example
In Belgium, a survey was done to account for people who regularly
read newspapers. The answers were classified according to regions
of residence and language of the newspaper (Flemish, French or
both).
We have 10 regions: Antwerp, Western Flanders, Eastern Flanders,
Hainant, Liège, Limbourg, Luxembourg, Flemish-Brabant,
Wallon-Brabant, city of Brussels.
The language of newspaper is denoted by the first letter.
v: Flemish (Vlaams)
f: French (Francais)
b: both (beide)
Altogether, we have 15 newspapers.
Applied Multivariate Statistical Analysis
Correspondence Analysis 15-33
λj % variance cumulated %
183.40 0.653 0.653
43.75 0.156 0.809
25.21 0.090 0.898
11.74 0.042 0.940
8.04 0.029 0.969
4.68 0.017 0.985
2.13 0.008 0.993
1.20 0.004 0.997
0.82 0.003 1.000
0.00 0.000 1.000
Example
The tables show for instance the important role of Antwerp and
the newspaper vb in determining the variance of both factors but
clearly the first axis expresses linguistic differences between the 3
parts of Belgium, the second axis shows a larger dispersion
between the Flemish region then the French speaking regions.
0.5 ve
lim
foc vc brf
vd
bk
for
va bxl fh
0 hai
vm bj fn fi
bl brw
ff fg
r2, s2
fo
lig
lux
anv
-0.5
vb
-1
Example (Interpretation)
High association between the regions and type of newspaper. In
particular vb (Gazet van Antwerp) is read in Antwerp (extremes in
the graph). The points on the left all belong to Flanders, whereas
those on the right all belong to Wallonia. Notice that the
Wallon-Brabant and the Flemish-Brabant are not so far from
Brussels. Brussels is near the center so not far as being an average
and also being near the bilingual newspaper.
Example
Apply correspondence analysis to the French baccalauréat data.
A: Philosophy, B: Economics and Social Sciences, C: Mathematics
and Physics, D: Mathematics and Natural Sciences,
E: Mathematics and Techniques, F: Industrial Techniques,
G: Economic Techniques, H: Computer Techniques.
The data were collected in 22 regions denoted by four-letter codes.
We have 202100 observations in 22 × 8 contingency table.
We did the analysis two times (with and without Corsica) because
the graphics suggests that Corsica is an outlier.
Baccalaureat Data
0.5
cors
0.4
r2,s2 0.3
0.2
pcha
aqui
0.1 laro Aprov bnor payl
midi
D cham Gfrac
pica cent
0 auve
bour
bret hnor
rhoa limo lorr
H E
ildfB alsa
nopc F
-0.1 C
Baccalaureat Data
0.1 C
alsaF nopc
ildf
B
0.05 lorrE
H
hnor rhoa bret
bour limo
0
auve
r2,s2
cent
pica
G
frac cham
-0.05 D
midi
A
prov
bnor
payl
laro
-0.1
aqui
pcha
-0.15
Example
Example
Example
Region r1 r2 r3 Ca (i, r1 ) Ca (i, r2 ) Ca (i, r3 )
ILDF 0.1464 0.0677 0.0157 0.3839 0.2175 0.0333
CHAM -0.0603 -0.0410 -0.0187 0.0064 0.0078 0.0047
PICA 0.0323 -0.0258 -0.0318 0.0021 0.0036 0.0155
HNOR -0.0692 0.0287 0.1156 0.0096 0.0044 0.2035
CENT -0.0068 -0.0205 -0.0145 0.0001 0.0030 0.0043
BNOR -0.0271 -0.0762 0.0061 0.0014 0.0284 0.0005
BOUR -0.1921 0.0188 0.0578 0.0920 0.0023 0.0630
NOPC -0.1278 0.0863 -0.0570 0.0871 0.1052 0.1311
LORR -0.2084 0.0511 0.0467 0.1606 0.0256 0.0608
ALSA -0.2331 0.0838 0.0655 0.1283 0.0439 0.0767
FRAC -0.1304 -0.0368 -0.0444 0.0265 0.0056 0.0232
PAYL -0.0743 -0.0816 -0.0341 0.0232 0.0743 0.0370
BRET 0.0158 0.0249 -0.0469 0.0011 0.0070 0.0708
PCHA -0.0610 -0.1391 -0.0178 0.0085 0.1171 0.0054
AQUI 0.0368 -0.1183 0.0455 0.0055 0.1519 0.0643
MIDI 0.0208 -0.0567 0.0138 0.0018 0.0359 0.0061
LIMO -0.0540 0.0221 -0.0427 0.0033 0.0014 0.0154
RHOA -0.0225 0.0273 -0.0385 0.0042 0.0161 0.0918
AUVE 0.0290 -0.0139 -0.0554 0.0017 0.0010 0.0469
LARO 0.0290 -0.0862 -0.0177 0.0383 0.0595 0.0072
PROV 0.0469 -0.0717 0.0279 0.0142 0.0884 0.0383
Applied Multivariate Statistical Analysis
Correspondence Analysis 15-45
Example
Example (Interpretation)
The baccalauréats B on one side and F on the other side are most
strongly responsible for the variation on the first axis. The second
axis mostly characterizes an opposition between baccalauréats A
and C. Regarding the regions, Ile de France plays an important role
on each axis. On the first axis it is opposed to Lorraine and Alsace,
whereas on the second axis it is opposed to Poitou-Charentes and
Aquitaine.
On the right are the more classical baccalauréats and on the left
the more technical ones. The regions on the left have thus bigger
weights in the technical baccalauréats.
Example
Note also that most of the southern regions of France are
concentrated in the lower part of the picture near the baccalauréat
A.
Finally, looking at the 3-rd axis, we see that it is dominated by the
baccalauréat D (negative sign) and also but to a lesser degree by E
(negative) (as opposed to A (positive sign)). The dominating
regions are HWOR (positive sign) (opposed to NPAC (negative
sign)). So for instance HWOR is particularly poor in baccalauréat
D.
Example
US crime data set: For one year (85) we have the reported number
of crimes in the 50 states of the US classified according 7
categories: murder, rape, robbery, assault, burglary, larceny and
auto-theft.
mur
0.4 ass
MS NC
AR AL
0.2 SCrap
VW TN
r2,s2
FL LA
VT
GA
TXbur
ME VANMOK MI
SD MD rob
OR AZ KY
0 ID
WY
KS WA
lar NH CO CA NV
MO NY
DE IL
MT NE PA
IA UT AK INCT
ND OH
WI MN NJ
HI
-0.2 aut
RI
MA
Applied Multivariate
Looking at the absolute Statistical Analysis
contributions (not reproduced here, see Exercise 14.6), it appears that
the first axis is robbery (+) versus larceny (-) and auto-theft (-) axis and that the second
Correspondence Analysis 15-50
Example (Interpretation)
It appears that the first axis is robbery (+) versus larceny (-)
and auto-theft (-) and that the second factor compares
assault (-) with auto-theft (+).
The dominating states for the first axis are the North-Eastern
States MA (+) and NY (+) compared with the Western
States WY (-)and ID (-). For the second axis the opposition
is between Northern States (MA (+) again and RI (+))
against the Southern States AL (-), MS (-) and AR (-).
4 0 2 6
0 1 1 2
1 1 4 6
5 2 7 14
P
n
x
We want the row and column indices such that sj ∝ ri x•jij and
i=1
P
p
x
ri ∝ sj xi•ij .
j=1
Example
Companies
HiTech
Munich
Finance
Frankfurt
Berlin
Energy
Biplots
Biplots are a low-dimensional display of a data matrix X where the
rows and the columns are represented by points.
Reconstitution formula
Recall (27) and check
PR 1
2
λ γ δ
ik jk
xij = Eij 1 + k=1
q k
xi• x•j
x••
Theorem
Define fr = max a> ΣXY b under the constraints
a,b
Fix r , 1 ≤ r ≤ k.
1/2
Then the maximum of ρ(a, b) is given by fr = λr and is attained
when
−1/2
a = ar = ΣXX γr
and
−1/2
b = br = ΣYY δr .
Theorem
Let η and ϕ be the canonical variables, i.e., the components of the
vector η are
>
−1/2
ηi = ΣXX γi X,
for 1 ≤ i ≤ k. Then
! !
η I Λ
Var = ,
ϕ Λ I
1/2 1/2
where Λ = diag(λ1 , . . . , λk ).
Applied Multivariate Statistical Analysis
Canonical Correlation Analysis 16-8
Summary: CC Analysis
Summary: CC Analysis
Example
!
1.41 −1.11
SXX =
−1.11 1.19
!
0.78 −0.71 −0.90 −1.04 −0.95 0.18
SXY = ,
−0.42 0.82 0.77 0.90 1.12 0.11
0.75 −0.23 −0.45 −0.42 −0.28 0.28
−0.23 0.66 0.52 0.57 0.85 0.14
−0.45 0.52 0.72 0.77 0.68 −0.10
SYY =
.
−0.42 0.57 0.77 1.05 0.76 −0.15
−0.28 0.85 0.68 0.76 1.26 0.22
0.28 0.14 −0.10 −0.15 0.22 0.32
Example
−1/2 −1/2
Now we estimate K = ΣXX ΣXY ΣYY by
b
and perform a singular value decomposition of K,
Example
The first canonical variables are
ηb1 = ab1> x = 1.602 x1 + 1.686 x2
ϕb1 = 0.568 y1 + 0.544 y2 − 0.012 y3 − 0.096 y4 − 0.014 y5 + 0.915 y6
The canonical variable
Example
Considering the corresponding canonical variable
8
Wartburg
7 Jaguar
Trabant
φ1
Rover
Lada
6 Audi BMW
Mazda
Citroen
Peugeot
RenaultMitsubishi
VW Golf Toyota
Ford
Hyundai Mercedes
5 Opel Nissan
Vectra
Fiat
Opel Corsa
VW Passat
4
8 9 10 11 12 13 14
η
1
Figure: The first canonical variables for the car marks data
Figure 15.1: The first canonical variables for the car marks data. MVAcancarm
MVAcancarm
−1/2 −1/2
Now we estimate K = Σ XX ΣXY Σ
YY by
Applied Multivariate Statistical Analysis
b = S −1/2 SXY S −1/2
K
Canonical Correlation Analysis 16-17
Example
US crimes data, X ∈ R7 (murder, rape, robbery, assault, burglary,
larceny, autotheft)
US health data, Y ∈ R7 (accident, cardiovascular, cancer,
pulmonary, pneumonia, diabetes, liver)
Estimated matrix K
b MVAcanus
Example
b=b
Singular value decomposition of K ΓΛb∆
b>
1 2 3 4 5 6 7
0.928 0.895 0.795 0.752 0.627 0.502 0.278
MVAcanus
Example
1 2 3 4 5 6 7
-0.173 -0.066 0.233 0.004 -0.269 0.000 0.275
-0.066 0.044 -0.012 -0.021 -0.014 0.201 -0.085
0.005 -0.006 -0.005 -0.007 -0.011 -0.002 -0.002
0.006 -0.002 0.001 -0.001 0.022 -0.014 -0.018
0.002 0.001 -0.001 -0.003 0.003 0.001 0.004
-0.001 0.001 0.000 0.000 -0.001 -0.002 -0.001
0.002 0.000 0.004 0.006 0.000 0.002 -0.001
MVAcanus
Example
1 2 3 4 5 6 7
-0.056 -0.039 0.048 0.016 0.056 0.038 -0.044
-0.008 -0.019 -0.011 -0.020 -0.014 -0.013 -0.012
0.014 0.008 0.034 0.028 0.049 0.052 0.061
-0.035 0.114 -0.078 -0.080 0.114 -0.084 0.022
0.063 0.075 -0.101 0.207 0.002 0.198 -0.140
0.036 -0.009 -0.010 0.282 0.059 -0.197 -0.264
0.215 -0.022 0.059 -0.169 -0.108 -0.003 -0.369
MVAcanus
Multidimensional Scaling
Definition
We say that a distance matrix D = (dij ) is Euclidean if for some
points x1 , . . . , xn ∈ Rp ; dij2 = (xi − xj )> (xi − xj ).
Metric MDS
Multidimensional scaling based on Euclidean proximities is usually
referred to as metric MDS, whereas the more popular non-metric
MDS is used when the proximities are measured on an ordinal
scale.
Example (Intercity Distances)
Consider road distances between six German towns.
MDS can recreate the map from the set of distances.
In real-life applications, the problems are exceedingly more complex
because there are usually errors in the data and the dimensionality
is rarely known in advance.
200
Rostock
0
Hamburg
-200
Koblenz
-400
Rostock
800
Berlin
600
Dresden
400
Koblenz
200
Muenchen
0
0 200 400 600 800
EAST - WEST - DIRECTION in km
Figure: Metric MDS solution for the inter-city road distances after 90◦
Figure 16.2: Metric MDS solution for the inter-city road distances after reflection and 90◦
rotation. rotation. MVAMDScity2 MVAMDScity1
Metric MDS
3
ferrari
2
wartburg
jaguar
trabant
1
y
rover
lada
bmw
audi mazdacitroen
0
mitsubishi
peugeot
renault
toyota
ford nissan fiat
mercedesopel_vectra hyundai
vw_golf
−1
vw_passat
opel_corsa
−4 −2 0 2 4
x
Example
The dissimilarities in this table were in fact computed as Euclidean
distances from the original data containing car marks data on
economy, price, security, . . .
Plotting the correlation between the MDS and the original we see:
The first MDS direction is highly correlated with service(-),
value(-), design(-), sportiness(-), safety(-) and price(+). We
can interpret the first direction as the price direction since a
bad mark in price (“high price”) obviously corresponds with a
good mark, say, in sportiness (“very sportive”).
The second MDS direction is highly positively correlated with
practicability
Correlations MDS/Variables
1.0
easy
economic
0.5
security
service
price
value
0.0
y
sporty
look
−0.5
−1.0
Theorem
Define A = (aij ), aij = − 12 dij2 , B = HAH and centering matrix
H = In − n−1 1n 1> n.
Then the distance matrix D is Euclidean if and only if B is positive
semidefinite.
If D is the distance matrix of some data matrix X , then
B = HX X > H
MDS Solution
Similarity → Distance
Theorem
If C is negative definite, then the distance matrix defined by the
above formula is Euclidean.
Theorem
Among all projections X L1 , of X onto k-dimensional subspaces of
Rp the quantity φ is minimized when X is projected onto its first k
principal components.
Applied Multivariate Statistical Analysis
Multidimensional Scaling 17-15
Summary: MDS
Shepard-Kruskal algorithm
Monotonic Regression
15
Distance 10
5 10 15
Rank
Pool-Adjacent-Violator-Algorithm
15
Pool-Adjacent-Violator-Algorithm
15
Distance 10
5 10 15
Rank
Shepard-Kruskal Algorithm
(0)
In a first step, called the initial phase, we calculate Euclidean distances dij from an arbitrar-
ily chosen initial configuration X0 in dimension p∗ , provided that all objects have different
Applied coordinates.
Multivariate One Statistical Analysis
might use metric MDS to obtain these initial coordinates. The second
step or nonmetric phase determines disparities dˆij from the distances dij by constructing
(0) (0)
(0)
Multidimensional Scaling 17-20
Example
Dissimilarities δij of 4 objects based on the car marks data set.
j 1 2 3 4
i Mercedes Jaguar Ferrari VW
1 Mercedes -
2 Jaguar 3 -
3 Ferrari 2 1 -
4 VW 5 4 6 -
Example
Our aim is to find a p ∗ = 2 dimensional representation via MDS.
Assume we choose as initial configuration from metric MDS X0 as
366 16 Multidimensional Scaling
MVAnmdscar1:
Initial Configuration
8
Jaguar
6
i xi1 xi2
1 Mercedes 3 2
4 VW
2 Jaguar 2 7
Ferrari
2 Mercedes
3 Ferrari 1 3
4 VW 10 4
0
0 2 4 6 8 10 12
Figure 16.7: Initial configuration of the MDS of the car data. MVAnmdscar1
8
(1,4)
Multidimensional Scaling 17-22
Example
p
The corresponding distances dij = (xi − xj )> (xi − xj ) are
Example Figure 16.7: Initial configuration of the MDS of the car data. MVAnmdscar1
8
(1,4)
6
Distance
(1,2)
(2,3)
4
(1,3)
2
0
0 1 2 3 4 5 6 7
Dissimilarity
Example
The first violator of monotonicity is the second point (1, 3).
Therefore, average the distances d13 and d23 to get the disparities
d13 + d23 2.2 + 4.1
db13 = db23 = = = 3.15.
2 2
Apply the same procedure to the pair (2, 4) and (1, 4) to yield
db24 = db14 = 7.9. The plot of δij versus the disparities dbij represents
a monotone regression relationship.
In the initial configuration, the point 3 (Ferrari) could be moved to
reduce the distance to object 2 (Jaguar). However, this procedure
also alters the distance between objects 3 and 4.
In order to assess how well the derived configuration fits the given
dissimilarities Kruskal suggests a measure called STRESS1 that is
given by
P !1
(d − db ) 2 2
i<j ij ij
STRESS1 = P 2
.
d
i<j ij
Example
Example
Let us compute the new point configuration for i = 3 (Ferrari).
The initial coordinates are x31 = 1, x32 = 3. Applying the above
formula yields:
4
!
3 X db31
NEW
x31 =1+ 1− (xj1 − 1)
4−1 d31
j=1,j6=3
3.15 3.15 9.1
= 1+ 1 − (3−1)+ 1 − (2−1)+ 1 − (10−1)
2.2 2.2 9.1
= 1 − 0.86 + 0.23 + 0 = 0.37
NEW = 4.36.
Similarly we obtain x32
Jaguar
Ferrari New
4 VW
Ferrari Init
2 Mercedes
0
0 2 4 6 8 10 12
Figure 16.9: First iteration for Ferrari. MVAnmdscar3
Figure: First iteration for Ferrari MVAnmdscar3
Example
A car producer plans to introduce a new car. The elements are
safety components (airbag component just for the driver or also for
the second front seat) and sporty note (leather steering wheel vs.
leather interior). There are 4 lines of cars.
Example
For the car producer it is important to rank these cars and to find
out customers’ attitudes. A tester may rank the cars as follows:
car 1 2 3 4
ranking 1 2 4 3
The elementary utilities here are the safety equipment and the
sportiness outfit. CMA aims at explaining the rank order given by
the test person as a function of these elementary utilities.
Example
A margarine producer plans to create a new product and varies the
elements calories (low vs. high) and presentation (a plastic pot vs.
paper packed). One has in fact 4 products.
Product 1 2 3 4
tester’s rank 3 4 1 2
Applied Multivariate Statistical Analysis
Conjoint Measurement Analysis 18-5
Aim
The profile method asks for the utility of each stimulus. This
may be time consuming and tiring for a test person if there
are too many factors and factor levels.
The two factor method is a simplification and considers only
two factors simultaneously. It is also called trade-off analysis.
Example
Add a product category (property) such as
1 bread
X3 (use) = 2 cooking
3 universal
to the margarine example. ⇒ 3 ∗ 2 ∗ 2 = 12 Stimuli
two factor method = consider two factors simultaneously.
Example
The trade–off matrices for the levels X1 , X2 , X3 for the margarine
example:
X3 X1 X3 X2
X1 X2
1 1 2 1 1 2
1 1 2
2 1 2 2 1 2
2 1 2
3 1 2 3 1 2
Example
For the automobile example additional characteristics may be
engine power and the number of doors. These categories may be
coded as
1 50 kW
X3 (power of engine) = 2 70 kW
3 90 kW
and
1 2 doors
X4 (doors) = 2 4 doors
3 5 doors
Example
The trade-off matrices for the new car outfit are as follows
X4 X3 X4 X2 X4 X1
1 1 2 3 1 1 2 1 1 2
2 1 2 3 2 1 2 2 1 2
3 1 2 3 3 1 2 3 1 2
X3 X2 X3 X1
X2 X1
1 1 2 1 1 2
1 1 2
2 1 2 2 1 2
2 1 2
3 1 2 3 1 2
Lj
J X Lj
X X
Yk = βjl I (Xj = xjl ) + µ; k = 1, . . . , K ∀ j βjl = 0
j=1 l=1 l=1
J
Y
K= Lj .
j=i
Applied Multivariate Statistical Analysis
Conjoint Measurement Analysis 18-15
Example
X1 = use, X2 = calories
changed notation
x11 = 1, x12 = 2, x13 = 3, x21 = 1, x22 = 2 L1 = 3, L2 = 2
X2
1 2
1 2 1
X1 2 3 4
3 6 5
Ranked products
Applied Multivariate Statistical Analysis
Conjoint Measurement Analysis 18-16
Example (cont’d)
Order the stimuli
Y1 = Utility (X1 = 1 ∧ X2 = 1)
Y2 = Utility (X1 = 1 ∧ X2 = 2)
Y3 = Utility (X1 = 2 ∧ X2 = 1)
Y4 = Utility (X1 = 2 ∧ X2 = 2)
Y5 = Utility (X1 = 3 ∧ X2 = 1)
Y6 = Utility (X1 = 3 ∧ X2 = 2)
Example (cont’d)
We obtain the decomposition
Y1 = β11 + β21 + µ
Y2 = β11 + β22 + µ
Y3 = β12 + β21 + µ
Y4 = β12 + β22 + µ
Y5 = β13 + β21 + µ
Y6 = β13 + β22 + µ
Metric Solution
In the above example 1, 2, ..., 6 possible outcomes of utility
Hence µ = y = (1 + 2 + 3 + 4 + 5 + 6)/6 = 21/6 = 3.5.
ANOVA table
X2
1 2 p̄x1• β1l
1 2 1 1.5 -2
X1 2 3 4 3.5 0
3 6 5 5.5 2
p̄x2• 3.66 3.33 3.5
β2l 0.16 -0.16
1 P
d
The coefficients βjl are computed as p̄xj − µ, p̄xj = d xjk .
k=1
Applied Multivariate Statistical Analysis
Conjoint Measurement Analysis 18-19
Lj
P
Note that βjl = 0, j = 1, . . . , J.
l=1
Y = Xβ + ε
Y = Xβ + ε
Nonmetric Solution
Example
For the car example the reported Yk values were
Y = (1, 3, 2, 6, 4, 5)> . The estimated values are:
6
●
car5
●
car6
5
estimated rankings
4
●
●
car4
car3
3
●
car2
2
1
●
car1
1 2 3 4 5 6
revealed rankings
AppliedWe see that the estimated Ŷ6 = 5.16 is below the estimated Ŷ5 = 5.66 and thus an inconsis-
Multivariate Statistical Analysis
tency in ranking the utilities occurs. The monotone transformation Ẑ = f (Ŷ ) is introduced
k k
to make the relationship in Figure 17.1 monotone. A very simple procedure consists of av-
Conjoint Measurement Analysis 18-25
P
K
(Zbk − Ybk )2
k=1
STRESS = (28)
P
K
(Ybk − Yb̄ )2
k=1
Portfolio Analysis
Risk of a portfolio
portfolio of p assets.
price of asset j at time i is pij .
Return
pij − pi−1,j
xij = ·
pij
X ∼ (µ, Σ)
return of a portfolio: Q = c > X ; c > 1p = 1
Expected value of Q: c > µ
Risk (squared volatility): 21 c > Σc
Efficient Portfolio
Example
Consider the returns from January 1978 to December 1987 of six
stocks traded on the New York stock exchange.
For each stock we have chosen the same scale on the vertical axis
(which gives the return of the stock). The return of some stocks,
such as Pan American Airways, Gerber, Texaco, and Delta Airlines
are more volatile than the returns of other stocks, such as IBM or
Consolidated Edison (Electric utilities).
We compare returns of two portfolios consisting from IBM and
PanAm.
0 0
Y
-0.4 -0.4
0 50 100 0 50 100
X X
PanAm Gerber
0 0
Y
Y
-0.4 -0.4
0 50 100 0 50 100
X X
Delta Airlines Texaco
0 0
Y
Y
-0.4 -0.4
0 50 100 0 50 100
X X
Y
-0.2
-0.4
0 50 100
X
0
Y
-0.2
-0.4
0 50 100
X
Figure: Portfolio
Figure 18.2: of two
Portfolio assets,
of IBM equal and
and PanAm efficient
assets, weights
equal and efficient weights.
MVAportfol MVAportfol IBM PanAm
The text windows on the right of Figure 18.2 show the exact weights which were used. We
can clearly see that the returns of the portfolio with a higher share of the IBM assets (which
Applied
have a Multivariate
low variance) areStatistical
much less Analysis
volatile.
Applications in Finance 19-9
Σc = λ1p
c = λΣ−1 1p .
1
L = λ2 1p Σ−1 1p − λ(λ1p Σ−1 1p − 1)
2
1
= λ − λ2 1p Σ−1 1p
2
Theorem
The variance efficient portfolio weights for returns X ∼ (µ, Σ) are
Σ−1 1p
copt = .
1> −1
p Σ 1p
Notation
c = portfolio of risky assets
c0 = portfolio of riskless assets
c0 = 1 − 1>
p C (there is only one riskless asset!)
Σ = cov of risky assets (now invertible)
We see that;
λ1 −1
c= Σ (µ − r 1p )
2
Mean-variance efficient weight vector if there exists a riskless asset.
In the case of existence of a riskless asset the mean variance
efficient portfolio weights are given by
µΣ−1 (µ − r 1p )
c= .
µ> Σ−1 (µ − r 1p )
Applied Multivariate Statistical Analysis
Applications in Finance 19-14
Corollary
A portfolio of uncorrelated assets whose returns have equal
variances (Σ = σ 2 Ip ) needs to be weighted equally:
1
copt = 1p .
p
Corollary
A portfolio of correlated assets whose returns have equal variances,
i.e.,
1 ρ ··· ρ
ρ 1 ··· ρ 1
Σ = σ2 .. ..
. . .. , − < ρ < 1.
. . . . p−1
ρ ρ ··· 1
Corollary
A portfolio of uncorrelated assets with returns of different
variances (i.e., Σ = diag(σ12 , . . . , σp2 )) has the optimal weights
σj−2
cj = , j = 1, . . . , p.
P
p
−2
σj
j=1
Corollary
A portfolio of assets with returns covariance
Σ1 0 . . . 0
0 Σ . . . ...
2
Σ= . .
.. . . . . . ...
0 . . . 0 Σr
Σ−1
j 1
cj = , j = 1, . . . , r .
1> Σ−1
j 1
Efficient
390 Portfolios in Practice 18 Applications in Finance
Y
0.167 Gerber
0.167 Texaco
-0.2
-0.4
0 50 100
X
0.007 Gerber
0.189 Texaco
-0.2
-0.4
0 50 100
X
Figure: Portfolio
Figure 18.3: of all six
Portfolio of allassets,
six assets,equal
equal andand efficient
efficient weights. weights
MVAportfol
MVAportfol
Hence the optimal weighting is
S −1 16
b
c= = (0.2504, 0.0039, 0.0409, 0.5087, 0.0072, 0.1890)> .
1>
6S
−1 1
6
Applied Multivariate Statistical Analysis
As we can clearly see, the optimal weights are quite different from the equal weights (cj =
Applications in Finance 19-21
The CAPM
Var(x0 ) = 0, Σ is singular.
Multiply by c >
2c > Σc − λ1 µ̄ = λ2
µ̄ − y0
µ = µ̄1p + (Σc − c > Σc1p )
c > Σc
Σc
µ = y0 1p + > (µ̄ − y0 )
c Σc
µ = y0 1p + β(µ̄ − y0 )
with
Σc
β≡ .
c > Σc
Simplicial Depth
Simplicial depth generalizes the notion of data depth. It allows us
to define a multivariate median and to visually present high
dimensional data in low dimension.
The mean and the mode can be easily extended to multivariate
random variables.
However, the median poses a problem, since in a multivariate
sense, we cannot interpret the element-wise median
(
x((n+1)/2),j if n odd
xmed,j = x(n/2),j +x(n/2+1),j
2 otherwise.
as a point that is “most central”.
Applied Multivariate Statistical Analysis
Highly Interactive, Computationally Intensive Techniques 20-2
3 4
Simplicial depth
4
2
Y
0 -2
-2 0 2
X
Figure 19.2: 10 point distribution according to depth with the median shown as a big star
Figure: The multivariate median
in the center.
is shown as big star in the center
MVAsimdepex
MVAsimdepex
according to depth. It contains 100 data points with corresponding parameters controlling
its spread. The deepest point, the two-dimensional median, is indicated as a big star in the
center. The points with less depth are indicated via grey shades.
Projection Pursuit
m
Y
fˆm (x) = g0 (x) gk (Λ>
k x).
k=1
as a function of α.
The entropy index maximizes
Z
IE (α) = f log f
Standardize x:
zi = Σ̂−1/2 (xi − x̄).
Divide the range of yi in S non-overlapping intervals (slices)
Sttks , s = 1, . . . , S. ns denotes the number of observations
within slice Sttks , and ISttks the indicator function for this
slice:
Xn
ns = ISttks (yi ).
i=1
150
!
100
!
! !
!
! ! !! !
! !
!
50
! ! ! !! !
! ! ! !
! ! !
! !! !! !! ! !
! ! ! !! !
!
!! ! ! !! ! !! ! ! ! !!! !
! !! !
!! !! ! !
! ! ! !!!! ! ! !! !!
! !!! ! ! !
response
! ! ! !! ! !! !! !!!! ! ! !! !!
! !! !! !!!!!!!!
!!!!!!!
!
! !
!
!!!!
! !
! ! !!!! !!!! !!!!!!
!!
! !!
!! !!
!!! !!
!!
!!!
!!! !!
!!
!!!! !!
!! !!! ! !
! !!
!
!!
!
!!!!
!!!!!!
! !!! !
! !
0
! !! !!! !!! !! ! !
! !!
! ! ! !!
! ! !!!
!
!!!!!
!! !! !
!! !
! ! ! !
!
!!
! !!
!
!
!50
!
!
!
!
!100
!
!
!
!150
!4 !2 0 2 4
first index
Figure:
FigurePlot
19.5: of
Plotthe true
of the response
true response versus
versus the true the true
indices. Theindices
monotonicThe monotonic
and the convex
shapes can be clearly seen. MVAsirdata
and the convex shapes can be clearly seen MVAsirdata
True index vs Response
Applied Multivariate Statistical Analysis
150
!
Figure 19.5: Plot of the true response versus the true indices. The monotonic and the convex
Highly Interactive,
shapesComputationally
can be clearly seen. Intensive Techniques
MVAsirdata 20-18
150
!
100
!
! !
!
!
!
!!!! !
!
50
!! !!
! ! ! ! ! !
! ! !
!
!!
! ! !! ! ! !
! ! ! !!!
! !
!! ! ! !!
!
!! !!! ! ! ! ! ! !! !
! ! ! ! ! ! !! !
!
!!!
! !
!! !! !! ! !
response
!!!! ! ! !
!
! ! !!!
! ! ! ! ! ! ! !!! !
!
!
!!!!
!!!
! !!!!!!! !
! !
!!
! !!!! !! !!! !! !!
!!
!!
!!!!
!!! !!! !
!
! !!! !!!
! ! ! !!!!! ! !
! !!! !
! !!!! !! !
! ! !! !! !
! !!!
!! !!!!! ! !!!!! ! !
0
!!!!! !! ! ! !
! ! ! !!!!
! !!!!!
!!!
! ! !
!! ! !
!!!
! ! !
! ! ! ! ! !
!
!! !
!
!
!50
!
!
!
!
!100
!
!
!
!150
!4 !2 0 2
second index
Figure: Plot
Figure 19.6: of
Plotthe true
of the response
true response versus
versus the true the true
indices. The indices
monotonicThe monotonic
and the convex
shapes can be clearly seen. MVAsirdata
and the convex shapes can be clearly seen MVAsirdata
150
!
100
! !!
! ! !
!! ! !! ! ! ! ! ! ! ! !
50
! !! ! !!!
50 100 150
! !
! ! !
response
! !! !! !! !!
!! !! !! ! !! !!! ! !
! !!!!!!!
!
!! !!! !!!
!
! ! !!! !!
!!!
! !! !! !!
!! ! !!! !!! !!! ! !! ! !!
!!
!
! !
!!!
!! !!
! ! !! !! !
second index
!!! !! ! !! ! !! !!! ! !!! ! !!!
! !! !!
!
!!! ! !
!!! !!! !
! ! ! !
! !!! !!!
!! !
!!!
!
!! !! !! !! ! ! !!
! ! ! !!! !
!!
!! !!
! ! !
!!! !
!!!
!!! !
! !
!!!
!!!
!!
!!!
! !
!! !!!!!!
!!!
!!
!! ! !!
!
!!!
! !!
!
!!! ! !
!!!
!!!
!!
!!
!!!
!
!! ! !
!!
! !!
! !
!! !!
!!!!! ! !
!!
!!
!!
!
!
! !
!
!!
!!!
!!
!
!!
!!! !!! !! ! !!
! ! !
0
!!!
!!!!
!!
!
!!! !
!!!
!! !
!
!
!!
!!
! !!
!!!!
! ! !
response
!! !
! !!!!
!!!
!!!
!
!
!!!
!!!
!
! !
!
!!!!
!!
! ! !!!! !
! !!
!!
!!
!
!!
!!
!!
!!
! !!! !!!
!
! !
!50
!
! 3
0
!!
! !
2
!
!150!100 !50
! !
! 1
0
! ! ! !1
!2
! !3
!150
! !4
!3 !2 !1 0 1 2 3
!3 !2 !1 0 1 2
first index first index
150
1.0
! !
!
100 !
Psi(k) Eigenvalues
! !
0.8
!
!! !! !! !
50
! ! ! ! !!
! ! ! !
! !!
response
! !
!! !!! !! !! !
!!!!!! ! ! !
!! ! ! ! !!
!!! !!! !!
!!
0.6
!! ! ! !! !
!
! ! !!
! !! !
!!!!!!!!!
! ! !! ! ! ! !!!
!!!
!!
!! !
!
!
!!
! !
!
!
!
!
!
!
!
!
!
!
!!!
!!!
! !
!
!!
!
! !
!!!
!
!
!
!
!
!
!
!
!!!
!
!
!
!
!!!
!!
!!
!!
! !
!!
!
!!
!
!!
!
!
!
!
!
!!!!
!!
! ! ! !! !! !
0
!!!!
! !!!!
!
! !!! !!
!
!! !! ! !!!! !
!!
! !
!! !!
!! !
0.4
!
!50
! !
! !
0.2
! !
!
!150
Figure: The left plots show the response versus the estimated
Figure 19.4: SIR: The left plots show the response versus the estimated EDR-directions. The
EDR-directions. The upper
upper right plot right plot is plot
is a three-dimensional a three-dimensional
of the first two directions and theplot of the
response. The lower right plot shows the eigenvalues λ̂ (∗) and the cumulative i
first two directions sum and (◦). the response. The lower right plot shows the
MVAsirdata
SIR II Algorithm
n
1 X
Vbs = IHs (yi )zi zi> − ns z̄s z̄s> .
ns − 1
i=1
S
1X b
V̄ = ns Vs .
n
s=1
1 X b 2
S S
1 X b2
Vb = ns Vs − V̄ = ns Vs − V̄ 2 .
n n
s=1 s=1
150
!
100
! !!
! ! !
!! ! !! ! ! ! ! ! ! ! !
50
! !! ! !!!
50 100 150
! !
! ! !
response
! !! !! !! !!
!! !! !! ! !! !! ! !
! !!!!!!!
!
!! !!! !!!
!
! ! !!! !!
!!!
! !! !! !!
!! ! !!! !!! !!! ! ! !!
!!
!
! !
!!!! !!
!! !!
! ! !! !! !
second index
!!! !! ! !! ! !! !!! ! !!! ! !!!
! !! !!
!
!!! ! !
!!! !!! !
! ! ! !
! !!! !!!
!! !
!!!
!
!! !! !! !! ! ! !!
! ! ! !!! !
!!
!! !!
! ! !
!!! !
!!!
!!! !
! !
!!!
!!!
!!
!!!
! !
!! !!!!!!
!!!
!!
!! ! !!
!
!!!
! !!
!
!!! ! !
!
!!!
!!!
!!
!!
!
!!!
!
!! ! !
!!
! !!
! !
!! !!
!!!!! ! !
!!
!!
!!
!
!
! !
!
!!
!!!
!!
!
!!
!!! !! ! !! !!
! ! !!
0
!!!
!!!!
!!
!
!!! !
!!!
!! !
!
!
!!
!!
! !
!!!!
! ! !
response
!! !
! !!!!
!!!
!!!
!
!
!!!
!!!
!
! !
!
!!!!
!!
! ! !!!! !
! !!
!!
!!
!
!!
!!
!!
!!
! !! ! !!!
!
! !
!50
!
! 3
0
!!
! !
2
!
!150!100 !50
! !
! 1
0
! ! ! !1
!2
! !3
!150
! !4
!3 !2 !1 0 1 2 3
!3 !2 !1 0 1 2
first index first index
150
1.0
! !
!
100 !
Psi(k) Eigenvalues
! !
0.8
!
!! !! !! !
50
! ! ! ! !!
! ! ! !
! !!
response
! !
!! !!! !! !! !
!!!!!! ! ! !
!! ! ! ! !!
!!! !!! !!
!!
0.6
!! ! ! !! !
!
! ! !!
! !! !
!!!!!!!!!
! ! !! ! ! ! !!!
!!!
!!
!! !
!
!
!!
! !
!
!
!
!
!
!
!
!
!
!!!
!! !!
! !
!
!!
!
! !
!!!
!
!
!
!
!
!
!
!
!!!
!
!
!
!
!!!
!!
!!
!!
! !
!!
!
!!
!
!!
!
!
!
!
!
!!!!
!!
! ! ! !! !! !
0
!!!!!!!! !
!
! !!! !!
!
!! !! ! !!!! !
!!
! !
!! !!
!! !
0.4
!
!50
! !
! !
0.2
! !
!
!150
Figure: SIR II mainly sees the direction β2 .The left plots show the
response versus thetheestimated EDR-directions. The upper right plot is a
Figure 19.7: SIR II mainly sees the direction β . The left plots show the response versus
estimated EDR-directions. The upper right plot is a three-dimensional plot
2
of the first two directions and the response. The lower right plot shows the
three-dimensional plot of the first two directions
eigenvalues λ̂ (∗) and the cumulative sum (◦). i
and the response. The
MVAsir2data
Classification method
Nonparametric multivariate non-linear statistical technique
Applications: pattern recognition, medical diagnostics, text
classification, corporate bankruptcy analysis
Illustration
Illustration
Loss
VC bound
Margin ( d )
xi> w + b ≥ +1 for yi = +1
xi> w + b ≤ −1 for yi = −1
yi (xi> w + b) − 1 ≥ 0 i = 1, 2, ..., n
X n
1
LP (w , b) = kw k2 − αi {yi (xi> w + b) − 1}
2
i=1
yi (xi> w + b) − 1 ≥ 0 i = 1, ..., n
αi ≥ 0
αi {yi (xi> w + b) − 1} = 0
min LP (w , b)
w ,b
n
X
max LD (α) s.t. αi ≥ 0, αi yi = 0
α
i=1
xi> w + b ≥ 1 − ξi for yi = 1,
xi> w + b ≤ −1 + ξi for yi = −1,
ξi ≥ 0
yi (xi> w + b) ≥ 1 − ξi and ξi ≥ 0
X n
1
LP (w , b, ξ) = kw k2 + C ξi −
2
i=1
n
X n
X
αi {yi xi> w + b − 1 + ξi } − µi ξi ,
i=1 i=1
min LP (w , b, ξ)
w ,b,ξ
s.t. 0 ≤ αi ≤ C
n
X
αi yi = 0
i=1
Nonlinear Classification
Data Space Feature Space
constant σ
> −2 −1
K (xi , xj ) = e −(xi −xj ) r Σ (xi −xj )/2 – the stationary Gaussian
kernel with an anisotropic radial basis with constant r and
variance-covariance matrix Σ from training set
K (xi , xj ) = (xi> xj + 1)p – the polynomial kernel of degree p
K (xi , xj ) = tanh(kxi> xj − δ) – the hyperbolic tangent kernel
with constant k and δ
Simulated Data
3 1.0
●
●
●
● ●
●
●
● ● ●
2 ●
●
● ● ● ● ●
● 0.5
●
● ● ●
●
1 ● ●
●● ●
●●
● ● ● ● ●
● ●
0.0
X2
● ●
0 ●
●●
● ● ● ●
● ● ● ●
● ● ●
● ●
● ● ●
● ● ● ● ●
●
●
−1 ●
● ●
●
●
●
● ●
−0.5
● ● ●
● ●
● ● ●
● ●
−2 ● ●
●
●
●
● −1.0
−3 ●
−3 −2 −1 0 ● 1● 2 3
●
●
X1 ●
4
● ● ●
● ● ● ● 0.2
● ●
●
● ●
●
●
● ●
●
●
● ●
2
● ● 0.1
● ●
● ● ●
● ● ●● ● ● ●
●●●●
● ●
●●● ●
● ●
● ● ● ●
● ● ●
● ● ●
● ●● ●
●●
● ● 0.0
X2
●0 ●
● ●
●
● ●●
●
●
●
●
●●
● ●
●
●
●● −0.1
● ● ●●
●
−2 ●
● ● ● ●● ●
● ●
●
−0.2
−4
−4 −2 0 2 4
X1
Figure: SVM (r = 0.1, C = 10/200) for noisy spiral data (spread over 3π
radian); distance between spirals: 1, n−1 = n+1 = 100, n = 200. Injected
noise εi ∼ N(0, 0.12 I). MVASVMspiral
Scoring Companies
Scoring Companies
0.5
0.2
0.0
0.1 ●
●
●
● −0.5
● ●
● ● ●
x3
● ● ●
● ● ● ● ●
● ● ● ●
● ●●
● ● ●●
0.0 ●● ●
●
●
●
−1.0
●
●
●
−0.1 ●
●
●
● −1.5
● ●
●
● −2.0
−0.2
● ● ● ●
x24
Effect of parameters
1.0
0.2
0.5
0.1 ●
●
●
● ● ● 0.0
● ● ●
x3
● ● ●
● ● ● ● ●
● ● ● ●
● ●●
● ● ●●
0.0 ●● ●
●
●
●
−0.5
●
●
●
−0.1 ●
●
●
●
●
● ●
−1.0
●
−0.2
● ● ● ● −1.5
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
x24
1.0
0.2
0.5
0.1 ●
●
●
● ●
● ● ●
●
x3
● ● ● ● ●
● ● ●
0.0
● ● ● ●
● ●●
● ● ●●
0.0 ●● ●
●
●
●
●
●
−0.5
●
−0.1 ●
●
●
●
● ●
●
−1.0
●
−0.2
● ● ● ●
x24
5
0.2
0.1 ●
● 0
●
● ●
● ● ●
●
x3
● ● ●
● ● ● ● ●
● ● ● ●
● ●●
● ● ●●
0.0 ●● ●
●
●
●
●
● −5
●
−0.1 ●
●
●
●
● ●
●
●
−0.2 −10
● ● ● ●
x24