Professional Documents
Culture Documents
Søren Højsgaard
February 15, 2005
Contents
1 Introduction 1
2 Vectors 1
2.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.2 Transpose of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Multiplying a vector by a number . . . . . . . . . . . . . . . . . . . . 3
2.4 Sum of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.5 (Inner) product of vectors . . . . . . . . . . . . . . . . . . . . . . . . 4
2.6 The length (norm) of a vector . . . . . . . . . . . . . . . . . . . . . . 5
2.7 The 0–vector and 1–vector . . . . . . . . . . . . . . . . . . . . . . . . 5
2.8 Orthogonal (perpendicular) vectors . . . . . . . . . . . . . . . . . . . 5
3 Matrices 6
3.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Multiplying a matrix with a number . . . . . . . . . . . . . . . . . . 6
3.3 Transpose of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4 Sum of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.5 Multiplication of a matrix and a vector . . . . . . . . . . . . . . . . 7
3.6 Multiplication of matrices . . . . . . . . . . . . . . . . . . . . . . . . 8
3.7 Vectors as matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.8 Some special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.9 Inverse of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.10 Solving systems of linear equations . . . . . . . . . . . . . . . . . . . 11
3.11 Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.12 Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.13 Some additional rules for matrix operations . . . . . . . . . . . . . . 12
3.14 Details on inverse matrices* . . . . . . . . . . . . . . . . . . . . . . . 12
3.14.1 Inverse of a 2 × 2 matrix* . . . . . . . . . . . . . . . . . . . . 12
3.14.2 Inverse of diagonal matrices* . . . . . . . . . . . . . . . . . . 13
3.14.3 Generalized inverse* . . . . . . . . . . . . . . . . . . . . . . . 13
3.14.4 Inverting an n × n matrix* . . . . . . . . . . . . . . . . . . . 13
4 Least squares 15
1 Introduction
This note has two goal: 1) Introducing linear algebra (vectors and matrices) and 2)
showing how to work with these concepts in R.
1
2 Vectors
2.1 Vectors
A column vector is a list of numbers stacked on top of each other, e.g.
2
a= 1
3
A row vector is a list of numbers written one after the other, e.g.
b = (2, 1, 3)
In both cases, the list is ordered, i.e.
(2, 1, 3) 6= (1, 2, 3).
We make the following convention:
• In what follows all vectors are column vectors unless otherwise stated.
• However, writing column vectors takes up more space than row vectors. There-
fore we shall frequently write vectors as row vectors, but with the understand-
ing that it really is a column vector.
A general n–vector has the form
a1
a2
a=
..
.
an
where the ai s are numbers, and this vector shall be written a = (a1 , . . . , an ).
A graphical representation of 2–vectors is shown Figure 1. Note that row and
3
a1 = (2,2)
2
1
0
a2 = (1,−0.5)
−1
−1 0 1 2 3
[1] 1 3 2
The vector a is in R printed “in row format” but can really be regarded as a
column vector, cfr. the convention above.
2
2.2 Transpose of vectors
Transposing a vector means turning a column (row) vector into a row (column)
vector. The transpose is denoted by “> ”.
Example 1 2 3> 2 3
1 1
4 3 5 = [1, 3, 2] og [1, 3, 2]> = 4 3 5
2 2
a = (a> )>
> t(a)
See Figure 2.
Example 2
1 7
7 3 = 21
2 14
> 7 * a
[1] 7 21 14
3
3
a1 = (2,2)
2
1
− a2 = (−1,0.5)
a2 = (1,−0.5)
2a2 = (2,−1)
−1
−1 0 1 2 3
See Figure 3 and 4. Only vectors of the same dimension can be added.
Example 3 2 3 2 3 2 3 2 3
1 2 1+2 3
4 3 5+4 8 5=4 3+8 5=4 11 5
2 9 2+9 11
3
a1 = (2,2)
2
1
a1 + a2 = (3,1.5)
0
a2 = (1,−0.5)
−1
−1 0 1 2 3
4
3
a1 + (− a2) = (1,2.5)
a1 = (2,2)
2
1
a1 + a2 = (3,1.5)
− a2 = (−1,0.5)
a2 = (1,−0.5)
−1
−1 0 1 2 3
[1] 3 11 11
a · b = a1 b1 + · · · + an bn
> sum(a * b)
[1] 44
[1] 3.741657
5
2.7 The 0–vector and 1–vector
The 0-vector (1–vector) is a vector with 0 (1) on all entries. The 0–vector
(1–vector) is frequently written simply as 0 (1) or as 0n (1n ) to emphasize
that its length n.
> rep(0, 5)
[1] 0 0 0 0 0
> rep(1, 5)
[1] 1 1 1 1 1
v1 ⊥ v2 ⇔ v1 · v2 = 0
[1] 0
3 Matrices
3.1 Matrices
An r × c matrix A (reads “an r times c matrix”) is a table with r rows og c columns
a11 a12 . . . a1c
a21 a22 . . . a2c
A= .
.. .. ..
.. . . .
ar1 ar2 . . . arc
Note that one can regard A as consisting of c columns vectors put after each other:
A = [a1 : a2 : · · · : ac ]
6
Note that the numbers 1, 3, 2, 2, 8, 9 are read into the matrix column–by–
column. To get the numbers read in row–by–row do
Example 4
1 2 7 14
7 3 8 = 21 56
2 9 14 63
> 7 * A
Example 5 >
1 2
3 1 3 2
8 =
2 8 9
2 9
> t(A)
[,1] [,2]
[1,] 1 3
[2,] 2 2
[3,] 8 9
7
3.4 Sum of matrices
Let A and B be r × c matrices. The sum A + B is the r × c matrix obtained
by adding A and B elementwise.
Only matrices with the same dimensions can be added.
Example 6
1 2 5 4 6 6
3 8 + 8 2 = 11 10
2 9 3 7 5 16
Example 7
1 2 1·5+2·8 21
5
3 8 = 3 · 5 + 8 · 8 = 79
8
2 9 2·5+9·8 82
> A %*% a
[,1]
[1,] 23
[2,] 27
8
3.6 Multiplication of matrices
Let A be an r × c matrix and B a c × t matrix, i.e. B = [b1 : b2 : · · · : bt ]. The
product AB is the r × t matrix given by:
Example 8
"
1
#
2 1 2 1 2
5 4 5 4
3 8 = 3 8 : 3 8
8 2 8 2
2 9 2 9 2 9
1·5+2·8 1·4+2·2 21 8
= 3·5+8·8 3 · 4 + 8 · 2 = 79 28
2·5+9·8 2·4+9·2 82 26
Note that the product AB can only be formed if the number of rows in B and
the number of columns in A are the same. In that case, A and B are said to
be conforme.
In general AB and BA are not identical.
A mnemonic for matrix multiplication is :
5 4
1 2 8 2 21 8
5 4
3 8 = 1 2 1·5+2·8 1 · 4 + 2 · 2 = 79 28
8 2
2 9 3 8 3·5+8·8 3·4+8·2 82 26
2 9 2·5+9·8 2·4+9·2
[,1] [,2]
[1,] 21 8
[2,] 79 28
[3,] 82 26
9
– A matrix consisting of 1s in all entries is of written J.
– A square matrix with 0 on all off–diagonal entries and elements d1 , d2 , . . . , dn
on the diagonal a diagonal matrix and is often written diag{d1 , d2 , . . . , dn }
– A diagonal matrix with 1s on the diagonal is called the identity matrix
and is denoted I. The identity matrix satisfies that IA = AI = A.
> diag(1, 3)
[1] 1 2 3
> diag(A)
[1] 1 8
10
3.9 Inverse of matrices
In general, the inverse of an n × n matrix A is the matrix B (which is also
n × n) which when multiplied with A gives the identity matrix I. That is,
AB = BA = I.
One says that B is A’s inverse and writes B = A−1 . Likewise, A is Bs inverse.
Example 9 Let
1 3 −2 1.5
A= B=
2 4 1 −0.5
Now AB = BA = I so B = A−1 .
– Only square matrices can have an inverse, but not all square matrices
have an inverse.
– When the inverse exists, it is unique.
– Finding the inverse of a large matrix A is numerically complicated (but
computers do it for us).
In Section ?? the issue of matrix inversion is discussed in more detail.
Finding the inverse of a matrix in R is done using the solve() function:
[,1] [,2]
[1,] 1 3
[2,] 2 4
[,1] [,2]
[1,] -2 1.5
[2,] 1 -0.5
> A %*% B
[,1] [,2]
[1,] 1 0
[2,] 0 1
11
3.10 Solving systems of linear equations
Example 11 Matrices are closely related to systems of linear equations. Con-
sider the two equations
x1 + 3x2 = 7
2x1 + 4x2 = 10
1
0
−1
−1 0 1 2 3
x1
From the Figure it follows that there are 3 possible cases of solutions to the
system
1. Exactly one solution – when the lines intersect in one point
2. No solutions – when the lines are parallel but not identical
3. Infinitely many solutions – when the lines coincide.
[,1]
[1,] 1
[2,] 2
12
3.11 Trace
Missing
3.12 Determinant
Missing
A = diag(a1 , a2 , . . . , an )
13
3.14.4 Inverting an n × n matrix*
In the following we will illustrate one frequently applied methopd for matrix inver-
sion. The method is called Gauss–Seidels method and many computer programs,
including solve() use variants of the method for finding the inverse of an n × n
matrix.
Consider the matrix A:
> A <- matrix(c(2, 2, 3, 3, 5, 9, 5, 6, 7), ncol = 3)
> A
We want to find the matrix B = A−1 . To start, we append to A the identity matrix
and call the result AB:
> AB <- cbind(A, diag(c(1, 1, 1)))
> AB
14
• Next we ensure that AB[2,2]=1. Afterwards we subtract a constant times the
second row from the third to obtain that AB[3,2]=0:
Now we extract the three rightmost columns of AB into the matrix B. We claim that
B is the inverse of A, and this can be verified by a simple matrix multiplication
So, apart from rounding errors, the product is the identity matrix, and hence B =
A−1 . This example illustrates that numerical precision and rounding errors is an
important issue when making computer programs.
15
4 Least squares
Consider the table of pairs (xi , yi ) below.
6.0
5.0
y
4.0
1 2 3 4 5
Figure 6: Regression
> y
> X
x
[1,] 1 1
[2,] 1 2
[3,] 1 3
[4,] 1 4
[5,] 1 5
16
> beta.hat <- solve(t(X) %*% X) %*% t(X) %*% y
> beta.hat
[,1]
3.07
x 0.61
17