You are on page 1of 17

Linear algebra in R

Søren Højsgaard
February 15, 2005

Contents
1 Introduction 1

2 Vectors 1
2.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.2 Transpose of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Multiplying a vector by a number . . . . . . . . . . . . . . . . . . . . 3
2.4 Sum of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.5 (Inner) product of vectors . . . . . . . . . . . . . . . . . . . . . . . . 4
2.6 The length (norm) of a vector . . . . . . . . . . . . . . . . . . . . . . 5
2.7 The 0–vector and 1–vector . . . . . . . . . . . . . . . . . . . . . . . . 5
2.8 Orthogonal (perpendicular) vectors . . . . . . . . . . . . . . . . . . . 5

3 Matrices 6
3.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Multiplying a matrix with a number . . . . . . . . . . . . . . . . . . 6
3.3 Transpose of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4 Sum of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.5 Multiplication of a matrix and a vector . . . . . . . . . . . . . . . . 7
3.6 Multiplication of matrices . . . . . . . . . . . . . . . . . . . . . . . . 8
3.7 Vectors as matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.8 Some special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.9 Inverse of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.10 Solving systems of linear equations . . . . . . . . . . . . . . . . . . . 11
3.11 Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.12 Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.13 Some additional rules for matrix operations . . . . . . . . . . . . . . 12
3.14 Details on inverse matrices* . . . . . . . . . . . . . . . . . . . . . . . 12
3.14.1 Inverse of a 2 × 2 matrix* . . . . . . . . . . . . . . . . . . . . 12
3.14.2 Inverse of diagonal matrices* . . . . . . . . . . . . . . . . . . 13
3.14.3 Generalized inverse* . . . . . . . . . . . . . . . . . . . . . . . 13
3.14.4 Inverting an n × n matrix* . . . . . . . . . . . . . . . . . . . 13

4 Least squares 15

5 A neat little exercise – from a bird’s perspective 16

1 Introduction
This note has two goal: 1) Introducing linear algebra (vectors and matrices) and 2)
showing how to work with these concepts in R.

1
2 Vectors
2.1 Vectors
A column vector is a list of numbers stacked on top of each other, e.g.
 
2
a= 1 
3
A row vector is a list of numbers written one after the other, e.g.
b = (2, 1, 3)
In both cases, the list is ordered, i.e.
(2, 1, 3) 6= (1, 2, 3).
We make the following convention:
• In what follows all vectors are column vectors unless otherwise stated.
• However, writing column vectors takes up more space than row vectors. There-
fore we shall frequently write vectors as row vectors, but with the understand-
ing that it really is a column vector.
A general n–vector has the form
 
a1
 a2 
a=
 
.. 
 . 
an
where the ai s are numbers, and this vector shall be written a = (a1 , . . . , an ).
A graphical representation of 2–vectors is shown Figure 1. Note that row and
3

a1 = (2,2)
2
1
0

a2 = (1,−0.5)
−1

−1 0 1 2 3

Figure 1: Two 2-vectors

column vectors are drawn the same way.

> a <- c(1, 3, 2)


> a

[1] 1 3 2

The vector a is in R printed “in row format” but can really be regarded as a
column vector, cfr. the convention above.

2
2.2 Transpose of vectors
Transposing a vector means turning a column (row) vector into a row (column)
vector. The transpose is denoted by “> ”.

Example 1 2 3> 2 3
1 1
4 3 5 = [1, 3, 2] og [1, 3, 2]> = 4 3 5
2 2

Hence transposing twice takes us back to where we started:

a = (a> )>

> t(a)

[,1] [,2] [,3]


[1,] 1 3 2

2.3 Multiplying a vector by a number


If a is a vector and α is a number then αa is the vector
 
αa1
 αa2 
αa =  . 
 
 .. 
αan

See Figure 2.

Example 2    
1 7
7  3  =  21 
2 14

> 7 * a

[1] 7 21 14

3
3
a1 = (2,2)

2
1
− a2 = (−1,0.5)

a2 = (1,−0.5)
2a2 = (2,−1)
−1

−1 0 1 2 3

Figure 2: Multiplication of a vector by a number

2.4 Sum of vectors


Let a and b be n–vectors. The sum a + b is the n–vector
     
a1 b1 a1 + b1
 a2   b2   a2 + b2 
a+b= . + . = =b+a
     
..
 ..   ..   . 
an bn an + bn

See Figure 3 and 4. Only vectors of the same dimension can be added.

Example 3 2 3 2 3 2 3 2 3
1 2 1+2 3
4 3 5+4 8 5=4 3+8 5=4 11 5
2 9 2+9 11


3

a1 = (2,2)
2
1

a1 + a2 = (3,1.5)
0

a2 = (1,−0.5)
−1

−1 0 1 2 3

Figure 3: Addition of vectors

4
3
a1 + (− a2) = (1,2.5)

a1 = (2,2)

2
1
a1 + a2 = (3,1.5)
− a2 = (−1,0.5)

a2 = (1,−0.5)
−1

−1 0 1 2 3

Figure 4: Addition of vectors and multiplication by a number

> a <- c(1, 3, 2)


> b <- c(2, 8, 9)
> a + b

[1] 3 11 11

2.5 (Inner) product of vectors


Let a = (a1 , . . . , an ) and b = (b1 , . . . , bn ). The (inner) product of a and b is

a · b = a1 b1 + · · · + an bn

Note, that the product is a number – not a vector.

> sum(a * b)

[1] 44

2.6 The length (norm) of a vector


The length (or norm) of a vector a is
v
u n
√ uX
||a|| = a·a=t a2 i
i=1

> sqrt(sum(a * a))

[1] 3.741657

5
2.7 The 0–vector and 1–vector
The 0-vector (1–vector) is a vector with 0 (1) on all entries. The 0–vector
(1–vector) is frequently written simply as 0 (1) or as 0n (1n ) to emphasize
that its length n.

> rep(0, 5)

[1] 0 0 0 0 0

> rep(1, 5)

[1] 1 1 1 1 1

2.8 Orthogonal (perpendicular) vectors


Two vectors v1 and v2 are orthogonal if their inner product is zero, written

v1 ⊥ v2 ⇔ v1 · v2 = 0

> v1 <- c(1, 1)


> v2 <- c(-1, 1)
> sum(v1 * v2)

[1] 0

3 Matrices
3.1 Matrices
An r × c matrix A (reads “an r times c matrix”) is a table with r rows og c columns
 
a11 a12 . . . a1c
 a21 a22 . . . a2c 
A= .
 
.. .. .. 
 .. . . . 
ar1 ar2 . . . arc

Note that one can regard A as consisting of c columns vectors put after each other:

A = [a1 : a2 : · · · : ac ]

> A <- matrix(c(1, 3, 2, 2, 8, 9), ncol = 3)


> A

[,1] [,2] [,3]


[1,] 1 2 8
[2,] 3 2 9

6
Note that the numbers 1, 3, 2, 2, 8, 9 are read into the matrix column–by–
column. To get the numbers read in row–by–row do

> A2 <- matrix(c(1, 3, 2, 2, 8, 9), ncol = 3, byrow = T)


> A2

[,1] [,2] [,3]


[1,] 1 3 2
[2,] 2 8 9

3.2 Multiplying a matrix with a number


For a number α and a matrix A, the product αA is the matrix obtained by
multiplying each element in A by α.

Example 4    
1 2 7 14
7 3 8  =  21 56 
2 9 14 63


> 7 * A

[,1] [,2] [,3]


[1,] 7 14 56
[2,] 21 14 63

3.3 Transpose of matrices


A matrix is transposed by interchanging rows and columns and is denoted by
“> ”.

Example 5  >
1 2  
 3 1 3 2
8  =
2 8 9
2 9


Note that if A is an r × c matrix then A> is a c × r matrix.

> t(A)

[,1] [,2]
[1,] 1 3
[2,] 2 2
[3,] 8 9

7
3.4 Sum of matrices
Let A and B be r × c matrices. The sum A + B is the r × c matrix obtained
by adding A and B elementwise.
Only matrices with the same dimensions can be added.

Example 6      
1 2 5 4 6 6
 3 8 + 8 2  =  11 10 
2 9 3 7 5 16


> B <- matrix(c(5, 8, 3, 4, 2, 7), ncol = 3, byrow = T)


> A + B

[,1] [,2] [,3]


[1,] 6 10 11
[2,] 7 4 16

3.5 Multiplication of a matrix and a vector


Let A be an r × c matrix and let b be a c-dimensional column vector. The
product Ab is the r × 1 matrix
    
a11 a12 . . . a1c b1 a11 b1 + a12 b2 + · · · + a1c bc
 a21 a22 . . . a2c  b2   a21 b1 + a22 b2 + · · · + a2c bc 
Ab= . =
    
. .. . . ..  .. .. 
 . . . .  .   . 
ar1 ar2 ... arc bc ar1 b1 + ar2 b2 + · · · + arc bc

Example 7
     
1 2   1·5+2·8 21
5
 3 8  =  3 · 5 + 8 · 8  =  79 
8
2 9 2·5+9·8 82


> A %*% a

[,1]
[1,] 23
[2,] 27

Note the difference to


> A * a

[,1] [,2] [,3]


[1,] 1 4 24
[2,] 9 2 18

Figure out yourself what goes on!

8
3.6 Multiplication of matrices
Let A be an r × c matrix and B a c × t matrix, i.e. B = [b1 : b2 : · · · : bt ]. The
product AB is the r × t matrix given by:

AB = A[b1 : b2 : · · · : bt ] = [Ab1 : Ab2 : · · · : Abt ]

Example 8
    
"
1
#
2  1 2   1 2  
5 4 5 4
3 8 =  3 8  :  3 8  
8 2 8 2
2 9 2 9 2 9
   
1·5+2·8 1·4+2·2 21 8
=  3·5+8·8 3 · 4 + 8 · 2  =  79 28 
2·5+9·8 2·4+9·2 82 26

Note that the product AB can only be formed if the number of rows in B and
the number of columns in A are the same. In that case, A and B are said to
be conforme.
In general AB and BA are not identical.
A mnemonic for matrix multiplication is :

  5 4  
1 2   8 2 21 8
5 4
 3 8  = 1 2 1·5+2·8 1 · 4 + 2 · 2 = 79 28 
8 2
2 9 3 8 3·5+8·8 3·4+8·2 82 26
2 9 2·5+9·8 2·4+9·2

> A <- matrix(c(1, 3, 2, 2, 8, 9), ncol = 2)


> B <- matrix(c(5, 8, 4, 2), ncol = 2)
> A %*% B

[,1] [,2]
[1,] 21 8
[2,] 79 28
[3,] 82 26

3.7 Vectors as matrices


One can regard a column vector of length r as an r × 1 matrix and a row
vector of length c as a 1 × c matrix.

3.8 Some special matrices


– An n × n matrix is a square matrix
– A matrix A is symmetric if A = A> .
– A matrix with 0 on all entries is the 0–matrix and is often written simply
as 0.

9
– A matrix consisting of 1s in all entries is of written J.
– A square matrix with 0 on all off–diagonal entries and elements d1 , d2 , . . . , dn
on the diagonal a diagonal matrix and is often written diag{d1 , d2 , . . . , dn }
– A diagonal matrix with 1s on the diagonal is called the identity matrix
and is denoted I. The identity matrix satisfies that IA = AI = A.

• 0-matrix and 1-matrix


> matrix(0, nrow = 2, ncol = 3)

[,1] [,2] [,3]


[1,] 0 0 0
[2,] 0 0 0

> matrix(1, nrow = 2, ncol = 3)

[,1] [,2] [,3]


[1,] 1 1 1
[2,] 1 1 1

• Diagonal matrix and identity matrix

> diag(c(1, 2, 3))

[,1] [,2] [,3]


[1,] 1 0 0
[2,] 0 2 0
[3,] 0 0 3

> diag(1, 3)

[,1] [,2] [,3]


[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1

Note what happens when diag is applied to a matrix:

> diag(diag(c(1, 2, 3)))

[1] 1 2 3

> diag(A)

[1] 1 8

10
3.9 Inverse of matrices
In general, the inverse of an n × n matrix A is the matrix B (which is also
n × n) which when multiplied with A gives the identity matrix I. That is,

AB = BA = I.

One says that B is A’s inverse and writes B = A−1 . Likewise, A is Bs inverse.

Example 9 Let
   
1 3 −2 1.5
A= B=
2 4 1 −0.5

Now AB = BA = I so B = A−1 . 

Example 10 If A is a 1 × 1 matrix, i.e. a number, for example A = 4, then


A−1 = 1/4. 

Some facts about inverse matrices are:

– Only square matrices can have an inverse, but not all square matrices
have an inverse.
– When the inverse exists, it is unique.
– Finding the inverse of a large matrix A is numerically complicated (but
computers do it for us).
In Section ?? the issue of matrix inversion is discussed in more detail.
Finding the inverse of a matrix in R is done using the solve() function:

> A <- matrix(c(1, 3, 2, 4), ncol = 2, byrow = T)


> A

[,1] [,2]
[1,] 1 3
[2,] 2 4

> B <- solve(A)


> B

[,1] [,2]
[1,] -2 1.5
[2,] 1 -0.5

> A %*% B

[,1] [,2]
[1,] 1 0
[2,] 0 1

11
3.10 Solving systems of linear equations
Example 11 Matrices are closely related to systems of linear equations. Con-
sider the two equations

x1 + 3x2 = 7
2x1 + 4x2 = 10

The system can be written in matrix form


    
1 3 x1 7
= i.e. Ax = b
2 4 x2 10

Since A−1 A = I and since Ix = x we have


    
−1 −2 1.5 7 1
x=A b= =
1 −0.5 10 2

A geometrical approach to solving these equations is as follows: Isolate x2 in


the equations:
7 1 1 2
x2 = − x1 x2 = 4 − x1
3 3 0 4
These two lines are shown in Figure 5 from which it can be seen that the
solution is x1 = 1, x2 = 2.
3
2
2

1
0
−1

−1 0 1 2 3
x1

Figure 5: Solving two equations with two unknowns.

From the Figure it follows that there are 3 possible cases of solutions to the
system
1. Exactly one solution – when the lines intersect in one point
2. No solutions – when the lines are parallel but not identical
3. Infinitely many solutions – when the lines coincide.


> A <- matrix(c(1, 2, 3, 4), ncol = 2)


> b <- c(7, 10)
> x <- solve(A) %*% b
> x

[,1]
[1,] 1
[2,] 2

12
3.11 Trace
Missing

3.12 Determinant
Missing

3.13 Some additional rules for matrix operations


For matrices A, B and C whose dimension match appropriately: the following rules
apply
(A + B)> = A> + B >
(AB)> = B > A>
A(B + C) = AB + AC
AB = AC 6⇒ B = C
In genereal AB 6= BA
AI = IA = A
If α is a number then αAB = A(αB)

3.14 Details on inverse matrices*


3.14.1 Inverse of a 2 × 2 matrix*
It is easy find the inverse for a 2 × 2 matrix. When
 
a b
A=
c d

then the inverse is  


1 d −b
A−1 =
ad − bc −c a
under the assumption that ab−bc 6= 0. The number ab−bc is called the determinant
of A, sometimes written |A|. If |A| = 0, then A has no inverse.

3.14.2 Inverse of diagonal matrices*


Finding the inverse of a diagonal matrix is easy: Let

A = diag(a1 , a2 , . . . , an )

where all ai 6= 0. Then the inverse is


1 1 1
A−1 = diag( , ,..., )
a1 a2 an
If one ai = 0 then A−1 does not exist.

3.14.3 Generalized inverse*


Not all square matrices have an inverse. However all square matrices have an
infinite number of generalized inverses. A generalized inverse of a square matrix A
is a matrix A− satisfying that
AA− A = A.
For many practical problems it suffice to find a generalized inverse.

13
3.14.4 Inverting an n × n matrix*
In the following we will illustrate one frequently applied methopd for matrix inver-
sion. The method is called Gauss–Seidels method and many computer programs,
including solve() use variants of the method for finding the inverse of an n × n
matrix.
Consider the matrix A:
> A <- matrix(c(2, 2, 3, 3, 5, 9, 5, 6, 7), ncol = 3)
> A

[,1] [,2] [,3]


[1,] 2 3 5
[2,] 2 5 6
[3,] 3 9 7

We want to find the matrix B = A−1 . To start, we append to A the identity matrix
and call the result AB:
> AB <- cbind(A, diag(c(1, 1, 1)))
> AB

[,1] [,2] [,3] [,4] [,5] [,6]


[1,] 2 3 5 1 0 0
[2,] 2 5 6 0 1 0
[3,] 3 9 7 0 0 1

On a matrix we allow ourselves to do the following three operations (sometimes


called elementary operations) as often as we want:
1. Multiply a row by a (non–zero) constant.
2. Multiply a row by a (non–zero) constant and add the result to another row.
3. Interchange two rows.
The aim is to perform such operations on AB in a way such that one ends up with
a 3 × 6 matrix which has the identity matrix in the three leftmost columns. The
three rightmost columns will then contain B = A−1 .
Recall that writing e.g. AB[1,] extracts the enire first row of AB.
• First, we make sure that AB[1,1]=1. Then we subtract a constant times the
first row from the second to obtain that AB[2,1]=0, and similarly for the third
row:
> AB[1, ] <- AB[1, ]/AB[1, 1]
> AB[2, ] <- AB[2, ] - 2 * AB[1, ]
> AB[3, ] <- AB[3, ] - 3 * AB[1, ]
> AB

[,1] [,2] [,3] [,4] [,5] [,6]


[1,] 1 1.5 2.5 0.5 0 0
[2,] 0 2.0 1.0 -1.0 1 0
[3,] 0 4.5 -0.5 -1.5 0 1

14
• Next we ensure that AB[2,2]=1. Afterwards we subtract a constant times the
second row from the third to obtain that AB[3,2]=0:

> AB[2, ] <- AB[2, ]/AB[2, 2]


> AB[3, ] <- AB[3, ] - 4.5 * AB[2, ]

• Now we rescale the third row such that AB[3,3]=1:

> AB[3, ] <- AB[3, ]/AB[3, 3]


> AB

[,1] [,2] [,3] [,4] [,5] [,6]


[1,] 1 1.5 2.5 0.5000000 0.0000000 0.0000000
[2,] 0 1.0 0.5 -0.5000000 0.5000000 0.0000000
[3,] 0 0.0 1.0 -0.2727273 0.8181818 -0.3636364

Then AB has zeros below the main diagonal.


• We then work our way up to obtain that AB has zeros above the main diagonal:

> AB[2, ] <- AB[2, ] - 0.5 * AB[3, ]


> AB[1, ] <- AB[1, ] - 2.5 * AB[3, ]
> AB

[,1] [,2] [,3] [,4] [,5] [,6]


[1,] 1 1.5 0 1.1818182 -2.04545455 0.9090909
[2,] 0 1.0 0 -0.3636364 0.09090909 0.1818182
[3,] 0 0.0 1 -0.2727273 0.81818182 -0.3636364

> AB[1, ] <- AB[1, ] - 1.5 * AB[2, ]


> AB

[,1] [,2] [,3] [,4] [,5] [,6]


[1,] 1 0 0 1.7272727 -2.18181818 0.6363636
[2,] 0 1 0 -0.3636364 0.09090909 0.1818182
[3,] 0 0 1 -0.2727273 0.81818182 -0.3636364

Now we extract the three rightmost columns of AB into the matrix B. We claim that
B is the inverse of A, and this can be verified by a simple matrix multiplication

> B <- AB[, 4:6]


> A %*% B

[,1] [,2] [,3]


[1,] 1.000000e+00 3.330669e-16 1.110223e-16
[2,] -4.440892e-16 1.000000e+00 2.220446e-16
[3,] -2.220446e-16 9.992007e-16 1.000000e+00

So, apart from rounding errors, the product is the identity matrix, and hence B =
A−1 . This example illustrates that numerical precision and rounding errors is an
important issue when making computer programs.

15
4 Least squares
Consider the table of pairs (xi , yi ) below.

x 1.00 2.00 3.00 4.00 5.00


y 3.70 4.20 4.90 5.70 6.00

A plot of yi against xi is shown in Figure 6.

6.0
5.0
y

4.0

1 2 3 4 5

Figure 6: Regression

The plot in Figure 6 suggests an approximately linear relationship between y and


x, i.e.
yi = β0 + β1 xi for i = 1, . . . , 5
Writing this in matrix form gives
   1 x1

y1
 y2   1 x2  
 β0

y=  ...  ≈  = Xβ
  .. ..  β
 . .  1
y5 1 x5
The first question is: Can we find a vector β such that y = Xβ? The answer is
clearly no, because that would require the points to lie exactly on a straight line.
A more modest question is: Can we find a vector β̂ such that X β̂ is in a sense “as
close to y as possible”. The answer is yes. The task is to find β̂ such that the length
of the vector
e = y − Xβ
is as small as possible. The solution is
β̂ = (X > X)−1 X > y

> y

[1] 3.7 4.2 4.9 5.7 6.0

> X

x
[1,] 1 1
[2,] 1 2
[3,] 1 3
[4,] 1 4
[5,] 1 5

16
> beta.hat <- solve(t(X) %*% X) %*% t(X) %*% y
> beta.hat

[,1]
3.07
x 0.61

5 A neat little exercise – from a bird’s perspective


On a sunny day, two tables are standing in an English country garden. On each
table birds of unknown species are sitting having the time of their lives.
A bird from the first table says to those on the second table: “Hi – if one of you
come to our table then there will be the same number of us on each table”. “Yeah,
right”, says a bird from the second table, “but if one of you comes to our table, then
we will be twice as many on our table as on yours”.
Question: How many birds are on each table? More specifically,
• Write up two equations with two unknowns.
• Solve these equations using the methods you have learned from linear algebra.
• Simply finding the solution by trial–and–error is considered cheating.

17

You might also like