You are on page 1of 14

Matrices

A matrix is a rectangular arrangement of values which are all


of the same basic type. For example,
 
1 4
 2 5 
3 6
STATS 380
arranges the numbers 1, . . . , 6 in three rows and two columns.
This is a 3 × 2 matrix of numbers.
Matrices
In R, the elements of a matrix are stored in a vector with the
elements of the first column coming first followed by those of
the second, and so on.

This is known as column-major order storage.

Creating Matrices Matrix Elements and Recycling

The matrix on the previous slide can be created as follows The first argument to the matrix function contains the
elements which are to form the matrix. If there are not enough
> matrix(1:6, nrow = 3, ncol = 2) elements in the argument to create the matrix, the recycling
[,1] [,2] rule is applied to obtain more.
[1,] 1 4
[2,] 2 5 A 2 × 3 matrix filled with 1s can be obtained as follows.
[3,] 3 6
> matrix(1, nrow = 2, ncol = 3)
An optional argument byrow=TRUE can be used to arrange the [,1] [,2] [,3]
vector of values by row. [1,] 1 1 1
[2,] 1 1 1
> matrix(1:6, nrow = 3, ncol = 2, byrow = TRUE)
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
Optional Dimension Specifications Determining Matrix Dimensions

Often R can work out the number of rows in a matrix given the The number of rows and columns or a matrix can be obtained
number of columns and the elements, or the the number of with the functions nrow and ncol. Or obtained together with
columns in a matrix given the number of rows and the the function dim which returns a vector containing the number
elements. In such cases it is not necessary to specify both the of rows as the first element can the number of columns as its
number of rows and the number of columns. second.

> matrix(1:6, nrow = 3) > x = matrix(1:6, nc = 2)


[,1] [,2] > nrow(x)
[1,] 1 4 [1] 3
[2,] 2 5 > ncol(x)
[3,] 3 6 [1] 2
> dim(x)
[1] 3 2

Creating Matrices from Rows and Columns Binding Rows and Columns; Recycling

Matrices can be created by gluing together rows with rbind The arguments to rbind and cbind are not required to be the
of gluing together columns with cbind. same length length. When they are not, a matrix is created
which is big enough to accommodate the largest argument,
> cbind(1:3, 4:6) and the others have the recycling rule applied to them to
[,1] [,2] supply additional arguments. Apparent mismatches produce
[1,] 1 4 warnings.
[2,] 2 5
[3,] 3 6 > rbind(1:2, 1:3)
[,1] [,2] [,3]
> rbind(1:3, 4:6) [1,] 1 2 1
[,1] [,2] [,3] [2,] 1 2 3
[1,] 1 2 3 Warning message:
[2,] 4 5 6 In rbind(1:2, 1:3) :
number of columns of result is not
a multiple of vector length (arg 1)
Matrices and Naming Extracting Names

It is possible to attach row and column labels to matrices. Names can also be extracted with dimnames, rownames and
Rownames can be attached with rownames, column names colnames.
with colnames, and both can be attached simultaneously with
dimnames. > dimnames(x)
[[1]]
> x = matrix(1:6, nrow = 2) [1] "First" "Second"
> dimnames(x) = list(c("First", "Second"),
c("A", "B", "C")) [[2]]
> x [1] "A" "B" "C"
A B C
First 1 3 5 > rownames(x)
Second 2 4 6 [1] "First" "Second"
> colnames(x)
[1] "A" "B" "C"

Extracting Matrix Elements Matrix Subsets

The ij-th element of a matrix x can be extracted with the Expressions of the form x[i, j] can also be used to extract
expression x[i, j]. more general subsets of the elements of a matrix by specifying
vector subscripts.
This means, for example, that we could sum all the elements
in a matrix x with the following code. > (x = matrix(1:12, nrow = 3, ncol = 4))
[,1] [,2] [,3] [,4]
> s = 0 [1,] 1 4 7 10
> for(i in 1:nrow(x)) [2,] 2 5 8 11
for(j in 1:ncol(x)) [3,] 3 6 9 12
s = s + x[i, j]
> x[1:2, c(2, 4)]
Because matrices are just vectors with additional
[,1] [,2]
dimensioning information it is actually more efficient to use
[1,] 4 10
> s = sum(x) [2,] 5 11
Assigning to Matrix Subsets Specifying Entire Rows and Columns

It is also possible to assign to subsets of matrices. When a subscript is omitted, it is taken to correspond to all
possible values. This works for extracting value from matrices
> x[1:2, c(2, 4)] = 21:24 and for assigning to them.
> x
[,1] [,2] [,3] [,4] > x = matrix(1:12, nrow = 3, ncol = 4)
[1,] 1 21 7 23 > x[1,] = 100
[2,] 2 22 8 24 > x
[3,] 3 6 9 12 [,1] [,2] [,3] [,4]
> x[2:1, c(2, 4)] = 21:24 [1,] 100 100 100 100
> x [2,] 2 5 8 11
[,1] [,2] [,3] [,4] [3,] 3 6 9 12
[1,] 1 22 7 24
[2,] 2 21 8 23
[3,] 3 6 9 12

Subsetting Matrices as Vectors An Example

Because matrices are just vectors with additional Here is how to create a 4 × 4 tridiagonal matrix with diagonal
dimensioning information they can be treated as vectors. values being 2 and the off-diagonal elements being 1.

> x = matrix(1:6, nrow = 2, ncol = 3) > x = matrix(0, nrow = 4, ncol = 4)


> x[7] > x[row(x) == col(x)] = 2
[1] NA > x[abs(row(x) - col(x)) == 1] = 1
> x
The functions row and col return matrices indicating the row [,1] [,2] [,3] [,4]
and column of each element. This can be used to extract or [1,] 2 1 0 0
change submatrices. [2,] 1 2 1 0
> x[row(x) < col(x)] = 0 [3,] 0 1 2 1
> x [4,] 0 0 1 2
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 2 4 0
Simple Computations with Matrices Combining Vectors and Matrices

The basic mathematical manipulations on matrices are defined When a vector is added to a matrix, the recycling rule is used
by using the fact that the operations are defined for the to expand the vector so that it matches the number of elements
underlying vectors. in the matrix.

> (x = matrix(1:4, nrow = 2, ncol = 2)) > x + 1


[,1] [,2] [,1] [,2]
[1,] 1 3 [1,] 2 4
[2,] 2 4 [2,] 3 5

> x + x^2 > x + 1:2


[,1] [,2] [,1] [,2]
[1,] 2 12 [1,] 2 4
[2,] 6 20 [2,] 4 6

Combining Vectors and Matrices Row and Column Summaries

Some checks are carried out to try to make sure that operations A common thing to want to do with matrices is to obtain a
are sensible. This can produce warnings. vector containing a numerical summary for each row (or each
column). It is possible to to this by using for loops.
> x + 1:3
[,1] [,2] As a simple example, we’ll look at computing row (or
[1,] 2 6 column) means. We can do this as follows:
[2,] 4 5
Warning message: > rm = numeric(nrow(x))
In x + 1:3 : > for(i in 1:nrow(x))
longer object length is not a multiple of rm[i] = mean(x[i,])
shorter object length
Notice that there is nothing special about using the mean
Note that it is an error to try to combine a matrix with a vector function here. The same code will work for any numerical
which has more elements than the matrix. summary.
The “Apply” Mechanism Computing Row and Column Means

Because this is such a common kind of task, R has a special Row and column means of x can be computed as follows:
way of carrying out this kind of operation.
> apply(x, 1, mean)
The function apply can be used to compute row or column [1] 2 3
summaries. The expression
> apply(x, 2, mean)
apply(matrix, 1, summary) [1] 1.5 3.5
computes row summaries of the type specified by summary for There is no difficulty in substituting any other summary
the matrix specified by matrix. Similarly, function in place of mean. For example, the row standard
deviations can be computed as follows.
apply(matrix, 2, summary)
> apply(x, 1, sd)
computes column summaries.
[1] 1.414214 1.414214

Apply and Anonymous Functions More Complex Summaries

In the preceding examples, we’ve used a function name to The first subscript of the result ranges over the set of values
specify which summary function the apply mechanism returned by the summary function while the second specifies
should use. It is also possible to specify the function directly which row or column the summary function is being computed
by using its definition as the argument to apply. Here is how for.
we can compute the column sums of squares for x.
> apply(x, 1, range)
> apply(x, 2, function(x) sum(x^2)) [,1] [,2]
[1] 5 25 [1,] 1 2
[2,] 3 4
Computing the sums of squares about the column means is
just as easy. > apply(x, 2, range)
> apply(x, 2, function(x) sum((x - mean(x))^2)) [,1] [,2]
[1] 0.5 0.5 [1,] 1 3
[2,] 2 4
Additional Arguments to Summaries Sweeping Out Summaries

It is also possible to specify additional arguments for the Once summaries have been computed, it is common to want to
summary function as extra arguments to the apply call. To see subtract out those summaries to obtain residual values. As
an example, let’s return to the mean function, which has an with computing the summaries themselves, this can be done
optional argument trim, which specifies a fraction of the by looping over the rows or columns, as appropriate. Because
observations to trimmed from its argument before the mean is this is quite a common task, there is a special function called
computed. The largest and smallest trim/2 fraction of the sweep which can be used to sweep out computed summaries.
observations are trimmed before the mean is computed. A To subtract out row summaries from the rows of a matrix use a
value of trim=.1 could be specified in a call to apply as call of the form
follows.
sweep(matrix, 1, apply(matrix, 1, summary))
apply(matrix, margin, mean, trim = .1)
and to subtract out column summaries use a call of the form

sweep(matrix, 2, apply(matrix, 2, summary))

Other Forms of Sweeping A Statistical Example

By default, sweep will subtract out the summaries, but it is Consider the following data showing child mortality in regions
possible to use any other binary operation instead. For of the United States, all races, 1964–1966 (deaths per 1000
example, the following statement forms will divide through by live births). This is typical of data encountered in statistical
the summary, rather than subtracting. data analysis.
sweep(matrix, 1, apply(matrix, 1, summary), "/") > mortality
sweep(matrix, 2, apply(matrix, 2, summary), "/") Father✬s Education (Years)
Region <=8 9-11 12 13-15 >=16
As a concrete example, here is how to divide the columns of a
Northeast 25.3 25.3 18.2 18.3 16.3
matrix by their column means.
North Central 32.1 29.0 18.8 24.3 19.0
> x = cbind(1:3, 9:11) South 38.8 31.0 19.3 15.7 16.8
> sweep(x, 2, apply(x, 2, mean), "/") West 25.4 21.1 20.3 24.0 17.5
[,1] [,2]
Notice that in this case, the row and column names are
[1,] 0.5 0.9
themselves named (by the strings "Region" and
[2,] 1.0 1.0
"Father✬s Education (Years)" respectively).
[3,] 1.5 1.1
Creating The Matrix An “Overall + Rows + Columns” Model

> mortality = A common way of dealing with a table of values like this is to
matrix(c(25.3, 25.3, 18.2, 18.3, 16.3, try to explain the pattern in the table with in terms of some
32.1, 29.0, 18.8, 24.3, 19.0, simpler underlying structure. In the case of the mortality data,
38.8, 31.0, 19.3, 15.7, 16.8, we will look at how to fit an overall plus row-effect plus
25.4, 21.1, 20.3, 24.0, 17.5), column-effect model to describe these values. This means that
nrow = 4, byrow = TRUE, we will seek to represent the values yi j in the matrix by
dimnames = list(
Region = yi j = µ + αi + β j + εi j
c("Northeast", "North Central",
"South", "West"), where µ represents the overall level of the data, αi is an effect
"Father✬s Education (Years)" = common to all the values in the i-th row, β j is a value common
c("<=8", "9-11", "12", to all values in the j-th column and the εi j are (small) residuals
"13-15", ">=16"))) which describe the deviations between what the model
predicts and the observed values.

Fitting By Sweeping Out Effects Sweeping Out Row Effects

We will fit this model by progressively sweeping out the Next we compute the row effects as the row means of y and
effects from a set of working residuals. subtract them from y.

To begin we will estimate the overall level as the average value The easy way to do this is using apply and sweep.
and then subtracting away from the original values.
> (alpha = apply(r, 1, mean))
This gives a set of deviations about the overall mean. Northeast North Central South
-2.145 1.815 1.495
> r = mortality West
> (mu = mean(r)) -1.165
[1] 22.825 > r = sweep(r, 1, alpha)
> r = r - mu
Sweeping Out Column Effects The Residuals

Finally, we extract the column effects as the column means of After completing the process, we are left with a set of
y and subtract them from y. residuals from the fit.

Again, this can be done with sweep. > r


Father✬s Education (Years)
> (beta = apply(r, 2, mean)) Region <=8 9-11 12 13-15 >=16
<=8 9-11 12 13-15 >=16 Northeast -2.955 0.845 1.195 -0.13 1.045
7.575 3.775 -3.675 -2.250 -5.425 North Central -0.115 0.585 -2.165 1.91 -0.215
> r = sweep(r, 2, beta) South 6.905 2.905 -1.345 -6.37 -2.095
West -3.835 -4.335 2.315 4.59 1.265

Remaining Effects? Packaging as a Function

To see that all the effects have been swept out of the residuals > twoway =
we can compute the row and column means of the residuals. function(y)
{
> round(apply(r, 1, mean), 4) mu = mean(y)
Northeast North Central South y = y - mu
0 0 0 alpha = apply(y, 1, mean)
West y = sweep(y, 1, alpha)
0 beta = apply(y, 2, mean)
y = sweep(y, 2, beta)
> round(apply(r, 2, mean), 4) list(overall = mu, rows = alpha,
<=8 9-11 12 13-15 >=16 cols = beta, residuals = y)
0 0 0 0 0 }
These are (essentially) zero.
Commenting (Generalized) Outer Products

I omitted comments from the function on the previous slide in The function outer provides another useful utility in
order to save space. When writing computer code it is vital to connection with matrix calculations.
include informative comments.
In mathematics, the outer product of two vectors x and y is a
.
.
. matrix whose i j-th element is

## ... Compute and remove the overall mean xi y j .


mu = mean(y)
y = y - mu outer generalizes this so that, when given a function f , the
computed result is a matrix whose i j-th element is
## ... Compute and remove the row effects
f (xi , y j ).
alpha = apply(y, 1, mean)
y = sweep(y, 1, alpha) The default function value in R is f (x, y) = xy, which
. produces an ordinary outer product.
.
.

Example: A Multiplication Table A More Complex Example

Here is a small multiplication table (truncated at 8 so that it • Using apply and outer together provides a very
fits on a slide. powerful way of carrying out quite complex
computations.
> outer(1:8, 1:8)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] • In this example we’ll consider the problem of finding
[1,] 1 2 3 4 5 6 7 8 the closest match for each value y1 , . . . , ym in a set of
[2,] 2 4 6 8 10 12 14 16 values x1 , . . . , xn (a kind of nearest neighbour
[3,] 3 6 9 12 15 18 21 24 regression).
[4,] 4 8 12 16 20 24 28 32
[5,] 5 10 15 20 25 30 35 40 • While the problem can be carried out using nested for
[6,] 6 12 18 24 30 36 42 48 loops, there is a “one-liner” solution.
[7,] 7 14 21 28 35 42 49 56
[8,] 8 16 24 32 40 48 56 64
The which Function Nearest Neighbour Matches

• Recall that the which function accepts a vector of The algorithm is as follows:
logical values and returns a vector that indicates which
indices correspond to true values. 1. Form the matrix D so that the i jth element of D is the
distance between yi and x j (this can be done with
> x = c(-1, 0, 1) outer).
> which(x >= 0)
[1] 2 3 2. Determine the column index of the smallest value in
> which(x == 0) each row of D.
[1] 2 3. Return the x value corresponding to this index.

• The which.min (or which.max) function returns the


index of the first minimum (or maximum) in a vector.

> which.min(x)
[1] 1
> which.max(x)
[1] 3

Nearest Neighbour Matches Matrix Transposes

> nearest2 = function(x, y) { # two-liner Often it useful to be able to interchange the role of rows and
d = outer(x, y, function(x, y) abs(x - y)) columns in a matrix. This is refered to as taking the transpose
x[apply(d, 2, which.min)] of a matrix. R has a special purpose function called t which
} can be used to compute matrix transposes. It is very simple to
use.
> nearest = function(x, y) # one-liner
x[apply(outer(x, y, function(x, y) > (x = cbind(1:3, 11:13))
abs(x - y)), 2, which.min)] [,1] [,2]
[1,] 1 11
> (x = 0:10/10) [2,] 2 12
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 [3,] 3 13
> (y = round(runif(4), 4))
[1] 0.1713 0.9631 0.0906 0.5987 > t(x)
> nearest(x, y) [,1] [,2] [,3]
[1] 0.2 1.0 0.1 0.6 [1,] 1 2 3
[2,] 11 12 13
Matrix Diagonals Matrix Products

• Given a matrix A, the function diag returns the Matrices are often found in mathematical expressions
diagonal elements of that matrix. Note that diag(A) is involving matrix multiplication. Mathematically, the product
equivalent to A[row(A)==col(A)]. of an m × n matrix A whose i j-th element is ai j and an n × p
matrix B whose i j-th element is bi j is defined to be the m × p
• Given a vector a, the expression diag(a) returns a matrix whose i j-th element is
diagonal matrix with the elements of a on the diagonal.
n
• Given a positive integer n the expression diag(n) ∑ aik bk j
returns an n × n identity matrix. k=1

The product of matrices A and B is computed in R by the


expression

A %*% B

Systems of Linear Equations Solving Systems of Linear Equations

The set of equations • Given a non-singular n × n matrix A and vector b of


length n, the linear system Ax = b can be solved with
a11 x1 + a12 x2 + · · · + a1n xn = b1 the R expression solve(A, b).
a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .. .. .. • Solutions to multiple right-hand sides can be obtained
. . . .
by assembling them as the columns of a matrix B and
an1 x1 + an2 x2 + · · · + ann xn = bn
using the expression solve(A, B).
can be written in the matrix form
• The expression solve(A) computes the inverse of A
    
a11 · · · a1n x1 b1 (because the default value for b is a suitable identity
 .. .. ..   ..  =  ..  , matrix).
 . . .  .   . 
an1 · · · ann xn bn
or
Ax = b.
Example: Formulae for Sums of Powers A System of Equations

There is a well known formula for the sum of the first n Let’s suppose that
positive integers.
Snk = c0 + c1 n + c2 n2 + · · · + ck+1 nk+1
n(n + 1)
1+2+···+n =
2 By considering what happens for n = 1, 2, . . . , k + 2 we get a
series of simultaneous equations which can be solved for
Let’s examine the more general problem of finding a formula
c0 , . . . ck+1 .
for the sum of the kth powers of the first n positive integers.
S1k = c0 + c1 1 + c2 12 +···+ ck+1 1k+1
Snk = 1k + 2k + · · · + nk S2k = c0 + c1 2 + c2 22 +···+ ck+1 2k+1
..
Note that for all n .
Sk+2,k = c0 + c1 (k + 2) + c2 (k + 2)2 + · · · + ck+1 (k + 2)k+1
k k k k k+1
{z· · · + n} = n × n = n
Snk ≤ n| + n +
n terms

so that Snk is probably a polynomial of degree ≤ k + 1.

A Matrix System System Components


The equations can be written in matrix form as The coefficient matrix can be obtained in a number of ways.
     One obvious way is to create the matrix and then fill the
1 1 12 ··· 1k+1 c0 S1k
elements in one by one.

 1 2 22 ··· 2k+1   c1

 
  S2k 

 .. ..   .. = .. .
 . .  .   .  > A = matrix(1, nr = k + 2, nc = k + 2)
1 k+2 (k + 2)2 ··· (k + 2)k+1 ck+1 Sk+2,k > for(i in 1:(k+2))
for(j in 1:(k+2))
To determine the values of c0 , . . . ck+1 , we only need to create A[i, j] = i^(j-1)
the coefficient matrix and vector of right-hand sides, and then
to use solve. This produces the right answer, but a much cleaner
implementation can be obtained by using the outer function.
> A = outer(1:(k+2), 0:(k+1), "^")
The right-hand sides can generated quite simply using the
cumsum function.
> b = cumsum(seq(k+2)^k)
A Function The Formula for k = 2

Using the statements on the previous slide, it is very easy to For k = 2, we will be computing
create a function which returns the coefficients c0 , . . . ck+1 .
Sn2 = 12 + 22 + 32 + . . . + n2 .
> sumpow =
The coefficients are computed as:
function(k)
solve(outer(1:(k+2), 0:(k+1), "^"), > sumpow(2)
cumsum(seq(k+2)^k)) [1] 1.471046e-15 1.666667e-01 5.000000e-01
[4] 3.333333e-01
For k = 1 the result is
The first coeffient is very close to 0, the deviation being due to
> sumpow(1) small numerical errors which occured during the calculation.
[1] 0.0 0.5 0.5 The other coefficients are very close to simple fractions.
which corresponds to
The formula is
n n2n(n + 1) n n2 n3 n + 3n2 + 2n3 n(n + 1)(2n + 1)
Sn1 = + = . Sn2 = + + = =
2 2 2 6 2 3 6 6
.

The Formula for k = 3

For k = 3, we will be computing

Sn3 = 13 + 23 + 33 + . . . + n3 .

The coefficients are computed as:

> sumpow(3)
[1] 0.00 0.00 0.25 0.50 0.25

The formula is
n2 n3 n4 n2 (1 + 2n + n2 ) n2 (n + 1)2
Sn3 = + + = = .
4 2 4 4 4

You might also like