Professional Documents
Culture Documents
The matrix on the previous slide can be created as follows The first argument to the matrix function contains the
elements which are to form the matrix. If there are not enough
> matrix(1:6, nrow = 3, ncol = 2) elements in the argument to create the matrix, the recycling
[,1] [,2] rule is applied to obtain more.
[1,] 1 4
[2,] 2 5 A 2 × 3 matrix filled with 1s can be obtained as follows.
[3,] 3 6
> matrix(1, nrow = 2, ncol = 3)
An optional argument byrow=TRUE can be used to arrange the [,1] [,2] [,3]
vector of values by row. [1,] 1 1 1
[2,] 1 1 1
> matrix(1:6, nrow = 3, ncol = 2, byrow = TRUE)
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
Optional Dimension Specifications Determining Matrix Dimensions
Often R can work out the number of rows in a matrix given the The number of rows and columns or a matrix can be obtained
number of columns and the elements, or the the number of with the functions nrow and ncol. Or obtained together with
columns in a matrix given the number of rows and the the function dim which returns a vector containing the number
elements. In such cases it is not necessary to specify both the of rows as the first element can the number of columns as its
number of rows and the number of columns. second.
Creating Matrices from Rows and Columns Binding Rows and Columns; Recycling
Matrices can be created by gluing together rows with rbind The arguments to rbind and cbind are not required to be the
of gluing together columns with cbind. same length length. When they are not, a matrix is created
which is big enough to accommodate the largest argument,
> cbind(1:3, 4:6) and the others have the recycling rule applied to them to
[,1] [,2] supply additional arguments. Apparent mismatches produce
[1,] 1 4 warnings.
[2,] 2 5
[3,] 3 6 > rbind(1:2, 1:3)
[,1] [,2] [,3]
> rbind(1:3, 4:6) [1,] 1 2 1
[,1] [,2] [,3] [2,] 1 2 3
[1,] 1 2 3 Warning message:
[2,] 4 5 6 In rbind(1:2, 1:3) :
number of columns of result is not
a multiple of vector length (arg 1)
Matrices and Naming Extracting Names
It is possible to attach row and column labels to matrices. Names can also be extracted with dimnames, rownames and
Rownames can be attached with rownames, column names colnames.
with colnames, and both can be attached simultaneously with
dimnames. > dimnames(x)
[[1]]
> x = matrix(1:6, nrow = 2) [1] "First" "Second"
> dimnames(x) = list(c("First", "Second"),
c("A", "B", "C")) [[2]]
> x [1] "A" "B" "C"
A B C
First 1 3 5 > rownames(x)
Second 2 4 6 [1] "First" "Second"
> colnames(x)
[1] "A" "B" "C"
The ij-th element of a matrix x can be extracted with the Expressions of the form x[i, j] can also be used to extract
expression x[i, j]. more general subsets of the elements of a matrix by specifying
vector subscripts.
This means, for example, that we could sum all the elements
in a matrix x with the following code. > (x = matrix(1:12, nrow = 3, ncol = 4))
[,1] [,2] [,3] [,4]
> s = 0 [1,] 1 4 7 10
> for(i in 1:nrow(x)) [2,] 2 5 8 11
for(j in 1:ncol(x)) [3,] 3 6 9 12
s = s + x[i, j]
> x[1:2, c(2, 4)]
Because matrices are just vectors with additional
[,1] [,2]
dimensioning information it is actually more efficient to use
[1,] 4 10
> s = sum(x) [2,] 5 11
Assigning to Matrix Subsets Specifying Entire Rows and Columns
It is also possible to assign to subsets of matrices. When a subscript is omitted, it is taken to correspond to all
possible values. This works for extracting value from matrices
> x[1:2, c(2, 4)] = 21:24 and for assigning to them.
> x
[,1] [,2] [,3] [,4] > x = matrix(1:12, nrow = 3, ncol = 4)
[1,] 1 21 7 23 > x[1,] = 100
[2,] 2 22 8 24 > x
[3,] 3 6 9 12 [,1] [,2] [,3] [,4]
> x[2:1, c(2, 4)] = 21:24 [1,] 100 100 100 100
> x [2,] 2 5 8 11
[,1] [,2] [,3] [,4] [3,] 3 6 9 12
[1,] 1 22 7 24
[2,] 2 21 8 23
[3,] 3 6 9 12
Because matrices are just vectors with additional Here is how to create a 4 × 4 tridiagonal matrix with diagonal
dimensioning information they can be treated as vectors. values being 2 and the off-diagonal elements being 1.
The basic mathematical manipulations on matrices are defined When a vector is added to a matrix, the recycling rule is used
by using the fact that the operations are defined for the to expand the vector so that it matches the number of elements
underlying vectors. in the matrix.
Some checks are carried out to try to make sure that operations A common thing to want to do with matrices is to obtain a
are sensible. This can produce warnings. vector containing a numerical summary for each row (or each
column). It is possible to to this by using for loops.
> x + 1:3
[,1] [,2] As a simple example, we’ll look at computing row (or
[1,] 2 6 column) means. We can do this as follows:
[2,] 4 5
Warning message: > rm = numeric(nrow(x))
In x + 1:3 : > for(i in 1:nrow(x))
longer object length is not a multiple of rm[i] = mean(x[i,])
shorter object length
Notice that there is nothing special about using the mean
Note that it is an error to try to combine a matrix with a vector function here. The same code will work for any numerical
which has more elements than the matrix. summary.
The “Apply” Mechanism Computing Row and Column Means
Because this is such a common kind of task, R has a special Row and column means of x can be computed as follows:
way of carrying out this kind of operation.
> apply(x, 1, mean)
The function apply can be used to compute row or column [1] 2 3
summaries. The expression
> apply(x, 2, mean)
apply(matrix, 1, summary) [1] 1.5 3.5
computes row summaries of the type specified by summary for There is no difficulty in substituting any other summary
the matrix specified by matrix. Similarly, function in place of mean. For example, the row standard
deviations can be computed as follows.
apply(matrix, 2, summary)
> apply(x, 1, sd)
computes column summaries.
[1] 1.414214 1.414214
In the preceding examples, we’ve used a function name to The first subscript of the result ranges over the set of values
specify which summary function the apply mechanism returned by the summary function while the second specifies
should use. It is also possible to specify the function directly which row or column the summary function is being computed
by using its definition as the argument to apply. Here is how for.
we can compute the column sums of squares for x.
> apply(x, 1, range)
> apply(x, 2, function(x) sum(x^2)) [,1] [,2]
[1] 5 25 [1,] 1 2
[2,] 3 4
Computing the sums of squares about the column means is
just as easy. > apply(x, 2, range)
> apply(x, 2, function(x) sum((x - mean(x))^2)) [,1] [,2]
[1] 0.5 0.5 [1,] 1 3
[2,] 2 4
Additional Arguments to Summaries Sweeping Out Summaries
It is also possible to specify additional arguments for the Once summaries have been computed, it is common to want to
summary function as extra arguments to the apply call. To see subtract out those summaries to obtain residual values. As
an example, let’s return to the mean function, which has an with computing the summaries themselves, this can be done
optional argument trim, which specifies a fraction of the by looping over the rows or columns, as appropriate. Because
observations to trimmed from its argument before the mean is this is quite a common task, there is a special function called
computed. The largest and smallest trim/2 fraction of the sweep which can be used to sweep out computed summaries.
observations are trimmed before the mean is computed. A To subtract out row summaries from the rows of a matrix use a
value of trim=.1 could be specified in a call to apply as call of the form
follows.
sweep(matrix, 1, apply(matrix, 1, summary))
apply(matrix, margin, mean, trim = .1)
and to subtract out column summaries use a call of the form
By default, sweep will subtract out the summaries, but it is Consider the following data showing child mortality in regions
possible to use any other binary operation instead. For of the United States, all races, 1964–1966 (deaths per 1000
example, the following statement forms will divide through by live births). This is typical of data encountered in statistical
the summary, rather than subtracting. data analysis.
sweep(matrix, 1, apply(matrix, 1, summary), "/") > mortality
sweep(matrix, 2, apply(matrix, 2, summary), "/") Father✬s Education (Years)
Region <=8 9-11 12 13-15 >=16
As a concrete example, here is how to divide the columns of a
Northeast 25.3 25.3 18.2 18.3 16.3
matrix by their column means.
North Central 32.1 29.0 18.8 24.3 19.0
> x = cbind(1:3, 9:11) South 38.8 31.0 19.3 15.7 16.8
> sweep(x, 2, apply(x, 2, mean), "/") West 25.4 21.1 20.3 24.0 17.5
[,1] [,2]
Notice that in this case, the row and column names are
[1,] 0.5 0.9
themselves named (by the strings "Region" and
[2,] 1.0 1.0
"Father✬s Education (Years)" respectively).
[3,] 1.5 1.1
Creating The Matrix An “Overall + Rows + Columns” Model
> mortality = A common way of dealing with a table of values like this is to
matrix(c(25.3, 25.3, 18.2, 18.3, 16.3, try to explain the pattern in the table with in terms of some
32.1, 29.0, 18.8, 24.3, 19.0, simpler underlying structure. In the case of the mortality data,
38.8, 31.0, 19.3, 15.7, 16.8, we will look at how to fit an overall plus row-effect plus
25.4, 21.1, 20.3, 24.0, 17.5), column-effect model to describe these values. This means that
nrow = 4, byrow = TRUE, we will seek to represent the values yi j in the matrix by
dimnames = list(
Region = yi j = µ + αi + β j + εi j
c("Northeast", "North Central",
"South", "West"), where µ represents the overall level of the data, αi is an effect
"Father✬s Education (Years)" = common to all the values in the i-th row, β j is a value common
c("<=8", "9-11", "12", to all values in the j-th column and the εi j are (small) residuals
"13-15", ">=16"))) which describe the deviations between what the model
predicts and the observed values.
We will fit this model by progressively sweeping out the Next we compute the row effects as the row means of y and
effects from a set of working residuals. subtract them from y.
To begin we will estimate the overall level as the average value The easy way to do this is using apply and sweep.
and then subtracting away from the original values.
> (alpha = apply(r, 1, mean))
This gives a set of deviations about the overall mean. Northeast North Central South
-2.145 1.815 1.495
> r = mortality West
> (mu = mean(r)) -1.165
[1] 22.825 > r = sweep(r, 1, alpha)
> r = r - mu
Sweeping Out Column Effects The Residuals
Finally, we extract the column effects as the column means of After completing the process, we are left with a set of
y and subtract them from y. residuals from the fit.
To see that all the effects have been swept out of the residuals > twoway =
we can compute the row and column means of the residuals. function(y)
{
> round(apply(r, 1, mean), 4) mu = mean(y)
Northeast North Central South y = y - mu
0 0 0 alpha = apply(y, 1, mean)
West y = sweep(y, 1, alpha)
0 beta = apply(y, 2, mean)
y = sweep(y, 2, beta)
> round(apply(r, 2, mean), 4) list(overall = mu, rows = alpha,
<=8 9-11 12 13-15 >=16 cols = beta, residuals = y)
0 0 0 0 0 }
These are (essentially) zero.
Commenting (Generalized) Outer Products
I omitted comments from the function on the previous slide in The function outer provides another useful utility in
order to save space. When writing computer code it is vital to connection with matrix calculations.
include informative comments.
In mathematics, the outer product of two vectors x and y is a
.
.
. matrix whose i j-th element is
Here is a small multiplication table (truncated at 8 so that it • Using apply and outer together provides a very
fits on a slide. powerful way of carrying out quite complex
computations.
> outer(1:8, 1:8)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] • In this example we’ll consider the problem of finding
[1,] 1 2 3 4 5 6 7 8 the closest match for each value y1 , . . . , ym in a set of
[2,] 2 4 6 8 10 12 14 16 values x1 , . . . , xn (a kind of nearest neighbour
[3,] 3 6 9 12 15 18 21 24 regression).
[4,] 4 8 12 16 20 24 28 32
[5,] 5 10 15 20 25 30 35 40 • While the problem can be carried out using nested for
[6,] 6 12 18 24 30 36 42 48 loops, there is a “one-liner” solution.
[7,] 7 14 21 28 35 42 49 56
[8,] 8 16 24 32 40 48 56 64
The which Function Nearest Neighbour Matches
• Recall that the which function accepts a vector of The algorithm is as follows:
logical values and returns a vector that indicates which
indices correspond to true values. 1. Form the matrix D so that the i jth element of D is the
distance between yi and x j (this can be done with
> x = c(-1, 0, 1) outer).
> which(x >= 0)
[1] 2 3 2. Determine the column index of the smallest value in
> which(x == 0) each row of D.
[1] 2 3. Return the x value corresponding to this index.
> which.min(x)
[1] 1
> which.max(x)
[1] 3
> nearest2 = function(x, y) { # two-liner Often it useful to be able to interchange the role of rows and
d = outer(x, y, function(x, y) abs(x - y)) columns in a matrix. This is refered to as taking the transpose
x[apply(d, 2, which.min)] of a matrix. R has a special purpose function called t which
} can be used to compute matrix transposes. It is very simple to
use.
> nearest = function(x, y) # one-liner
x[apply(outer(x, y, function(x, y) > (x = cbind(1:3, 11:13))
abs(x - y)), 2, which.min)] [,1] [,2]
[1,] 1 11
> (x = 0:10/10) [2,] 2 12
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 [3,] 3 13
> (y = round(runif(4), 4))
[1] 0.1713 0.9631 0.0906 0.5987 > t(x)
> nearest(x, y) [,1] [,2] [,3]
[1] 0.2 1.0 0.1 0.6 [1,] 1 2 3
[2,] 11 12 13
Matrix Diagonals Matrix Products
• Given a matrix A, the function diag returns the Matrices are often found in mathematical expressions
diagonal elements of that matrix. Note that diag(A) is involving matrix multiplication. Mathematically, the product
equivalent to A[row(A)==col(A)]. of an m × n matrix A whose i j-th element is ai j and an n × p
matrix B whose i j-th element is bi j is defined to be the m × p
• Given a vector a, the expression diag(a) returns a matrix whose i j-th element is
diagonal matrix with the elements of a on the diagonal.
n
• Given a positive integer n the expression diag(n) ∑ aik bk j
returns an n × n identity matrix. k=1
A %*% B
There is a well known formula for the sum of the first n Let’s suppose that
positive integers.
Snk = c0 + c1 n + c2 n2 + · · · + ck+1 nk+1
n(n + 1)
1+2+···+n =
2 By considering what happens for n = 1, 2, . . . , k + 2 we get a
series of simultaneous equations which can be solved for
Let’s examine the more general problem of finding a formula
c0 , . . . ck+1 .
for the sum of the kth powers of the first n positive integers.
S1k = c0 + c1 1 + c2 12 +···+ ck+1 1k+1
Snk = 1k + 2k + · · · + nk S2k = c0 + c1 2 + c2 22 +···+ ck+1 2k+1
..
Note that for all n .
Sk+2,k = c0 + c1 (k + 2) + c2 (k + 2)2 + · · · + ck+1 (k + 2)k+1
k k k k k+1
{z· · · + n} = n × n = n
Snk ≤ n| + n +
n terms
Using the statements on the previous slide, it is very easy to For k = 2, we will be computing
create a function which returns the coefficients c0 , . . . ck+1 .
Sn2 = 12 + 22 + 32 + . . . + n2 .
> sumpow =
The coefficients are computed as:
function(k)
solve(outer(1:(k+2), 0:(k+1), "^"), > sumpow(2)
cumsum(seq(k+2)^k)) [1] 1.471046e-15 1.666667e-01 5.000000e-01
[4] 3.333333e-01
For k = 1 the result is
The first coeffient is very close to 0, the deviation being due to
> sumpow(1) small numerical errors which occured during the calculation.
[1] 0.0 0.5 0.5 The other coefficients are very close to simple fractions.
which corresponds to
The formula is
n n2n(n + 1) n n2 n3 n + 3n2 + 2n3 n(n + 1)(2n + 1)
Sn1 = + = . Sn2 = + + = =
2 2 2 6 2 3 6 6
.
Sn3 = 13 + 23 + 33 + . . . + n3 .
> sumpow(3)
[1] 0.00 0.00 0.25 0.50 0.25
The formula is
n2 n3 n4 n2 (1 + 2n + n2 ) n2 (n + 1)2
Sn3 = + + = = .
4 2 4 4 4