Professional Documents
Culture Documents
• The purpose of this handout is to give a brief review of some of the basic concepts in linear
algebra. If you are unfamiliar with the material and/or would like to do some further reading,
you may consult, e.g., the books [1,2,3].
1 Vector Spaces
We denote the set of real numbers (also referred to as scalars) by R. A vector space that we may
be familiar with is R2 . We can think of it as a plane, and any point in R2 can be represented by
an ordered list of its coordinates, that is,
2 x
R = : x, y ∈ R .
y
We can easily define algebraic manipulations on Fn . For example, the addition of two elements in
F2 is defined by adding corresponding coordinates:
x1 y1 x 1 + y1
.. .. ..
. + . = ,
.
xn yn x n + yn
xn axn
where a ∈ F. The motivation for the definition of a vector space is from the properties possessed
by addition and scalar multiplication operations on Fn . Formally, we say that a set V is a vector
space over a field F, if there is a binary operation “+”, called addition, in V, and a map F × V → V,
called scaler multiplication, such that the following properties hold:
1
See https://en.wikipedia.org/wiki/Field (mathematics) for more details.
1
• Commutativity. x + y = y + x for all x, y ∈ V;
there exist scalars a1 , . . . , am ∈ R, not all of them are zero, such that m i
i=1 ai x = 0. The collection
C = {x1 , x2 , . . . , xm } is said to be linearly independent if it is not linearly dependent.
A linear combination of a collection {x1 , x2 , . . . , xm } of vector in Rn is a vector of the form
a1 x1 + · · · + am xm ,
where a1 , . . . , am ∈ R. The set of all linear combinations of {x1 , x2 , . . . , xm } is called the span of
{x1 , x2 , . . . , xm }. A fact about space is that the span of a collection of vectors in Rn is a subspace
of Rn .
A basis of Rn is a collection of vectors in Rn that is linearly independent and spans Rn . For
example,
1 3
,
2 4
is a basis of R2 , and
1 0
,
0 1
is a standard basis of R2 .
2
We denote the set of all linear maps from Rn to Rm as L(Rn , Rm ). For T ∈ L(Rn , Rm ), the null
space (also referred to as kernel) of T , denoted as null(T ), is the subset of Rn that consists of
vectors mapped to 0 by T :
null(T ) = {x ∈ Rn : T x = 0}.
The range of T , denoted as range(T ), is the the subset of Rm consisting of vectors that are of the
form T x for some x ∈ Rn :
range(T ) = {T x ∈ Rm : x ∈ Rn }.
A fact of null spaces and ranges is that the null space (range) of T is a subspace of Rn (Rm ).
We use Rm×n to denote the set of m × n arrays whose components are from R. We can write
an element A ∈ Rm×n as:
a11 a12 · · · a1n
a21 a22 · · · a2n
A= . .. , (1)
.. ..
.. . . .
am1 am2 · · · amn
where aij ∈ R for i = 1, . . . , m, j = 1, . . . , n. For a linear map T ∈ L(Rn , Rm ), suppose that
{y 1 , . . . , y n } is a basis of Rn and {x1 , . . . , xm } a basis of Rm . For each k = 1, . . . , n, suppose that
we can write T y k (uniquely) as a linear combination of x1 , . . . , xm as follows:
T y k = a1k x1 + · · · + amk xm .
Then the scalars ajk ’s completely determine the linear map T . We call the m × n array in (1) the
matrix of T with respect to {y 1 , . . . , y n } and {x1 , . . . , xm }. A row vector is a matrix with m = 1,
and a column vector is a matrix with n = 1.
Now, given an m × n matrix A of the form (1), its transpose2 A⊤ is defined as the following
n × m matrix:
a11 a21 · · · am1
a12 a22 · · · am2
⊤
A = . .. ,
.. ..
.. . . .
a1n a2n · · · amn
An m × m real matrix A is said to be symmetric if A = A⊤ .
We use x ≥ 0 to indicate that all the components of x are non–negative, and x ≥ y to mean
that x − y ≥ 0.
The matrix-matrix product between A ∈ Rm×n and B ∈ Rn×p is defined as
n
X
C = AB where cij = aik bkj .
k=1
3
4 Rank and Matrix Inversion
The rank of a matrix A ∈ Rm×n , denoted by rank(A), is defined as the number of elements of a
maximal linearly independent subset of its columns or rows. Some facts about the rank of a matrix:
• rank(A) = rank(A⊤ );
An n × n square matrix A is said to be invertible if the columns of A has full-rank. The inverse
of the matrix A is denoted as A−1 , and we have
AA−1 = A−1 A = I.
Some facts:
• (A−1 )−1 = A.
We now introduce common norms in Rn —the Hölder p-norm, 1 ≤ p ≤ ∞, which are defined as
follows:
x1 n
.. X
∥ . ∥p = ( |xi |p )1/p
xn i=1
4
√ Pn 1
When p = 2, the 2-norm of x, which in particular is defined as ∥x∥2 = x⊤ x = x2 2 , is also
i=1 i
referred to as the Euclidean norm of x. A fundamental inequality that relates the inner product
of two vectors and their respective Euclidean norms is the Cauchy-Schwarz inequality:
The equality holds if and only if the vectors x and y are linearly dependent; i.e., x = αy for some
α ∈ R.
References
[1] Strang, G. (1993). Introduction to linear algebra (Vol. 3). Wellesley, MA: Wellesley-Cambridge
Press.
[2] Horn, R. A., & Johnson, C. R. (2012). Matrix analysis. Cambridge university press.
[3] Axler, S. J. (1997). Linear algebra done right (Vol. 2). New York: Springer.