You are on page 1of 47

Linear Algebra

Department of Mathematics
Indian Institute of Technology Guwahati

January – May 2019

MA 102 (RA, RKS, MGPP, KVK)

1 / 17
The Vector Space Rn

Topics:
The vector space Rn
Inner product, length and angle

2 / 17
The Vector Space Rn

Topics:
The vector space Rn
Inner product, length and angle
Linear combination
Matrices and matrix-vector multiplication

2 / 17
The vector space Rn
We define Rn to be the set of all ordered n-tuples of real numbers.
Thus an n-tuple v ∈ Rn is of the form
 
v1
row vector: v = [v1 , . . . , vn ] or column vector: v =  ... 
 

vn

3 / 17
The vector space Rn
We define Rn to be the set of all ordered n-tuples of real numbers.
Thus an n-tuple v ∈ Rn is of the form
 
v1
row vector: v = [v1 , . . . , vn ] or column vector: v =  ... 
 

vn

We always write an n-tuple in Rn as column vector. Thus


  
 v1
 

R :=  ...
n
 : v1 , . . . , vn ∈ R
 
 
vn
 

3 / 17
The vector space Rn
We define Rn to be the set of all ordered n-tuples of real numbers.
Thus an n-tuple v ∈ Rn is of the form
 
v1
row vector: v = [v1 , . . . , vn ] or column vector: v =  ... 
 

vn

We always write an n-tuple in Rn as column vector. Thus


  
 v1
 

R :=  ...
n
 : v1 , . . . , vn ∈ R
 
 
vn
 

   >
v1 v1
Transpose: [v1 , . . . , vn ]> =  ...  and  ...  = [v1 , . . . , vn ].
   

vn vn
3 / 17
Algebraic properties of Rn
Define addition and scalar multiplication on Rn as follows:
         
u1 v1 u1 + v1 u1 αu1
 ..   ..   ..  .   . 
 . + .  =   and α  ..  =  ..  for α ∈ R.

.
un vn un + vn un αun

4 / 17
Algebraic properties of Rn
Define addition and scalar multiplication on Rn as follows:
         
u1 v1 u1 + v1 u1 αu1
 ..   ..   ..  .   . 
 . + .  =   and α  ..  =  ..  for α ∈ R.

.
un vn un + vn un αun

Then for u, v, w in Rn and scalars α, β in R, the following hold:


1 Commutativity: u + v = v + u
2 Associativity: (u + v) + w = u + (v + w)
3 Identity: u + 0 = u
4 Inverse: u + (−u) = 0

4 / 17
Algebraic properties of Rn
Define addition and scalar multiplication on Rn as follows:
         
u1 v1 u1 + v1 u1 αu1
 ..   ..   ..  .   . 
 . + .  =   and α  ..  =  ..  for α ∈ R.

.
un vn un + vn un αun

Then for u, v, w in Rn and scalars α, β in R, the following hold:


1 Commutativity: u + v = v + u
2 Associativity: (u + v) + w = u + (v + w)
3 Identity: u + 0 = u
4 Inverse: u + (−u) = 0
5 Distributivity : α(u + v) = αu + αv
6 Distributivity : (α + β)u = αu + βu
7 Associativity: α(βu) = (αβ)u
8 Identity: 1u = u.
4 / 17
The vector space Rn
The set Rn equipped with vector addition “+” and scalar
multiplication “·” is called a vector space.
Exercise: Let u, v, x be vectors in R3 .
(a) Simplify 3u + (5v − 2u) + 2(u − v)
(b) Solve 5x − u = 2(u + 2x) for x.

5 / 17
The vector space Rn
The set Rn equipped with vector addition “+” and scalar
multiplication “·” is called a vector space.
Exercise: Let u, v, x be vectors in R3 .
(a) Simplify 3u + (5v − 2u) + 2(u − v)
(b) Solve 5x − u = 2(u + 2x) for x.
Length and angle: Length, distance and angle can all be described
by using the notion of inner product (dot product) of two vectors.
Definition: If u := [u1 , . . . , un ]> and v := [v1 , . . . , vn ]> are vectors
in Rn then the inner product hu, vi is defined by

hu, vi := u1 v1 + u2 v2 + · · · + un vn .

The inner product hu, vi is also written as the dot product u • v.

5 / 17
The vector space Rn
The set Rn equipped with vector addition “+” and scalar
multiplication “·” is called a vector space.
Exercise: Let u, v, x be vectors in R3 .
(a) Simplify 3u + (5v − 2u) + 2(u − v)
(b) Solve 5x − u = 2(u + 2x) for x.
Length and angle: Length, distance and angle can all be described
by using the notion of inner product (dot product) of two vectors.
Definition: If u := [u1 , . . . , un ]> and v := [v1 , . . . , vn ]> are vectors
in Rn then the inner product hu, vi is defined by

hu, vi := u1 v1 + u2 v2 + · · · + un vn .

The inner product hu, vi is also written as the dot product u • v.


Example: If u := [1, 2, −3]> and v := [−3, 5, 2]> then

hu, vi = 1 · (−3) + 2 · 5 + (−3) · 2 = 1.


5 / 17
Length and angle
Theorem: Let u, v, and w be vectors in Rn and let α ∈ R. Then
1 hu, ui ≥ 0 and hu, ui = 0 ⇐⇒ u = 0.
2 hu, vi = hv, ui
3 hu, v + wi = hu, vi + hu, wi
4 hαu, vi = αhu, vi.

6 / 17
Length and angle
Theorem: Let u, v, and w be vectors in Rn and let α ∈ R. Then
1 hu, ui ≥ 0 and hu, ui = 0 ⇐⇒ u = 0.
2 hu, vi = hv, ui
3 hu, v + wi = hu, vi + hu, wi
4 hαu, vi = αhu, vi.
Definition: The norm (or length) of a vector v := [v1 , . . . , vn ]> in
Rn is a nonnegative number kvk defined by
p q
kvk := hv, vi = v12 + · · · + vn2 .

6 / 17
Length and angle
Theorem: Let u, v, and w be vectors in Rn and let α ∈ R. Then
1 hu, ui ≥ 0 and hu, ui = 0 ⇐⇒ u = 0.
2 hu, vi = hv, ui
3 hu, v + wi = hu, vi + hu, wi
4 hαu, vi = αhu, vi.
Definition: The norm (or length) of a vector v := [v1 , . . . , vn ]> in
Rn is a nonnegative number kvk defined by
p q
kvk := hv, vi = v12 + · · · + vn2 .

Cauchy-Schwarz Inequality: Let u and v be vectors in Rn . Then

|hu, vi| ≤ kuk kvk.

Proof: Define p(t) := hu + tv, u + tvi for t ∈ R. Then


p(t) = kuk2 + 2thu, vi + kvk2 t 2 ≥ 0 for all t ∈ R yields the result.
6 / 17
Length and angle
Theorem: Let u and v be vectors in Rn and let α ∈ R. Then
1 Positive definite: kuk = 0 ⇐⇒ u = 0
2 Positive homogeneity: kαuk = |α| kuk
3 Triangle inequality: ku + vk ≤ kuk + kvk.

7 / 17
Length and angle
Theorem: Let u and v be vectors in Rn and let α ∈ R. Then
1 Positive definite: kuk = 0 ⇐⇒ u = 0
2 Positive homogeneity: kαuk = |α| kuk
3 Triangle inequality: ku + vk ≤ kuk + kvk.
Unit vector: A vector v in Rn is called a unit vector if kvk = 1. If u
1
is a nonzero vector then v := kuk u is a unit vector in the direction
of u. Here v is referred to as normalization of u.
Example: The vectors e1 := [1, 0, 0]> , e2 := [0, 1, 0]> and
e3 := [0, 0, 1]> are unit vectors in R3 . These vectors are called
standard unit vectors.

7 / 17
Length and angle
Theorem: Let u and v be vectors in Rn and let α ∈ R. Then
1 Positive definite: kuk = 0 ⇐⇒ u = 0
2 Positive homogeneity: kαuk = |α| kuk
3 Triangle inequality: ku + vk ≤ kuk + kvk.
Unit vector: A vector v in Rn is called a unit vector if kvk = 1. If u
1
is a nonzero vector then v := kuk u is a unit vector in the direction
of u. Here v is referred to as normalization of u.
Example: The vectors e1 := [1, 0, 0]> , e2 := [0, 1, 0]> and
e3 := [0, 0, 1]> are unit vectors in R3 . These vectors are called
standard unit vectors.
Distance: The distance d(u, v) between two vectors
u := [u1 , . . . , un ]> and v := [v1 , . . . , vn ]> in Rn is defined by
q
d(u, v) := ku − vk = (u1 − v1 )2 + · · · + (un − vn )2 .

7 / 17
Angle and orthogonality
Let u and v be nonzero vectors in Rn . Consider the triangle with
sides u, v and u − v. Let θ be the angle between u and v. Then the
low of cosines applied to the triangle yields
ku − vk2 = kuk2 + kvk2 − 2kuk kvk cos θ.

8 / 17
Angle and orthogonality
Let u and v be nonzero vectors in Rn . Consider the triangle with
sides u, v and u − v. Let θ be the angle between u and v. Then the
low of cosines applied to the triangle yields
ku − vk2 = kuk2 + kvk2 − 2kuk kvk cos θ.
Expanding ku − vk2 = kuk2 − 2hu, vi + kvk2 gives us
hu, vi
hu, vi = kuk kvk cos θ =⇒ cos θ = .
kuk kvk

8 / 17
Angle and orthogonality
Let u and v be nonzero vectors in Rn . Consider the triangle with
sides u, v and u − v. Let θ be the angle between u and v. Then the
low of cosines applied to the triangle yields
ku − vk2 = kuk2 + kvk2 − 2kuk kvk cos θ.
Expanding ku − vk2 = kuk2 − 2hu, vi + kvk2 gives us
hu, vi
hu, vi = kuk kvk cos θ =⇒ cos θ = .
kuk kvk

Definition: Two vectors u and v in Rn are said to be orthogonal to


each other if hu, vi = 0.

8 / 17
Angle and orthogonality
Let u and v be nonzero vectors in Rn . Consider the triangle with
sides u, v and u − v. Let θ be the angle between u and v. Then the
low of cosines applied to the triangle yields
ku − vk2 = kuk2 + kvk2 − 2kuk kvk cos θ.
Expanding ku − vk2 = kuk2 − 2hu, vi + kvk2 gives us
hu, vi
hu, vi = kuk kvk cos θ =⇒ cos θ = .
kuk kvk

Definition: Two vectors u and v in Rn are said to be orthogonal to


each other if hu, vi = 0.
Example: The vectors u := [1, 1, −2]> and v := [3, 1, 2]> in R3 are
orthogonal as hu, vi = 0.

8 / 17
Angle and orthogonality
Let u and v be nonzero vectors in Rn . Consider the triangle with
sides u, v and u − v. Let θ be the angle between u and v. Then the
low of cosines applied to the triangle yields
ku − vk2 = kuk2 + kvk2 − 2kuk kvk cos θ.
Expanding ku − vk2 = kuk2 − 2hu, vi + kvk2 gives us
hu, vi
hu, vi = kuk kvk cos θ =⇒ cos θ = .
kuk kvk

Definition: Two vectors u and v in Rn are said to be orthogonal to


each other if hu, vi = 0.
Example: The vectors u := [1, 1, −2]> and v := [3, 1, 2]> in R3 are
orthogonal as hu, vi = 0.
Pythagoras’ Theorem: Let u and v be vectors in Rn . Then
ku + vk2 = kuk2 + kvk2 ⇐⇒ hu, vi = 0.

8 / 17
Matrices
Definition: A matrix is an array of numbers called entries or
elements of the matrix. The size of a matrix A is a description of
the number of rows and columns of the matrix A. An m × n
matrix A has m rows and n columns and is of the form
 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
A= . ..  .
 
. ..
 . . ··· . 
am1 am2 · · · amn

9 / 17
Matrices
Definition: A matrix is an array of numbers called entries or
elements of the matrix. The size of a matrix A is a description of
the number of rows and columns of the matrix A. An m × n
matrix A has m rows and n columns and is of the form
 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
A= . ..  .
 
. ..
 . . ··· . 
am1 am2 · · · amn

Let aj := [a1j , . . . , amj ]> be the


 j-th column of Afor j = 1 : n.
Then we represent A as A = a1 a2 · · · an .

9 / 17
Matrices
Definition: A matrix is an array of numbers called entries or
elements of the matrix. The size of a matrix A is a description of
the number of rows and columns of the matrix A. An m × n
matrix A has m rows and n columns and is of the form
 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
A= . ..  .
 
. ..
 . . ··· . 
am1 am2 · · · amn

Let aj := [a1j , . . . , amj ]> be the  j-th column of Afor j = 1 : n.


Then we represent A as A = a1 a2 · · · an .
Let Ai := [ai1 , ai2 , . . . , a
in ] be the
 i-th row of A for i = 1 : m. Then
A1
we represent A as A =  ...  .
 

Am
9 / 17
Linear combination
Let u and v be vectors in Rn . Let α and β be scalars. Adding αu
and βv gives the linear combination αu + βv.
Example: Let u := [1, 1, −1]> , v := [2, 3, 4]> and w := [4, 5, 2]> .
Then w = 2u + v. Thus w is a linear combination of u and v.

10 / 17
Linear combination
Let u and v be vectors in Rn . Let α and β be scalars. Adding αu
and βv gives the linear combination αu + βv.
Example: Let u := [1, 1, −1]> , v := [2, 3, 4]> and w := [4, 5, 2]> .
Then w = 2u + v. Thus w is a linear combination of u and v.

Definition: Let v1 , . . . , vm be vectors in Rn and let α1 , . . . , αm be


scalars. Then the vector u := α1 v1 + · · · + αm vm is called a linear
combination of v1 , . . . , vm .

10 / 17
Linear combination
Let u and v be vectors in Rn . Let α and β be scalars. Adding αu
and βv gives the linear combination αu + βv.
Example: Let u := [1, 1, −1]> , v := [2, 3, 4]> and w := [4, 5, 2]> .
Then w = 2u + v. Thus w is a linear combination of u and v.

Definition: Let v1 , . . . , vm be vectors in Rn and let α1 , . . . , αm be


scalars. Then the vector u := α1 v1 + · · · + αm vm is called a linear
combination of v1 , . . . , vm .

Problem: Let a1 , . . . , an and b be vectors in Rm . Find scalars


x1 , . . . , xn , if exist, such that x1 a1 + · · · + xn an = b.
Example: Vector equation
       
1 0 0 1
x1  −1  + x2  1  + x3  0  =  3  .
0 −1 1 5

10 / 17
Matrix times vector
 combination x1 a1 + · · · + xn an using a
We rewrite the linear
matrix. Set A := a1 · · · an and x := [x1 , . . . , xn ]> . We
define the matrix A times the vector x to be the same as the
combination x1 a1 + · · · + xn an .

11 / 17
Matrix times vector
 combination x1 a1 + · · · + xn an using a
We rewrite the linear
matrix. Set A := a1 · · · an and x := [x1 , . . . , xn ]> . We
define the matrix A times the vector x to be the same as the
combination x1 a1 + · · · + xn an .
Definition: Matrix-vector multiplication
 
x1
Ax = a1 · · · an  ...  = x1 a1 + · · · + xn an .
  

xn

The matrix A acts on the vector x and the result Ax is a linear


combination of the columns of A.

11 / 17
Matrix times vector
 combination x1 a1 + · · · + xn an using a
We rewrite the linear
matrix. Set A := a1 · · · an and x := [x1 , . . . , xn ]> . We
define the matrix A times the vector x to be the same as the
combination x1 a1 + · · · + xn an .
Definition: Matrix-vector multiplication
 
x1
Ax = a1 · · · an  ...  = x1 a1 + · · · + xn an .
  

xn

The matrix A acts on the vector x and the result Ax is a linear


combination of the columns of A.
Example: Compact notation for vector equation
          
1 0 0 x1 1 0 0 1
 −1 1 0   x2  = x1  −1 +x2  1 +x3  0  =  3  .
0 −1 1 x3 0 −1 1 5

11 / 17
Matrix times vector
 
A row vector ai1 · · · ain is a 1 × n matrix. Therefore
 
x1
  . 
ai1 · · · ain  ..  = ai1 x1 + · · · + ain xn .
xn

12 / 17
Matrix times vector
 
A row vector ai1 · · · ain is a 1 × n matrix. Therefore
 
x1
  . 
ai1 · · · ain  ..  = ai1 x1 + · · · + ain xn .
xn

Example: Matrix-vector multiplication in two ways


       
1 0 0 x1 1 0 0
 1 1 0   x2  = x1  1  + x2  1  + x3  0 
0 1 1 x3 0 1 1
 
   1 0 0 x 
x1
   
=  x1 + x2  =  1 1 0 x 
x2 + x2  
0 1 1 x

12 / 17
Matrix-vector multiplication
More
 generally      
a11 · · · a1n x1 a11 a1n
 .. ..   ..  .   . 
 = x1  ..  + · · · + xn  .. 

 . ··· .  .
a · · · amn xn a a
m1    m1   mn
a11 x1 + · · · + a1n xn a11 · · · a1n x
= .. ..
= .
   
.  . 
am1 x1 + · · · + amn xn am1 · · · amn x

13 / 17
Matrix-vector multiplication
More
 generally      
a11 · · · a1n x1 a11 a1n
 .. ..   ..  ..   ..
 = x1  .  + · · · + xn  .
 
 . ··· .  . 
a · · · amn xn a a
m1    m1   mn
a11 x1 + · · · + a1n xn a11 · · · a1n x
= .. ..
= .
   
.  . 
am1 x1 + · · · + amn xn am1 · · · amn x
 
A1
   ..
Now represent A := a1 · · · an by its rows: A =  . .

Am
Then we have
   
a11 x1 + · · · + a1n xn A1 x
Ax = x1 a1 + · · · + xn an =  ..   .. 
 =  . .

.
am1 x1 + · · · + amn xn Am x

13 / 17
Matrices and Vectors in Data Mining
Problem: Given a few key words, retrieve relevant information
from a large database.

14 / 17
Matrices and Vectors in Data Mining
Problem: Given a few key words, retrieve relevant information
from a large database.

Term-Document Matrix: Term-document matrices are used in


information retrieval. Consider the five documents.

14 / 17
Matrices and Vectors in Data Mining
Problem: Given a few key words, retrieve relevant information
from a large database.

Term-Document Matrix: Term-document matrices are used in


information retrieval. Consider the five documents.
Doc. 1: The Google matrix G is a model of the Internet.
Doc. 2: Gij is nonzero if there is a link from web page j to i.
Doc. 3: The Google matrix G is used to rank all web pages.
Doc. 4: The ranking is done by solving a matrix eigenvalue
problem.
Doc. 5: England dropped out of the top 10 in the FIFA ranking.

14 / 17
Matrices and Vectors in Data Mining
Problem: Given a few key words, retrieve relevant information
from a large database.

Term-Document Matrix: Term-document matrices are used in


information retrieval. Consider the five documents.
Doc. 1: The Google matrix G is a model of the Internet.
Doc. 2: Gij is nonzero if there is a link from web page j to i.
Doc. 3: The Google matrix G is used to rank all web pages.
Doc. 4: The ranking is done by solving a matrix eigenvalue
problem.
Doc. 5: England dropped out of the top 10 in the FIFA ranking.

The key words or terms are colored blue. The set of terms is
called a Dictionary. Counting the frequency of terms in each
document, we obtain a term-document matrix.

14 / 17
Term-document matrix

Term Doc. 1 Doc. 2 Doc. 3 Doc. 4 Doc. 5


eigenvalue 0 0 0 1 0
England 0 0 0 0 1
FIFA 0 0 0 0 1
Google 1 0 1 0 0
Internet 1 0 0 0 0
link 0 1 0 0 0
matrix 1 0 1 1 0
page 0 1 1 0 0
rank 0 0 1 1 1
web 0 1 1 0 1

15 / 17
Term-document matrix

Term Doc. 1 Doc. 2 Doc. 3 Doc. 4 Doc. 5


eigenvalue 0 0 0 1 0
England 0 0 0 0 1
FIFA 0 0 0 0 1
Google 1 0 1 0 0
Internet 1 0 0 0 0
link 0 1 0 0 0
matrix 1 0 1 1 0
page 0 1 1 0 0
rank 0 0 1 1 1
web 0 1 1 0 1

Each document is represented by a 10 × 5 column of the


term-document matrix A which is a vector in R10 .

15 / 17
Query vector

Suppose that we want to find all documents that are relevant


to the query ranking of web pages. This is represented by a
query vector, constructed in the way as the term-document
matrix, using the same dictionary:
>
v := 0 0 0 0 0 0 0 1 1 1 ∈ R10 .


16 / 17
Query vector

Suppose that we want to find all documents that are relevant


to the query ranking of web pages. This is represented by a
query vector, constructed in the way as the term-document
matrix, using the same dictionary:
>
v := 0 0 0 0 0 0 0 1 1 1 ∈ R10 .


Thus the query itself is a document. The information retrieval


task can now be formulated as a mathematical problem.

16 / 17
Query vector

Suppose that we want to find all documents that are relevant


to the query ranking of web pages. This is represented by a
query vector, constructed in the way as the term-document
matrix, using the same dictionary:
>
v := 0 0 0 0 0 0 0 1 1 1 ∈ R10 .


Thus the query itself is a document. The information retrieval


task can now be formulated as a mathematical problem.

Problem: Find the columns of A that are close (in some


sense) to the query vector v.

16 / 17
Query matching (use of dot product)
Query matching is the process of finding all documents that
are relevant to a particular query v. The cosine of angle
between two vectors is often used to determine relevant
documents:
hAej , vi
cos θj := > tol
kvk kAej k
where Aej is the j-th column of A and tol is a predefined
tolerance. Thus cos θj > tol ⇒ Aej is relevant.

17 / 17
Query matching (use of dot product)
Query matching is the process of finding all documents that
are relevant to a particular query v. The cosine of angle
between two vectors is often used to determine relevant
documents:
hAej , vi
cos θj := > tol
kvk kAej k
where Aej is the j-th column of A and tol is a predefined
tolerance. Thus cos θj > tol ⇒ Aej is relevant.

Consider the term-document matrix A and the query (”ranking


of web pages”) vector v. Then the cosines measures of the
query and the original data are given by

[0, 0.6667, 0.7746, 0.3333, 0.3333]T

which shows that Doc. 2-3 are most relevant.


***
17 / 17

You might also like