Professional Documents
Culture Documents
Learning
Lecture 4
Semester 1
2020/2021
Acknowledgement:
EE2211 development team
(Thomas, Kar-Ann, Chen Khong, Helen, Robby and Haizhou)
2
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Module II Contents
• Notations, Vectors, Matrices
• Operations on Vectors and Matrices
• Systems of Linear Equations
• Functions, Derivative and Gradient
• Least Squares, Linear Regression
• Multiple Outputs
• Ridge Regression
• Polynomial Regression
3
© Copyright EE, NUS. All Rights Reserved.
Fundamental ML Algorithms:
Linear Regression
4
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
• A scalar is a simple numerical value, like 15 or −3.25.
• Variables or constants that take scalar values are
denoted by an italic letter, like x or a.
• We shall focus on real numbers.
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.
5
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
• A vector is an ordered list of scalar values, called
attributes. We denote a vector as a bold character, for
example, x or w.
• Vectors can be visualized as arrows that point to some
directions as well as points in a multi-dimensional space.
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.
6
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
2 −2 1
Illustrations of three two-dimensional vectors, 𝐚 = ,𝐛 = , and 𝐜 =
3 5 0
are given in Figure 1 below.
7
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
• We denote an attribute of a vector as an italic value
with an index, like this: 𝑤 𝑗 𝑜𝑟 𝑥 𝑗 . The index j denotes
a specific dimension of the vector, the position of an
attribute in the list.
• For instance, in the vector a shown in red in Figure 1,
𝑎1 2 or more commonly 𝑎1 2
𝐚= 2
= 𝐚= 𝑎 = .
𝑎 3 2 3
Note:
• The notation 𝑥 𝑗 should not be confused with the power operator, such
as the 2 in 𝑥 2 (squared) or 3 in 𝑥 3 (cubed).
• If we want to apply a power operator, say square, to an indexed attribute
of a vector, we write like this: (𝑥 𝑗 )2 .
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.
8
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
• A matrix is a rectangular array of numbers arranged in
rows and columns. Below is an example of a matrix with
two rows and three columns,
2 4 −3
𝐗= .
21 −6 −1
• Matrices are denoted with bold capital letters, such as X
or W.
Note:
𝑗 𝑘
• A variable can have two or more indices, like this: 𝑥𝑖 or like this 𝑥𝑖,𝑗 .
𝑗
• For example, in neural networks, we denote as 𝑥𝑙,𝑢 the input feature j of unit u in
layer l.
• For the elements in matrix 𝐗, we shall use the indexing 𝑥1,1 where the first and the
second indices respectively indicate the row and the column positions.
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.
9
© Copyright EE, NUS. All Rights Reserved.
Recall this Example (continuous)
https://archive.ics.uci.edu/ml/datasets/iris
Operations on Vectors:
𝑥1 𝑦1 𝑥1 + 𝑦1
𝐱+𝐲= 𝑥 + 𝑦 = 𝑥 +𝑦
2 2 2 2
𝑥1 𝑦1 𝑥1 − 𝑦1
𝐱−𝐲= 𝑥 − 𝑦 = 𝑥 −𝑦
2 2 2 2
11
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
Operations on Vectors:
𝑥1 𝑎𝑥1
𝑎 𝐱 = 𝑎 𝑥 = 𝑎𝑥
2 2
1
𝑥1 𝑥
1 1 𝑎 1
𝑎
𝐱 =
𝑎 𝑥2 = 1
𝑥
𝑎 2
12
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
Matrix or Vector Transpose:
𝑥1
𝐱= 𝑥 , 𝐱 𝑇 = 𝑥1 𝑥2
2
𝑥1,1 𝑥1,2 𝑥1,3 𝑥1,1 𝑥2,1 𝑥3,1
𝐗 = 𝑥2,1 𝑥2,2 𝑥2,3 , 𝐗 𝑇 = 𝑥1,2 𝑥2,2 𝑥3,2
𝑥3,1 𝑥3,2 𝑥3,3 𝑥1,3 𝑥2,3 𝑥3,3
13
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
Dot Product or Inner Product of Vectors:
𝐱 · 𝐲 = 𝐱𝑇 𝐲
𝑦1
= 𝑥1 𝑥2 𝑦
2
= 𝑥1 𝑦1 + 𝑥2 𝑦2
Geometric Definition: 𝐱
𝐱 · 𝐲 = 𝐱 𝐲 cos𝜃
where 𝜃 is the angle
between 𝐱 and 𝐲,
and 𝐳 = 𝐳 ⋅ 𝐳 is the 𝜃
Euclidean length of 𝐲
vector 𝐳. 𝐱 cos𝜃
14
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
Matrix-Vector Product
15
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
Vector-Matrix Product
16
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
Matrix-Matrix Product
17
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Matrix inverse
Definition:
a d-by-d square matrix A is said to be invertible (also nonsingular)
if there exists a d-by-d square matrix B such that 𝐀𝐁 = 𝐁𝐀 = 𝐈
(identity matrix) given by
1 0…0 0
0 1 0 0
⁞ ⋱ ⁞ of d-by-d dimension.
0 0 1 0
0 0…0 1
Ref: https://en.wikipedia.org/wiki/Invertible_matrix
18
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
−1
1
𝐀 = adj(𝐀)
det 𝐀
• det 𝐀 is the determinant of 𝐀.
• adj(𝐀) is the adjugate or adjoint of 𝐀 which is the transpose
of its cofactor matrix.
• The 𝑖, 𝑗 entry of the cofactor matrix 𝐂 is the 𝑖, 𝑗 -minor
times a sign factor.
𝑎 𝑏 𝑑 −𝑏
• E.g., 𝐀 = , then adj 𝐀 = = 𝐂𝑇 .
𝑐 𝑑 −𝑐 𝑎
Ref: https://en.wikipedia.org/wiki/Invertible_matrix
19
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Determinant computation
Ref: https://en.wikipedia.org/wiki/Invertible_matrix
20
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Determinant computation
Example: 3x3 matrix
Ref: https://en.wikipedia.org/wiki/Determinant
21
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
𝑎22 𝑎23
The minor 𝐀11 = 𝑎 𝑎33
32
Ref: https://en.wikipedia.org/wiki/Determinant
22
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
𝑎21 𝑎23
The minor 𝐀12 = 𝑎 𝑎33
31
Ref: https://en.wikipedia.org/wiki/Determinant
23
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
𝑎21 𝑎22
The minor 𝐀13 = 𝑎 𝑎32
31
Ref: https://en.wikipedia.org/wiki/Determinant
24
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
𝑎12 𝑎13
The minor 𝐀21 = 𝑎 𝑎33
32
Ref: https://en.wikipedia.org/wiki/Determinant
25
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
𝑎11 𝑎13
The minor 𝐀22 = 𝑎 𝑎33
31
Ref: https://en.wikipedia.org/wiki/Determinant
26
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Example
1 2 3
Find the cofactor matrix of 𝐀 given that 𝐀 = 0 4 5 .
1 0 6
Solution:
4 5 0 5 0 4
𝑎11 = = 24, 𝑎12 = − = 5, 𝑎13 = = −4,
0 6 1 6 1 0
2 3 1 3 1 2
𝑎21 =− = −12, 𝑎22 = = 3, 𝑎23 = − = 2,
0 6 1 6 1 0
2 3 1 3 1 2
𝑎31 = = −2, 𝑎32 = − = −5, 𝑎33 = = 4,
4 5 0 5 0 4
24 5 −4
The cofactor matrix is thus −12 3 2 .
−2 − 5 4
Ref: https://www.mathwords.com/c/cofactor_matrix.htm
27
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
28
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
𝑥2 𝑥3 𝐱3
𝐱1 𝐱2
𝐱1
𝐱2 𝑥2
𝑥1 𝑥1
𝛽1 𝐱1 + 𝛽2 𝐱 2 = 0 𝛽1 𝐱1 + 𝛽2 𝐱 2 ≠ 𝛽3 𝐱3
29
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
30
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
These equations can be written compactly in matrix-
vector notation:
𝐗𝐰 = 𝐲
Where
𝑥1,1 𝑥1,2 … 𝑥1,𝑑 𝑤1 𝑦1
𝐗= ⁞ ⁞ ⋱ ⁞ , 𝐰= ⁞ , 𝐲= ⁞ .
𝑥𝑚,1 𝑥𝑚,2 … 𝑥𝑚,𝑑 𝑤𝑑 𝑦𝑚
31
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
32
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Example 1 𝑤1 + 𝑤2 = 4 (1) Two unknowns and
𝑤1 − 2𝑤2 = 1 (2) two equations
1 1 𝑤1 4
=
1 − 2 𝑤2 1
𝐗 𝐰 𝐲
ෝ = 𝐗 −1 𝐲
𝐰
−1
1 1 4 3
= =
1 −2 1 1
33
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
(ii) Over-determined system: 𝑚 > 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑 ,
𝐰 ∈ 𝓡𝑑×1 , 𝐲 ∈ 𝓡𝑚×1 .
(i.e., there are more equations than unknowns)
This set of linear equations has NO exact solution (𝐗 is non-square and
hence not invertible). However, an approximated solution is yet available.
34
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Example 2 𝑤1 + 𝑤2 = 1 (1) Two unknowns and
𝑤1 − 𝑤2 = 0 (2) three equations
𝑤1 = 2 (3)
1 1 𝑤 1
1
1 −1 𝑤 = 0
2
1 0 2
𝐗 𝐰 𝐲
This set of linear equations has NO exact solution.
ෝ = 𝐗 † 𝐲 = (𝐗 𝑇 𝐗)−𝟏 𝐗 𝑇 𝐲
𝐰 Here (𝐗 𝑇 𝐗 is invertible)
3 0 −1
1 1 1 1 1
= 0 = (Approximation).
0 2 1 −1 0 2 0.5
35
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
(iii) Under-determined system: 𝑚 < 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑 ,
𝐰 ∈ 𝓡𝑑×1 , 𝐲 ∈ 𝓡𝑚×1 .
(i.e., there are more unknowns than equations => infinite number of
solutions)
36
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Derivation:
Under-determined system: 𝑚 < 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑
(i.e., there are more unknowns than equations => infinite number of
solutions => but a unique solution is yet possible by constraining the
search using 𝐰 = 𝐗 𝑇 𝐚 !)
𝐗†
right-inverse
37
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
38
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
39
© Copyright EE, NUS. All Rights Reserved.
The End
40
© Copyright EE, NUS. All Rights Reserved.