You are on page 1of 40

EE2211 Introduction to Machine

Learning
Lecture 4
Semester 1
2020/2021

Kar-Ann Toh / Helen Juan Zhou


eletka@nus.edu.sg / helen.zhou@nus.edu.sg

Electrical and Computer Engineering Department


National University of Singapore

Acknowledgement:
EE2211 development team
(Thomas, Kar-Ann, Chen Khong, Helen, Robby and Haizhou)

© Copyright EE, NUS. All Rights Reserved.


Course Contents
• Introduction and Preliminaries (Haizhou)
– Introduction
– Data Engineering
– Introduction to Probability and Statistics
• Fundamental Machine Learning Algorithms I (Kar-Ann / Helen)
– Systems of linear equations
– Least squares, Linear regression
– Ridge regression, Polynomial regression
• Fundamental Machine Learning Algorithms II (Thomas)
– Over-fitting, bias/variance trade-off
– Optimization, Gradient descent
– Decision Trees, Random Forest
• Performance and More Algorithms (Haizhou)
– Performance Issues
– K-means Clustering
– Neural Networks

2
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Module II Contents
• Notations, Vectors, Matrices
• Operations on Vectors and Matrices
• Systems of Linear Equations
• Functions, Derivative and Gradient
• Least Squares, Linear Regression
• Multiple Outputs
• Ridge Regression
• Polynomial Regression

3
© Copyright EE, NUS. All Rights Reserved.
Fundamental ML Algorithms:
Linear Regression

References for Lectures 4-6:


Main
• [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019.
(read first, buy later: http://themlbook.com/wiki/doku.php)
• [Book2] Andreas C. Muller and Sarah Guido, “Introduction to Machine
Learning with Python: A Guide for Data Scientists”, O’Reilly Media, Inc.,
2017.
Supplementary
• [Book3] Jeff Leek, “The Elements of Data Analytic Style: A guide for people
who want to analyze data”, Lean Publishing, 2015.
• [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied
Linear Algebra”, Cambridge University Press, 2018 (available online)
http://vmls-book.stanford.edu/ .

4
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
• A scalar is a simple numerical value, like 15 or −3.25.
• Variables or constants that take scalar values are
denoted by an italic letter, like x or a.
• We shall focus on real numbers.

• The summation over a collection {𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , . . . , 𝑥𝑚 }


is denoted like this:
∑𝑚𝑖=1 𝑥𝑖 = 𝑥1 + 𝑥2 + … + 𝑥𝑚−1 + 𝑥𝑚 .
• The product over a collection {𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , . . . , 𝑥𝑚 } is
denoted like this:
∏𝑚𝑖=1 𝑥𝑖 = 𝑥1 · 𝑥2 ·…· 𝑥𝑚−1 · 𝑥𝑚 .

Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.

5
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
• A vector is an ordered list of scalar values, called
attributes. We denote a vector as a bold character, for
example, x or w.
• Vectors can be visualized as arrows that point to some
directions as well as points in a multi-dimensional space.

• In many books, vectors are written column-wise:


2 −2 1
𝐚= , 𝐛= , 𝐜= .
3 5 0

Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.

6
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices

2 −2 1
Illustrations of three two-dimensional vectors, 𝐚 = ,𝐛 = , and 𝐜 =
3 5 0
are given in Figure 1 below.

Figure 1: Three vectors visualized as directions and as points.


Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.

7
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
• We denote an attribute of a vector as an italic value
with an index, like this: 𝑤 𝑗 𝑜𝑟 𝑥 𝑗 . The index j denotes
a specific dimension of the vector, the position of an
attribute in the list.
• For instance, in the vector a shown in red in Figure 1,
𝑎1 2 or more commonly 𝑎1 2
𝐚= 2
= 𝐚= 𝑎 = .
𝑎 3 2 3
Note:
• The notation 𝑥 𝑗 should not be confused with the power operator, such
as the 2 in 𝑥 2 (squared) or 3 in 𝑥 3 (cubed).
• If we want to apply a power operator, say square, to an indexed attribute
of a vector, we write like this: (𝑥 𝑗 )2 .

Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.

8
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
• A matrix is a rectangular array of numbers arranged in
rows and columns. Below is an example of a matrix with
two rows and three columns,
2 4 −3
𝐗= .
21 −6 −1
• Matrices are denoted with bold capital letters, such as X
or W.

Note:
𝑗 𝑘
• A variable can have two or more indices, like this: 𝑥𝑖 or like this 𝑥𝑖,𝑗 .
𝑗
• For example, in neural networks, we denote as 𝑥𝑙,𝑢 the input feature j of unit u in
layer l.
• For the elements in matrix 𝐗, we shall use the indexing 𝑥1,1 where the first and the
second indices respectively indicate the row and the column positions.
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.

9
© Copyright EE, NUS. All Rights Reserved.
Recall this Example (continuous)

Iris data set


• Measurement features
can be packed as
𝐗 ∈ 𝓡150×4

• Labels can be written as

https://archive.ics.uci.edu/ml/datasets/iris

EE2211 Introduction to Machine Learning 10


© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices

Operations on Vectors:

𝑥1 𝑦1 𝑥1 + 𝑦1
𝐱+𝐲= 𝑥 + 𝑦 = 𝑥 +𝑦
2 2 2 2

𝑥1 𝑦1 𝑥1 − 𝑦1
𝐱−𝐲= 𝑥 − 𝑦 = 𝑥 −𝑦
2 2 2 2

11
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices

Operations on Vectors:

𝑥1 𝑎𝑥1
𝑎 𝐱 = 𝑎 𝑥 = 𝑎𝑥
2 2

1
𝑥1 𝑥
1 1 𝑎 1
𝑎
𝐱 =
𝑎 𝑥2 = 1
𝑥
𝑎 2

12
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
Matrix or Vector Transpose:
𝑥1
𝐱= 𝑥 , 𝐱 𝑇 = 𝑥1 𝑥2
2
𝑥1,1 𝑥1,2 𝑥1,3 𝑥1,1 𝑥2,1 𝑥3,1
𝐗 = 𝑥2,1 𝑥2,2 𝑥2,3 , 𝐗 𝑇 = 𝑥1,2 𝑥2,2 𝑥3,2
𝑥3,1 𝑥3,2 𝑥3,3 𝑥1,3 𝑥2,3 𝑥3,3

13
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices
Dot Product or Inner Product of Vectors:
𝐱 · 𝐲 = 𝐱𝑇 𝐲
𝑦1
= 𝑥1 𝑥2 𝑦
2
= 𝑥1 𝑦1 + 𝑥2 𝑦2
Geometric Definition: 𝐱
𝐱 · 𝐲 = 𝐱 𝐲 cos𝜃
where 𝜃 is the angle
between 𝐱 and 𝐲,
and 𝐳 = 𝐳 ⋅ 𝐳 is the 𝜃
Euclidean length of 𝐲
vector 𝐳. 𝐱 cos𝜃

14
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices

Matrix-Vector Product

𝑥1,1 𝑥1,2 𝑥1,3 𝑤1


𝐗𝐰 = 𝑥2,1 𝑥2,2 𝑥2,3 𝑤2
𝑥3,1 𝑥3,2 𝑥3,3 𝑤3

𝑥1,1 𝑤1 + 𝑥1,2 𝑤2 + 𝑥1,3 𝑤3


= 𝑥2,1 𝑤1 + 𝑥2,2 𝑤2 + 𝑥2,3 𝑤3
𝑥3,1 𝑤1 + 𝑥3,2 𝑤2 + 𝑥3,3 𝑤3

15
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices

Vector-Matrix Product

𝑤1,1 𝑤1,2 𝑤1,3


𝐱 𝑇 𝐖 = 𝑥1 𝑥2 𝑥3 𝑤2,1 𝑤2,2 𝑤2,3
𝑤3,1 𝑤3,2 𝑤3,3
= (𝑥1 𝑤1,1 + 𝑥2 𝑤2,1 + 𝑥3 𝑤3,1 ) (𝑥1 𝑤1,2 + 𝑥2 𝑤2,2 + 𝑥3 𝑤3,2 ) (𝑥1 𝑤1,3 + 𝑥2 𝑤2,3 + 𝑥3 𝑤3,3 )

16
© Copyright EE, NUS. All Rights Reserved.
Notations, Vectors, Matrices

Matrix-Matrix Product

𝑥1,1 … 𝑥1,𝑑 𝑤1,1 … 𝑤1,ℎ


𝐗𝐖 = ⁞ ⋱ ⁞ ⁞ ⋱ ⁞
𝑥𝑚,1 … 𝑥𝑚,𝑑 𝑤𝑑,1 … 𝑤𝑑,ℎ

(𝑥1,1 𝑤1,1 + ⋯ + 𝑥1,𝑑 𝑤𝑑,1 ) … (𝑥1,1 𝑤1,ℎ + ⋯ + 𝑥1,𝑑 𝑤𝑑,ℎ )


= ⁞ ⋱ ⁞
(𝑥𝑚,1 𝑤1,1 + ⋯ + 𝑥𝑚,𝑑 𝑤𝑑,1 ) … (𝑥𝑚,1 𝑤1,ℎ + ⋯ + 𝑥𝑚,𝑑 𝑤𝑑,ℎ )

∑𝑑𝑖=1 𝑥1,𝑖 𝑤𝑖,1 … ∑𝑑𝑖=1 𝑥1,𝑖 𝑤𝑖,ℎ


= ⁞ ⋱ ⁞
∑𝑑𝑖=1 𝑥𝑚,𝑖 𝑤𝑖,1 … ∑𝑑𝑖=1 𝑥𝑚,𝑖 𝑤𝑖,ℎ

17
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Matrix inverse

Definition:
a d-by-d square matrix A is said to be invertible (also nonsingular)
if there exists a d-by-d square matrix B such that 𝐀𝐁 = 𝐁𝐀 = 𝐈
(identity matrix) given by
1 0…0 0
0 1 0 0
⁞ ⋱ ⁞ of d-by-d dimension.
0 0 1 0
0 0…0 1

Ref: https://en.wikipedia.org/wiki/Invertible_matrix

18
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Matrix inverse computation

−1
1
𝐀 = adj(𝐀)
det 𝐀
• det 𝐀 is the determinant of 𝐀.
• adj(𝐀) is the adjugate or adjoint of 𝐀 which is the transpose
of its cofactor matrix.
• The 𝑖, 𝑗 entry of the cofactor matrix 𝐂 is the 𝑖, 𝑗 -minor
times a sign factor.
𝑎 𝑏 𝑑 −𝑏
• E.g., 𝐀 = , then adj 𝐀 = = 𝐂𝑇 .
𝑐 𝑑 −𝑐 𝑎

Ref: https://en.wikipedia.org/wiki/Invertible_matrix

19
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Determinant computation

Example: 2x2 matrix


𝑎 𝑏
det 𝐀 = |𝐀| = = 𝑎𝑑 − 𝑏𝑐
𝑐 𝑑

Ref: https://en.wikipedia.org/wiki/Invertible_matrix

20
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Determinant computation
Example: 3x3 matrix

= a(ei - fh) – b(di - fg) + c(dh - eg)

Ref: https://en.wikipedia.org/wiki/Determinant

21
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

𝑎22 𝑎23
The minor 𝐀11 = 𝑎 𝑎33
32

Ref: https://en.wikipedia.org/wiki/Determinant

22
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

𝑎21 𝑎23
The minor 𝐀12 = 𝑎 𝑎33
31

Ref: https://en.wikipedia.org/wiki/Determinant

23
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

𝑎21 𝑎22
The minor 𝐀13 = 𝑎 𝑎32
31

Ref: https://en.wikipedia.org/wiki/Determinant

24
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

𝑎12 𝑎13
The minor 𝐀21 = 𝑎 𝑎33
32

Ref: https://en.wikipedia.org/wiki/Determinant

25
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

𝑎11 𝑎13
The minor 𝐀22 = 𝑎 𝑎33
31

Ref: https://en.wikipedia.org/wiki/Determinant

26
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Example

1 2 3
Find the cofactor matrix of 𝐀 given that 𝐀 = 0 4 5 .
1 0 6
Solution:
4 5 0 5 0 4
𝑎11 = = 24, 𝑎12 = − = 5, 𝑎13 = = −4,
0 6 1 6 1 0
2 3 1 3 1 2
𝑎21 =− = −12, 𝑎22 = = 3, 𝑎23 = − = 2,
0 6 1 6 1 0
2 3 1 3 1 2
𝑎31 = = −2, 𝑎32 = − = −5, 𝑎33 = = 4,
4 5 0 5 0 4
24 5 −4
The cofactor matrix is thus −12 3 2 .
−2 − 5 4

Ref: https://www.mathwords.com/c/cofactor_matrix.htm

27
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Linear dependence and Independence


• A collection of d-vectors 𝐱1 , … , 𝐱 𝑚 (with 𝑚 ≥ 1) is called
linearly dependent if
𝛽1 𝐱1 + ⋯ + 𝛽𝑚 𝐱 𝑚 = 0
holds for some 𝛽1 , … , 𝛽𝑚 that are not all zero.

• A collection of d-vectors 𝐱1 , … , 𝐱 𝑚 (with 𝑚 ≥ 1) is called


linearly independent if it is not linearly dependent, which
means that
𝛽1 𝐱1 + ⋯ + 𝛽𝑚 𝐱 𝑚 = 0
only holds for 𝛽1 = ⋯ = 𝛽𝑚 = 0.
Note: a vector of size/dimension/length d is called a d-vector.
𝛽1 , … , 𝛽𝑚 are scalars. Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to
Applied Linear Algebra”, Cambridge University Press, 2018 (Chp5)

28
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Geometry of dependency and independency

𝑥2 𝑥3 𝐱3
𝐱1 𝐱2
𝐱1

𝐱2 𝑥2

𝑥1 𝑥1
𝛽1 𝐱1 + 𝛽2 𝐱 2 = 0 𝛽1 𝐱1 + 𝛽2 𝐱 2 ≠ 𝛽3 𝐱3

29
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

• Consider a system of m linear equations in d variables


𝑥1 , … , 𝑥𝑑 :

𝑥1,1 𝑤1 + 𝑥1,2 𝑤2 + ⋯ + 𝑥1,𝑑 𝑤𝑑 = 𝑦1


𝑥2,1 𝑤1 + 𝑥2,2 𝑤2 + ⋯ + 𝑥2,𝑑 𝑤𝑑 = 𝑦2

𝑥𝑚,1 𝑤1 + 𝑥𝑚,2 𝑤2 + ⋯ + 𝑥𝑚,𝑑 𝑤𝑑 = 𝑦𝑚 .

Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to


Applied Linear Algebra”, Cambridge University Press, 2018 (Chp8.3)

30
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
These equations can be written compactly in matrix-
vector notation:

𝐗𝐰 = 𝐲
Where
𝑥1,1 𝑥1,2 … 𝑥1,𝑑 𝑤1 𝑦1
𝐗= ⁞ ⁞ ⋱ ⁞ , 𝐰= ⁞ , 𝐲= ⁞ .
𝑥𝑚,1 𝑥𝑚,2 … 𝑥𝑚,𝑑 𝑤𝑑 𝑦𝑚

Note: 𝐗 is of m-by-d dimension (denoted as 𝐗 ∈ 𝓡𝑚×𝑑 for real matrix 𝐗).

Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to


Applied Linear Algebra”, Cambridge University Press, 2018 (Chp8.3)

31
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

(i) Square or even-determined system: 𝑚 = 𝑑 in 𝐗𝐰 = 𝐲,


𝐗 ∈ 𝓡𝑚×𝑑 , 𝐰 ∈ 𝓡𝑑×1 , 𝐲 ∈ 𝓡𝑚×1 .
(equal number of equations and unknowns, i.e., 𝐗 ∈ 𝓡𝑑×𝑑 )

If 𝐗 is invertible (or 𝐗 −1 𝐗 = 𝐈 ), then pre-multiply both sides by 𝐗 −1


we have
𝐗 −1 𝐗 𝐰 = 𝐗 −1 𝐲
⇒ ෝ = 𝐗 −1 𝐲.
𝐰
(Note: we use a hat on top of 𝐰 to indicate that it is a specific point in the space of 𝐰)
If all rows or columns of 𝐗 are linearly independent, then 𝐗 is
invertible.

Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to


Applied Linear Algebra”, Cambridge University Press, 2018 (Chp11.2)

32
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Example 1 𝑤1 + 𝑤2 = 4 (1) Two unknowns and
𝑤1 − 2𝑤2 = 1 (2) two equations

1 1 𝑤1 4
=
1 − 2 𝑤2 1
𝐗 𝐰 𝐲

ෝ = 𝐗 −1 𝐲
𝐰
−1
1 1 4 3
= =
1 −2 1 1

Here, the rows or columns of 𝐗 are linearly independent, hence 𝐗 is invertible.

33
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
(ii) Over-determined system: 𝑚 > 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑 ,
𝐰 ∈ 𝓡𝑑×1 , 𝐲 ∈ 𝓡𝑚×1 .
(i.e., there are more equations than unknowns)
This set of linear equations has NO exact solution (𝐗 is non-square and
hence not invertible). However, an approximated solution is yet available.

If the left-inverse of 𝐗 exists such that 𝐗 † 𝐗 = 𝐈, then pre-multiply both


sides by 𝐗 † results in
𝐗†𝐗 𝐰 = 𝐗†𝐲
⇒𝐰 ෝ = 𝐗 † 𝐲.

Definition: a matrix B that satisfies 𝐁𝐀 = 𝐈 (identity matrix) is called a


left-inverse of 𝐀. (Note: A is m-by-d and B is d-by-m).
Note: The left-inverse can be computed as 𝐗 † = (𝐗 𝑇 𝐗)−𝟏 𝐗 𝑇 given 𝐗 𝑇 𝐗
is invertible. Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to
Applied Linear Algebra”, Cambridge University Press, 2018 (Chp11.1-11.2, 11.5)

34
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Example 2 𝑤1 + 𝑤2 = 1 (1) Two unknowns and
𝑤1 − 𝑤2 = 0 (2) three equations
𝑤1 = 2 (3)
1 1 𝑤 1
1
1 −1 𝑤 = 0
2
1 0 2
𝐗 𝐰 𝐲
This set of linear equations has NO exact solution.

ෝ = 𝐗 † 𝐲 = (𝐗 𝑇 𝐗)−𝟏 𝐗 𝑇 𝐲
𝐰 Here (𝐗 𝑇 𝐗 is invertible)
3 0 −1
1 1 1 1 1
= 0 = (Approximation).
0 2 1 −1 0 2 0.5

35
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
(iii) Under-determined system: 𝑚 < 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑 ,
𝐰 ∈ 𝓡𝑑×1 , 𝐲 ∈ 𝓡𝑚×1 .
(i.e., there are more unknowns than equations => infinite number of
solutions)

If the right-inverse of 𝐗 exists such that 𝐗𝐗 † = 𝐈, then the 𝑑-vector


𝐰 = 𝐗 † 𝐲 (one of the infinite cases) satisfies the equation 𝐗𝐰 = 𝐲, i.e.,
𝐗𝐗 † 𝐲 = 𝐲
⇒ 𝐲 = 𝐲.

Definition: a matrix B that satisfies𝐀𝐁 = 𝐈 (identity matrix) is called a


right-inverse of 𝐀. (Note: A is m-by-d and B is d-by-m).

Note: The right-inverse can be computed as 𝐗 † = 𝐗 𝑇 (𝐗𝐗 𝑇 )−1 given 𝐗𝐗 𝑇


is invertible. Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to
Applied Linear Algebra”, Cambridge University Press, 2018 (Chp11.1-11.2, 11.5)

36
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Derivation:
Under-determined system: 𝑚 < 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑
(i.e., there are more unknowns than equations => infinite number of
solutions => but a unique solution is yet possible by constraining the
search using 𝐰 = 𝐗 𝑇 𝐚 !)

If 𝐗𝐗 𝑇 is invertible, let 𝐰 = 𝐗 𝑇 𝐚, then


𝐗𝐗 𝑇 𝐚 = 𝐲
⇒ 𝐚ො = (𝐗𝐗 𝑇 )−1 𝐲
⇒𝐰ෝ = 𝐗 𝑇 𝐚ො = 𝐗 𝑇 (𝐗𝐗 𝑇 )−1 𝐲.

𝐗†
right-inverse

37
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Example 3 𝑤1 + 2𝑤2 + 3𝑤3 = 2 (1) Three unknowns and


𝑤1 − 2𝑤2 + 3𝑤3 = 1 (2) two equations
𝑤
1 2 3 𝑤1 2
2 =
1 −2 3 𝑤3 1
𝐗 𝐰 𝐲
This set of linear equations has
Infinitely many solutions along the
intersection line. 𝑇
Here (𝐗𝐗 is invertible)
ෝ=
𝐰 𝐗 𝑇 (𝐗𝐗 𝑇 )−1 𝐲
−1
1 1 14 6 0.15
2 (Constrained
= 2 −2 = 0.25
6 14 1 solution).
3 3 0.45

38
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations

Example 4 𝑤1 + 2𝑤2 + 3𝑤3 = 2 (1) Three unknowns and


3𝑤1 + 6𝑤2 + 9𝑤3 = 1 (2) two equations
𝑤
1 2 3 𝑤1 2
2 =
3 6 9 𝑤3 1
𝐗 𝐰 𝐲

Here both 𝐗𝐗 𝑇 and 𝐗 𝑇 𝐗


are NOT invertible!

There is NO solution for the


system.

39
© Copyright EE, NUS. All Rights Reserved.
The End

40
© Copyright EE, NUS. All Rights Reserved.

You might also like