EE2211 Lecture 4

EE2211 Introduction to Machine
Learning
Lecture 4
Semester 1
2020/2021
Kar-Ann Toh / Helen Juan Zhou

eletka@nus.edu.sg / helen.zhou@nus.edu.sg
Electrical and Computer Engineering Department

National University of Singapore
Acknowledgement:
EE2211 development team
(Thomas, Kar-Ann, Chen Khong, Helen, Robby and Haizhou)
© Copyright EE, NUS. All Rights Reserved.

Course Contents
• Introduction and Preliminaries (Haizhou)
– Introduction
– Data Engineering
– Introduction to Probability and Statistics
• Fundamental Machine Learning Algorithms I (Kar-Ann / Helen)
– Systems of linear equations
– Least squares, Linear regression
– Ridge regression, Polynomial regression
• Fundamental Machine Learning Algorithms II (Thomas)
– Over-fitting, bias/variance trade-off
– Optimization, Gradient descent
– Decision Trees, Random Forest
• Performance and More Algorithms (Haizhou)
– Performance Issues
– K-means Clustering
– Neural Networks
2
Systems of Linear Equations
Module II Contents
• Notations, Vectors, Matrices
• Operations on Vectors and Matrices
• Systems of Linear Equations
• Functions, Derivative and Gradient
• Least Squares, Linear Regression
• Multiple Outputs
• Ridge Regression
• Polynomial Regression
3
Fundamental ML Algorithms:
Linear Regression
References for Lectures 4-6:

Main
• [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019.
(read first, buy later: http://themlbook.com/wiki/doku.php)
• [Book2] Andreas C. Muller and Sarah Guido, “Introduction to Machine
Learning with Python: A Guide for Data Scientists”, O’Reilly Media, Inc.,
2017.
Supplementary
• [Book3] Jeff Leek, “The Elements of Data Analytic Style: A guide for people
who want to analyze data”, Lean Publishing, 2015.
• [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied
Linear Algebra”, Cambridge University Press, 2018 (available online)
http://vmls-book.stanford.edu/ .
4
Notations, Vectors, Matrices
• A scalar is a simple numerical value, like 15 or −3.25.
• Variables or constants that take scalar values are
denoted by an italic letter, like x or a.
• We shall focus on real numbers.
• The summation over a collection {𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , . . . , 𝑥𝑚 }

is denoted like this:
∑𝑚𝑖=1 𝑥𝑖 = 𝑥1 + 𝑥2 + … + 𝑥𝑚−1 + 𝑥𝑚 .
• The product over a collection {𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , . . . , 𝑥𝑚 } is
denoted like this:
∏𝑚𝑖=1 𝑥𝑖 = 𝑥1 · 𝑥2 ·…· 𝑥𝑚−1 · 𝑥𝑚 .
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019, Chp2.
5
• A vector is an ordered list of scalar values, called
attributes. We denote a vector as a bold character, for
example, x or w.
• Vectors can be visualized as arrows that point to some
directions as well as points in a multi-dimensional space.
• In many books, vectors are written column-wise:

2 −2 1
𝐚= , 𝐛= , 𝐜= .
3 5 0
6
2 −2 1
Illustrations of three two-dimensional vectors, 𝐚 = ,𝐛 = , and 𝐜 =
3 5 0
are given in Figure 1 below.
Figure 1: Three vectors visualized as directions and as points.

7
• We denote an attribute of a vector as an italic value
with an index, like this: 𝑤 𝑗 𝑜𝑟 𝑥 𝑗 . The index j denotes
a specific dimension of the vector, the position of an
attribute in the list.
• For instance, in the vector a shown in red in Figure 1,
𝑎1 2 or more commonly 𝑎1 2
𝐚= 2
= 𝐚= 𝑎 = .
𝑎 3 2 3
Note:
• The notation 𝑥 𝑗 should not be confused with the power operator, such
as the 2 in 𝑥 2 (squared) or 3 in 𝑥 3 (cubed).
• If we want to apply a power operator, say square, to an indexed attribute
of a vector, we write like this: (𝑥 𝑗 )2 .
8
• A matrix is a rectangular array of numbers arranged in
rows and columns. Below is an example of a matrix with
two rows and three columns,
2 4 −3
𝐗= .
21 −6 −1
• Matrices are denoted with bold capital letters, such as X
or W.
Note:
𝑗 𝑘
• A variable can have two or more indices, like this: 𝑥𝑖 or like this 𝑥𝑖,𝑗 .
𝑗
• For example, in neural networks, we denote as 𝑥𝑙,𝑢 the input feature j of unit u in
layer l.
• For the elements in matrix 𝐗, we shall use the indexing 𝑥1,1 where the first and the
second indices respectively indicate the row and the column positions.
9
Recall this Example (continuous)
Iris data set

• Measurement features
can be packed as
𝐗 ∈ 𝓡150×4
• Labels can be written as
https://archive.ics.uci.edu/ml/datasets/iris
EE2211 Introduction to Machine Learning 10

Operations on Vectors:
𝑥1 𝑦1 𝑥1 + 𝑦1
𝐱+𝐲= 𝑥 + 𝑦 = 𝑥 +𝑦
2 2 2 2
𝑥1 𝑦1 𝑥1 − 𝑦1
𝐱−𝐲= 𝑥 − 𝑦 = 𝑥 −𝑦
2 2 2 2
11
Operations on Vectors:
𝑥1 𝑎𝑥1
𝑎 𝐱 = 𝑎 𝑥 = 𝑎𝑥
2 2
1
𝑥1 𝑥
1 1 𝑎 1
𝑎
𝐱 =
𝑎 𝑥2 = 1
𝑥
𝑎 2
12
Matrix or Vector Transpose:
𝑥1
𝐱= 𝑥 , 𝐱 𝑇 = 𝑥1 𝑥2
2
𝑥1,1 𝑥1,2 𝑥1,3 𝑥1,1 𝑥2,1 𝑥3,1
𝐗 = 𝑥2,1 𝑥2,2 𝑥2,3 , 𝐗 𝑇 = 𝑥1,2 𝑥2,2 𝑥3,2
𝑥3,1 𝑥3,2 𝑥3,3 𝑥1,3 𝑥2,3 𝑥3,3
13
Dot Product or Inner Product of Vectors:
𝐱 · 𝐲 = 𝐱𝑇 𝐲
𝑦1
= 𝑥1 𝑥2 𝑦
2
= 𝑥1 𝑦1 + 𝑥2 𝑦2
Geometric Definition: 𝐱
𝐱 · 𝐲 = 𝐱 𝐲 cos𝜃
where 𝜃 is the angle
between 𝐱 and 𝐲,
and 𝐳 = 𝐳 ⋅ 𝐳 is the 𝜃
Euclidean length of 𝐲
vector 𝐳. 𝐱 cos𝜃
14
Matrix-Vector Product
𝑥1,1 𝑥1,2 𝑥1,3 𝑤1

𝐗𝐰 = 𝑥2,1 𝑥2,2 𝑥2,3 𝑤2
𝑥3,1 𝑥3,2 𝑥3,3 𝑤3
𝑥1,1 𝑤1 + 𝑥1,2 𝑤2 + 𝑥1,3 𝑤3

= 𝑥2,1 𝑤1 + 𝑥2,2 𝑤2 + 𝑥2,3 𝑤3
𝑥3,1 𝑤1 + 𝑥3,2 𝑤2 + 𝑥3,3 𝑤3
15
Vector-Matrix Product
𝑤1,1 𝑤1,2 𝑤1,3

𝐱 𝑇 𝐖 = 𝑥1 𝑥2 𝑥3 𝑤2,1 𝑤2,2 𝑤2,3
𝑤3,1 𝑤3,2 𝑤3,3
= (𝑥1 𝑤1,1 + 𝑥2 𝑤2,1 + 𝑥3 𝑤3,1 ) (𝑥1 𝑤1,2 + 𝑥2 𝑤2,2 + 𝑥3 𝑤3,2 ) (𝑥1 𝑤1,3 + 𝑥2 𝑤2,3 + 𝑥3 𝑤3,3 )
16
Matrix-Matrix Product
𝑥1,1 … 𝑥1,𝑑 𝑤1,1 … 𝑤1,ℎ

𝐗𝐖 = ⁞ ⋱ ⁞ ⁞ ⋱ ⁞
𝑥𝑚,1 … 𝑥𝑚,𝑑 𝑤𝑑,1 … 𝑤𝑑,ℎ
(𝑥1,1 𝑤1,1 + ⋯ + 𝑥1,𝑑 𝑤𝑑,1 ) … (𝑥1,1 𝑤1,ℎ + ⋯ + 𝑥1,𝑑 𝑤𝑑,ℎ )

= ⁞ ⋱ ⁞
(𝑥𝑚,1 𝑤1,1 + ⋯ + 𝑥𝑚,𝑑 𝑤𝑑,1 ) … (𝑥𝑚,1 𝑤1,ℎ + ⋯ + 𝑥𝑚,𝑑 𝑤𝑑,ℎ )
∑𝑑𝑖=1 𝑥1,𝑖 𝑤𝑖,1 … ∑𝑑𝑖=1 𝑥1,𝑖 𝑤𝑖,ℎ

= ⁞ ⋱ ⁞
∑𝑑𝑖=1 𝑥𝑚,𝑖 𝑤𝑖,1 … ∑𝑑𝑖=1 𝑥𝑚,𝑖 𝑤𝑖,ℎ
17
Matrix inverse
Definition:
a d-by-d square matrix A is said to be invertible (also nonsingular)
if there exists a d-by-d square matrix B such that 𝐀𝐁 = 𝐁𝐀 = 𝐈
(identity matrix) given by
1 0…0 0
0 1 0 0
⁞ ⋱ ⁞ of d-by-d dimension.
0 0 1 0
0 0…0 1
Ref: https://en.wikipedia.org/wiki/Invertible_matrix
18
Matrix inverse computation
−1
1
𝐀 = adj(𝐀)
det 𝐀
• det 𝐀 is the determinant of 𝐀.
• adj(𝐀) is the adjugate or adjoint of 𝐀 which is the transpose
of its cofactor matrix.
• The 𝑖, 𝑗 entry of the cofactor matrix 𝐂 is the 𝑖, 𝑗 -minor
times a sign factor.
𝑎 𝑏 𝑑 −𝑏
• E.g., 𝐀 = , then adj 𝐀 = = 𝐂𝑇 .
𝑐 𝑑 −𝑐 𝑎
19
Determinant computation
Example: 2x2 matrix

𝑎 𝑏
det 𝐀 = |𝐀| = = 𝑎𝑑 − 𝑏𝑐
𝑐 𝑑
20
Determinant computation
Example: 3x3 matrix
= a(ei - fh) – b(di - fg) + c(dh - eg)
Ref: https://en.wikipedia.org/wiki/Determinant
21
𝑎22 𝑎23
The minor 𝐀11 = 𝑎 𝑎33
32
22
𝑎21 𝑎23
31
23
𝑎21 𝑎22
31
24
𝑎12 𝑎13
32
25
𝑎11 𝑎13
31
26
Example
1 2 3
Find the cofactor matrix of 𝐀 given that 𝐀 = 0 4 5 .
1 0 6
Solution:
4 5 0 5 0 4
𝑎11 = = 24, 𝑎12 = − = 5, 𝑎13 = = −4,
0 6 1 6 1 0
2 3 1 3 1 2
𝑎21 =− = −12, 𝑎22 = = 3, 𝑎23 = − = 2,
0 6 1 6 1 0
2 3 1 3 1 2
𝑎31 = = −2, 𝑎32 = − = −5, 𝑎33 = = 4,
4 5 0 5 0 4
24 5 −4
The cofactor matrix is thus −12 3 2 .
−2 − 5 4
Ref: https://www.mathwords.com/c/cofactor_matrix.htm
27
Linear dependence and Independence

• A collection of d-vectors 𝐱1 , … , 𝐱 𝑚 (with 𝑚 ≥ 1) is called
linearly dependent if
𝛽1 𝐱1 + ⋯ + 𝛽𝑚 𝐱 𝑚 = 0
holds for some 𝛽1 , … , 𝛽𝑚 that are not all zero.
• A collection of d-vectors 𝐱1 , … , 𝐱 𝑚 (with 𝑚 ≥ 1) is called

linearly independent if it is not linearly dependent, which
means that
𝛽1 𝐱1 + ⋯ + 𝛽𝑚 𝐱 𝑚 = 0
only holds for 𝛽1 = ⋯ = 𝛽𝑚 = 0.
Note: a vector of size/dimension/length d is called a d-vector.
𝛽1 , … , 𝛽𝑚 are scalars. Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to
Applied Linear Algebra”, Cambridge University Press, 2018 (Chp5)
28
Geometry of dependency and independency
𝑥2 𝑥3 𝐱3
𝐱1 𝐱2
𝐱1
𝐱2 𝑥2
𝑥1 𝑥1
𝛽1 𝐱1 + 𝛽2 𝐱 2 = 0 𝛽1 𝐱1 + 𝛽2 𝐱 2 ≠ 𝛽3 𝐱3
29
• Consider a system of m linear equations in d variables

𝑥1 , … , 𝑥𝑑 :
𝑥1,1 𝑤1 + 𝑥1,2 𝑤2 + ⋯ + 𝑥1,𝑑 𝑤𝑑 = 𝑦1

𝑥2,1 𝑤1 + 𝑥2,2 𝑤2 + ⋯ + 𝑥2,𝑑 𝑤𝑑 = 𝑦2
⁞
𝑥𝑚,1 𝑤1 + 𝑥𝑚,2 𝑤2 + ⋯ + 𝑥𝑚,𝑑 𝑤𝑑 = 𝑦𝑚 .
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to

Applied Linear Algebra”, Cambridge University Press, 2018 (Chp8.3)
30
These equations can be written compactly in matrix-
vector notation:
𝐗𝐰 = 𝐲
Where
𝑥1,1 𝑥1,2 … 𝑥1,𝑑 𝑤1 𝑦1
𝐗= ⁞ ⁞ ⋱ ⁞ , 𝐰= ⁞ , 𝐲= ⁞ .
𝑥𝑚,1 𝑥𝑚,2 … 𝑥𝑚,𝑑 𝑤𝑑 𝑦𝑚
Note: 𝐗 is of m-by-d dimension (denoted as 𝐗 ∈ 𝓡𝑚×𝑑 for real matrix 𝐗).

31
(i) Square or even-determined system: 𝑚 = 𝑑 in 𝐗𝐰 = 𝐲,

𝐗 ∈ 𝓡𝑚×𝑑 , 𝐰 ∈ 𝓡𝑑×1 , 𝐲 ∈ 𝓡𝑚×1 .
(equal number of equations and unknowns, i.e., 𝐗 ∈ 𝓡𝑑×𝑑 )
If 𝐗 is invertible (or 𝐗 −1 𝐗 = 𝐈 ), then pre-multiply both sides by 𝐗 −1

we have
𝐗 −1 𝐗 𝐰 = 𝐗 −1 𝐲
⇒ ෝ = 𝐗 −1 𝐲.
𝐰
(Note: we use a hat on top of 𝐰 to indicate that it is a specific point in the space of 𝐰)
If all rows or columns of 𝐗 are linearly independent, then 𝐗 is
invertible.

32
Example 1 𝑤1 + 𝑤2 = 4 (1) Two unknowns and
𝑤1 − 2𝑤2 = 1 (2) two equations
1 1 𝑤1 4
=
1 − 2 𝑤2 1
𝐗 𝐰 𝐲
ෝ = 𝐗 −1 𝐲
𝐰
−1
1 1 4 3
= =
1 −2 1 1
Here, the rows or columns of 𝐗 are linearly independent, hence 𝐗 is invertible.
33
(ii) Over-determined system: 𝑚 > 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑 ,
𝐰 ∈ 𝓡𝑑×1 , 𝐲 ∈ 𝓡𝑚×1 .
(i.e., there are more equations than unknowns)
This set of linear equations has NO exact solution (𝐗 is non-square and
hence not invertible). However, an approximated solution is yet available.
If the left-inverse of 𝐗 exists such that 𝐗 † 𝐗 = 𝐈, then pre-multiply both

sides by 𝐗 † results in
𝐗†𝐗 𝐰 = 𝐗†𝐲
⇒𝐰 ෝ = 𝐗 † 𝐲.
Definition: a matrix B that satisfies 𝐁𝐀 = 𝐈 (identity matrix) is called a

left-inverse of 𝐀. (Note: A is m-by-d and B is d-by-m).
Note: The left-inverse can be computed as 𝐗 † = (𝐗 𝑇 𝐗)−𝟏 𝐗 𝑇 given 𝐗 𝑇 𝐗
is invertible. Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to
Applied Linear Algebra”, Cambridge University Press, 2018 (Chp11.1-11.2, 11.5)
34
Example 2 𝑤1 + 𝑤2 = 1 (1) Two unknowns and
𝑤1 − 𝑤2 = 0 (2) three equations
𝑤1 = 2 (3)
1 1 𝑤 1
1
1 −1 𝑤 = 0
2
1 0 2
𝐗 𝐰 𝐲
This set of linear equations has NO exact solution.
ෝ = 𝐗 † 𝐲 = (𝐗 𝑇 𝐗)−𝟏 𝐗 𝑇 𝐲
𝐰 Here (𝐗 𝑇 𝐗 is invertible)
3 0 −1
1 1 1 1 1
= 0 = (Approximation).
0 2 1 −1 0 2 0.5
35
(iii) Under-determined system: 𝑚 < 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑 ,
𝐰 ∈ 𝓡𝑑×1 , 𝐲 ∈ 𝓡𝑚×1 .
(i.e., there are more unknowns than equations => infinite number of
solutions)
If the right-inverse of 𝐗 exists such that 𝐗𝐗 † = 𝐈, then the 𝑑-vector

𝐰 = 𝐗 † 𝐲 (one of the infinite cases) satisfies the equation 𝐗𝐰 = 𝐲, i.e.,
𝐗𝐗 † 𝐲 = 𝐲
⇒ 𝐲 = 𝐲.
Definition: a matrix B that satisfies𝐀𝐁 = 𝐈 (identity matrix) is called a

right-inverse of 𝐀. (Note: A is m-by-d and B is d-by-m).
Note: The right-inverse can be computed as 𝐗 † = 𝐗 𝑇 (𝐗𝐗 𝑇 )−1 given 𝐗𝐗 𝑇

is invertible. Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to
Applied Linear Algebra”, Cambridge University Press, 2018 (Chp11.1-11.2, 11.5)
36
Derivation:
Under-determined system: 𝑚 < 𝑑 in 𝐗𝐰 = 𝐲, 𝐗 ∈ 𝓡𝑚×𝑑
(i.e., there are more unknowns than equations => infinite number of
solutions => but a unique solution is yet possible by constraining the
search using 𝐰 = 𝐗 𝑇 𝐚 !)
If 𝐗𝐗 𝑇 is invertible, let 𝐰 = 𝐗 𝑇 𝐚, then

𝐗𝐗 𝑇 𝐚 = 𝐲
⇒ 𝐚ො = (𝐗𝐗 𝑇 )−1 𝐲
⇒𝐰ෝ = 𝐗 𝑇 𝐚ො = 𝐗 𝑇 (𝐗𝐗 𝑇 )−1 𝐲.
𝐗†
right-inverse
37
Example 3 𝑤1 + 2𝑤2 + 3𝑤3 = 2 (1) Three unknowns and

𝑤1 − 2𝑤2 + 3𝑤3 = 1 (2) two equations
𝑤
1 2 3 𝑤1 2
2 =
1 −2 3 𝑤3 1
𝐗 𝐰 𝐲
This set of linear equations has
Infinitely many solutions along the
intersection line. 𝑇
Here (𝐗𝐗 is invertible)
ෝ=
𝐰 𝐗 𝑇 (𝐗𝐗 𝑇 )−1 𝐲
−1
1 1 14 6 0.15
2 (Constrained
= 2 −2 = 0.25
6 14 1 solution).
3 3 0.45
38
Example 4 𝑤1 + 2𝑤2 + 3𝑤3 = 2 (1) Three unknowns and

3𝑤1 + 6𝑤2 + 9𝑤3 = 1 (2) two equations
𝑤
1 2 3 𝑤1 2
2 =
3 6 9 𝑤3 1
𝐗 𝐰 𝐲
Here both 𝐗𝐗 𝑇 and 𝐗 𝑇 𝐗

are NOT invertible!
There is NO solution for the

system.
39
The End
40

EE2211 Lecture 4

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EE2211 Lecture 4

Uploaded by

Copyright:

Available Formats

EE2211 Introduction to Machine

Kar-Ann Toh / Helen Juan Zhou

Electrical and Computer Engineering Department

© Copyright EE, NUS. All Rights Reserved.

References for Lectures 4-6:

• The summation over a collection {𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , . . . , 𝑥𝑚 }

• In many books, vectors are written column-wise:

Figure 1: Three vectors visualized as directions and as points.

Iris data set

• Labels can be written as

EE2211 Introduction to Machine Learning 10

𝑥1,1 𝑥1,2 𝑥1,3 𝑤1

𝑥1,1 𝑤1 + 𝑥1,2 𝑤2 + 𝑥1,3 𝑤3

𝑤1,1 𝑤1,2 𝑤1,3

𝑥1,1 … 𝑥1,𝑑 𝑤1,1 … 𝑤1,ℎ

(𝑥1,1 𝑤1,1 + ⋯ + 𝑥1,𝑑 𝑤𝑑,1 ) … (𝑥1,1 𝑤1,ℎ + ⋯ + 𝑥1,𝑑 𝑤𝑑,ℎ )

∑𝑑𝑖=1 𝑥1,𝑖 𝑤𝑖,1 … ∑𝑑𝑖=1 𝑥1,𝑖 𝑤𝑖,ℎ

Matrix inverse computation

Example: 2x2 matrix

= a(ei - fh) – b(di - fg) + c(dh - eg)

Linear dependence and Independence

• A collection of d-vectors 𝐱1 , … , 𝐱 𝑚 (with 𝑚 ≥ 1) is called

Geometry of dependency and independency

• Consider a system of m linear equations in d variables

𝑥1,1 𝑤1 + 𝑥1,2 𝑤2 + ⋯ + 𝑥1,𝑑 𝑤𝑑 = 𝑦1

Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to

Note: 𝐗 is of m-by-d dimension (denoted as 𝐗 ∈ 𝓡𝑚×𝑑 for real matrix 𝐗).

Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to

(i) Square or even-determined system: 𝑚 = 𝑑 in 𝐗𝐰 = 𝐲,

If 𝐗 is invertible (or 𝐗 −1 𝐗 = 𝐈 ), then pre-multiply both sides by 𝐗 −1

Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to

Here, the rows or columns of 𝐗 are linearly independent, hence 𝐗 is invertible.

If the left-inverse of 𝐗 exists such that 𝐗 † 𝐗 = 𝐈, then pre-multiply both

Definition: a matrix B that satisfies 𝐁𝐀 = 𝐈 (identity matrix) is called a

If the right-inverse of 𝐗 exists such that 𝐗𝐗 † = 𝐈, then the 𝑑-vector

Definition: a matrix B that satisfies𝐀𝐁 = 𝐈 (identity matrix) is called a

Note: The right-inverse can be computed as 𝐗 † = 𝐗 𝑇 (𝐗𝐗 𝑇 )−1 given 𝐗𝐗 𝑇

If 𝐗𝐗 𝑇 is invertible, let 𝐰 = 𝐗 𝑇 𝐚, then

Example 3 𝑤1 + 2𝑤2 + 3𝑤3 = 2 (1) Three unknowns and

Example 4 𝑤1 + 2𝑤2 + 3𝑤3 = 2 (1) Three unknowns and

Here both 𝐗𝐗 𝑇 and 𝐗 𝑇 𝐗

There is NO solution for the

You might also like