Mlba Slides 5 Linear-Independence-note

Machine Learning
for Business Analytics

Prof. Sunny S. Yang
5. Linear independence
Outline
Linear independence
Basis
Orthonormal vectors
Gram–Schmidt algorithm
Linear independence 3
Linear dependence
• set of n-vectors {a1, . . . , ak} (with k≥ 1) is linearly dependent if
β1a1 + . . . + βk ak = 0 Iinearcomhination : 0
holds for some β1, . . . , βk , that are not all zero 線性獨立

• equivalent to: at least one ai is a linear combination of the others
• we say ‘a1, . . . , ak are linearly dependent’
• {a1} is linearly dependent only if a1 = 0

• {a1 , a 2} is linearly dependent only if one ai is a multiple of the other
如果以的線性組合中有 a. , 則 a.az 兩个叫是線性獨立
"
,
• for more than two vectors, there is no simple to state condition
Linear independence—Linear independence 4

Example
• the vectors
0.2 −0.1 0
[8.6] [ −1 ] [ 2.2]
a1 = −7 , a2 = 2 , a 3 = −1
are linearly dependent, since a1 + 2a 2 − 3a 3 = 0

• can express any of them as linear combination of the other two, e.g.,
a 2 = (−1/2)a1 + (3/2)a 3
每⼀
个 vector 都可以⽤其他兩个 veūor 組合
Linear independence
• set of n-vectors {a1, . . . , ak}(with k ≥ 1) is linearly independent if it is not

linearly dependent, i.e. ,
β1a1 + . . . + βk ak = 0
holds only when β1 = . . . = βk = 0 不泉性相依
• we say ‘a1, . . . , ak are linearly independent’
• equivalent to: no ai is a linear combination of the others
• example: the unit n-vectors e1, . . . , en are linearly independent

Linear combinations of
linearly independent vectors
• suppose x is linear combination of linearly independent vectors a1, . . . , ak:

x = β1a1 + . . . + βk ak
• the coefficients β1, . . . , βk are unique , i.e. , if
x = γ1a1 + . . . + γk ak
then βi = γi for i = 1,...,k
• this means that (in principle) we can deduce the coefficients from x
• to see why, note that
(β1 − γ1)a1 + . . . + (βk − γk )ak = 0
β1 − γ1 = . . . = βk − γk = 0
and so (by linear independence)
Outline
Linear independence
Basis
Orthonormal vectors
Independence-dimension inequality
kgelement , 每 gelement 有 ngvector

• a linearly independent set of n-vectors can have at most n elements
• put another way: any set of n + 1 or more n-vectors is linearly dependent
Linear independence—Basis 9
Basis
• a set of n linearly independent n-vectors a1, . . . , an is called a basis

• any n-vector b can be expressed as a linear combination of them:
b = β1a1 + . . . + βn an
for some β1, . . . , βn
• and these coefficients are unique
• formula above is called expansion of b in the a1, . . . , an basis
• example: e1, . . . , en is a basis, expansion of b is
b = β1e1 + . . . + βn en
Linear independence—Basis 10
Outline
Linear independence
Basis
Orthonormal vectors
Orthonormal vectors
• set of n-vectors a1, . . . , ak are (mutually) orthogonal if ai ⊥ aj for i ≠ j

• they are normalized if ∥ai∥ = 1 for i = 1,...,k 單位向量 + 互相垂直
• they are orthonormal if both hold
• can be expressed using inner products as
{0 i ≠ j
1 i =j ⾃⼰ ✗ ⾃⼰
aiT aj =
政
• orthonormal sets of vectors are linearly independent

• by independence-dimension inequality, must have k ≤ n
• when k = n, a1, . . . , an are an orthonormal basis
Linear independence—Orthonormal vectors 12

Examples of orthonormal bases
• standard unit n-vectors e1, . . . , en
• the 3-vectors
0 1 1
[−1] 2 [0] 2[0]
1 1
0 , 1 , −1 ,
• the 2-vectors shown below
Orthonormal expansion
• if a1, . . . , an is an orthonormal basis, we have for any n-vector x
x = (a1T x)a1 + . . . + (anT x)an
• called orthonormal expansion of x (in the orthonormal basis)

• to verify formula, take inner product of both sides with ai

Outline
Linear independence
Basis
Orthonormal vectors
Gram–Schmidt (orthogonalization) algorithm
⽔ • an algorithm to check if a1, . . . , ak are linearly independent
• we’ll see later it has many other uses
Linear independence—Gram-Schmidt algorithm 16

given n-vectors a1, . . . , ak

for i = 1,...,k
1. Orthogonalization: q̃ i = ai − (q1T ai )q1 − . . . − (qi−1
T
ai )qi−1
2. Test for linear dependence: if q̃ i = 0 , quit
3. Normalization: qi = q̃i /∥q̃i∥
• if G–S does not stop early (in step 2) , a1, . . . , ak are linearly independent
• if G–S stops early in iteration i = j then aj is a linear combination of

a1, . . . , aj−1 (so a1, . . . , ak are linearly dependent)
Example
unitvector
Gtep Ii

Analysis
let’s show by induction that q1, . . . , qi are orthonormal
• assume it’s true for i − 1

• orthogonalization step ensures that
q̃i ⊥ q1, . . . , q̃i ⊥ qi−1

• to see this, take inner product of both sides with qj , j < i
qjT q̃i = qjT ai − (q1T ai )(qjT q1) − . . . − (qi−1

T
ai )(qjT qi−1)
= qjT ai − qjT ai = 0
• so qi ⊥ q1, . . . , qi ⊥ qi−1
• normalization step ensures that ∥qi∥ = 1
Analysis
assuming G–S has not terminated before iteration i
• ai is a linear combination of q1, . . . , qi :

ai = ∥q̃i∥qi + (q1T ai )q1 + . . . + (qi−1
T
ai )qi−1
• qi is a linear combination of a1, . . . , ai : by induction on i,
qi = (1/∥q̃i∥)(ai − (q1T ai )q1 − . . . − (qi−1
T
ai )qi−1)
and (by induction assumption) each q1, . . . , qi−1 is a linear combination of
a1, . . . , ai−1

Early termination
suppose G–S terminates in step j
• aj is linear combination of q1, . . . , qj−1

aj = (q1T aj )q1 + . . . + (qj−1
T
aj )qj−1
• and each of q1, . . . , qj−1 is linear combination of a a1, . . . , aj−1
• so aj is a linear combination of a1, . . . , aj−1
Complexity of Gram–Schmidt algorithm
• step 1 of iteration i requires i − 1 inner products,

q1T ai, . . . , qi−1
T
ai
which costs (i − 1)(2n − 1) flops
• 2n(i − 1) flops to compute q̃i
• 3n flops to compute ∥q̃i∥ and qi
• total is
k
k (k − 1)
+ 3n k ≈ 2n k 2
∑
((4n − 1)(i − 1) + 3n) = (4n − 1)
i=1
2
k
using ∑
(i − 1) = k (k − 1) / 2
i=1

Mlba Slides 5 Linear-Independence-note

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mlba Slides 5 Linear-Independence-note

Uploaded by

Copyright:

Available Formats

Machine Learning

for Business Analytics

• set of n-vectors {a1, . . . , ak} (with k≥ 1) is linearly dependent if

holds for some β1, . . . , βk , that are not all zero 線 性 獨立

• {a1} is linearly dependent only if a1 = 0

• for more than two vectors, there is no simple to state condition

Linear independence—Linear independence 4

are linearly dependent, since a1 + 2a 2 − 3a 3 = 0

Linear independence—Linear independence 5

• set of n-vectors {a1, . . . , ak}(with k ≥ 1) is linearly independent if it is not

• example: the unit n-vectors e1, . . . , en are linearly independent

Linear independence—Linear independence 6

• suppose x is linear combination of linearly independent vectors a1, . . . , ak:

• the coefficients β1, . . . , βk are unique , i.e. , if

then βi = γi for i = 1,...,k

Linear independence—Linear independence 7

kgelement , 每 gelement 有 ngvector

• a set of n linearly independent n-vectors a1, . . . , an is called a basis

• set of n-vectors a1, . . . , ak are (mutually) orthogonal if ai ⊥ aj for i ≠ j

• they are orthonormal if both hold

• can be expressed using inner products as

• orthonormal sets of vectors are linearly independent

• when k = n, a1, . . . , an are an orthonormal basis

Linear independence—Orthonormal vectors 12

• the 2-vectors shown below

Linear independence—Orthonormal vectors 13

• if a1, . . . , an is an orthonormal basis, we have for any n-vector x

x = (a1T x)a1 + . . . + (anT x)an

• called orthonormal expansion of x (in the orthonormal basis)

Linear independence—Orthonormal vectors 14

Gram–Schmidt (orthogonalization) algorithm

⽔ • an algorithm to check if a1, . . . , ak are linearly independent

• we’ll see later it has many other uses

Linear independence—Gram-Schmidt algorithm 16

given n-vectors a1, . . . , ak

• if G–S stops early in iteration i = j then aj is a linear combination of

Linear independence—Gram-Schmidt algorithm 17

Linear independence—Gram-Schmidt algorithm 18

let’s show by induction that q1, . . . , qi are orthonormal

• assume it’s true for i − 1

q̃i ⊥ q1, . . . , q̃i ⊥ qi−1

qjT q̃i = qjT ai − (q1T ai )(qjT q1) − . . . − (qi−1

Linear independence—Gram-Schmidt algorithm 19

assuming G–S has not terminated before iteration i

• ai is a linear combination of q1, . . . , qi :

Linear independence—Gram-Schmidt algorithm 20

suppose G–S terminates in step j

• aj is linear combination of q1, . . . , qj−1

• so aj is a linear combination of a1, . . . , aj−1

Linear independence—Gram-Schmidt algorithm 21

Complexity of Gram–Schmidt algorithm

• step 1 of iteration i requires i − 1 inner products,

Linear independence—Gram-Schmidt algorithm 22

You might also like

holds for some β1, . . . , βk , that are not all zero 線性獨立