Professional Documents
Culture Documents
Tut01 - Geometry Crash Course
Tut01 - Geometry Crash Course
10000010110100101000
01000001101110110111
10111101000010101010
01101101011011010000
Model 10111110000110110101
Features
Vectors
An ordered list of numbers
1.0 -1.5 2.0 -2.5 0.0 ≠ 2.0 1.0 0.0 -1.5 -2.5
Since we have become such good
friends, let me teach you a bit
more about my mother tongue
geometry crash course
can time series models predict how long we need to wait
till the next video gets uploaded ?? … just asking for a
friend …
Vectors 𝑦
Does not matter in which
order we move along the
𝑧
coordinate axes
5
0 𝑥
𝑥 𝑦
-4 -3 -2 -1 0 1 2 3
4
𝐮=¿ 3.5 4.0 𝐯=¿ -1.5 2.5 𝐰=¿ 3.5 2.5 4.0
How to stretch,
𝑦
shrink and flip vectors
4
Scalar multiplication
3 𝐮=¿ 1.5 2.0
2 2 ⋅𝐮=¿ 3.0 4.0
1
𝑥
−1.5⋅𝐮=¿ -2.25 -3.0
-4
4
-3 -2 -1 0 1 2 3
0.5 ⋅𝐮=¿ 0.75 1.0
-1
−0.75 ⋅ 𝐮=¿ -1.125 -1.5
-2
The sign of the scalar decides if the vector
-3 will get flipped or not. A magnitude of
less than one will shrink the vector and
-4 more than one will stretch the vector
How to add𝑦vectors The coordinate-wise rule remains the
same even if adding/subtracting more
than 2 vectors in more than 2 dimensions
4
Vector addition/subtraction
Add/subtract coordinate-wise
3
Complete the Parallelogram
2
𝐮=¿ 1.5 2.0
1
𝑥 𝐯=¿ 2.0 -2.5
𝐮+𝐯=¿ 3.5
-4 -3 -2 -1 0 1 2 3
4 -0.5
-1
-4
−0.5 ⋅𝐮+0.5 ⋅𝐯=¿ 0.25 -2.25
How to measure
𝑦
the length of a vector
4
Euclidean length
3
𝐮=¿ -3.0 4.0
2
-4 -3 -2 -1 0 1 2 3
𝑥 𝐰=¿ 6.0 -2.0 -3.0
4
-1
-2
-3
-4
How to measure
𝑦
the length of a vector
4
Taxicab/Manhattan length
3
𝐮=¿ -3.0 4.0
2
-4 -3 -2 -1 0 1 2 3
𝑥 𝐰=¿ 6.0 -2.0 -3.0
4
-1
These notions of length are also called norms. There
-2 is an entire family of so-called norms defined as
-1
-2
-3
-4
How to measure angles
Let us give you a simple proof Two vectors are
of why dot products can be perpendicular if their
𝑦 used to calculate angles dot product is
-2
-3
-4
Dot product𝑦helps us measure angles
𝑏
4 𝐮=¿ 𝑎
𝑥 𝑏
𝑦 𝐯=¿ √ 𝑝 +𝑞 𝑞
𝑝 2 2
0
3 Let’s rotate the vectors
2 Doesn’t change the angle or dot
product (proof of latter later)
1
𝑥Claim:
-4 -3 -2 -1 0
𝜃𝑎
1 2 3 4 Proof:
-1
-2 (cos = base/hyp)
-3 Clearly, we do have
-4
An application of norms in ML
ℬ 2 ( 𝐜 ,𝑟 ) ≝ { 𝐮 :‖𝐮 − 𝐜‖2 ≤ 𝑟 } Anomaly or attack detection
𝐜
𝑟
𝐝 𝑠
𝑥 where
Changing does not change
the slope of the line. It just
𝑏 changes the intercept
⊤
𝐰 𝐱+𝑏>0⇒
The set is called a halfspace. The other set is also a
halfspace. Linear models solve binary classification using a
model that divides the entire space into two halfspaces,
one for each of the two classes. A line or hyperplane
separates the two halfspaces
Linear models in higher dimensions Changing shifts the hyperplane. Note that if we
decrease but keep the same then fewer points may
satisfy i.e., decreasing the bias makes the model more
The same trick works in higher dimensions picky about classifying points as green!
by learning a hyperplane classifier where
The hyperplane itself is often called the
“decision boundary” 𝐰
The vector is the normal or perpendicular
vector of the hyperplane
Consider any two vectors on the hyperplane i.e.,
Note that this means
The vector is parallel to the hyperplane and
perpendicular to all such vectors
Changing
rotates the
hyperplane
To or not to – that is the question
Sometimes, ML algos are simpler if we do not have a bias term
However, having a bias term is often critical so we cheat a bit and hide it
Create another dim in feature vector and fill it with 1 i.e.,
Note that features are now -dimensional, so must be the model
Learn a -dimensional linear model but without bias term
Let the new model be
If we denote , then
𝑑
𝑓 :ℝ →ℝ 𝐱 𝐳 𝐲 𝐱 𝐳 𝐲
∀ 𝐱 ,𝐲 ∈𝒞 CONVEX FUNCTION NON-CONVEX
∀ 𝜆 ∈ [ 0,1 ] FUNCTION
Think of common functions that are convex