Professional Documents
Culture Documents
00
1 .1 . S t r a i g h t L i n e +ve Half space 0 5. Equ at io n O f a Circle
1 . T he fir st d e riva tive g ives the slope of the tangent
D e lta O p e r at o r 0 9. Method o f L a g r a n g e m u lt i p l i e r s :
-ve Half space line to the function at a point.
i. Slope-intercept form A method finding the local minima or local
of
ve rst derivative ⇒ the function is increasing
subject to
+ fi
maxima of a function
iii. D i s ta n c e from the origin Optima for f can be
subject to
where Centre coordinates and Radius -ve fi rst derivative ⇒ the function is decreasing
found ∇𝑓
by putting
constraints
m = slope ,c = intercept
2 . T he Second d e riva tive of a function represents its equal to a Null m atrix The problem can be rewritten as :
P oints inside the circle give -ve values when
of the same
ii. Point - Slope form substituted in the circle e uation and points q concavity.
dimensions as ∇
If the second derivative is + ve ⇒ concave upwards
outside the circle give +ve values.
i v. D i s ta n c e from the Point If the second derivative is -ve ⇒ concave
downwards. Lagrange multiplier
' C o m m o n D e r i vat i v e s
:
i i i . T wo - p o i n t F o r m Let s say we have a coordinate system x-y initially
and a point
Iterative algorithm to reach the optima of a function. 10. E igen vector and E i g e n va lu e
03. Vectors
For an y m atrix, there exists a vector such that when
i. Dot Product this vector is multiplied with the matrix, we get a
Where, and
Y Gradinet new vector in the same direction having a different
Y`
Initial
W eight magnitude.
are two points on the line.
X`
E igen Vector of M at rix A
i v. I n t e r c e p t F o r m
ii. Unit Vector Global Cost
X Minimum
θ in
E A
iii. D i s ta n c e b e t w e e n t wo k
anticloc wise direction. P in new system would be: GD Algorithm to optimi e z T here can be multiple eigen vectors, which are
Steps to find the optima
v. G e n e r a l f o r m points
:
S t e p 1 Initially, pic k and randomly
always orthogonal to each other.
genvecto
,
G iven a function f(x), calculate its derivative. T he ei r associated with the largest
i.e. f (x
Step 2 : C ompute and at and eigenvalue indicates the direction in which the data
a, b, c real numbers
0 7. D i f f e r e n t i at i o n
first principles
using
P ut f (x) = 0 to obtain the stationary points x = c respectively.
has the most variance.
Pa r a l l e l :
/ :
v. Angle between t wo v e c t o r s
Sum Difference rule
iii. If f (c) = 0, then f( x) may or may not have a Step 4 : Repeat step 3 until
Perpendicular : C onsider a vector x in the space representing one of
maxima or minima at x = c.
Here, η ⇒ learning rate the points in our data and a unit vector u
Co n sta n t m u lt i p l e r u l e : Maximum
l wly. If it is a large value
s o , it may overs h oot the minima.
i. Vector Form
vi. Projection
Local Va r i a n t s o f G r a d i e n t descent
Maximum
Batc h GD
Local
C alculates the partial derivative
variables approach certain values. T he best u will be where the summation of the
Minimum using only a fe w da t a points θ θ j
length of pro ections of all such points (𝑥 ) on the
Absolute (random) from the data set.
Minimum vector u is maximum.
i i . H a l f S pa c e s
Chain rule : Stochastic Gradient descent : Ex p l a i n e d va r i a n c e
f(x) is continuous at a point x = a if: Pa r t i a l d e r i vat i v e