Professional Documents
Culture Documents
d
θ1 := θ1 − α J (θ1 )
dθ 1
dθ 1
,
J (θ1 ) θ1 eventually converges to its minimum value.
The following graph shows that when the slope is negative, the value of θ1 increases and
when it is positive, the value of θ1 decreases.
On a side note, we should adjust our parameter α to ensure that the gradient descent
algorithm converges in a reasonable time. Failure to converge or too much time to obtain
the minimum value imply that our step size is wrong.
https://www.coursera.org/learn/machine-learning/supplement/QKEdR/gradient-descent-intuition 1/3
1/5/2018 Coursera | Online Courses From Top Universities. Join for Free | Coursera
dθ 1
J (θ1 ) approaches 0 as we approach the
bottom of our convex function. At the minimum, the derivative will always be 0 and thus
we get:
θ1 := θ1 − α ∗ 0
https://www.coursera.org/learn/machine-learning/supplement/QKEdR/gradient-descent-intuition 2/3
1/5/2018 Coursera | Online Courses From Top Universities. Join for Free | Coursera
Mark as completed
https://www.coursera.org/learn/machine-learning/supplement/QKEdR/gradient-descent-intuition 3/3