You are on page 1of 1

28/09/2020 Machine Learning - Home | Coursera

Cost function
For the parameter vector θ (of type Rn+1 or in R(n+1)×1 , the cost function is:

m
1 2
J(θ) = ∑ (hθ (x(i) ) − y (i) )
2m i=1

The vectorized version is:

1
J(θ) = (Xθ − y )T (Xθ − y )
2m

Where y denotes the vector of all y values.

Gradient Descent for Multiple Variables


The gradient descent equation itself is generally the same form; we just have to repeat it for our 'n' features:

repeat until convergence: {


1 m (i)
θ0 := θ0 − α ∑ (h θ (x (i) ) − y (i) ) ⋅ x 0
m i=1
1 m (i)
θ1 := θ1 − α ∑ (h θ (x (i) ) − y (i) ) ⋅ x 1
m i=1
m
1 (i)
θ2 := θ2 − α ∑ (h θ (x (i) ) − y (i) ) ⋅ x 2
m i=1

}

In other words:

repeat until convergence: {


1 m (i)
θj := θj − α ∑ (h θ (x (i) ) − y (i) ) ⋅ x j for j := 0..n
m i=1
}

https://www.coursera.org/learn/machine-learning/resources/QQx8l 1/1

You might also like