Summary: Linear Regression with Multiple Variables.

Table of Contents

1. Multivariate Linear Regression

– 1a. Multiple Features (Variables)

– 1b. Gradient Descent for Multiple Variables

– 1d. Gradient Descent: Checking

– 1e. Gradient Descent: Learning Rate

– 1f. Features and Polynomial Regression

– 2a. Normal Equation

– 2b. Normal Equation Non-invertibility

I would like to give full credits to the respective authors as these are my personal python notebooks taken from

deep learning courses from Andrew Ng, Data School and Udemy :) This is a simple python notebook hosted

generously through Github Pages that is on my main personal notes repository on

https://github.com/ritchieng/ritchieng.github.io . They are meant for my personal review but I have open-source

my repository of personal notes as a lot of people found it useful.

X1, X2, X3, X4 and more

New hypothesis

Can reduce hypothesis to single number with a transposed theta matrix multiplied by x matrix

Summary

New Algorithm

Ensure features are on similar scale

Gradient descent will take longer to reach the global minimum when the features are not on a similar

scale

Mean normalization

Can you a graph

Alpha (Learning Rate) too small: slow convergence

Start with 0.001 and increase x3 each time until you reach an acceptable alpha

Ensure the features capture the pattern

Method to solve for theta analytically

If theta is not

X: m x (n + 1)

n: number of features

X_transpose: (n + 1) x m

X_transpose * X: (n + 1) x m * m x (n + 1) = (n + 1) x (n + 1)

theta = (n + 1) x m * m x 1 = (n + 1) x 1

alpha

iterations

(10,000)

> 1000

What happens if X_transpose * X is non-invertible (singular or degenerate)

Intuition of non-invertibility

Causes of non-invertibility

