You are on page 1of 2

1- You need feature scaling if you are using Gradient Descent

method but not if you are using finding Minimum by d Y/d X =


0

2- If cost function is increasing with more iteration of


Gradient Descent than learning rate may be too high for the
model converse to minimum.

3- Time complexity of inverting matrix is O(N3)

4- Common cause of Transpose(X) * X is non-invertible


a- Redundant features ( one feature is linearly dependent on
some other features)
b- Too many features ( m <= n)
5- LSTM
# Need to read in more details

LR - assumptions
LR gradient descent
Conjugate descent
BFGS, L-BFGS

You might also like