Machine Learning
→ The central idea is that you can represent reality using a mathematical function.
→ The machine learning algorithm does not know this function in advance but can
guess it after having seen some data.
→ Example
Regression
→ Regression involves different domains: statistics, economics, psychology, social
sciences, and political sciences
→ Linear regression works by combining through numeric feature summation
o y = bX + a
o where, y is the vector of the response values
o X is the matrix of features to guess the y vector
→ Suppose the response vector y represents the sales of a product
o sales = advertising*badv + shops*bshop + price*bprice + a
o The bias „a‟ represents the prediction baseline when all the features have
values of zero.
→ Each b represents a numeric value that describes the intensity of the relationship to the
response.
→ It also has a sign that shows the effect of a change in feature.
→ Using the gradient descent algorithm linear regression can find the best set of
coefficients and bias.
o To minimize a cost function given by the squared difference between the
predictions and the real values.
o ∑ where, n = number of examples
Statistical Approach: Curve Fitting
→ Two variables
→ To find relation between them
→ Height and Weight of a person
→ Expenditure and income
→ Relation
→ By means of some mathematical equation
→ Straight line:
Parabola:
Cubic Curve:
Exponential:
Logistic:
→ y: dependent
x: independent
Method of Least Squares
Fitting straight line
→ Sum of the squares of differences between the observed and the corresponding
estimated values should be the minimum possible
→ Observed value:
→ Estimated value using straight line:
→ Difference:
→ The “best fitting” straight line is interpreted as that which minimises the sum of the
squares of differences ∑
→ The problem is tackled by mathematical methods. This leads to a set of equations,
called normal equations, solving which we get the desired values of a and b
→ How the normal equations are derived
Minimise ∑
∑ ∑
or, ∑ or, ∑
or, ∑ or, ∑
1st Equation
→ ∑ ∑
2nd Equation
→ ∑ ∑ ∑
→ Determine the equation of a straight line which best fits the following data
x → 10 12 13 16 17 20 25
y → 19 22 24 27 29 33 37
∑ ∑ ∑ ∑ ∑
Solution:
Accuracy
→ Both sides will be equal, when and are replaced by their means ̅
and ̅
→ Here, substituting ̅ in the above equation y =
→ The discrepancy of 1 in the last place is due to rounding error.
Fitting Geometric Curve
→
: Where, A = log a X = log x
Normal Equations:
∑ ∑
∑ ∑ ∑
Data:
X→ 0.6021 0.6990 0.7782 0.8451 0.9031
Y→0.9031 1.0969 1.2553 1.3892 1.5057
Calculation:
→0.3600 0.4900 0.6084 0.7225 0.8100
XY→0.5400 0.7700 0.9828 1.1815 1.3590
Substituting in equations
Solution:
Fitting Parabola (Polynomial of degree 2)
→
→ Normal equations
∑ ∑ ∑
∑ ∑ ∑ ∑
∑ ∑ ∑ ∑
Data:
1 2.18 1 1 1 2.18 2.18
2 2.44 4 8 16 4.88 9.76
3 2.78 9 27 81 8.34 25.02
4 3.25 16 64 256 13.00 52.00
5 3.83 25 125 625 19.15 95.75
Total 15 14.48 55 225 979 47.55 184.71
Substituting in the equations
Solution:
Accuracy:
Calculate ̅ Caculate ̅ ̅̅̅
Substituting, y = ̅ ̅̅̅
Fitting Exponential
∑ ∑
∑ ∑ ∑
Data:
Y:
Substituting in the equations
Solution: