Professional Documents
Culture Documents
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
-1 0 1
Suppose we want to know if one increases his/her daily calory intake by 100 units,
how much it will affect the weight of the person?
Regression Analysis
Regression Analysis is
75
SON'S HEIGHT (INCHES)
70
65
60
55
50
50 55 60 65 70 75 80
FATHER'S HEIGHT (INCHES)
Regression
• Meaning: “Stepping back towards the average”
• Galton found that offsprings of abnormally tall or short parents tend to “regress”
or “step back” to the average population height.
Income Expenditure
Marks in Statistics
30 45 80
60 54 60
80 91
53 58
40
35 63 20
15 35
0
40 45
0 20 40 60 80 100
35 45
Marks in Mathematics
48 44
70
60 Line of best fit
50 or we can say
40 a line which best fits the data
30
20
10
0
0 20 40 60 80 100
Marks in Mathematics
How to find a line which best fits the data?
One method is
𝑌𝑖 = 𝑎 + 𝑏𝑋𝑖 + 𝜖𝑖 , 𝑖 = 1,2, … , 𝑛
Where
• (𝑋𝑖 , 𝑌𝑖 ) is the 𝑖𝑡ℎ pair of observations on the dependent variable and
independent variables,
• 𝑛 is the sample size or the total pairs of observations on 𝑋 and 𝑌.
• 𝑎 and 𝑏 are unknown constants of the equation.
• 𝜖 is the error term.
Principle of Least Squares
Estimate the values of unknown constants by minimizing the residual sum of squares.
Also, let 50
Y
40
𝑒𝑖 = 𝑌𝑖 − 𝑌𝑖 , 𝑖 = 1,2, … , 𝑛. 30
20
𝑒𝑖 is the residual associated with the 10
ത 𝑌).
The first normal equation tells that the line passes through the sample means (𝑋, ത
The regression line (line of best fit) of 𝑌 on 𝑋 is a line which has slope 𝑏 and passes
ത 𝑌),
through (𝑋, ത i.e.
𝜎𝑌
𝑌 − 𝑌ത = 𝑟 𝑋 − 𝑋ത
𝜎𝑋
𝑌 − 𝑌ത = 𝑏𝑌𝑋 𝑋 − 𝑋ത
𝜎
Where 𝑏𝑌𝑋 = 𝑟 𝑌 is called the regression coefficient of 𝑌 on 𝑋.
𝜎𝑋
The regression coefficient 𝑏𝑌𝑋 tells the change in the value of 𝑌 with a unit change in
𝑋 on 𝑌.
Regression Line
X on Y
Marks in Statistics
Example
80
60 54
80 91 60
53 58
40
35 63
15 35 20
40 45
35 45 0
48 44 0 20 40 60 80 100
471 565 Marks in Mathematics
𝑟 𝑋, 𝑌 = 0.86
Marks in Mathematics Marks in Statistics
(𝑋) (𝑌) 𝑋2 𝑌2 𝑋𝑌
75 85 5625 7225 6375
30 45 900 2025 1350
60 54 3600 2916 3240
80 91 6400 8281 7280
Example
𝑋ത = 47.1, 𝑌ത = 56.5
𝜎𝑋 = 19.31, 𝜎𝑌 = 17.52
𝑟 𝑋, 𝑌 = 0.86
Marks in Mathematics Marks in Statistics
(𝑋) (𝑌) Estimated Marks in Statistics when a student
75 85
gets 85 marks in Mathematics.
30 45
Example
60 54
80 91
53 58
35 63
15 35
40 45
35 45
48 44
𝑌 = 19.56 + 0.78 𝑋
𝑋 = −6.748 + 0.95 𝑌
Summary
“A statistical relationship,