You are on page 1of 3

Chapter No 11 (Simple Linear Regression)

Definitions
Pearson product moment correlation coefficient: A numerical measure of strength is the linear
relationship between two variables is called Pearson product moment correlation coefficient, total
correlation and coefficient of simple correlation. It is denoted by r.
∑(𝑋 − 𝑋̅)(𝑌 − 𝑌̅)
𝑟=
√∑(𝑋 − 𝑋̅)2 𝛴(𝑌 − 𝑌̅)2
𝑛∑𝑋𝑌 − 𝛴𝑋𝛴𝑌
𝑟=
√[𝑛∑𝑋 2 − (∑𝑋)2 ][𝑛∑𝑌 2 − (∑𝑌)2 ]

Properties of correlation coefficient r:

• The correlation coefficient r is symmetrical with respect to the variables X and Y.


• The correlation coefficient r lies between -1 and +1.
• The correlation coefficient r is independent of the origin and scale.
Correlation and causation: Correlation refer to the relationship between two variables where they tent to
change together, while causation refer that one variable directly cause change on another variable.
Relation between correlation coefficient and regression coefficient: The correlation coefficient is the
geometric mean of regression coefficient.

𝑟 = √𝑏𝑌𝑋 × 𝑏𝑋𝑌

Scatter Diagram: The graphical representation of the set of n pairs of bivariate data is called scatter plot
and scatter diagram. In scatter diagram we take independent variable X along (x-axis) and dependent
variable Y along (y-axis). The scatter diagram shows positive linear relation, negative linear relation, no
relationship, curvy linear relationship.
Regression: The dependence of one dependent variable on one or more independent variable is called
regression.
Simple regression: The dependence of one dependent variable on one independent variable is called
simple regression.
Multiple regression: The dependence of one dependent variable on two or more variables is called
multiple regression.
Coefficient of determination: The coefficient of determination is denoted by 𝑅 2 . It is statistical measure
that indicates the proportion of the variance in the dependent variable that is predictable from the
independent variable. In simple words, it tells us how well independent variable explains the variability of
dependent variable. The coefficient of determination lies between 0 and 1.
𝑆𝑆𝐸
𝑅2 = 1 −
𝑆𝑆𝑇
Properties of least square regression line:

• The least squares regression line always goes through the line (𝑋̅, 𝑌̅), means of the data.
• The sum of the deviations of the observed values Y from the least square regression line is always
equal to 0, 𝛴(𝑌 − 𝑌̂) = 0.
• The sum of the square of the deviations of the observed value Y from the least square regression
line is minimum, 𝛴(𝑦 − 𝑦̂)2 = 𝑚𝑖𝑛𝑖𝑚𝑢𝑚.
• The least square regression line obtained from a random sample is the line of best fit because a
and b are the unbiased estimates of the parameter 𝑎 𝑎𝑛𝑑 𝛽.
Standard deviation of regression or standard error of estimate:

∑(𝑦 − 𝑦̅)2
𝑠𝑌𝑋 = √
𝑛−2

Assumptions of error term:

• 𝐸(𝜀) = 0, the expected value of error term is zero.


• Var (𝜀)= E (𝜀 2 )=𝜎 2 , the variance of the error term is constant. It means, the distribution of error
has the same variance for all values of X (homoscedasticity assumption).
• E (𝜀𝑖 , 𝜀𝑗 ) = 0 for all i ≠ 𝑗, error term are independent of each other (assumption of no serial or
auto correlation.
• E (𝑋, 𝜀𝑖 ) = 0, X and 𝜀𝑖 are also independent of each other.
• 𝜀𝑖′ 𝑠 are normally distributed with a mean of zero and constant variance.

Regression line: The line or curve around which the point cluster is called regression line.
Dependent variable: In dependent variable, it is denoted by Y and it is the variable that is being
explained or predicted. It is also called as regressand, explained, predictand and response. For example,
marks score in test.
Independent variable: In independent variable, it is denoted by X and it is the variable used to make
predictions. It is also called as regressor, explanatory, predictor and regression. For example, number of
hours study.
Simple linear regression: If the simple regression describes the dependence of the expected value of the
dependent variable as a linear function of the independent variable, then the regression is called simple
linear regression.
Simple linear regression coefficient: the simple linear regression coefficient is the relative change in the
expected value of the dependent variable with respect to one unit increase in the independent non-random
variable.

MCQ’S point of view


• r=+1, refer perfect positive correlation.
• r=-1, refer perfect negative correlation.
• r=0, refer zero or no correlation.
• The sign of r shows the direction of correlation or relation.
• Population correlation coefficient shows 𝜌.
• 𝑅 2 is near to 1 is best.
• 𝑅 2 is used to check that which model is best model.
• If SSE=0 then 𝑅 2=1.
• If SSE is less than SST then 𝑅 2 approaches to approximately 0.
• The principle of least square estimator is denoted by LS.
• Expection always apply on random variable.
• The distribution of error is used as the distribution of Y.
• α is Intercept.
• 𝛽 𝑖𝑠 Regression coefficient, slope, rate of change.
• Prediction (domain of X)
• Forecasting ( particular variable).
• Dependent (random variable)
• Independent (fixed or non-random variable)

You might also like