A simple linear regression model is given by the following equation: Y = β0 + β1X + ϵ where Y is called the dependent variable. X is called the independent variable. ϵ is called the error variable β0 is called the y-intercept coefficient β1 is called the slope coefficient The required conditions for error variable: • ϵ ∼ N(O; σϵ2) • ϵ and X are independent • For n observation, the errors ϵ1, ϵ2, …, ϵn are independent and have the same variances. 2. The least squares regression line The data consist of n observations (x1, y1), (x2, y2), …, (xn, yn) Notation: n n n x2 2 yi2 − n ȳ 2 ∑ ∑ i ∑ SSxy = xi yi − n x̄ ȳ; SSx = − n x̄ ; SSy = i=1 i=1 i=1 SSxy ̂ • The slope estimate is: β1 = SS x • The y-intercept estimate is: β ̂ = ȳ − β1̂ x̄ 0
• The least squares regression line is ŷ = β0̂ + β1̂ x
3. Assessing the regression model
• The sum of squares of errors: 2 SSxy 2 (yi − yî ) = SSy − ∑ SSE = SSx (remark that yî = β0̂ + β1̂ xi) • The estimate of standard error σϵ SSE sϵ = n−2 • The coefficient of determination is 2 ∑ ( yî − ȳ)2 SSE SSxy 2 R = =1− = ∑ (yi − ȳ)2 SSy SSx SSy 2 R measures the proportion of the variation in y that is explained by the linear regression model. • Testing on the slope H0 : β1 = 0 (there is not a linear relationship between Y and X)
HA : β1 ≠ 0 (there is a linear relationship between Y and X)
Or HA : β1 > 0 (there is a positive linear relationship between Y and X)
Or HA : β1 < 0 (there is a negative linear relationship between Y and X)
β1̂ sϵ - The test statistic is Tobs = , where sβ1̂ = . sβ1̂ SSx
- The decision rule: we reject H0 if:
Tobs < − tn−2,α/2 or Tobs > tn−2,α/2 (for HA : β1 ≠ 0)