0% found this document useful (0 votes)

25 views14 pages

Sta 212

Knowledge

Uploaded by

ithamarfongongfancy91

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views14 pages

Sta 212

Knowledge

Uploaded by

ithamarfongongfancy91

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

TARABA STATE UNIVERSITY, JALINGO

STA212 LABORATORY FOR INFERENCE II (2 UNITS) LECTURE NOTES

LINEAR REGRESSION AND CORRELATION
When comparing two different variables, two questions come to mind: “Is there a relationship
between two variables?” and “How strong is that relationship?” These questions can be answered
using regression and correlation. Regression answers whether there is a relationship (again this
book will explore linear only) and correlation answers how strong the linear relationship is.
The independent variable, also called the explanatory variable or predictor variable, is the x -
value in the equation. The independent variable is the one that you use to predict what the other
variable is. The dependent variable depends on what independent value you pick. It also
responds to the explanatory variable and is sometimes called the response variable.
The population equation looks like:
y=β 0 + β 1 x
β 0 = slope
β 1 = y-intercept
^y is used to predict y

Assumptions of the regression line:

a) The set ( x , y ) of ordered pairs is a random sample from the population of all such possible (
x , y ) pairs.
b) For each fixed value of x , the y -values have a normal distribution. All of the y distributions
have the same variance, and for a given x -value, the distribution of y values has a mean that
lies on the least squares line. You also assume that for a fixed y , each x has its own normal
distribution. This is difficult to figure out, so you can use the following to determine if you
have a normal distribution.
i. Look to see if the scatter plot has a linear pattern.
ii. Examine the residuals to see if there is randomness in the residuals. If there is a
pattern to the residuals, then there is an issue in the data.

SIMPLE LINEAR REGRESSION

We consider the modelling between the dependent and one independent variable. When there is
only one independent variable in the linear regression model, the model is generally termed as a
simple linear regression model. When there are more than one independent variables in the
model, then the linear model is termed as the multiple linear regression model.
The linear model
y=β 0 + β 1 x 1 +ε 1
where
 x independent variable.  n Number of cases or individuals.
 y dependent variable.  ∑ xy Sum of the product of dependent and independent variables.  β 1 The
Slope of the regression line  ∑ x = Sum of independent variable.
 β 0 The intercept point of the regression line and the y axis.  ∑ y = Sum of dependent variable.
 ∑ x 2 = Sum of square independent variable.
n ∑ xy−∑ x ∑ y
β 1= β 1= y−β 1 x
n ∑ x 2−( ∑ x)
2

Example – linear Regression of patient's age and their blood pressure A study is conducted
involving 10 patients to investigate the relationship and effects of patient's age and their blood
pressure.
Obs Age BP
x y xy x
2

1 35 112 3920 1225

2 40 128 5120 1600
3 38 130 4940 1444
4 44 138 6072 1936
5 67 158 10586 4489
6 64 162 10368 4096
7 59 140 8260 3481
8 69 175 12075 4761
9 25 125 3125 625
10 50 142 7100 2500
Total 491 1410 71566 26157

∑ x =491 ∑ y=1410 ∑ x y=71566 ∑ x 2=26157

Calculating the mean: x , y ;

x=
∑ x = 491 =49.1
n 10

y=
∑ y = 1410 =141
n 10
n ∑ xy−∑ x ∑ y
β 1=
n ∑ x 2−( ∑ x)
2

10× 71566−491 ×1410

β 1= 2
10× 26157−(491)
715660−692310
β 1=
261570−241081
23350
β 1= =1.140
20489
β 0= y −β1 x β 0=141−1.140× 49.1
β 0=141−55.974 β 0=85.026

𝑬𝒔𝒕𝒊𝒎𝒂𝒕𝒆𝒅 𝒃𝒍𝒐𝒐𝒅 𝒑𝒓𝒆𝒔𝒔𝒖𝒓𝒆 Y =85.026+1.140 age

Then substitute the regression coefficient into the regression model

Interpretation of the equation;

Constant (intercept) value β 0 = 85.026 indicates that blood pressure at age zero.
Regression coefficient β 1 = 1.140 indicates that as age increase by one year the blood pressure
increase by 1.140

Applying the value of age to the regression Model to calculate the estimated blood pressure
(Y^ ) coefficient of determination ( R2) as follows:

Obs Age BP
x y Y^ Y^ − y ( Y^ − y)
2
( y− Y^ ) ( y− Y^ )
2
( y−Y ) 2
( y−Y )
1 35 112 124.926 -16.074 258.373 -12.926 167.081 -29 841
2 40 128 130.626 -10.374 107.620 -2.626 6.896 -13 169
3 38 130 128.346 -12.654 160.124 1.654 2.736 -11 121
4 44 138 135.186 -5.814 33.803 2.814 7.919 -3 9
5 67 158 161.406 20.406 416.405 -3.406 11.601 17 289
6 64 162 157.986 16.986 288.524 4.014 16.112 21 441
7 59 140 152.286 11.286 127.374 -12.286 150.946 -1 1
8 69 175 163.686 22.686 514.655 11.314 128.007 34 1156
9 25 125 113.526 -27.474 754.821 11.474 131.653 -16 256
10 50 142 142.026 1.026 1.053 -0.026 0.001 1 1
Total 491 1410 1410 0.000 2662.750 0.000 622.950 0 3284

Equation of ANOVA table for simple linear regression

Source of Variation Sums of Squares Df Mean Square F
Regression ∑ (Y^ −Y )2 1 SS reg MS reg
1 MS res
Residual ∑ ( y−Y^ )2 N–2 SSres
(N −2)
Total ∑ ( y−Y )2 N–1

Calculating the ANOVA table values for simple linear regression

Source of Variation Sums of Df Mean Square F

Squares
Regression 2662.75 1 2662.75 2662.75
=2662.75 =34.195
1 77.86875
Residual 622.95 8 622.95
8
Total 3284 9

Calculating the coefficient of determination ( R2)

2 Explained Variation Regression ∑ of Square (SSR)
R= =
Total Variation Total ∑ of Square(SST )
Then substitute the values from ANOVA table
2662.75
2
R= =0.810
3284
We can say that 81% of the variation in the blood pressure rate is explained by age.

SIMPLE CORRELATION
A correlation exists between two variables when the values of one variable are somehow
associated with the values of the other variable. When you see a pattern in the data you say there
is a correlation in the data. Patterns can be exponential, logarithmic, or periodic, all are linear
patterns. To see this pattern, you can draw a scatter plot of the data. Remember to read graphs
from left to right, the same as you read words. If the graph goes up the correlation is positive and
if the graph goes down the correlation is negative. The words “weak”, “moderate”, and “strong”
are used to describe the strength of the relationship between the two variables.
a. Strong positive correlation between x and y. The points lie close to a straight line with y
increasing as x increases.
b. Weak, positive correlation between x and y. The trend shown is that y increases as x
increases but the points are not close to a straight line
c. No correlation between x and y; the points are distributed randomly on the graph.
d. Weak, negative correlation between x and y. The trend shown is that y decreases as x
increases but the points do not lie close to a straight line
e. Strong, negative correlation. The points lie close to a straight line, with y decreasing as x
increases
Correlation can have a value:
1. 1 is a perfect positive correlation
2. 0 is no correlation (the values don't seem linked at all)
3. -1 is a perfect negative correlation
The value shows how good the correlation is (not how steep the line is), and if it is positive or
negative. Usually, in statistics, there are three types of correlations: Pearson correlation, Kendall
rank correlation and Spearman correlation.
The Pearson correlation coefficient is given by the following equation:
n

∑ (x i−x )( y i− y )
i=1
r=

√
n n

∑ ( xi −x) 2
∑ ( y i− y)2
i=1 i=1
where x is the mean of variable x values and y is the mean of variable y values.
Example – Correlation of statistics and science tests A study is conducted involving 10 students
to investigate the association between statistics and science tests. The question arises here; is
there a relationship between the degrees gained by the 10 students in statistics and science tests?
Student degree in Statistic and science
Students 1 2 3 4 5 6 7 8 9 10
Statistics 20 23 8 29 14 12 11 21 17 18
Science 20 25 11 24 23 16 12 21 22 26
Notes: the marks out of 30
Suppose that (x) denotes for statistics degrees and (y) for science degree.
Calculating the mean ( x , y );

x=
∑ x = 173 =17.3, y=
∑ y = 200 =20
n 10 n 10
Where the mean of statistics degrees x = 17.3 and the mean of science degrees y = 20
Calculating the equation parameters
Statistics Science
x y x−x 2
(x−x ) y− y ( y− y)
2
(x−x )( y− y )
20 20 2.7 7.29 0 0 0
23 25 5.7 32.49 5 25 28
8 11 -9.3 86.49 -9 81 83
29 24 11.7 136.89 4 16 21.2
14 23 -3.3 10.89 3 9 -9.9
12 16 -5.3 28.09 -4 16 21.2
11 12 -6.3 39.69 -8 64 50.4
21 21 3.7 13.69 1 1 3.7
17 22 -0.3 0.09 2 4 -0.6
18 26 0.7 0.49 6 36 4.2
173 200 0 356.1 0 252 228

∑ (x−x )2 =356.1, ∑ ( y− y)2=252, ∑ ( x −x ) ( y− y )=228

Calculating the Pearson correlation coefficient;

r=
∑ (x−x )( y− y ) = 228
√ ∑ ( x−x)2 ∑ ( y − y)2 √356.1 √ 252
228 228
¿ = =0.761
( 18.8706 ) (15.8745) 299.5614
Spearman rank correlation: Spearman rank correlation is a non-parametric test that is used to
measure the degree of association between two variables. It was developed by Spearman, thus it
is called the Spearman rank correlation. Spearman rank correlation test does not assume any
assumptions about the distribution of the data and is the appropriate correlation analysis when
the variables are measured on a scale that is at least ordinal.
The following formula is used to calculate the Spearman rank correlation coefficient:
6 ∑ di
2
ρ=1− 2
n(n −1)
Where:
ρ = Spearman rank correlation coefficient
d i= the difference between the ranks of corresponding values X i and Y i
n = number of value in each data set.
The Spearman correlation coefficient, ρ , can take values from +1 to -1. A ρ of +1 indicates a
perfect association of ranks, a ρ of zero indicates no association between ranks and a ρ of -1
indicates a perfect negative association of ranks. The closer ρ to zero, the weaker the association
between the ranks.
An example of calculating Spearman's correlation
To calculate a Spearman rank-order correlation coefficient on data without any ties use the
following data:
Students 1 2 3 4 5 6 7 8 9 10
Statistics 20 23 8 29 14 12 11 20 17 18
Science 20 25 11 24 23 16 12 21 22 26

Calculating the Parameters of Spearman rank Equation:

Statistics Science Rank Rank
(degree) (degree) (statistics) (science) |d| d
2

20 20 4 7 3 9
23 25 2 2 0 0
8 11 10 10 0 0
29 24 1 3 2 4
14 23 7 4 3 9
12 16 8 8 0 0
11 12 9 9 0 0
21 21 3 6 3 9
17 22 6 5 1 1
18 26 5 1 4 16
173 200 48

Where d = absolute difference between ranks and d 2 = difference squared. Then calculate the
following:
Then substitute into the main equation as follows:
6 ∑ di
2
ρ=1− 2
n(n −1)
6 × 48 288
ρ=1− 2 ; ρ=1− ; ρ=1−0.2909; ρ=0.71
10(10 −1) 990

Hence, we have a ρ = 0.71; this indicates a strong positive relationship between the ranks
individuals obtained in the statistics and science exam. This means the higher you ranked in
statistics, the higher you ranked in science also, and vice versa. So; the Pearson r correlation
coefficient = 0.761 and Spearman's correlation = 0.71 for the same data which means that
correlation coefficients for both techniques are approximately equal.
INFERENCE FOR REGRESSION AND CORRELATION
How do you really say you have a correlation? Can you test to see if there really is a correlation?
Of course, the answer is yes. The hypothesis test for correlation is as follows:
Hypothesis Test for Correlation:
1. State the random variables in words.
x = independent variable
y = dependent variable
2. State the null and alternative hypotheses and the level of significance
H 0 : ρ=0 (There is no correlation)
H a : ρ ≠ 0 (There is a correlation)
Or
H a : ρ<0 (There is a negative correlation)
Or
H a : ρ>0 (There is a positive correlation)
Also, state your α level here.
3. State and check the assumptions for the hypothesis test. The assumptions for the hypothesis
test are the same assumptions for regression and correlation.
4. Find the test statistic and p-value
r
t c=

with degrees of freedom = df = n – 2

√ (1−r 2)
(n−2)

5. Conclusion This is where you write reject H 0 or fail to reject H 0. The rule is: if the p-value <
α, then reject H 0 . If the p-value ≥ α, then fail to reject H 0.
6. Interpretation
This is where you interpret in real word terms the conclusion to the test. The conclusion for
a hypothesis test is that you either have enough evidence to show H a is true, or you do not
have enough evidence to show H a is true.

TESTS CONCERNING CORRELATION AND REGRESSION COEFFICIENTS

The correlation coefficient, r, tells us about the strength and direction of the linear relationship
between X 1 and X 2 .
The sample data are used to compute r, the correlation coefficient for the sample. If we had data
for the entire population, we could find the population correlation coefficient. But because we
have only sample data, we cannot calculate the population correlation coefficient. The sample
correlation coefficient, r , is our estimate of the unknown population correlation coefficient.
ρ = population correlation coefficient (unknown)
r = sample correlation coefficient (known; calculated from sample data)
The hypothesis test lets us decide whether the value of the population correlation coefficient ρ is
"close to zero" or "significantly different from zero". We decide this based on the sample
correlation coefficient r and the sample size n.
If the test concludes that the correlation coefficient is significantly different from zero, we say
that the correlation coefficient is "significant."
Conclusion: There is sufficient evidence to conclude that there is a significant linear
relationship between X 1 and X 2 because the correlation coefficient is significantly different from
zero.
What the conclusion means: There is a significant linear relationship X 1 and X 2 . If the test
concludes that the correlation coefficient is not significantly different from zero (it is close to
zero), we say that correlation coefficient is "not significant".
Performing the Hypothesis Test
 Null Hypothesis: H 0 : ρ=0
 Alternate Hypothesis: H a : ρ ≠ 0
What the Hypotheses Mean in Words
Null Hypothesis H 0: The population correlation coefficient IS NOT significantly different from
zero. There IS NOT a significant linear relationship (correlation) between X 1 and X 2 in the
population.
Alternate Hypothesis H a : The population correlation coefficient is significantly different from
zero. There is a significant linear relationship (correlation) between X 1 and X 2 in the population.
Drawing a Conclusion
There are two methods of making the decision concerning the hypothesis. The two methods are
equivalent and give the same result. The test statistic to test this hypothesis is:
r
t c=

√ (1−r 2)
(n−2)

r √ n−2
t c=
√ 1−r 2
Method 1: Using the p – value
 The p – value is calculated using t−¿ distribution with n−2 degrees of freedom.
 The value of the test statistic,t , is shown in the computer or calculator output along with the
p – value. The test statistic t has the same sign as the correlation coefficient r .
 The p – value is the combined area in both tails.
If the is less than the significance level (∝=0.05 ):
Decision: Reject the null hypothesis.
Conclusion: "There is sufficient evidence to conclude that there is a significant linear
relationship between and because the correlation coefficient is significantly different from zero."
If the is NOT less than the significance level (∝=0.05 )
Decision: DO NOT REJECT the null hypothesis.
Conclusion: "There is insufficient evidence to conclude that there is a significant linear
relationship between and because the correlation coefficient is NOT significantly different from
zero."
E.g. Consider an example where the line of best fit is: ^y =−173.51+ 4.83 x with r =0.6631 and
there are n=11 data points.
Can the regression line be used for prediction? Given a third exam score ( x value), can we use
the line to predict the final exam score (predicted y value)?
 H 0 : ρ=0
 Ha : ρ ≠ 0

∝=0.05
The p-value is 0.026 (from your calculator or from computer software). The p-value, 0.026, is
less than the significance level of ∝=0.05 .
Decision: Reject the Null Hypothesis H 0
Conclusion: There is sufficient evidence to conclude that there is a significant linear
relationship between the third exam score ( x ) and the final exam score ( y ) because the
correlation coefficient is significantly different from zero.
Method 2: Using a table of critical values
The 95% Critical Values of the Sample Correlation Coefficient Table can be used to give you a
good idea of whether the computed value of is significant or not. Compare to the appropriate
critical value in the table. If is not between the positive and negative critical values, then the
correlation coefficient is significant. If is significant, then you may want to use the line for
prediction.
We will always use a significance level of 5 % ,∝=0.05 .
Example:
Suppose you computed r =0.801 using n=10 data points. df =n−2=10−2=8. The critical
values associated with df =8 are −0.632 and +0.632 . If r <¿ negative critical value or r
>positive critical value, then r is significant. Since r =0.801 and 0.801>0.632, r is significant
and the line may be used for prediction.
FITTING A STRAIGHT LINE BY THE METHOD OF LEAST SQUARES
Let ( x i , y i ), i=1 , 2 ,… , n be the n sets of observations and let the related relation by y=ax+ b.
Now we have to select a and b so that the straight line is the best fit to the data.
As explained earlier, the residual at x=x i is
d i= yi −f ( x¿¿ i)= y i−( a xi +b ) ,i=1, 2 , … , n ¿
n n
E=∑ d =∑ [ y i −(a x i +b) ]
2 2
i
i i

By the principle of least squares, E is minimum

∂E ∂E
=0 and =0
∂a ∂b

i.e. 2 ∑ [ y i−( a xi +b ) ] (−x i )=0 and 2 ∑ [ y i−( a xi +b ) ] (−1)=0

n n
i.e. ∑ ( x i y i−a x i −b xi ) =0 and ∑ ( y i−a x i−b )=0
2

i=1 i=1

n n n
i.e. a ∑ x +b ∑ x i=∑ x i y i
2
i (1)
i=1 i=1 i=1

n n
and a ∑ x i +nb=∑ y i (2)
i=1 i=1

Since, x i, y i are known, equations (1) and (2) give two equations in a and b . Solve for a and b
from (1) and (2) and obtain the best fit y=ax+ b.
Note:
 Equations (1) and (2) are called normal equations.
 Dropping suffix i from (1) and (2), the normal equations are
a ∑ x +nb=∑ y and a ∑ x 2 +b ∑ x=∑ xy

Which are get taking Σ on both sides of y=ax+ b and also taking Σ on both sides after
multiplying by x both sides of y=ax+ b.
x−a y−b
 Transformation like X = , Y= reduce the linear equation y=ax+ b to the form
h h
Y = AX + B. Hence, a linear fit is another linear fit in both systems of coordinates.
Example 1:
By the method of least squares find the straight line to the data given below
x 5 10 15 20 25
y 16 19 23 26 30
Solution:
Let the straight line be y=ax+ b

The normal equations are a ∑ x +5 b=∑ y (1)

a ∑ x +b ∑ x=∑ xy
2
(2)

To calculate ∑ x , ∑ x 2, ∑ y , ∑ xy we form below the table.

x y x
2
xy
5 16 25 80
10 19 100 190
15 23 225 345
20 26 400 520
25 30 625 750
Total 75 114 1375 1885

The normal equations are 75 a+5 b=114 (1)

1375 a+75 b=1885 (2)
Eliminate b , multiply (1) by 15
1125 a+75 b=1710 (3)
Equation (1) – (2) gives, 250 a=175 or a=0.7 , hence b=12.3.
Hence, the best fitting line is y=0.7 x+12.3 .
x−x mid x−15 y− y mid y−23
Let X = = , Y= =
h 5 h 5
Let the line in the new variable by Y = AX + B
x y X X
2
Y XY
5 16 -2 4 -1.4 2.8
10 19 -1 1 -0.8 0.8
15 23 0 0 0 0
20 26 1 1 0.6 0.6
25 30 2 4 1.4 2.8
Total 75 114 0 10 -0.2 7

The normal equations are A ∑ X +5 B=∑ Y (4)

A ∑ X + B ∑ X =∑ XY
2
(5)

Therefore, −5 B=−0.2 → B=−0.04

10 A=7 → A=0.7
The equation Y =0.7 X−0.04

i.e. ,
y−23
5 (
=0.7
x−15
5 )
−0.04 → y−23=0.7 x−10.5−0.2

i.e. y=0.7 x−33.3

Which is the same equation as seen before.
Example 2:
Fit a straight line to the data given below. Also estimate the value of y at x =2.5
x 0 1 2 3 4
y 1 1.8 3.3 4.5 6.3

Substituting in (2) and (3), we get,

10 a+5 b=16.9
30 a+10 b=47.1
Solving equation (2) – (l), we get, a=1.33, b=0.72
Hence, the equation is y=1.33 x−0.72
y ( at x=2.5 )=1.33 ( 2.5 ) +0.72=4.045

POLYNOMIAL AND REGRESSION PLANE

Some data although exhibiting a marked pattern, can be poorly represented by a straight line.
One method to accomplish this objective is to use transformations. Another alternative method is
to fit polynomials to the data using polynomial regression.
The least squares procedure can be readily extended to fit the data to a higher-order polynomial.
For example, suppose that we a second-order polynomial or quadratic:
2
y=a0 +a 1 x +a2 x + ε
For this case the sum of the squares of the residual is:
n
Sr =∑ ( yi −a0 −a1 xi −a2 x2i )
2
(1)
i=1
We take derivative of (1) with respect to each of the unknown coefficients of the polynomial, as
in
n
∂ Sr
=−2 ∑ ( y i −a0 −a1 xi −a2 x2i )
∂ a0 i=1
n
∂ Sr
=−2 ∑ x i ( y i−a0 −a1 x i−a2 x 2i )
∂ a1 i=1
n
∂ Sr
=−2 ∑ x 2i ( y i−a0 −a1 xi −a2 x 2i )
∂ a2 i=1
These equations can be set equal to zero and rearranged to develop the following set of normal
equations (tag equation (2)):
( n ) a 0+ ( ∑ x i ) a1 + ( ∑ x 2i ) a 2=∑ y i
( ∑ x i ) a 0 + (∑ x 2i ) a1 + (∑ x3i ) a2=∑ x i y i
(∑ x2i ) a0 + (∑ x 3i ) a 1+ (∑ x 4i ) a 2=∑ x 2i y i
where all summation are from i=1 through n . Note that the above equations are linear and have
three unknowns: a 0, a 1 and a 2. The coefficients of the unknowns can be calculated directly from
the observed data.
The two-dimensional case can be easily extended to mth-order polynomial as:
2 m
y=a0 +a 1 x +a2 x + …+am x +ε
The foregoing analysis can be easily extended to this general case. Thus, we recognize that
determining the coefficient of an mth-order polynomial is equivalent to solving a system of m
=1 simultaneous linear equations. For this case, the standard error is formulated as

√
S y / x=
Sr
n−(m+1)
(3)

Example: Fit a second-order polynomial to the data in the first two columns of the table below
Computations for an error analysis of the quadratic least-squares fit
xi yi ( y ¿¿ i− y ) ¿
2 2 2
( y ¿¿ i−a¿¿ 0−a1 x i−a2 x i ) ¿ ¿
0 2.1 544.44 0.14332
1 7.7 314.47 1.00286
2 13.6 140.03 1.08158
3 27.2 3.12 0.80491
4 40.9 239.22 0.61951
5 61.1 1272.11 0.09439
Σ 152.6 2513.39 3.74657

Solution
From the given data,
m=2 n=6 ∑ x i=15 ∑ x 4i =979
∑ yi =152.6 ∑ x i y i=585.6 ∑ x 2i =55 ∑ x 3i =225
∑ x 2i y i=2488.8 y=25.433 x=2.5
Therefore, the simultaneous linear equations are

[ ][ ] [ ]
6 15 55 a0 152.6
15 55 225 a1 = 585.6
55 225 979 a2 2488.8
Solving these equations through a technique such as Gauss elimination gives a 0=2.47857 ,
a 1=2.35929 and a 2=1.86071. Therefore, the least-squares quadratic equation for this case is
2
y=2.47857+2.35929 x+1.86071 x
The standard error of the estimate based on the regression polynomial is

The coefficients of determination is

√
S y / x=
3.74657
6−3
=1.12

2 2513.39−3.74657
r= =0.99851
2513.39
and the correlation coefficient is r =0.99925.
These results indicate that 99.851 percent of the original uncertainty has been explained by the
model. This result supports the conclusion that the quadratic equation represents an excellent fit.

13simple Linear Regression
No ratings yet
13simple Linear Regression
127 pages
Regression and Correlation Analysis Guide
No ratings yet
Regression and Correlation Analysis Guide
22 pages
Correlation and Regression
No ratings yet
Correlation and Regression
82 pages
Regression, Correlation, and ANOVA Insights
No ratings yet
Regression, Correlation, and ANOVA Insights
29 pages
07 - Correlation and Regression Analysis-1
No ratings yet
07 - Correlation and Regression Analysis-1
13 pages
Correlation and Regression
No ratings yet
Correlation and Regression
81 pages
Analyzing Quantitative Variable Relationships
No ratings yet
Analyzing Quantitative Variable Relationships
26 pages
Linear Correlation and Regression Guide
No ratings yet
Linear Correlation and Regression Guide
13 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
42 pages
Correlation and Regression Analysis - Updated
No ratings yet
Correlation and Regression Analysis - Updated
49 pages
Understanding Covariance and Correlation
100% (1)
Understanding Covariance and Correlation
67 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
47 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
28 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
63 pages
MBA LSCM: Correlation & Regression
No ratings yet
MBA LSCM: Correlation & Regression
50 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
26 pages
Correlation Coefficient & Linear Regression
No ratings yet
Correlation Coefficient & Linear Regression
53 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
Correlation
100% (1)
Correlation
29 pages
Microsoft PowerPoint Session 4 PDF
No ratings yet
Microsoft PowerPoint Session 4 PDF
86 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
45 pages
Correlation & Regression Guide
No ratings yet
Correlation & Regression Guide
25 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
112 pages
Bivariate Analysis: Correlation & Regression
No ratings yet
Bivariate Analysis: Correlation & Regression
10 pages
Regression and Correlation Analysis Guide
No ratings yet
Regression and Correlation Analysis Guide
9 pages
Correlation and Linear Regression Explained
No ratings yet
Correlation and Linear Regression Explained
7 pages
14 - Regresi Dan Korelasi
No ratings yet
14 - Regresi Dan Korelasi
34 pages
Correlation
No ratings yet
Correlation
13 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
20 pages
Correlation and Regression
No ratings yet
Correlation and Regression
20 pages
Correlation and Regression Explained
No ratings yet
Correlation and Regression Explained
42 pages
Exploring Variable Associations and Regression
No ratings yet
Exploring Variable Associations and Regression
6 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
58 pages
Correlation and Regression Fundamentals
No ratings yet
Correlation and Regression Fundamentals
7 pages
Engineering Regression Techniques
No ratings yet
Engineering Regression Techniques
8 pages
Correlation and Regression Analysis Guide
100% (1)
Correlation and Regression Analysis Guide
55 pages
Raghunath Chatterjee Correlation Lecture
No ratings yet
Raghunath Chatterjee Correlation Lecture
40 pages
Descriptive Stats (E.g., Mean, Median, Mode, Standard Deviation) Z-Test &/or T-Test For A Single Population Parameter (E.g., Mean)
No ratings yet
Descriptive Stats (E.g., Mean, Median, Mode, Standard Deviation) Z-Test &/or T-Test For A Single Population Parameter (E.g., Mean)
43 pages
Correction
No ratings yet
Correction
10 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
28 pages
Scatter Diagrams and Correlation Analysis
No ratings yet
Scatter Diagrams and Correlation Analysis
38 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
65 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
65 pages
Linear Correlation and Regression
No ratings yet
Linear Correlation and Regression
36 pages
Chapter 3 Complete
No ratings yet
Chapter 3 Complete
109 pages
Correlation and Regression
100% (5)
Correlation and Regression
49 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
46 pages
Lecture 12 (Chapter 8) - Linear Regression and Correlation Analysis
No ratings yet
Lecture 12 (Chapter 8) - Linear Regression and Correlation Analysis
40 pages
Regression Analysis in Biostatistics
No ratings yet
Regression Analysis in Biostatistics
28 pages
Correlation and Regression
No ratings yet
Correlation and Regression
62 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
18 pages
Second Stats Packet 24
No ratings yet
Second Stats Packet 24
100 pages
Correlation and Regression Explained
No ratings yet
Correlation and Regression Explained
27 pages
Correlation and Regression Analysis Guide
100% (2)
Correlation and Regression Analysis Guide
54 pages
Stats10 - Chapter+4 2
No ratings yet
Stats10 - Chapter+4 2
54 pages
Correlation, Regression & Curve Fitting
No ratings yet
Correlation, Regression & Curve Fitting
6 pages
Correlation
No ratings yet
Correlation
54 pages
Reg & Cor QMS 080-1
No ratings yet
Reg & Cor QMS 080-1
48 pages
300-400 Exams Timetable
No ratings yet
300-400 Exams Timetable
2 pages
Adaptive Immunity
No ratings yet
Adaptive Immunity
46 pages
Gst305 - CBT Exam Batchbrfri, Jun 27, 2025 0100 PM BR
No ratings yet
Gst305 - CBT Exam Batchbrfri, Jun 27, 2025 0100 PM BR
26 pages
GST 301 French
No ratings yet
GST 301 French
15 pages
Screenshot 2024-04-27 at 02.37.47
No ratings yet
Screenshot 2024-04-27 at 02.37.47
15 pages
Hadoop in Business Analytics
No ratings yet
Hadoop in Business Analytics
2 pages
Buchi 2021 Digital Well Being Theory and Research
No ratings yet
Buchi 2021 Digital Well Being Theory and Research
18 pages
Simple Linear Regression & Correlation Analysis
No ratings yet
Simple Linear Regression & Correlation Analysis
5 pages
Quantitative Demand Analysis Techniques
No ratings yet
Quantitative Demand Analysis Techniques
29 pages
Soderstrom T., Stoica P. System Identification (PH 1989) (ISBN S
100% (6)
Soderstrom T., Stoica P. System Identification (PH 1989) (ISBN S
637 pages
Case Study Outline
No ratings yet
Case Study Outline
4 pages
Operations Management Forecasting Models
No ratings yet
Operations Management Forecasting Models
8 pages
Data Preprocessing & Mining Techniques
No ratings yet
Data Preprocessing & Mining Techniques
8 pages
Multiple Regression
No ratings yet
Multiple Regression
22 pages
Regression Models in Python Tutorial
No ratings yet
Regression Models in Python Tutorial
5 pages
FRM 356.experimental Designs (For Students)
No ratings yet
FRM 356.experimental Designs (For Students)
117 pages
AP Statistics Unit 2 Practice Test
No ratings yet
AP Statistics Unit 2 Practice Test
8 pages
Linear Regression Analysis
No ratings yet
Linear Regression Analysis
7 pages
A House Price Valuation Based On The Random Forest Approach: The Mass Appraisal of Residential Property in South Korea
No ratings yet
A House Price Valuation Based On The Random Forest Approach: The Mass Appraisal of Residential Property in South Korea
13 pages
051 Mayuri Verma Major Project Report
No ratings yet
051 Mayuri Verma Major Project Report
45 pages
Econometrics II
No ratings yet
Econometrics II
15 pages
Library Service Quality and User Satisfaction
No ratings yet
Library Service Quality and User Satisfaction
9 pages
22PCOAM16 - Machine Learning - Session 9 Linear Regression
No ratings yet
22PCOAM16 - Machine Learning - Session 9 Linear Regression
16 pages
Week 04
No ratings yet
Week 04
101 pages
Regression in Open Office/ Libre Office: The Linear Regression Equation
No ratings yet
Regression in Open Office/ Libre Office: The Linear Regression Equation
2 pages
Advance Financial Analytics Syllabus
No ratings yet
Advance Financial Analytics Syllabus
3 pages
Correlation and Linear Regression Overview
50% (2)
Correlation and Linear Regression Overview
67 pages
Residuals and Outliers in Linear Regression
No ratings yet
Residuals and Outliers in Linear Regression
3 pages
PDF
No ratings yet
PDF
185 pages
Lecture 1
No ratings yet
Lecture 1
123 pages
The Effect of Logistics Management On Firm Performance in Selected Food and Beverage Firms in Lagos State, Nigeria
No ratings yet
The Effect of Logistics Management On Firm Performance in Selected Food and Beverage Firms in Lagos State, Nigeria
24 pages
Dummy Regression Notes
No ratings yet
Dummy Regression Notes
15 pages
Using Linear Regression Analysis To Predict Energy
No ratings yet
Using Linear Regression Analysis To Predict Energy
22 pages
Introduction To Multiple Regression: Chapter 14 - 1
No ratings yet
Introduction To Multiple Regression: Chapter 14 - 1
62 pages
Data Analysis for Indonesia Education
50% (2)
Data Analysis for Indonesia Education
30 pages

Sta 212

Uploaded by

Sta 212

Uploaded by

TARABA STATE UNIVERSITY, JALINGO

STA212 LABORATORY FOR INFERENCE II (2 UNITS) LECTURE NOTES

Assumptions of the regression line:

SIMPLE LINEAR REGRESSION

1 35 112 3920 1225

∑ x =491 ∑ y=1410 ∑ x y=71566 ∑ x 2=26157

10× 71566−491 ×1410

𝑬𝒔𝒕𝒊𝒎𝒂𝒕𝒆𝒅 𝒃𝒍𝒐𝒐𝒅 𝒑𝒓𝒆𝒔𝒔𝒖𝒓𝒆 Y =85.026+1.140 age

Interpretation of the equation;

Equation of ANOVA table for simple linear regression

Calculating the ANOVA table values for simple linear regression

Source of Variation Sums of Df Mean Square F

Calculating the coefficient of determination ( R2)

∑ (x−x )2 =356.1, ∑ ( y− y)2=252, ∑ ( x −x ) ( y− y )=228

Calculating the Parameters of Spearman rank Equation:

with degrees of freedom = df = n – 2

TESTS CONCERNING CORRELATION AND REGRESSION COEFFICIENTS

By the principle of least squares, E is minimum

i.e. 2 ∑ [ y i−( a xi +b ) ] (−x i )=0 and 2 ∑ [ y i−( a xi +b ) ] (−1)=0

The normal equations are a ∑ x +5 b=∑ y (1)

To calculate ∑ x , ∑ x 2, ∑ y , ∑ xy we form below the table.

The normal equations are 75 a+5 b=114 (1)

The normal equations are A ∑ X +5 B=∑ Y (4)

Therefore, −5 B=−0.2 → B=−0.04

i.e. y=0.7 x−33.3

Substituting in (2) and (3), we get,

POLYNOMIAL AND REGRESSION PLANE

The coefficients of determination is

You might also like