Professional Documents
Culture Documents
In this lesson, you will learn that statistics can be used in making predictions.
These predictions are based on the fact that two variables are related.
There is a strong association between cigarette smoking and death rate from
lung cancer. A study of British doctors found that smokers had 20 times the risk of
non smokers, and a larger study of American men ages 40 to 79 found 11 times
higher death rates among smokers. Does this mean that cigarette smoking causes
lung cancer?
Example:
In Taguig City , a law compelling the residents to wear face masks due to
COVIID 19 went into effect in March 2020. As time passed, an increasing
percentage of people wearing masks is complied. A study found high positive
correlation between the percent of people wearing masks and the percent reduction
of positive cases . This is a clear instance of cause and effect. Wearing masks
prevent the positive cases of COVID 19. So an increase in their use causes a drop in
positive cases.
The tobacco industry has argued that the association between smoking and
lung cancer may be an instance of common response. Perhaps there is a genetic
factor that predisposes people both to nicotine addiction and to lung cancer. Then
smoking and lung cancer would be positively associated even if smoking had no
direct effect on health.
How can we detect causation?
B. Prediction
A strong relationship between two variables can be used to predict the value
of one when the value of the other is known.
This section tackles the simplest type of prediction, that of predicting one
variable (Y) with the knowledge of another variable(X). Researches on the field of
behavioral science are mostly focused on the problems on prediction.
Y = a + bX
Where:
Y = predicted score
a = the y intercept
The slope of the regression line for predicting y from x will represented by b and the
point where the line cuts the y –axis or simply the y- intercept is represented by a
and can be determined through the use of the formulas:
a = Y– bX
where :
Admin.
x y xy
1 4 4 16
2 4 3 12
3 3 3 9
4 5 4 20
5 3 5 15
6 3 4 12
7 4 4 16
8 5 4 20
9 5 5 25
10 4 5 20
∑x = 40 ∑y = 41
Formula:
∆Y n ∑ xy - ∑ x ∑y
b = -------- = ---------------------
∆X n ∑ x2 - (∑x)2
1650 - 1640
b = -------------------- = 10 / 60 = 0.17
1660 - 1600
a = y -bx
a = 4.1 - 0.17 (4) = 4.1 + 0.68 = 3.42
Y = 3.42 + 0.17 X
Therefore, Productivity = 3.42 + 0.17 ( administrative capability)
This means that for every one unit change in the level of administrative
capability of administrators , their level of productivity increases by 0.17 units.
2. Anthony, an engineering statistics student, has a summer job with the division of
DENR. A new variety of tree was planted 6 years ago, and the trunk diameters were
taken each year of growth: Neglect all the environmental factors.
Year 1 2 3 4 5 6
Solution :
a. The year is the independent variable x, since it is fixed, and the response
variable y is the diameter.
Year Diameter
x y xy X2 Y (y - Y)2
n = 6
_ ∑x 21
x = --------- = ------ = 3.5
n 6
_ 26.4
y = ------ = 4.4
6
n ∑ xy - ∑ x ∑y
b = ---------------------
n ∑ x2 - (∑x)2
_ _
a = y -bx = 4.4 - 1.223 ( 3.5 ) = 0.120
Computing Y :
Y = a + bx1
a. Regression equation is
Y = a + bx1
Y = 0.12 + 1.223 x
Y = 4.40
_ _
Therefore the regression line lies at the point : x , y = ( 3.5 ; 4.4 )
∑ e2 =
∑ ( y - Y)2 = 0.1919 (from the table)
Exercise 8
1. An FX service operator wants to determine the length of time it would take to
transport passengers within Manila to NAIA during non - peak times. A sample of 9
trips on a particular day during non - peak times indicate the following:
x y
10 19.75
11 18.1
12 21.9
14 24.1
15 27.15
16 22.95
18 29.4
20 37.25
24 40.5
2. The kilometer-per-liter (Km /L) figures for a new engine are recorded for fixed
speeds between 56 and 112 km / hr.
Speed(km/hr) Mileage (km/L)
x y
56 14.7
104 13.2
64 14.5
88 13.2
112 12.8
96 13.4
84 13.3
68 14.5
80 13.8
100 13
60 14.6
72 4.3