You are on page 1of 73

Statistical Inference and

Hypothesis Testing
BY

ASSOCIATE PROF. IDRIS AHMED ALIYU


• We could randomly select a sample of five employees at ABU, get the
number of years of service for each sampled employee and compute the
mean years of service;

• We could then use the sample mean to estimate the mean years of
service for all employees;

• In other words, we use sample statistic to estimate a population


parameter
What is statistical inference?
• Statistical inference is the process of drawing conclusions about populations.
Technically, inference may be defined as the selection of a probabilistic
model to resemble the process you wish to investigate;
• In statistical inference the procedure is to test the validity of a statement
about a population parameter;
• What is a hypothesis?
• A Hypothesis is a statement about the value of a population parameter
developed for the purpose of testing; i.e. a statement that is yet to be proven;
• Hypothesis testing is a procedure, based on sample evidence and
probability theory, used to determine whether the hypothesis is a
reasonable statement and should not be rejected;
Example
• The manufacturer of a mobile handset claims that the average recharge
period for the battery of its newly launched mobile set is 7 days. Beyond
which it has to be recharged. As a person who travels frequently, Mr R
found it interesting but wanted to be assured whether the claim is true.
He then formulate the following statements:

• 𝐻0 = 𝜇 = 7 (𝑖. 𝑒. 𝑡ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑟𝑒𝑐ℎ𝑎𝑟𝑔𝑒 𝑝𝑒𝑟𝑖𝑜𝑑 𝑖𝑠 7 𝑑𝑎𝑦𝑠);

• 𝐻1 = 𝜇 ≠ 7 (𝑖. 𝑒. 𝑡ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑟𝑒𝑐ℎ𝑎𝑟𝑔𝑒 𝑝𝑒𝑟𝑖𝑜𝑑 𝑖𝑠 𝑛𝑜𝑡 7 𝑑𝑎𝑦𝑠)


Steps of Hypothesis Testing
• Step 1: Identify Null and Alternate Hypothesis; i.e. state the null and
alternative;

• Step 2: Select the Significance Level, α;

• Step 3: Select test statistics, (i.e. obtain sampling distribution of 𝜃,


when 𝐻0 is true);

• Step 4: Select significance Level and compute the statistic, 𝜃;

• Step 5: Take decision based on comparison of tabulated statistic and


the critical value
• Step 1: State the Null Hypothesis (H0) and the Alternate Hypothesis
(H1)
• For example, suppose that the packaging Department of Dangote
Cement is concerned that some bags of cements are significantly
overweight. The bags are packaged in 50kg; so

• the null hypothesis is that the mean weight of a bag of cement is


less than or equals 50kg.
• The null hypothesis would be written 𝑯𝟎 : 𝝁 ≤ 50, while the alternate
hypothesis would be 𝑯𝟏 : 𝝁 > 50;

• Note, under the “one tailed test” the rejection region is either in the
right or left tail of the curve;
One-tail vs. Two-tail Test

7
Left-tail or Right-tail Test?
• The direction of the test involving claims Inequality
that use the words “has improved”, “is Keywords
Symbol
Part of:

better than”, and the like will depend upon


Larger (or more) than > H1
the variable being measured.
• For instance, if the variable involves time Smaller (or less) < H1
for a certain medication to take effect, the
No more than  H0
words “better” “improve” or more effective”
are translated as “<” (less than, i.e. faster At least ≥ H0

relief). Has increased > H1


• On the other hand, if the variable refers to Is there difference? ≠ H1
a test score, then the words “better”
Has not changed = H0
“improve” or more effective” are translated
as “>” (greater than, i.e. higher test scores)
8
Important Things to Remember about H0 and H1

• H0 and H1 are mutually exclusive and collectively exhaustive


• H0 is always presumed to be true
• H1 has the burden of proof
• A random sample (n) is used to “reject H0”
• If we conclude 'do not reject H0', this does not necessarily mean that the null
hypothesis is true, it only suggests that we don’t have sufficient evidence to
reject H0; rejecting the null hypothesis then, suggests that the alternative
hypothesis may be true.
• Equality is always part of H0 (e.g. “=” , “≥” , “≤”).
• “≠” “<” and “>” always part of H1

9
Step 2: Select the Significance Level
• The next step is to state the level of significance. The level of
significance is the probability of rejecting the null hypothesis when it is
true.
• The level of significance is designated 𝛼 the Greek letter alpha. It is also
sometimes called the level of risk.
• There is no one level of significance that is applied to all tests. A decision
is made to use the .05 level (often stated as the 5 percent level), the .01
level, the .10 level, or any other level between 0 and 1. Traditionally, the
.05 level is selected for studies that has to do with human behaviour, .01
for quality assurance.
Is it possible to reject a true hypothesis?
• Suppose you are given a contract by an organisation to supply computer
circuit boards and the quality control department of the organisation
specifies that it will take a sample of all incoming shipments with a
specification that if more than 6% of the sample is defective the shipment
will be rejected.

• 𝐻0 : The incoming shipment of the circuit boards contains ≤ 6% substandard


circuit boards

• 𝐻1 : The incoming shipment of the circuit boards contains > 6% substandard


circuit boards
• A sample of 50 circuit boards received revealed that 4 boards, or 8 percent,
were substandard. The shipment was rejected because it violated the
specification of the quality control department;

• However, suppose the 4 substandard circuit boards selected in the sample


of 50 were the only substandard boards in the shipment of 4,000 boards, it
means on 1% are defective therefore not accepting the shipment was an
error (Type I error). The probability of committing a Type I error is the level
of significance. Type I error is simply rejecting the 𝐻0 , when it is true.
13
Step 3: Select the Test Statistic
• There are many test statistics. They include z, t, F and chi-square.
• TEST STATISTIC A value, determined from sample information,
used to determine whether to reject the null hypothesis.
• In hypothesis testing for the mean when the standard deviation is
known, the test statistic z is computed by:

𝑋−𝜇
•𝑍=𝜎
ൗ 𝑛
• The z value is based on the sampling distribution of which follows the
normal distribution with a sample mean (𝜇𝑥 ) equal to 𝜇 and a standard
deviation equal to 𝜎ൗ 𝑛 (i.e. standard error);
• We can thus determine whether the difference between sample mean
and population mean is statistically significant by finding the number of
standard deviations 𝑎𝑤𝑎𝑦 from the mean using the Z formula.
• When the sample ≥ 30, and 𝜎 is known use z-distribution
• When the sample ≥ 30, and 𝜎 is unknown use z- distribution
• When the sample ≤ 30, and 𝜎 is known use z- distribution
• When the sample ≤ 30, and 𝜎 is unknown use t- distribution
Two Tailed Test
Confidence Level Critical Value
0.90 1.645
0.91 1.70
0.92 1.75
0.93 1.81
0.94 1.88
0.95 1.96
0.96 2.05
0.97 2.17
0.98 2.33
0.99 2.575
One-Tailed Test
Confidence Level Critical Value
0.90 1.282
0.95 1.645
0.975 1.960
0.99 2.326
0.995 2.576
0.999 3.090
Step 4: Formulate the Decision Rule
• A decision rule is a statement of the specific conditions under
which the null hypothesis is rejected and the conditions under
which it is not rejected. The region or area of rejection defines the
location of all those values that are so large or so small that the
probability of their occurrence under a true null hypothesis is
rather remote.

• CRITICAL VALUE: The dividing point between the region


where the null hypothesis is rejected and the region where it is
not rejected.
Step 5: Make a Decision

Compute the test statistic and compare it to the critical value, and
make a decision to reject or not to reject the null hypothesis.

18
Parts of a Distribution in Hypothesis Testing

19
One Sample Test: Testing for a Population Mean with a
Known Population Standard Deviation
ABC wood assembly factory produces desks and
other office equipment for both public and
private organisations. The weekly production of
the Model 325 desk at the Zaria Plant follows a
normal probability distribution with a mean of
200 and a standard deviation of 16. Recently,
because of market expansion, new production
methods have been introduced and new
employees hired. A random sample of 50 weekly
desks production was selected and the mean is
203.5. The manager of the company would like
to investigate whether there has been a change in
the weekly production of the Model 325 desk. Is
the mean number of desks produced at the Zaria
Plant different from 200 at the .01 significance
level?
20
Testing for a Population Mean with a Known Population Standard Deviation

Step 1: State the null hypothesis and the alternate hypothesis.


H0:  = 200
H1:  ≠ 200
(note: keyword in the problem “has changed”)

Step 2: Select the level of significance.


α = 0.01 as stated in the problem

Step 3: Select the test statistic.


Use Z-distribution since σ is known

21
Step 4: Formulate the decision rule.
Reject H0 if |Z| > Z/2

Z  Z / 2
X 
 Z / 2
/ n
203 .5  200
 Z .01/ 2
16 / 50
1.55 is not  2.58

Step 5: Make a decision and interpret the result.


Because 1.55 does not fall in the rejection region, H0 is not rejected. We
conclude that the population mean is not different from 200. So we would
report to the manager that the sample evidence does not show that the
production rate at the Zaria Plant has changed from 200 per week.
22
One Tailed Test
Suppose in the previous problem the manager wants to know whether there
has been an increase in the number of units assembled. To put it another
way, can we conclude, because of the improved production methods, that
the mean number of desks assembled in the last 50 weeks was more than
200?
Recall: σ=16, n=200, α=.01
Step 1: State the null hypothesis and the alternate hypothesis.
H0:  ≤ 200
H1:  > 200
(note: keyword in the problem “an increase”)
Step 2: Select the level of significance.
α = 0.01 as stated in the problem
Step 3: Select the test statistic.
Use Z-distribution since σ is known

23
Step 4: Formulate the decision rule.
Reject H0 if Z > Z

Step 5: Make a decision and interpret the result.


Because 1.55 does not fall in the rejection region, H0 is not rejected. We conclude
that the average number of desks assembled in the last 50 weeks is not more than
200
24
p-Value in Hypothesis Testing
• In recent years, availability of computer software provide
additional information concerning the strength of the rejection or
acceptance. That is, how confident are we in rejecting the null
hypothesis?
• p-VALUE is the probability of observing a sample value as
extreme as, or more extreme than, the value observed, given that
the null hypothesis is true.
• In testing a hypothesis, we can also compare the p-value with the
significance level ();
• If the p-value is < significance level, H0 is rejected, else H0 is not
rejected.

25
What does it mean when p-value < ?

(a) .10, we have some evidence that H0 is not true.

(b) .05, we have strong evidence that H0 is not true.

(c) .01, we have very strong evidence that H0 is not true.

(d) .001, we have extremely strong evidence that H0 is not true.

26
p-Value in Hypothesis Testing - Example
Recall the last problem where the
hypothesis and decision rules were set up
as:
H0:  ≤ 200
H1:  > 200
Reject H0 if Z > Z
where Z = 1.55 and Z =2.33
Reject H0 if p-value < 
0.0606 is not < 0.01
Conclude: Fail to reject H0
The probability of finding a z value of 1.55 or more is
.0606, found by 1- 0.9394. To put it another way, the
probability of obtaining a mean greater than 203.5 if the
mean is equal 200 is .0606. For 2 tail P value =
2(0.0606)= 0.1212 27
Testing for the Population Mean: When the Population
Standard Deviation is Unknown
• When the population standard deviation (σ) is unknown, the
sample standard deviation (s) is used in its place. The t-distribution
is used as test statistic, which is computed using the formula:
ҧ
𝑥−𝜇
•𝑡= 𝑠 with 𝑛 − 1 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚,
ൗ 𝑛

• 𝑤ℎ𝑒𝑟𝑒:
• 𝑥ҧ = 𝑖𝑠 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛;
• 𝑥ҧ = 𝑖𝑠 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛;
• 𝜇 = 𝑖𝑠 𝑡ℎ𝑒 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑧𝑒𝑑 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛;
• 𝑠 = 𝑖𝑠 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛;
• 𝑛 = 𝑖𝑠 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
30
Testing for the Population Mean: Population Standard Deviation Unknown
ABC insurance company claims department reports that the mean cost to
process a claim is N60. An industry comparison show this amount to be
larger than most of other insurance companies. As such, the company
Instituted cost cutting measures. To evaluate the effect of the cost cutting
measures, the manager randomly take a sample of 26 claims processed
last months. The sample information is reported below:
45 49 62 40 43 61
48 53 67 63 78 64
48 54 51 56 63 69
58 51 58 59 56 57
38 76

At 0.01 significance level, it is reasonable to conclude that the mean cost to process a claim
is less than N60.

Note: When you compute the mean sample, you will get 56.42

31
Solution:

Step 1: State the null hypothesis and the alternate hypothesis.


𝐻0 : 𝜇 ≥ 60
𝐻1 : 𝜇 < 60
(note: keyword in the problem “now less than”)

Step 2: Select the level of significance;


α = 0.01 as stated in the problem

Step 3: Select the test statistic.


Use t-distribution since σ is unknown

32
t-Distribution Table (portion)

34
Testing for a Population Mean with a Known Population Standard Deviation

Step 4: Formulate the decision rule.


Reject H0 if t computed > t,n-1

ഥ−𝝁
𝑿
𝒕= 𝒔 𝟓𝟔. 𝟒𝟐 − 𝟔𝟎 −𝟑. 𝟓𝟖
ൗ 𝒏 𝒕= 𝒕= 𝒕 = -1.818
𝟏𝟎. 𝟎𝟒𝟏ൗ 𝟏. 𝟗𝟔𝟗𝟎𝟎𝟔
𝟐𝟔

Step 5: Make a decision and interpret the result.

Because -1.818 does not fall in the rejection region, H0 is not rejected at the .01 significance
level.
We have not demonstrated that the cost-cutting measures reduced the mean cost per claim
to less than 60. The difference of -4.07 (55.93 - 60) between the sample mean and the
population mean could be due to sampling error.

38
Testing for a Population Mean with an Unknown Population Standard Deviation-
Example

The current rate for producing 5 amp fuses at Hyper Electric Co.
is 250 per hour. A new machine has been purchased and
installed that, according to the supplier, will increase the
production rate. A sample of 10 randomly selected hours from
last month revealed the mean hourly production on the new
machine was 256 units, with a sample standard deviation of 6
per hour. At the .05 significance level can Hyper Electric
concludes that the new machine is faster?

39
Testing for a Population Mean with a Known Population Standard Deviation- Example continued

Step 1: State the null and the alternate hypothesis.


H0: µ ≤ 250; H1: µ > 250

Step 2: Select the level of significance.


It is .05.
Step 3: Find a test statistic. Use the t distribution because the population standard
deviation is not known and the sample size is less than 30.

40
Testing for a Population Mean with a Known Population Standard Deviation- Example continued

Step 4: State the decision rule.


There are 10 – 1 = 9 degrees of freedom. The null hypothesis is rejected if t > 1.833.

X  256  250
t   3.162
s n 6 10

Step 5: Make a decision and interpret the results.


The null hypothesis is rejected. The mean number produced is more than 250 per hour.

41
Problem
• Suppose that it is known from experience that the standard deviation
of the weight of 8 ounce packages of cookies made by a certain
bakery is 0.16 ounce. To check whether its production is under
control a random sample of 25 packages were selected and the
weight mean computed for the sample is 8.091 ounce. Since the
baker stand to lose money when the mean is greater than 8 and the
customer losses out when mean is less than 8. Test the null
hypothesis mean equal 8 at 0.01 significance level.
T- Tests: Comparing Population Means
Comparing two populations – Some Examples
• Imagine you want to find out whether people who read statistics books
are better at analyzing data than those who do not.
1. You set up an experiment with two groups (A and B), group A was
allowed to read a statistics book while group B was not. The two
groups were asked to analyze the same data.
2. You run an experiment with one group only, test their analysis skills,
make them read a statistics book, and then test their skills again.
3. You find some people who read statistics books and other people
who do not and compare their analysis skills
All the 3 problems would end up with one variable that tells you
whether a person reads statistics books or not—a dichotomous variable
that defines group membership—and one continuous variable that
summarizes people’s analysis skills (or statistics performance).
t-test continues
• The above research question concerning whether people who
read statistics books are better at analyzing data can be
answered using t-test.
• A t-test takes the average score of one group and as a reference
point and determines whether a second group’s average score
differs from that of the first one by estimating the distance btw
the two means comparing that distance to the variance in the
data.
• A t-test simply calculates the difference btw 2 group means and
compares it to the average distance of all data point to the mean
Types of t-test
1. Independent sample test: This 2. Paired Sample t-test: this is
compares the average scores of two where experiment is run with
independent samples
one group only.
• With respect to the research
question raised, you can address the Get a group and test their ability
questions as follows: at data analysis, then allow them
i. Set up two groups, one read statistics book and test their
experimental group and the other ability again. i.e. test them before
controlled group the test and test them after the
ii. Get people who read statistics and test.
other people who do not and
compare their analysis skills
One Sample t-test
• The one-sample test is used to determine whether a sample comes from
a population with a specific mean. This population mean is not always
known, but is sometimes hypothesized.
• Though it compares two means, it does so based on one sample data.
For example, you are allow a group to read statistics book, give them a
test and compare with the available national average score.
• A one sample test cannot distinguish between groups, rather it will
distinguish difference between a selected dependent variables and a
specified mean.
• Your sample would be the group that read statistics books and your
population mean would be the national average score.
t -Test of Hypothesis
• Assumptions
• When you choose to analyze your data using a one-sample t-test,
ensure that the data meet the following assumptions:
1. The dependent variable should be measured at the interval or ratio
level;
2. The data are independent (i.e., not correlated/related), which means
that there is no relationship between the observations. This is more
of a study design issue than something you can test for;
3. There should be no significant outliers;
4. The dependent variable should be approximately normally
distributed.
ID Read stats books Analytic skills
1 No 32
2 Yes 61
3 Yes 93
4 No 65
5 Yes 75
6 No 70
7 No 62
8 No 61
9 Yes 65
10 Yes 78
11 Yes 94
12 Yes 96
13 No 46
14 Yes 79
... ... ...
Two Sample test: Independent Sample
• The samples are from independent populations.

• The formula for computing the value of z is:

Use if sample sizes  30


or if  1 and  2 are known

X1  X 2
z 
 12  22

n1 n2

50
EXAMPLE 1
The U-Scan facility was recently installed at the Shoprite Mall in Abuja.
The store manager would like to know if the mean checkout time using
the standard checkout method is longer than using the U-Scan. The
manager gathered the following sample information. The time is
measured from when the customer enters the line until their bags are in
the cart. Hence the time includes both waiting in line and checking out.
Test the null hypothesis at 0.01 significance level.
Customer Type Sample Mean Population STD Sample Size

Standard 5.50 minutes 0.40 minutes 50

U-Scan 5.30 minutes 0.30 minutes 100

51
EXAMPLE 1 continued

Step 1: State the null and alternate hypotheses.


H0: µS ≤ µU
H1: µS > µU

Step 2: State the level of significance.


The .01 significance level is stated in the problem.

Step 3: Find the appropriate test statistic.


Because both samples are more than 30, we can use z-distribution as the
test statistic.

52
Example 1 continued
Step 4: State the decision rule.
Reject H0 if Z > Z
Z > 2.33

53
Example 1 continued
Step 5: Compute the value of z and make a decision
Xs  Xu
z
 s2  u2

ns nu
5.5  5.3 The computed value of 3.13 is larger than the critical

0.40 2 0.30 2 value of 2.33. Our decision is to reject the null
 hypothesis. The difference of .20 minutes between the
50 100
mean checkout time using the standard method is too
0. 2
  3.13 large to have occurred by chance. We conclude the U-
0.064 Scan method is faster.

54
Comparing Population Means with Unknown Population Standard Deviations (the
Pooled t-test)

The t distribution is used as the test statistic if one or more of the


samples have less than 30 observations. The required
assumptions are:
1. Both populations must follow the normal distribution.
2. The populations must have equal standard deviations.
3. The samples are from independent populations.

55
Comparing Population Means with Unknown
Population Standard Deviations (the Pooled t-test)
Finding the value of the test
statistic requires two steps.

1. Pool the sample standard (n1  1) s12  (n2  1) s22


deviations. s 
2

n1  n2  2
p

2. Use the pooled standard


deviation in the formula.
X1  X 2
t
2 1 1 
s p   
 n1 n2 

56
Comparing Population Means with Unknown Population Standard Deviations (the
Pooled t-test)
Owens Lawn Care, Inc., manufactures and assembles lawnmowers
that are shipped to dealers throughout the United States and Canada.
Two different procedures have been proposed for mounting the
engine on the frame of the lawnmower. The question is: Is there a
difference in the mean time to mount the engines on the frames of the
lawnmowers?
The first procedure was developed by longtime Owens employee
Herb Welles (designated as procedure 1), and the other procedure
was developed by Owens Vice President of Engineering William
Atkins (designated as procedure 2). To evaluate the two methods, it
was decided to conduct a time and motion study.

A sample of five employees was timed using the Welles method and
six using the Atkins method. The results, in minutes, are shown on
the right. Is there a difference in the mean mounting times? Use the
.10 significance level.

57
Comparing Population Means with Unknown Population Standard Deviations (the Pooled t-test) -
Example
Step 1: State the null and alternate hypotheses.
H0: µ1 = µ2
H1: µ1 ≠ µ2

Step 2: State the level of significance. The .10 significance level is stated in
the problem.

Step 3: Find the appropriate test statistic.


Because the population standard deviations are not known but are
assumed to be equal, we use the pooled t-test.

58
Comparing Population Means with Unknown Population Standard Deviations (the Pooled t-test) - Example

Step 4: State the decision rule.


Reject H0 if t > t/2,n1+n2-2 or t < - t/2,n1+n2-2
t > t.05,9 or t < - t.05,9
t > 1.833 or t < - 1.833

59
Comparing Population Means with Unknown Population Standard Deviations (the Pooled t-test) -
Example
Step 5: Compute the value of t and make a decision

(a) Calculate the sample standard deviations

60
Comparing Population Means with Unknown Population Standard Deviations (the Pooled t-test) -
Example

The decision is not to reject the null


hypothesis, because 0.662 falls in the
region between -1.833 and 1.833.

We conclude that there is no


difference in the mean times to mount
the engine on the frame using the two
methods.
-0.662

61
Comparing Population Means with Unequal Population Standard Deviations

If it is not reasonable to assume the population


standard deviations are equal, then we compute
the t-statistic shown on the right.
The sample standard deviations s1 and s2 are
used in place of the respective population
standard deviations.
In addition, the degrees of freedom are adjusted
downward by a rather complex approximation
formula. The effect is to reduce the number of
degrees of freedom in the test, which will
require a larger value of the test statistic to
reject the null hypothesis.

62
Comparing Population Means with Unequal Population Standard
Deviations
Personnel in a consumer testing laboratory are evaluating the absorbency
of paper towels. They wish to compare a set of store brand towels to a
similar group of name brand ones. For each brand they dip a ply of the
paper into a tub of fluid, allow the paper to drain back into the vat for two
minutes, and then evaluate the amount of liquid the paper has taken up
from the vat. A random sample of 9 store brand paper towels absorbed the
following amounts of liquid in milliliters.
8 8 3 1 9 7 5 5 12
An independent random sample of 12 name brand towels absorbed the
following amounts of liquid in milliliters:
12 11 10 6 8 9 9 10 11 9 8 10
Use the .10 significance level and test if there is a difference in the mean
amount of liquid absorbed by the two types of paper towels.

63
Comparing Population Means with Unequal Population Standard Deviations

Step 1: State the null and alternate hypotheses.


There is no difference in the mean amount mean amount of liquid
absorbed between the two types of paper towels.
H0: 1 = 2
H1: 1 ≠ 2

Step 2: State the level of significance.


The .10 significance level is stated in the problem.

Step 3: Find the appropriate test statistic.


use unequal variances t-test

64
Comparing Population Means with Unequal Population Standard Deviations -
Example

Step 4: State the decision rule.


Reject H0 if
t > t/2d.f. or t < - t/2,d.f.
t > t.05,10 or t < - t.05, 10
t > 1.812 or t < -1.812

Step 5: Compute the value of t and


make a decision

The computed value of t is less than the lower critical value, so our
decision is to reject the null hypothesis. We conclude that the
mean absorption rate for the two towels is not the same.

65
Two-Sample Tests of Hypothesis: Dependent Samples
Dependent samples are samples that are paired or related in some
fashion.
For example:
• If you wished to buy a car you would look at the same car at two
(or more) different dealerships and compare the prices.
• If you wished to measure the effectiveness of a new diet you
would weigh the dieters at the start and at the finish of the
program.

66
Comparing Dependent and Independent Samples
• Before and after studies: Suppose you want to measure whether the
production output of workers is influenced by music. First you select a
sample of workers and measure their output under the current
environmental conditions. Then you install the musical device and
again measure the output of the same workers
• Pairing observations: Assume you want to evaluate the best teaching
method , you apply 2 different methods to same sample and compare
the scores.
Why do we prefer dependent samples to
independent samples?
• By using dependent samples we are able to reduce the variation in
sampling distributions.
Hypothesis Testing Involving Paired Observations
Use the following test when the samples are dependent:

d
t
sd / n

Where
d is the mean of the differences
sd is the standard deviation of the differences
n is the number of pairs (differences)

69
Hypothesis Testing Involving Paired
Observations - Example
• ASO Savings and Loan wishes to
compare the two companies it uses to
appraise the value of residential
homes. The company selected a
sample of 10 residential properties
and scheduled both firms for an
appraisal. The results, reported in
Naira are shown on the table (right).
• At the .05 significance level, can we
conclude there is a difference in the
mean appraised values of the homes?

70
Hypothesis Testing Involving Paired
Observations - Example
Step 1: State the null and alternate hypotheses.
H0: d = 0
H1: d ≠ 0

Step 2: State the level of significance.


The .05 significance level is stated in the problem.

Step 3: Find the appropriate test statistic.


We will use the t-test

71
Hypothesis Testing Involving Paired Observations -
Example
Step 4: State the decision rule.
Reject H0 if
t > t/2, n-1 or t < - t/2,n-1
t > t.025,9 or t < - t.025, 9
t > 2.262 or t < -2.262

72
Hypothesis Testing Involving Paired Observations -
Example
Step 5: Compute the value of t and make a decision

The computed value of t is greater


than the higher critical value, so our
decision is to reject the null
hypothesis. We conclude that there
is a difference in the mean
appraised values of the homes.

73

You might also like