You are on page 1of 35

Chapter 9

Preliminary Concepts on Statistical Inference

In descriptive statistics the use of single measures to describe a set of data or distribution was
introduced. These measures are the central measures of variability. The central measures include the
mean, the median and the mode while the range, quartile range, mean deviation, variance, standard
deviation and the coefficient of variant are the measures of variability.
The other branch of statistics is the inferential statistics. This branch or category of statistics
enables us to make estimates of population values called parameters and to make the statements about
computed statistics acceptable to some degree of confidence. The statistical method concerned with
making estimates of population values is called statistical inference. This particular method and process
will help us determine how accurate and acceptable our generalizations are.

Statistics plays important role in the field of applied scientific research. To get the data or
information about the population, the researcher may just use a portion of it in order to eliminate at
least the cost and time constraints. Statistics offers varied tools and techniques that will help the
researcher draw reliable and valid inferences or generalization about the population using the sample as
basis.

At this point, certain basic concepts are needed to be clarified in order to understand and
appreciate the concept of statistical inference better. The following are the discussion of the two sub-
areas of inferential statistics namely, statistical estimation and the test of hypothesis.

Statistical Estimation

In statistical estimation, we also consider a population and a sample. Recall that a population is
an aggregate of persons, objects, events, places and actions to certain stimuli that have a unique pattern
of qualities. Sometimes, this is referred to as the universe in statistical investigation. However, a sample
is a portion or a smaller part of the population that truly represents the unique qualities or
characteristics of the population. The acceptability of the sample depends on how well the sampling
technique has been selected and employed.

For example, in the study on the adjustment problems of freshmen students from the
Department of Liberal Arts, the researchers considered all the students coming from all the seven (7)
programs. However, they opted not to use all the students on account of their big number. They used
Slovin’s formula to determine the appropriate sample size. A stratified proportional sampling was
employed to determine the number of students per program. A test on Personal Adjustment inventory
was administered and they use the test results to report the encountered problems regarding the
adjustments of the entire group of freshmen students belonging to the department.
Inferential statistics help facilitate our work. Imagine, for instance, that the population of the said
department cited in the example above was 1,000 students distributed among seven programs. It would
be extremely laborious for the researchers to involve all the students. They can make their work easier
by drawing representative sample for each program which is proportional to the population size of
every program. The total sample size will be 286 students only applying Slovin’s formula and using 5%
margin of error. And yet, they can report about the results as reflective of the adjustment of the whole
freshmen students.

Parameter and Statistics

Statistical inference also deals with the concepts about parameter and statistics that are
involved in the estimation and further in the testing the hypothesis.

Parameters exist whether they are computed or not. In practice, these parameters are the
attributes of a population. The numerical descriptive measures of central tendencies such as the mean𝜇̅ ,
median𝜇̃ and mode𝜇̂ and the measures of variation including varianceσ2 and the standard deviation σ
are not known unless we invoke probability sampling techniques. The Greek symbols used here are read
as mufor the central measures and sigma for the measures of variations.

Statistics are the computed measures about the sample. The sample mean is symbolized as 𝑥̅
and the sample standard deviation by s. The term statistics is synonymous to the concepts of estimates.

Sampling Methods Revisited

The degree to which a particular statistic approximates its corresponding parameter value
depends upon how impartially we have drawn our sample. Sampling theory is based on the theoretical
use of the word “random”.
Random sampling which is the most commonly used sampling technique has two properties.
First is equiprobability which means that each member of the population has an equal chance of being
drawn and be included in the sample. If for instance there are 500 members in the population, the
𝟏
probability of each member to be drawn is . This is especially true when sampled cases are replaced
𝟓𝟎𝟎
or returned to the original pool.

The second property is independence. This means that the chance of one member being drawn
does not affect the chances of the other members getting chosen. For example, in a population where
there are father and son members not necessarily paired, when the father is drawn, this does not mean
that the son will automatically be included in the sample. Hence, the selection of the father is
independent of the inclusion of the son in this sample selection.
We know from Chapter 2 that sampling can be broken down into 2 types. The first one isthe
nonprobability sampling where there is no way of estimating the probability that each individual or
element will be included in the sample. Examples of this type are the accidental sampling, quota
sampling and the purposive sampling. The convenience and economical use of this type are its
advantages.

The second type is the probability sampling wherein every individual has an equal chance of
becoming a part of the sample. Examples under these type include the simple random sampling,
stratified random sampling, cluster sampling, systematic sampling and multistage sampling. It is also
noted whenever the sampling is not carried out like in the probability sampling then the result is a
biased sample.

The sampling error is the difference between a particular value, and its corresponding statistic.
Supposing that someone administered an intelligence test to a random sample of incoming freshmen in
a certain college and computed the mean. The statistic is an estimate of the parameter mean𝑥̅ . The
sampling error is denoted as eis the difference between the population mean and the sample mean,
thus μ - 𝑥̅ = e.

Point and Interval Estimation

There are two types of an estimator, these are the point estimator and interval estimator. The
point estimator is a rule or formula that gives the value of the gathered information for estimating a
particular parameter. An example of a point estimator is the sample mean 𝑥̅ , in which it estimates the
value of the population mean, μ. The second type is referred to as the interval estimator in which it is a
rule or formula that gives a set of computed values of the gathered information indicating the range of
values in which the parameter to be estimated will lie.

` Standard Error

The standard error of the distribution of the means is denoted by SE(𝑥̅ ). If it were possible to
draw a sample from a population 100 times and each time the mean of each sample is computed,
We would be able to compute 100 means. And if we were to construct a frequency distribution of these
100 means, this distribution would be referred to as a sampling distribution of the means.
Furthermore, if we compute the standard deviation of this distribution the value we will get is
the standard error of the means. In simple terms, the standard error is the standard deviation of the
sampling distribution of the means. It is an index number that guides us in making certain conclusions
concerning how far or how close the mean if a given sample is from the measure we would get had we
involved in the entire population.
The formula for obtaining the standard error of the sample mean is given by:
𝒔
̅) =
SE(𝒙 (1)
√𝒏
The acceptability of the representatives of the particular sample is determined by the
magnitude of the standard error. Furthermore, the formula above indicates that the magnitude of the
standard error of the distribution of the means is dependent on two measures. First is the standard
deviation or variability of scores around the mean and second is the size of the sample or the number of
cases being studied.

Hypothesis Testing

The goal of hypothesis is not to question the computed value of the sample statistic but to make
a judgment about the difference between the sample statistics and a hypothesized population
parameter.
Hypothesis testing enables a researcher to generalize a population from relatively small
samples. In many instances, a population from relatively small samples. In many instances, a researcher
can only rely on the information provided for by a part of the population.
A hypothesis is a tentative explanation for certain events, phenomena or behaviors. It is a
statement of prediction of the relationship between or among variables. It is also the most specific
statement of a problem in which the variables considered as measurable and that the statement
specifies how these variables are related. Furthermore, this statement is testable which means that the
relationship between the variables can be put into test by means of the application of appropriate
statistical test on the data gathered about the variables.

Null and Alternative Hypothesis

There are two kinds of hypotheses, the null hypothesis and the alternative hypothesis. Null
hypothesis, which is denoted as H0is the statement of equality indicating no existence of relationship
between the variables under study. This statement is tested for the purpose of being accepted or
rejected. Examples of a null hypothesis are given below.

Example 1 The Mathematics ability test scores of the control group do not differ with that of the
experimental group.
Example 2 The job performance of a group of employees working in a class A hotel is independent
on their working condition.
Example 3 The scholastic competition among Freshman students has no relationship on their
academic achievement.
Example 4 There is no difference in the college entrance examination scores obtained by the
students in the public and private schools.

The alternative hypothesis, which is denoted as Ha is also termed as the research hypothesis. It
is a statement of the expectation derived from the theory under the study. It specifies an existence of a
difference and is therefore termed as non-directional alternative hypothesis. Examples of a non-
directional alternative hypothesis are given below.
Example 5 The Mathematics ability test scores of the control group differs with that of the
experimental group
Example 6 The job performance of a group of employees working in a class A hotel is related on
their working condition.
Example 7 The scholastic competition among Freshmen students has a relationship on their
academic achievement.

Example 8 There is a difference in the college entrance examination scores obtained by the
students in the public and private schools.

It can be predictive hypothesis which specifies that one group performs better than the other
and is therefore termed as directional alternative hypothesis.

Example 9 The Mathematics ability test scores of the control group is lower than that of the
experimental group.
Example 10 High scores in the mental ability test corresponds to high scores on the self-concept
test.
Example 11 Students exposed to time pressure has a negative effect on their reading
comprehension skills.
Example 12 The brand of cellular phone used by college students in XYZ University has a positive
effect in developing one’s self-image.

Directional and Non-directional Tests of Hypothesis

The non-directional tests of hypothesis is also referred to as the two-tailed test. It makes use of
the two opposite sides or tails of the statistical model or distribution. This indicates that no assertion is
made whether the difference falls within the positive or negative end of the distribution. The illustration
of this test at 5% level of significance is presented on the next page.

The directional test of hypothesis is also referred to as the one-tailed test. It makes use of only
one side or tail of the statistical model or distribution which can be left-tailed or a right-tailed test. This
indicates that an assertion is made to whether the positive end of the distribution for a right-tailed test.
The illustration of this right-tail test at 5% level of significance is presented on figure 9.2 and the left-
tailed test at the same 5% level of significance is presented on Figure 9.3

Critical Region and Critical Value

The critical region is a set of values of the test statistic (computed from the gathered data set)
that is chosen before the experiment to define the conditions under which the null hypothesis will be
rejected.
The critical value or values separate the critical region from the values of test statistic that
would lead to the rejection of the null hypothesis. The critical values depend on the nature of the null
hypothesis, the relevant sampling distributions and the level of significance.
In two-tailed tests, the level of significance α is divided equally between two tails that constitute
the critical region. For example, in a two tailed test with a significance level of α = 5%, there is an area of
2.5% in each of the two tails as shown on the figure 9.1. On the other hand, for one tailed-tests, the
level of significance constitutes the critical region that can be found on either the left or right tail of the
distribution.

Type I and Type II Errors

In testing the null hypothesis, our conclusion is that of rejecting or accepting it. Correct decisions
happen when we reject a null hypothesis when it is false or when we accept a null hypothesis when it is
true. Otherwise, the decisions are wrong. That is, when we reject a true null hypothesis or when we
accept a false null hypothesis. These two possible scenarios of committing a wrong decision give two
different types of error in a statistical decision making. These errors are not a miscalculation or
procedural misstep. They are actual error that can occur when a rare event happens by chance.
The first type is the type I error. This is the chance of rejecting the null hypothesis when it is
true. It is also referred to as the significance level and denoted by the Greek symbol alpha (α) to
represent the probability of a type I error. The common values for α are 1%, 5% and 10%.
The second type is the type II error. This is the chance of failing to reject the null hypothesis
when it is false. It is denoted by the Greek symbol beta (β) to represent the probability of a type II error.

Confidence Interval

On the previous section, we discussed the chance of committing type I error, which is denoted
as α. This is also referred as the level of significance. A confidence level is denoted as 1 – α , which
represents the chance of accepting the null hypothesis when in fact it is true. This is usually attached to
the notion of interval estimation in which we are attaching certainty that the parameter we are
estimating will lie on the interval where the lower and upper bounds exist. The common values used for
the confidence level is 90%, 95% and 99%.
A confidence interval is constructed when we attach a confidence level to the interval estimate
of a particular parameter. A general formula for constructing a 100 x (1 – α)% confidence interval for the
parameter we used to estimate is given by:
Estimator ± S.E. (estimator) x critical value (2)
Wherein critical value is the one we obtain in the tables which depends on the nature of the null
hypothesis, the relevant sampling distributions and the level of significance. For the case of confidence
α
intervals, we only consider a critical value whose level of significance is .
2
Steps in Performing Hypothesis Testing

The following are the steps in performing hypothesis testing:


1. State the null and the alternative hypothesis of the given problem.
2. Determine the level of significance and the level of significance and the direction of test will be
based on whether the alternative hypothesis is stated as left or right tailed test or two-tailed
test.

3. Determine the appropriate statistical test based on the level of measurement of the gathered
data.

4. Write the decision rule expressing on how we accept or reject the null hypothesis.

5. Compute for the test statistic and compare with the critical value. The test statistic plays a vital
role whether the null hypothesis will be rejected or accepted.

6. State the decision based on the resulting computed value when compared to the critical value.

7. Write the conclusion for the given problem.

Testing Hypothesis for the Mean for Single Sample Case.

By the following the procedures in testing the hypothesis for only a single mean, we have a
hypothesized mean (μ0). Symbolically the null hypothesis is written as:

Ho: μ = μ0(3)

Stating that the mean is equal to the hypothesized mean. On the other hand, the alternative
hypothesis is written as symbolically as:

Ha: μ = μ0 (4)

Ha: μ > μ0 (5)

Ha: μ < μ0 (6)

The expression in (4) is used when the alternative hypothesis is non-directional and hence
undergoing a two-tailed test. The remaining expressions (5) and (6) are used when the alternative
hypothesis is directional and hence undergoing a one-tailed test which is either right or left-tailed test
respectively.

`` The decision rule is stated as follows: Reject the null hypothesis if the absolute value of the test
statistic exceeds the critical value. Otherwise, accept the null hypothesis.

Population Variance is known

In order to draw inference on a mean in one-population case assuming that the entries are
normally distributed and the variance is known, we use the Z-test. The Z-statistic, Zcis the test statistic
used in order to lead for the rejection of null hypothesis in favor of the alternative hypothesis. This is
computed as follows
̅−𝛍𝟎
𝒙
Zc = 𝛔 / (7)
√𝒏

Where 𝑥̅ the computed mean is in the gathered data, μ0 is the hypothesized mean, σ is the population
standard deviation which is known or given and n is the sample size.

The critical value is obtained using the Z-tabular value located in Appendix 2. For a two-sided
α α
test we consider the value of 1 - 2 , the value is symbolically written as Z2 . Otherwise, for a one-sided
test, we consider the value of 1 – α and the value is symbolically represented as Zα.

Example 1 A random sample of 100 students enrolled in Statistics course under professor XYZ
shows that the average grade in the midterm examination is 85%. Professor XYZ claims
that the average grade of the students in the midterm examination is at least 80% with a
standard deviation 16%. Is there an evidence to say that the claim of the Professor XYZ
is correct at 5% level of significance?

Solution:

We first state symbolically the null hypothesis as Ho: μ = 80%, which means that
the average grade of Professor XYZ’s students is greater than 80% and the alternative
hypothesis as Ha: μ > 80%, which means that the average grade of Professor XYZ’s
students is greater than 80%. Since the last statement on given problem asserts it to fall
positive end of the distribution, we consider this is a one-tailed test. Thus, our decision
rule is to reject the null hypothesis if ǀZcǀ> Zα. Otherwise, accept the null hypothesis.

The test statistic together with the Z-Tabular value is computed as follows:

𝑍𝑐= 0.85−0.80= 0.05


=3.125
0.16 0.016
( )
√100

So that ǀZcǀ = 3.125 against the Z-tabular value of Z0.05= 1.645. Our decision
based on this computation is to reject the null hypothesis in favor of the alternative
hypothesis. Thus, we conclude that the claim of Professor XYZ is true since the average
grade by his student in Statistics is greater than 80% at 5% level of significance.

Example 2 A random sample of 100 recorded deaths in U.S during the past year showed on average
life span of 71.8 yrs. assuming a population standard deviation of 8.9 years, does this seem to indicate
that the mean life span today is greater than 70 years? Use a 0.05 level of significance.

1.) HO: µ = 70
HA: µ > 70
2.) Z- Test (right tailed)
3.) α=.05
1 - 0.05 = 0.95
4.) Reject HO if ZC > Ztab
(71.8−70)√100
5.) 𝑍𝑐 = 8.9

Zc = 2.02

P(z > 2.02) = 1 – P(z > 2.02)

= 1 – 0.9783

= 0.0217

**Note: P value is the lowest significance in which the observed value of the test statistic is
significant.

6.) Zc > Ztab; Reject HO


7.) The mean life span today is greater than 70.

Example 3 A manufacturer of sports equipment has developed a new synthetic finishing line that he
claims has a mean breaking strength of 8 kg with σ= 0.5 kg test the hypothesis that µ= 8 kg against µ=
8kg if a random sample of 50 lines is tested and found to have a mean breaking strength of 7.8 kg. Use
0.01 level of significance

1.) HO: µ = 8
HA: µ ≠ 8
2.) Z- Test (≠)
3.) α= 0.01/2

Ztab1= -2.575
Ztab2= 2.575

4.) Reject HO if ZC < Ztab1


Reject HO if ZC > Ztab2
(7.8−8)√50
5.) 𝑍𝑐 = 0.5

Zc = -2.828

P(z < -2.828) = 0.0023(2)

= 0.0046

6.) Zc < Ztab1 ; Reject HO


7.) The mean mreaking strengths is not equal to 8kg.
TWO SAMPLE MEAN TEST
𝑋1 − 𝑋2
𝑍𝑐 =
𝜎12 𝜎2

𝑛1
+ 𝑛2
2

Where:

X1, X2 = sample mean of 1 and 2

σ12, σ22 = variances of 1 and 2

n1, n2 = size of 1 and 2

Example 1 An admission test was administered to incoming freshman in 2 colleges. Two independent
sample of 150 students each are randomly selected and the mean scored of the given samples are X1 =
88 and X2 = 85. Assume that the variances of the test scores are 40 and 35 respectively. Is the difference
between the mean scores significant or can be attributed to chance? Use 0.01 level of significance.

1.) HO: µ1 = µ2
HA: µ1 ≠ µ2
2.) Z- Test (two-tailed)
3.) α= 0.01

Ztab1= -2.578
Ztab2= 2.578

4.) Reject HO if ZC < Ztab1


Reject HO if ZC > Ztab2
88−85
5.) 𝑍𝑐 = 40 35
√ +
150 150

Zc = 4.2426

6.) Zc > Ztab1 ; Accept HO


7.) There is a significant difference and can be attributed to chance.

Population Variance is Unknown

In order to draw inference on a mean in one-population case assuming that the entries are
normally distributed but the variance is unknown, we use the t-test. The test statistic used in order to
lead for the rejection of null hypothesis in favor of the alternative hypothesis is the t-statistic, tc , which
is computed as follows:

̅ − 𝝁𝟎
𝒙
𝒕𝒄 =
𝒔/√𝒏
Where 𝑥̅ is the computed mean in the gathered data, μ0 is the hypothesized mean, 𝑠 is the sample
standard deviation which is known or given and n is the sample size.

The critical value is obtained using the t-tabular value located in Appendix 5. For a two-sided
test we look for the value of df, which is referred as the degrees of freedom, this is symbolically written
𝑡𝛼(𝑛−1). Otherwise, for a one-sided test, we look on the column of α and look for the value of df, this is
2
symbolically written as 𝑡𝛼(𝑛−1) .

Example 1 A random sample of 100 students enrolled in Statistics course under


professor XYZ shows that the average grade in the midterm examination is 85%
with computed standard deviation of 25%. Professor XYZ claims that the
average grade of the students in the midterm examination is at least 80%. Test
the claim of the professor at 5% level of significance.

Solution:

We first state symbolically the null hypothesis as Ho: μ = 80%, which


means that the average grade of Professor XYZ’s students is 80% and the
alternative hypothesis as Ha: μ ≠ 80%, which means that the average grade of
Professor XYZ’s students is not 80%. Since the last statement on given problem
does not assert to whether it falls within the positive or negative end, we
consider this as a non-directional test. Thus, our decision rule is to reject the
null hypothesis if ǀtcǀ>𝑡𝛼(𝑛−1). Otherwise, accept the null hypothesis.
2

The test statistic together with the t-tabular value is computed as


follows:
0.85−0.80 0.05
ǀtcǀ = ǀ 0.25 ǀ= ǀ0.025ǀ = 2.000
( )
√100

So that ǀtcǀ = 2.000 against the t -tabular value of t0.025(99)= 1.960. Notice
that on the values of n in the t-table is only up to 30 and then followed by INF
(infinity), which is used for n greater than 30.

Our decision based on this computation is to reject the null hypothesis


in favor of the alternative hypothesis. Thus, we conclude that the claim of
Professor XYZ is not true since the average grade obtained by his students in
Statistics is not 80%.

A 100 x (1-α)% confidence interval is constructed whenever the null


hypothesis of the two-tailed test is rejected. Otherwise, this confidence interval
will not be constructed.

In order to determine the possible values that the true average grade
will lie, we will construct a 95% confidence interval. Using formula (2), we have
the estimator for the mean which is the average grade by the students to 85%,
the tabular value of 1.960 and the standard error of the estimate is given by
0.25
= 0.025 which is equivalent to 2.5%.
√100

Using the results on example 8.14, the resulting confidence interval is


given as 85% ± (2.5%)(1.96) =85% ± 4.9%. That is, with an attached 95%
confidence coefficient, the true average grade obtained by Professor XYZ’s
students lies within 89.9% to 80.1% which is eventually higher than the
hypothesized value of 80%.

Example 2 A random sample of 25 female high school students show that their
average body mass index (BMI) is about 18 points with a standard deviation of
4.5 points. Test the hypothesis that the average BMI of the female high school
students is lower than 19 points at 5% level of significance.

Solution:

The null hypothesis is stated as Ho :μ =19, against the alternative


hypothesis of Ha: μ < 19. The last statement on given problem asserts it to fall
within the negative end of the distribution which is considered it as this left-
tailed test. Thus, our decision rule is to reject the null hypothesis. The test e t-
statistic together with the t-tabular value is computed as follows:

18 − 19 −1
ǀ𝑡𝑐 ǀ = ǀ 4.5
ǀ=ǀ ǀ = 1.111 𝑣𝑒𝑟𝑠𝑢𝑠
( ) 0.9
√25

𝑡(0.05)(25−1) = 𝑡(0.05)(24) = 1.711

Our decision based on this computation is to accept the null hypothesis.


Thus, we conclude that average BMI of female high school students is about 19
points.

Example 3 The manager of a car rental agency claims that the average mileage of cars rented is less
than 8000. A sample of 5 auto-mobiles has an average mileage of 7723, with st. dev. of 500 miles. At
α=0.01, is there enough evidence to reject to manager’s claim?

1.) HO: µ1 = 8000


HA: µ1 < 8000
2.) T- Test (left-tailed)
3.) α= 0.01

V= n-1= 5-1= 4
Ttab= -3.747

4.) Reject HO if TC < Ttab


(7723−8000)√5
5.) 𝑍𝑐 = 500

Zc = -1.238

6.) Since tc > ttab ; Accept HO


7.) The manager’s claim is false.

In testing two small samples:


𝑋1 − 𝑋2
𝑡𝑐 =
2
𝑛 𝑆 +𝑛 𝑆 2
√ 1 1 2 2 +𝑛1+𝑛2
𝑛1 +𝑛2 −2 𝑛1 𝑛2

With n1 + n2 -2 (degrees of freedom)

Example 4 Two samples are randomly selected from two groups of students who have been taught
using different teaching methods. An examination is given and the results are shown below.

Group1 Group2

n1=8 n2=10

X1=85 X2=87

S12=46 S22=66

Using 0.05 level of significance, can we conclude that the two different teaching
methods are equally effective?

1.) HO: µ1 = µ2
HA: µ1 ≠ µ2
2.) T- Test (two-tailed test)
3.) α= 0.05

V= n1+n2-2= 8+10-2 = 16

Ttab1= -2.120
Ttab2= 2.120

4.) Reject HO if TC < Ttab1


Reject HO if TC > Ttab2
85−87
5.) 𝑡𝑐 = 8(6)+10(36) 8+10
√ +
8+10−2 8(10)

tc = -0.651

6.) Since tc > ttab1 ; Accept HO


7.) The two teaching methods are equally effective.

EXERCISES

1. Suppose that mi allergist wishes to test the hypothesis that at least 30% of the public is allergic to
some cheese products. Explain how the allergist could commit

(a) a type I error;

(b) a type II error.

Answer: ((a) Conclude that fewer than 30% of the public are allergic to some cheese products when, in
fact, 30% or more are allergic.

(b) Conclude that at least 30% of the public are allergic to some cheese products when, in fact,
fewer than 30% are allergic.)

2. A sociologist is concerned about the effectiveness of a training course designed to get more drivers to
use seat, belts in automobiles.

(a) What hypothesis is she testing if she commits a type I error by erroneously concluding that the
training course: is ineffective?

(b) What hypothesis is she testing if she commits a type II error by erroneously concluding that the
training course is effective?

Answer: ((a) The training course is effective.

(b) The training course is effective.)

3. A large manufacturing firm is being charged with discrimination in its hiring practices.

(a) What hypothesis is being tested if a jury commits a type I error by dueling the firm guilty?

(b) What hypothesis is being tested if a jury commits a type II error by finding the firm guilty?

Answer: ((a) The firm is not guilty.

(b) The firm is guilty.)

4. The proportion of adults living in a small town who are college graduates is estimated to be p = 0.6.
To test this hypothesis, a random sample of 15 adults is selected. If the number of college graduates in
our sample is anywhere from 6 to 12, we shall not reject the null hypothesis that p = 0.6; otherwise, we
shall conclude that p ^ 0.6.

(a) Evaluate a assuming that p = 0.6. Use the binomial distribution.

(b) Evaluate 8 for the alternatives p = 0.5 and p — 0.7.

(c) Is this a good test procedure?

Answer: ((a) α = P(X ≤ 5 | p = 0.6)+P(X ≥ 13 | p = 0.6) = 0.0338+(1− 0.9729) = 0.0609.

(b) β = P(6 ≤ X ≤ 12 | p = 0.5) = 0.9963 − 0.1509 = 0.8454.

β = P(6 ≤ X ≤ 12 | p = 0.7) = 0.8732 − 0.0037 = 0.8695.

(c) This test procedure is not good for detecting differences of 0.1 in p.)

5. Repeat. Exercise 10.4 when 200 adults are selected and the fail to reject region is defined to be

110 < x < 130, where x is the number of college graduates in our sample. Use the normal approximation.

Answer: ((a) α = P(X < 110 | p = 0.6) + P(X > 130 | p = 0.6) = P(Z < − 1.52) + P(Z >

1.52) = 2(0.0643) = 0.1286.

(b) β = P(110 < X < 130 | p = 0.5) = P(1.34 < Z < 4.31) = 0.0901.

β = P(110 < X < 130 | p = 0.7) = P(− 4.71 < Z < − 1.47) = 0.0708.

(c) The probability of a Type I error is somewhat high for this procedure, although

Type II errors are reduced dramatically.)

6. A fabric manufacturer believes that the proportion of orders for raw material arriving late is p = 0.6.

If a random sample of 10 orders shows that 3 or fewer arrived late, the hypothesis that p = 0.6 should be
rejected in favor of the alternative p < 0.6. Use the binomial distribution.

(a) Find the probability of committing a type I error if the true proportion is p = 0.6.

(b) Find the probability of committing a type II error for the alternatives p = 0.3, p — 0.4, and p = 0.5.

Answer: ((a) α = P(X ≤ 3 | p = 0.6) = 0.0548.

(b) β = P(X > 3 | p = 0.3) = 1 − 0.6496 = 0.3504.


β = P(X > 3 | p = 0.4) = 1 − 0.3823 = 0.6177.

β = P(X > 3 | p = 0.5) = 1 − 0.1719 = 0.8281.)

7. Repeat Exercise 10.6 when 50 orders are selected and the critical region is defined to be x < 24, where
x is the number of orders in our sample that arrived late. Use the normal approximation.

Answer: ((a) α = P(X ≤ 24 | p = 0.6) = P(Z < − 1.59) = 0.0559.

(b) β = P(X > 24 | p = 0.3) = P(Z > 2.93) = 1 − 0.9983 = 0.0017.

β = P(X > 24 | p = 0.4) = P(Z > 1.30) = 1 − 0.9032 = 0.0968.

β = P(X > 24 | p = 0.5) = P(Z > − 0.14) = 1 − 0.4443 = 0.5557.)

8. An electrical firm manufactures light bulbs that have a lifetime that is approximately normally
distributed with a mean of 800 hours and a standard deviation of 40 hours. Test tbe hypothesis that p =
800 hours against the alternative p ^ 800 hours if a random sample of 30 bulbs has an average life of 788
hours. Use a P-value in your answers.
Answer: (The hypotheses are

H0 : μ = 800,

H1 : μ 6= 800.

Now, z = 788− 800

40/√30

= − 1.64, and P-value= 2P(Z < − 1.64) = (2)(0.0505) = 0.1010.

Hence, the mean is not significantly different from 800 for α < 0.101.)

9. A random sample of 64 bags of white Cheddar popcorn weighed, on average, 5.23 ounces with a
standard deviation of 0.24 ounces. Test the hypothesis that p = 5.5 ounces against the alternative
hypothesis, p < 5.5 ounces at the 0.05 level of significance.
Answer: ((The hypotheses are

H0 : μ = 5.5,

H1 : μ < 5.5.

The White Cheddar Popcorn, on average, weighs less than 5.5oz.)


10. In a research report by Richard H. Weindruch of the UCLA Medical School, it is claimed that mice
with an average life span of 32 months will live to be about 40 months old when 40% of the calories in
their food are replaced by vitamins and protein. Is there any reason to believe that /z < 40 if 64 mice
that are placed on this diet have an average life of 38 months with a standard deviation of 5.8 months?
Use a P-value in your conclusion.
Answer: (10.21 The hypotheses are

H0 : μ = 40 months,

H1 : μ < 40 months.

Decision: reject H0.)

10.25 Test the hypothesis that the average content of containers of a particular lubricant is 10 liters if
the contents of a random sample of 10 containers are 10.2, 9.7, 10.1, 10.3, 10.1, 9.8, 9.9, 10.4, 10.3, and
9.8 liters. Use a 0.01 level of significance and assume that the distribution of contents is normal.
Answer: (The hypotheses are

H0 : μ = 10,

H1 : μ 6= 10.

Decision: Fail to reject H0.)

10.26 According to a dietary study, a high sodium intake may be related to ulcers, stomach cancer, and
migraine headaches. The human requirement for salt is only 220 milligrams per day, which is surpassed
in most single servings of ready-to-eat cereals. If a random sample of 20 similar servings of of certain
cereal has a mean sodium content of 244 milligrams and a standard deviation of 24.5 milligrams, does
this suggest at the 0.05 level of significance that the average sodium content for a single serving of such
cereal is greater than 220 milligrams? Assume the distribution of sodium contents to be normal.
Answer: (The hypotheses are

H0 : μ = 220 milligrams,

H1 : μ > 220 milligrams.

Decision: Reject H0 and claim μ > 220 milligrams.)

10.27 A study at the University of Colorado at Boulder shows that running increases the percent resting
metabolic rate (RMR) in older women. The average RMR of 30 elderly women runners was 34.0% higher
than the average RMR of 30 sedentary elderly women and the standard deviations were reported to be
10.5% and 10.2%, respectively. Was there a significant increase in RMR of the women runners over the
sedentary women? Assume the populations to be approximately normally distributed with equal
variances. Use a P-value in your conclusions.
Answer: (The hypotheses are
H0 : μ1 = μ2,

H1 : μ1 > μ2.

Hence, the conclusion is that running increases the mean RMR in older women)

10.28 According to Chemical Engineering an important property of fiber is its water absorbency. The
average percent absorbency of 25 randomly selected pieces of cotton fiber was found to be 20 with a
standard deviation of 1.5. A random sample of 25 pieces of acetate yielded an average percent of 12
with a standard deviation of 1.25. Is there strong evidence that the population mean percent
absorbency for cotton fiber is significantly higher than the mean for acetate. Assume that the percent
absorbency is approximately normally distributed and that the population variances in percent
absorbency for the two fibers are the same. Use a significance level of 0.05.
Answer: (The hypotheses are

H0 : μC = μA,

H1 : μC > μA,

The mean percent absorbency for the cotton fiber is significantly higher than the mean percent
absorbency for acetate.)

10.29 Past experience indicates that the time for high school seniors to complete a standardized test is a
normal random variable with a mean of 35 minutes. If a random sample of 20 high school seniors took
an average of 33.1 minutes to complete this test with a standard deviation of 4.3 minutes, test the
hypothesis at the 0.05 level of significance that p = 35 minutes against the alternative that p < 35
minutes.
Answer: (The hypotheses are

H0 : μ = 35 minutes,

H1 : μ < 35 minutes.

Decision: Reject H0 and conclude that it takes less than 35 minutes, on the average, to take the
test.)

10.31 A manufacturer claims that the average tensile strength of thread A exceeds the average tensile
strength of thread B by at least 12 kilograms. To test his claim, 50 pieces of each type of thread are
tested under similar conditions. Type A thread had an average tensile strength of 86.7 kilograms with
known standard deviation of a A = 6.28 kilograms, while type B thread had an average tensile strength
of 77.8 kilograms with known standard deviation of an = 5.61 kilograms. Test the manufacturer's claim
ata = 0.05.
Answer: (hypotheses are

H0 : μA − μB = 12 kilograms,
H1 : μA − μB > 12 kilograms.

The average tensile strength of thread A does not exceed the average tensile strength of thread
B by 12 kilograms.)

NONPARAMETRIC TESTS

Nonparametric tests are sometimes called distribution-free tests because they are based on
fewer assumptions (e.g., they do not assume that the outcome is approximately normally
distributed). Parametric tests involve specific probability distributions (e.g., the normal
distribution) and the tests involve estimation of the key parameters of that distribution (e.g.,
the mean or difference in means) from the sample data. The cost of fewer assumptions is that
nonparametric tests are generally less powerful than their parametric counterparts (i.e., when
the alternative is true, they may be less likely to reject H0).

It can sometimes be difficult to assess whether a continuous outcome follows a normal


distribution and, thus, whether a parametric or nonparametric test is appropriate. There are
several statistical tests that can be used to assess whether data are likely from a normal
distribution. The most popular are the Kolmogorov-Smirnov test, the Anderson-Darling test,
and the Shapiro-Wilk test1. Each test is essentially a goodness of fit test and compares observed
data to quantiles of the normal (or other specified) distribution. The null hypothesis for each
test is H0: Data follow a normal distribution versus H1: Data do not follow a normal distribution.
If the test is statistically significant (e.g., p<0.05), then data do not follow a normal distribution,
and a nonparametric test is warranted. It should be noted that these tests for normality can be
subject to low power. Specifically, the tests may fail to reject H0: Data follow a normal
distribution when in fact the data do not follow a normal distribution. Low power is a major
issue when the sample size is small - which unfortunately is often when we wish to employ
these tests. The most practical approach to assessing normality involves investigating the
distributional form of the outcome in the sample using a histogram and to augment that with
data from other studies, if available, that may indicate the likely distribution of the outcome in
the population.

There are some situations when it is clear that the outcome does not follow a normal
distribution. These include situations:

 when the outcome is an ordinal variable or a rank,


 when there are definite outliers or
 when the outcome has clear limits of detection.

Using an Ordinal Scale


Consider a clinical trial where study participants are asked to rate their symptom severity
following 6 weeks on the assigned treatment. Symptom severity might be measured on a 5
point ordinal scale with response options: Symptoms got much worse, slightly worse, no
change, slightly improved, or much improved. Suppose there are a total of n=20 participants in
the trial, randomized to an experimental treatment or placebo, and the outcome data are
distributed as shown in the figure below.

Distribution of Symptom Severity in Total Sample

The distribution of the outcome (symptom severity) does not appear to be normal as more
participants report improvement in symptoms as opposed to worsening of symptoms.

When the Outcome is a Rank

In some studies, the outcome is a rank. For example, in obstetrical studies an APGAR score is
often used to assess the health of a newborn. The score, which ranges from 1-10, is the sum of
five component scores based on the infant's condition at birth. APGAR scores generally do not
follow a normal distribution, since most newborns have scores of 7 or higher (normal range).

When There Are Outliers

In some studies, the outcome is continuous but subject to outliers or extreme values. For
example, days in the hospital following a particular surgical procedure is an outcome that is
often subject to outliers. Suppose in an observational study investigators wish to assess
whether there is a difference in the days patients spend in the hospital following liver
transplant in for-profit versus nonprofit hospitals. Suppose we measure days in the hospital
following transplant in n=100 participants, 50 from for-profit and 50 from non-profit hospitals.
The number of days in the hospital are summarized by the box-whisker plot below.

Distribution of Days in the Hospital Following Transplant

Note that 75% of the participants stay at most 16 days in the hospital following transplant,
while at least 1 stays 35 days which would be considered an outlier. Recall from page 8 in the
module on Summarizing Data that we used Q1-1.5(Q3-Q1) as a lower limit and Q3+1.5(Q3-Q1) as
an upper limit to detect outliers. In the box-whisker plot above, 10.2, Q1=12 and Q3=16, thus
outliers are values below 12-1.5(16-12) = 6 or above 16+1.5(16-12) = 22.

Limits of Detection

In some studies, the outcome is a continuous variable that is measured with some imprecision
(e.g., with clear limits of detection). For example, some instruments or assays cannot measure
presence of specific quantities above or below certain limits. HIV viral load is a measure of the
amount of virus in the body and is measured as the amount of virus per a certain volume of
blood. It can range from "not detected" or "below the limit of detection" to hundreds of
millions of copies. Thus, in a sample some participants may have measures like 1,254,000 or
874,050 copies and others are measured as "not detected." If a substantial number of
participants have undetectable levels, the distribution of viral load is not normally distributed.

Hypothesis Testing with Nonparametric Tests


In nonparametric tests, the hypotheses are not about
population parameters (e.g., μ=50 or μ1=μ2). Instead, the
null hypothesis is more general. For example, when
comparing two independent groups in terms of a continuous
outcome, the null hypothesis in a parametric test is H0: μ1
=μ2. In a nonparametric test the null hypothesis is that the
two populations are equal, often this is interpreted as the
two populations are equal in terms of their central tendency.

Advantages of Nonparametric Tests

Nonparametric tests have some distinct advantages. With


outcomes such as those described above, nonparametric
tests may be the only way to analyze these data. Outcomes
that are ordinal, ranked, subject to outliers or measured
imprecisely are difficult to analyze with parametric methods
without making major assumptions about their distributions
as well as decisions about coding some values (e.g., "not
detected"). As described here, nonparametric tests can also
be relatively simple to conduct.

Introduction to Nonparametric Testing

This module will describe some popular nonparametric tests for continuous outcomes.
Interested readers should see Conover3 for a more comprehensive coverage of nonparametric
tests.

Key Concept:

Parametric tests are generally more powerful


and can test a wider range of alternative
hypotheses. It is worth repeating that if data
are approximately normally distributed then
parametric tests (as in the modules on
hypothesis testing) are more appropriate.
However, there are situations in which
assumptions for a parametric test are violated
and a nonparametric test is more appropriate.

The techniques described here apply to outcomes that are ordinal, ranked, or continuous
outcome variables that are not normally distributed. Recall that continuous outcomes are
quantitative measures based on a specific measurement scale (e.g., weight in pounds, height in
inches). Some investigators make the distinction between continuous, interval and ordinal
scaled data. Interval data are like continuous data in that they are measured on a constant
scale (i.e., there exists the same difference between adjacent scale scores across the entire
spectrum of scores). Differences between interval scores are interpretable, but ratios are not.
Temperature in Celsius or Fahrenheit is an example of an interval scale outcome. The difference
between 30º and 40º is the same as the difference between 70º and 80º, yet 80º is not twice as
warm as 40º. Ordinal outcomes can be less specific as the ordered categories need not be
equally spaced. Symptom severity is an example of an ordinal outcome and it is not clear
whether the difference between much worse and slightly worse is the same as the difference
between no change and slightly improved. Some studies use visual scales to assess participants'
self-reported signs and symptoms. Pain is often measured in this way, from 0 to 10 with 0
representing no pain and 10 representing agonizing pain. Participants are sometimes shown a
visual scale such as that shown in the upper portion of the figure below and asked to choose
the number that best represents their pain state. Sometimes pain scales use visual anchors as
shown in the lower portion of the figure below.

Visual Pain Scale

In the upper portion of the figure, certainly 10 is worse than 9, which is worse than 8; however,
the difference between adjacent scores may not necessarily be the same. It is important to
understand how outcomes are measured to make appropriate inferences based on statistical
analysis and, in particular, not to overstate precision.

Assigning Ranks

The nonparametric procedures that we describe here follow the same general procedure. The
outcome variable (ordinal, interval or continuous) is ranked from lowest to highest and the
analysis focuses on the ranks as opposed to the measured or raw values. For example, suppose
we measure self-reported pain using a visual analog scale with anchors at 0 (no pain) and 10
(agonizing pain) and record the following in a sample of n=6 participants:

7 5 9 3 0 2
The ranks, which are used to perform a nonparametric test, are assigned as follows: First, the
data are ordered from smallest to largest. The lowest value is then assigned a rank of 1, the
next lowest a rank of 2 and so on. The largest value is assigned a rank of n (in this example,
n=6). The observed data and corresponding ranks are shown below:

Ordered Observed Data: 0 2 3 5 7 9


Ranks: 1 2 3 4 5 6

A complicating issue that arises when assigning ranks occurs when there are ties in the sample
(i.e., the same values are measured in two or more participants). For example, suppose that the
following data are observed in our sample of n=6:

Observed Data: 7 7 9 3 0 2

The 4th and 5th ordered values are both equal to 7. When assigning ranks, the recommended
procedure is to assign the mean rank of 4.5 to each (i.e. the mean of 4 and 5), as follows:

Ordered Observed Data: 0.5 2.5 3.5 7 7 9


Ranks: 1.5 2.5 3.5 4.5 4.5 6

Suppose that there are three values of 7. In this case, we assign a rank of 5 (the mean of 4, 5
and 6) to the 4th, 5th and 6th values, as follows:

Ordered Observed Data: 0 2 3 7 7 7


Ranks: 1 2 3 5 5 5

Using this approach of assigning the mean rank when there are ties ensures that the sum of the
ranks is the same in each sample (for example, 1+2+3+4+5+6=21, 1+2+3+4.5+4.5+6=21 and
1+2+3+5+5+5=21). Using this approach, the sum of the ranks will always equal n(n+1)/2. When
conducting nonparametric tests, it is useful to check the sum of the ranks before proceeding
with the analysis.

To conduct nonparametric tests, we again follow the five-step approach outlined in the
modules on hypothesis testing.

1. Set up hypotheses and select the level of significance α. Analogous to parametric


testing, the research hypothesis can be one- or two- sided (one- or two-tailed),
depending on the research question of interest.
2. Select the appropriate test statistic. The test statistic is a single number that summarizes
the sample information. In nonparametric tests, the observed data is converted into
ranks and then the ranks are summarized into a test statistic.
3. Set up decision rule. The decision rule is a statement that tells under what
circumstances to reject the null hypothesis. Note that in some nonparametric tests we
reject H0 if the test statistic is large, while in others we reject H0 if the test statistic is
small. We make the distinction as we describe the different tests.
4. Compute the test statistic. Here we compute the test statistic by summarizing the ranks
into the test statistic identified in Step 2.
5. Conclusion. The final conclusion is made by comparing the test statistic (which is a
summary of the information observed in the sample) to the decision rule. The final
conclusion is either to reject the null hypothesis (because it is very unlikely to observe
the sample data if the null hypothesis is true) or not to reject the null hypothesis
(because the sample data are not very unlikely if the null hypothesis is true).

What is Chi Square Test?

Any statistical test that uses the chi square distribution can be called chi square test. Chi-square
test is conducted a statistical test to investigate difference, and it is denoted by χ2 . The chi-
square test measures the difference between a statistically generated expected result and an
actual result to see if there is a statistically significant difference between them. It measure the
goodness of fit between an expected and an actual result.

Chi Square Test Formula

The formula for Chi Square is defined as follows:

χ2=∑(O−E)2E

Where,

χ2 - Chi Square

O - Observed sample in each category.

E - Expected frequency in corresponding category.

Chi Square Test Degrees of Freedom

The degree of freedom for the chi square difference test is equal to the difference between
degree of freedom associated with the models. Each type of two way table has its own chi-
square distribution, depending on the number of rows and columns, and each chi-square
distribution is identified by its degree of freedom. A two way table with r rows and c column
uses a chi-square distribution with (r - 1)*(c - 1) degree of freedom.
1. For one degree of freedom, the distribution looks like a hyperbola.
2. For than one degree of freedom, it loos like a mound that has a long right tail.

Chi Square Test of Independence

Chi square test is applied when we have two categorical variables from a single population. It is
used to determine whether there is a significant association between the two variables. This
test is applicable when the observations are independent (random). The Chi-square test for
independence is also called a contingency table Chi-square test.

Chi Square Test of Independence Example

For a given population, we consider two attributes and we may find the dependence between
them. We have a set of workers in a factory and we try to classify them as smokers and non-
smokers. The same workers are classified again as 'men' and 'women'. Here, we may find that
the number of smokers are more in men than in women. So, we say that the attributes
'smoking' and 'sex' is dependent (associated).

Chi Square Goodness of Fit Test

This test is applicable when the observations are independent (random) and the total frequency
should be large. This test is used to test association of variables in two-way tables where the
assumed model of independence is evaluated against the observed data. The chi-square
goodness of fit test is that it can be applied to any univariate distribution for which you can
calculate the cumulative distribution function. The chi-square goodness-of-fit test can be
applied to discrete distributions such as the binomial and the Poisson.

Chi-square test statistic is of the form

χ2=∑(Observed value - Expected value)2Expected value


Degree of Freedom for the Chi-Square Test for Goodness of Fit

The number of degree of freedom that we calculate for the Chi-square test for goodness of fit
reflects the number of categories that we are comparing minus one.

Degree of freedom (df) = c - 1


Chi Square Difference Test

The chi square difference test is very useful both for making simpler models more complex and
for making complex models simpler. A more accurate test can be obtained by performing a chi
square difference test.

 Estimating the original model.


 Estimating the revised model in which new path has been added.
 Calculating the difference between the two resulting chi square values.

The resulting chi square difference statistic also has a chi square distribution. The degree of
freedom for the chi square difference test is equal to the difference between degree of
freedom associated with the models. When the chi square difference is statisticant, the model
with the smaller chi-square is considered to fit the data better than the model with the higher
chi-square.

Chi Square Test of Homogeneity

The chi square test of homogeneity is used to test the differences between two popuations that
are homogeneous with respect to some characteristics. In this test categories are assumed
mutually and exhaustively exclusive. The test statistics for chi square test of homogeneity is the
same as that for chi square of association.

χ2 = ∑mi=1∑nj=1(Oij−Eij)2Eij

Where, df = (m - 1)(n - 1).

Chi Square Test of Association

Chi-square test of association is equivalent to the Chi-square test of independence and the Chi-
square test of homogeneity. The Chi-square test of association is used to determine whether
there is an association between two or more categorical variables. In the Chi-square test of
association the expected proportions are known a priori, for the Chi-square test of association
the expected proportions are not known a priori but must be estimated from the sample data.

Chi Square Test for Trend

The chi-square test for trend tests is a linear trend between rows and the columns of the table.
It only makes sense when the rows are arranged in a natural order (such as by age or time), and
are equally spaced. A large chi-square statistic indicates in the table, the observed frequencies
differ markedly from the expected frequencies. When a chi-square is high, examine the table to
determine which cells are responsible. In the chi-squared test for trend, we not only use the
order of the categories, but attach a numerical value. The chi-squared for trend statistic is
always less than the chi-squared for association statistic. The difference between the two chi-
squared statistics follows a Chi-squared distribution if the null hypothesis is true, with degrees
of freedom equal to the difference between the two degrees of freedom.

One Sample Chi Square Test

The one-sample Chi-square test compares the distribution of cases across the categories of a
variable with a hypothesized distribution. The Chi-square test used with one sample is
described as a "goodness of fit" test. It can help you decide whether a distribution of
frequencies for a variable in a sample is representative of, or "fits", a specified population
distribution. The one sample Chi-square test is used to test a hypothesis such as 'suicide rate
varies significant from month to month'. If the hypothesis is false, the suicide rate will be the
same for one of the twelve months. The one sample Chi-square test can be used to compare
observed suicide rates per month with what would be expected if the rate were equal for the
all months.

Chi Square Test Interpretation

The chi-square test measures the difference between a statistically generated expected result
and an actual result to see if there is a statistically significant difference between them. After
finding the Chi-square value and the degree of freedom are known, a standard table of Chi-
square values can be consulted to determine the corresponding p-value. The p value indicates
the probability that a Chi-square value that large would have resulted from the chance.

Chi Square Test Assumptions

The chi-square test have some important assumption.

 For the chi-square test to be meaningful it is imperative that each person, item or entity
contributes to only one cell of the contingency table.
 Both independent and dependent variables are categorical with two or more levels.
 The data consist of frequencies, not scores.
 Each randomly selected observation can be classified into only one category for the
independent variable and only one category for the dependent variable.

Purpose of Chi Square Test

Chi-square test is one of the simplest and most widely used non-parametric tests. The chi-
square test is the most commonly used method for comparing frequencies or proportions. It is
a statistical test used to compare observed data with data that would be expected according to
a given hypothesis. It is very popularly known as test of "goodness of fit" for the reason that it
enables us to ascertain how appropriately the theoretical distributions.

Chi Square Test Table

Table for Chi square test is given below:

Chi Square Test Example

Chi-Square Test

1.) Goodness-of-fit-test
Ho: fo=fe where fo= observed frequency
Ha: fo≠fe fe= expected frequency
df= c-1
∑(𝑓𝑜 − 𝑓𝑒)2
𝑋𝐶 2 =
𝑓𝑒
Example:

The city distributor of air conditioners in the city of Manila has divided the area into four sub-
areas. A prospective buyer of the distributorship was told that the installations of the equipment
are equally distributed. The prospective buyer took a random sample of 40 installed performed
during the past years from the corporation file and found the following:

SUB AREAS A B C D TOTAL


NO INSTALL 6 12 14 8 40

Based on the information can we say that the units are equally distributed? Use α= 0.05

1.) Ho: fo=fe


Ha:fo≠fe

2.) Chi-square test

3.) α= 0.05
Xtab= 7.815
Df= c-1
4-1
Df= 3

4.) reject Ho if XC2 > Xtab2

5.) fe= 40/4


= 10
XC2= (6-10)2+(12-10)2+(14-10)2+(8-10)2
10
2
XC = 4

6.) Since XC2 < Xtab2 do not reject Ho

7.) The units are equally distributed.

2.) Test of Independence

∑𝑟∑𝑐
𝑓𝑒 =
𝑛
Where: r= row
c= column
df=(r-1)(c-1)
Example: a survey was conducted to determine whether gender and age are related among
stereo shop customers. A total of 200 respondents was taken and the results are presented in
the table.

Age Gender
Male fe Female fe TOTAL
Under 30 60 77 50 33 110
30 and over 80 63 10 27 90
TOTAL 140 60 200

Conduct attest whether gender and age of stereo shop customers are independent at 1% level
of significance

1.) Ho: Gender and age of stereo shop customers are independent
Ha: Gender and age of stereo shop customers are dependent

2.) Chi- square test

3.) α= 0.9
Df= (2-1)(2-1)
=1
2
Xtab = 6.635

4.) Reject Ho if XC2 > Xtab2

∑(𝑓𝑜−𝑓𝑒)2
5.) Xc 2 =
𝑓𝑒

(60 − 77)2 (80 − 63)2 (50 − 33)2 (10 − 27)2


𝑋𝑐 2 = + + +
77 63 33 27

XC2= 27.80

6.) Since XC2 > Xtab2 ; reject Ho

7.) Gender and age of stereo shop customers are independent

A certain school classified 1725 students according to the intelligence and family economic
levels. The results is as follows:

Economic Level Intelligence


Dull fe Average fe Intelligent fe TOTAL
Rich 81 128.67 322 347.31 273 160.01 636
Middle class 141 151.94 457 410.11 153 188.95 751
Poor 127 68.38 163 184.58 148 85.04 338
TOTAL 349 742 438 1725

Using this results, can we conclude that intelligence is related to the economic level? Use 1%
level of significance

1.) Ho: Intelligence is not related to the economic level


Ha: Intelligence is related to the economic level

2.) Chi- square test

3.) α= 0.01
Df= (3-1)(3-1)
=4
Xtab2= 13.277

4.) Reject Ho if XC2 > Xtab2

∑(𝑓𝑜−𝑓𝑒)2
5.) Xc 2 =
𝑓𝑒

2
(81 − 128.67)2 (322 − 347.31)2 (273 − 160.01)2 (141 − 151.94)2
𝑋𝑐 = + + +
128.67 347.31 160.01 151.94
(457 − 410.11)2 (153 − 188.95)2 (127 − 68.38)2
+ + +
410.11 188.95 68.38
2 2
(163 − 184.58) (148 − 85.04)
+ +
184.58 85.04

XC2= 137.70

6.) Since XC2 > Xtab2 ; reject Ho

7.) Intelligence is related to the economic level

Given below are some of the examples on chi square test.

Exercises
Question 1:
Find the chi square for the following given datas

Color Blue Black Brown Yellow


Observed frequency 5 15 10 20
Expected frequency 10 20 5 30

Answer:

(For blue, Observed frequency - Expected frequency = 5-10 = -5

For black, Observed frequency - Expected frequency = 15-20 = -5

For brown, Observed frequency - Expected frequency = 10-5 = 5

For yellow, Observed frequency - Expected frequency = 20-30 = -10

=9.58333)

Question 2:

Find the chi square for the following given datas

Color Blue Black Brown Yellow


Observed 10 5 25 35
frequency
Expected 15 30 30 25
frequency

Answer:
(27.3332)

Question 3:

Find the chi square for the following given datas

Color Blue Black Brown Yellow


Observed 23 24 32 23
frequency
Expected 12 32 25 21
frequency

Answer:

(14.2338)

Question 4:

Determine whether the gender and shoe size are dependent among the students
of section ChE 4102 and instructors from Chemical Engineering Department of
Batangas State University. Use 0.01for level of significance.
Data
Shoe Size
Gender below 8 8 and above TOTAL
Male 2 13 15
Female 23 11 34
TOTAL 25 24 49

Answer:
(Gender and shoe size are dependent.)

Question 5:
49 samples are selected from the group of male and female students of section ChE
4102 from Chemical Engineering Department of Batangas State University with their
instructors. Determine whether gender and height are independent among the students
and their professor. Use 0.01 level of significance. Given data below:
GENDER HEIGHT
below 160 cm 160 cm and above TOTAL
Male 1 14 15
Female 16 18 34
TOTAL 17 32 49

Answer:
(Gender and height are dependent.)

Question 6:

Reference:
http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/BS/BS704_Nonparametric/BS704_Nonparametric_print.html
http://math.tutorvista.com/statistics/chi-square-test.html

You might also like