TEST OF STATISTICS
Yunisa Astiarani
Dept. Public Health & Nutrition
NULL HYPOTHESIS
The null hypothesis reflects that there will be no observed effect for our experiment/study.
In a mathematical formulation of the null hypothesis there will typically be an equal
sign( =, ≥, ≤ ). This hypothesis is denoted by H0.
We hope to obtain a small enough p-value that it is lower than our level of significance
alpha and we are justified in rejecting the null hypothesis. If our p-value is greater than
alpha, then we fail to reject the null hypothesis.
ALTERNATIVE HYPOTHESIS
The alternative or experimental hypothesis reflects that there
will be an observed effect for our experiment (a total opposite
from H0)
In a mathematical formulation of the alternative hypothesis
there will typically be an inequality, or not equal to symbol (≠,
˃, <). This hypothesis is denoted by either Ha or by H1.
BASIC CONCEPT OF HYPOTHESIS
TESTING
A hypothesis may be defined
simply as a statement about one
or more populations.
By means of hypothesis testing
one determines whether or not
such statements are compatible
with the available data
PRECAUTION
When we fail to reject a null
It should be pointed out that
hypothesis, therefore, we do not
neither hypothesis testing nor
say that it is true, but that it may
statistical inference, in general,
be true. When we speak of
leads to the proof of a hypothesis;
accepting a null hypothesis, we
it merely indicates whether the
have this limitation in mind and do
hypothesis is supported or is not
not wish to convey the idea that
supported by the available data.
accepting implies proof.
The administrative or clinical decision usually depends on the statistical decision. If the
null hypothesis is rejected, the administrative or clinical decision usually reflects this,
in that the decision is compatible with the alternative hypothesis. The reverse is usually
true if the null hypothesis is not rejected.
The administrative or clinical decision, however, may take other forms, such as a
decision to gather more data.
PRECAUTION
Data
• The nature of the data that form the
basis of the testing procedures must be
understood, since this determines the
particular test to be employed.
Whether the data consist of counts or
measurements, for example, must be HYPOTHESIS
determined MA K I N G S T E P S
Assumptions
• Include assumptions about the
normality of the population
distribution, equality of variances, and
independence of samples
3). Hypotheses
• There are two statistical hypotheses involved in
hypothesis testing, and these should be stated
explicitly
• Null hypothesis is hypothesis to be tested
(H0)
• Alternative hypothesis is a statement of what
we will believe is true if our sample data
HYPOTHESES cause us to reject the null hypothesis (H1)
MA K I N G S T E P S
4) Statistical Test
5) Distribution of test statistic
6) Decision rule
HYPOTHESES MAKING STEPS
7. Calculation of test statistic
8. Statistical Decision
9. Conclusion
10. P Value
The term level of significance reflects the
fact that hypothesis tests are sometimes
called significance tests, and a computed
value of the test statistic that falls in the
rejection region is said to be significant.
The level of significance, α, specifies the
LEVEL OF area under the curve of the distribution of
SIGNIFICANC the test statistic that is above the values on
E the horizontal axis constituting the rejection
region.
The level of significance α is a probability
and, in fact, is the probability of rejecting
a true null hypothesis (0.1, 0.05, 0.01)
In hypothesis testing, test statistic serves as
a decision maker, since the decision to
reject or not to reject the null hypothesis
depends on the magnitude of the test
statistic.
However, that the outcome of the statistical
TEST OF test is only one piece of evidence that
influences the administrative or clinical
STATISTIC decision.
The statistical decision should not be
interpreted as definitive but should be
considered along with all the other relevant
information available to the experimenter
INFERENTIAL STATISTICS
Parametric Nonparametric
methods methods
PARAMETRIC METHODS
• The hypothesis testing procedures highlighted in the remainder of this
lecture generally examine the case of normally distributed data or cases
where the procedures are appropriate because the central limit theorem
applies
• In practice, it is not uncommon for samples to be small relative to the size of
the population, or to have samples that are highly skewed, and hence the
assumption of normality is violated. Methods to handle this situation, that is
distribution-free or nonparametric methods.
NORMAL
DISTRIBUTION
• It is symmetrical about its
mean
• The mean, the median, and
the mode are all equal.
• The total area under the
curve above the x-axis is
one square unit. Because
of the symmetry already
mentioned, 50 percent of
the area is to the right of a
perpendicular erected at
the mean, and 50 percent
is to the left.
• For the case where sampling is from a
nonnormally distributed population,
we refer to an important mathematical
theorem known as the central limit
theorem
CENTRAL
LIMIT “Given a population of any non normal
THEOREM functional form with a mean µ and finite
variance σ², the sampling distribution of
x̄ , computed from samples of size n
from the population, will be
approximately normally distributed
when the sample size is large”
TEST OF
STATISTICS
A single population mean
The difference of two population means
Paired comparisons
A singe population proportion
The difference of two population proportions
The ratio of two population variances
Chi square distribution (categorical variable)
ANOVA (One way, Two way)
A SINGLE POPULATION
MEAN
• Sampling from Normally Distributed
Populations: Population Variances
Known
• One-sided or two-sided tests may be
made, depending on the question being
asked.
ONE AND TWO
SIDED TEST
• Alternative hypothesis can be
either two sided (not equal)
HA ≠ 10 or one sided on either
side (less than/more than) HA
< 10, HA> 10
• In order to make decision rule on
the purpose rejecting null
hypothesis (the value is different)
CASE 1 (TWO SIDED)
• Researchers are interested in the mean age of a certain population. Let
us say that they are asking the following question: Can we conclude that
the mean age of this population is different from 30 years?
• Data. simple random sample of 10 individuals drawn from the
population of interest. From this sample a mean of x = 27 has been
computed
• Assumptions. It is assumed that the sample comes from a population
whose ages are approximately normally distributed. Let us also
assume that the population has a known variance of σ²=20
CASE 1 CONT.
• Hypotheses: H0: µ=30 HA: µ ≠ 30
• Test statistic :
• Distribution of test statistic. Following z distribution
• The decision rule : let α= 0.05. Since we have a two-sided
test, we put α/2 = 0.025 in each tail of the distribution of
our test statistics. Critical value of Z is 1.96. Reject H0 IF z
≥ 1.96 or z ≤-1.96
CASE 1 CONT.
• Calculation of test statistic. From our sample we compute
• Statistical decision. Abiding by the decision rule, we are able to reject the null
hypothesis since -2.12 < -1.96, is in the rejection region
• Conclusion. We conclude that µ is not equal to 30 and let our administrative or
clinical actions be in accordance with this conclusion
• p values. P <0.05 (more frequently used to interpret either the test reject or fail
to reject H0)
CASE 1(ONE SIDED)
• Suppose, instead of asking if they could conclude that µ≠ 30, the researchers had
asked: Can we conclude that µ < 30? To this question we would reply that they
can so conclude if they can reject the null hypothesis that µ ≥ 30.
A SINGLE POPULATION MEAN
• Sampling from a Normally Distributed Population: Population Variance
Unknown
CASE 2
• Nakamura et al. studied subjects with medial collateral ligament (MCL) and
anterior cruciate ligament (ACL) tears. From the study, 17 consecutive
patients with combined acute ACL and grade III MCL injuries treated by the
same physician at the research center. The variable of interest was the length
of time in days between the occurrence of the injury and the first magnetic
resonance imaging (MRI). We wish to know if we can conclude that the
mean number of days between injury and initial MRI is not 15 days in a
population presumed to be represented by these sample data.
CASE 2 CONT.
CASE 2 CONT.
• Population variance unknown
• Let α = 0.05. Since we have a two-sided test, we put α/2 = 0.025 in each tail of the
distribution of our test statistic.
• The decision rule. Reject H0 if critical t value ≥ 2.1199 or ≤-2.1199
• Calculation of test statistic
• Statistical decision. Do not reject H0, since -.791 falls in the nonrejection region.
• P Value >0.05
THE DIFFERENCE OF TWO
POPULATION MEANS
• Sampling from Normally Distributed Populations: Population Variances
Known
• Sampling from Normally Distributed Populations: Population Variances
Unknown
Equal Inequal
Variances Variances
HYPOTHESIS IN DIFFERENCE OF 2
POPULATION MEANS
CASE 3. VARIANCE KNOWN
P value < 0.05
PAIRED COMPARISONS
• In our previous discussion involving the
difference between two population means, it was
assumed that the samples were independent.
• A method frequently employed for assessing the
effectiveness of a treatment or experimental
procedure is one that makes use of related
observations resulting from nonindependent
samples. A hypothesis test based on this type of
data is known as a paired comparisons test.
EXAMPLE TABLE FOR PAIRED
COMPARISON
RESULT FROM STATS SOFTWARE
The gallbladder function in post op patients is significantly increase.
A SINGLE POPULATION
PROPORTION
• Testing hypotheses about population
proportions is carried out in much the same
way as for means when the conditions
necessary for using the normal curve are
met. One-sided or two-sided tests may be
made, depending on the question being
asked.
CASE 3
• Wagenknecht et al. collected data on a sample of 301
Hispanic women living in San Antonio, Texas. One variable
of interest was the percentage of subjects with impaired
fasting glucose (IFG). In the study, 24 women were
classified in the IFG stage. The article cites population
estimates for IFG among Hispanic women in Texas as 6.3
percent. Is there sufficient evidence to indicate that the
population of Hispanic women in San Antonio has a
prevalence of IFG higher than 6.3 percent?
CASE 3 CONT.
RESULT FROM STATS SOFTWARE
There IS NO sufficient evidence to indicate that the population of Hispanic women in San
Antonio has a prevalence of IFG higher than 6.3 percent.
TES T OF S TATI S TI CS – THE
DI F F ERENCES BETWEEN TW O
P OP ULATI ON P ROP ORTI ONS
• The most frequent test employed relative to
the difference between two population
proportions is that their difference is zero. It
is possible, however, to test that the
difference is equal to some other value.
• Both one-sided and two-sided tests may be
made.
CASE 4
• One study examined the stature of men and women with Noonan
syndrome. The study contained 29 male and 44 female adults. One
of the cut-off values used to assess stature was the third percentile
of adult height, 11 of the males fell below the third percentile of
adult male height, while 24 of the females fell below the third
percentile of female adult height. Does this study provide sufficient
evidence for us to conclude that among subjects with Noonan
syndrome, females are more likely than males to fall below the
respective third percentile of adult height? Let α = 0.05.
CASE 4 CONT.
RESULT OF STATS SOFTWARE
There is no sufficient evidence that females are more likely than males to fall below the
respective third percentile of adult height
THE RATIO OF TWO
POPULATION VARIANCES
• Variance Ratio Test. Decisions regarding the
comparability of two population variances are
usually based on the variance ratio test, which
is a test of the null hypothesis that two
population variances are equal. When we test
the hypothesis that two population variances
are equal, we are, in effect, testing the
hypothesis that their ratio is equal to 1
• Following F distribution by using F test
CASE 5
• Borden et al. compared meniscal repair techniques using cadaveric
knee specimens. One of the variables of interest was the load at
failure (in newtons) for knees fixed with the FasT-FIX technique
(group 1) and the vertical suture method (group 2). Each technique
was applied to six specimens. The standard deviation for the FasT-
FIX method was 30.62, and the standard deviation for the vertical
suture method was 11.37. Can we conclude that, in general, the
variance of load at failure is higher for the FasT-FIX technique than
the vertical suture method?
CASE 5 CONT.
• With hypothesis
• Statistical decision. We reject H0, since
7:25 > 5.05; that is, the computed ratio
falls in the rejection region.
• Conclusion. The failure load variability
is higher when using the FasTFIX
method than the vertical suture method.
• Decision rule. Let α:05. The • p value. Because the computed V.R. of
critical value of F, is 5.05. 7.25 is greater than 5.05, the p value for
• Calculation of test statistic this test is less than 0.05.
CHI-SQUARE DISTRIBUTION
• The chi-square distribution is the most frequently employed
statistical technique for the analysis of count or frequency
data (categorical variable)
• Tests of goodness-of-fit, tests of independence, and tests of
homogeneity
OBSERVED FREQUENCIES AND
EXPECTED FREQUENCIES
• The observed frequencies are the number of subjects or objects in our
sample that fall into the various categories of the variable of interest. For
example, if we have a sample of 100 hospital patients, we may observe that
50 are married, 30 are single, 15 are widowed, and 5 are divorced.
• Expected frequencies are the number of subjects or objects in our sample
that we would expect to observe if some null hypothesis about the variable
is true. For example, our null hypothesis might be that the four categories
of marital status are equally represented in the population from which we
drew our sample. In that case we would expect our sample to contain 25
married, 25 single, 25 widowed, and 25 divorced
CHI-SQUARE FORMULA
DECISION RULE
• The quantity of chi squared will be
small if the observed and expected
frequencies are close together and will
be large if the differences are large.
• The decision rule, then, is: Reject H0 if
x² is greater than or equal to the
tabulated x² for the chosen value of α.
• a goodness-of-fit test is appropriate when one
wishes to decide if an observed distribution of
frequencies is incompatible with some
preconceived or hypothesized distribution.
A GOODNESS-
OF-FIT TEST
TEST OF INDEPENDENCE
• Perhaps the most frequent, use of the chi-square
distribution is to test the null hypothesis that two
criteria of classification, when applied to the same
set of entities, are independent
• We say that two criteria of classification are
independent if the distribution of one criterion is
the same no matter what the distribution of the
other criterion.
• For example, if socioeconomic status and area of
residence of the inhabitants of a certain city are
independent, we would expect to find the same
proportion of families in the low, medium, and
high socioeconomic groups in all areas of the city
THE CONTINGENCY TABLE
• The classification, according to two criteria, of a set of entities, say,
people, can be shown by a table in which the r (rows) represent the
various levels of one criterion of classification and the c (columns)
represent the various levels of the second criterion. Such a table is
generally called a contingency table, with dimension r x c.
THE CONTINGENCY TABLE
THE CONTINGENCY TABLE
HYPOTHESIS TESTING - TEST OF
INDEPENDENCE
• We will be interested in testing the null hypothesis that in the
population the two criteria of classification are independent. If the
hypothesis is rejected, we will conclude that the two criteria of
classification are not independent.
CASE EXAMPLE
CASE EXAMPLE CONT.
R E S U LT
FROM
S TAT I S T I C A
L S O F T WA R E
TEST OF HOMOGENEITY
• The test of independence is concerned with the question: Are the
two criteria of classification independent?
• The homogeneity test is concerned with the question: Are the
samples drawn from populations that are homogeneous with
respect to some criterion of classification?
• The null hypothesis of this test, states that the samples are drawn
from the same population.
THANK YOU