You are on page 1of 6

Confidence Interval- is a range of values that’s likely to include a population value with a certain degree of

confidence.

Suppose a group of researchers is studying the heights of high school going children. The researchers take
a random sample from the population and establish a mean height of 74 inches.

The mean of 74 inches is a point estimate of the population mean.

What does true population mean? Whether it is the same as the sample mean or whether it is greater or
smaller than the sample mean, what will be the range of values that the population must belong within? How
confident are you in your claim?

Confidence intervals provide more information than point estimates. By establishing a 95% confidence
interval using the sample's mean and standard deviation, and assuming a normal distribution as represented
by the bell curve, the researchers arrive at an upper and lower bound that contains the true mean 95% of the
time.

Assume the interval is between 72 inches and 76 inches. If the researchers take 100 random samples from
the population of high school going children as a whole, the mean should fall between 72 and 76 inches in
95 of those samples.

If the researchers want even greater confidence, they can expand the interval to 99% confidence. Doing so
invariably creates a broader range, as it makes room for a greater number of sample means. If they establish
the 99% confidence interval as being between 70 inches and 78 inches, they can expect 99 of 100 samples
evaluated to contain a mean value between these numbers.

A 90% confidence level, on the other hand, implies that we would expect 90% of the interval estimates to
include the population parameter, and so forth.

Degrees of Freedom: Degrees of freedom are the number of independent variables that can be estimated in
a statistical analysis and tell you how many items can be randomly selected before constraints must be put
in place.

Simple formula for one sample situation: df=n-1

Example 1: Consider a data sample consisting of five positive integers. The values of the five integers
must have an average of six. If four items within the data set are {3, 8, 5, and 4}, the fifth number must be
10. Because the first four numbers can be chosen at random, the degree of freedom is four.
 One-sample t-test

 Two-independent sample t-test

 Paired sample t-test

 One-sample t-test

Why one sample t-test?


To test whether sample mean is same to population mean or almost same to population mean.

H0: x̅ =μ0

H0: Sample mean is equal to true population mean

H1: x̅ ≠μ0; x̅ >μ0; x̅ <μ0

H1: Sample mean is not equal to true population mean/ Sample mean is greater than the true population
mean/ Sample mean is smaller than the true population mean

One-Sample Statistics
N Mean Std. Deviation Std. Error Mean
Age 200 35.29 8.571 .606

The table indicate that there were 200 participants having mean and standard deviation are 35.29 and 8.571,
respectively.

One-Sample Test
Test Value = 70
95% Confidence Interval of the
Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
Age -57.271 199 .000 -34.710 -35.91 -33.51

Table- reveals that t199 = -57.271 p< .001 for two tailed test. Therefore, the age 70 is not significantly differ
from the mean age of the sample.
Independent Samples t-test

The Independent Samples t Test is commonly used to test the following:

 Statistical differences between the means of two groups


 Statistical differences between the means of two interventions
 Statistical differences between the means of two change scores

Note: The Independent Samples t Test can only compare the means for two (and only two)
groups. It cannot make comparisons among more than two groups. If you wish to compare the
means across more than two groups, you will likely want to run an ANOVA.

Data Requirements

Your data must meet the following requirements:

1. Dependent variable that is continuous (i.e., interval or ratio level)


2. Independent variable that is categorical and has exactly two categories
3. Cases that have values on both the dependent and independent variables
4. Independent samples/groups (i.e., independence of observations)
 There is no relationship between the Participants in each sample. This means that:
 Participants in the first group cannot also be in the second group
 No subject in either group can influence Participants in the other group
 No group can influence the other group
 Violation of this assumption will yield an inaccurate p value
5. Random sample of data from the population
6. Normal distribution (approximately) of the dependent variable for each group
 Non-normal population distributions, especially those that are thick-tailed or heavily skewed,
considerably reduce the power of the test
 Among moderate or large samples, a violation of normality may still yield accurate p values
7. Homogeneity of variances (i.e., variances approximately equal across groups)
 When this assumption is violated and the sample sizes for each group differ, the p value is not
trustworthy. However, the Independent Samples t Test output also includes an approximate t statistic
that is not based on assuming equal population variances. This alternative statistic, called the
Welch t Test statistic1, may be used when equal variances among populations cannot be assumed. The
Welch t Test is also known an Unequal Variance t Test or Separate Variances t Test.
8. No outliers
Note: When one or more of the assumptions for the Independent Samples t Test are not met, you
may want to run the nonparametric Mann-Whitney U Test instead.

Researchers often follow several rules of thumb:

 Each group should have at least 6 subjects, ideally more. Inferences for the population will be more tenuous
with too few subjects.
 A balanced design (i.e., same number of subjects in each group) is ideal. Extremely unbalanced designs
increase the possibility that violating any of the requirements/assumptions will threaten the validity of the
Independent Samples t Test.
Hypotheses
The null hypothesis (H0) and alternative hypothesis (H1) of the Independent Samples t Test can be
expressed in two different but equivalent ways:
H0: µ1 = µ2 ("the two population means are equal")
H1: µ1 ≠ µ2 ("the two population means are not equal")

OR

H0: µ1 - µ2 = 0 ("the difference between the two population means is equal to 0")
H1: µ1 - µ2 ≠ 0 ("the difference between the two population means is not 0")
where µ1 and µ2 are the population means for group 1 and group 2, respectively. Notice that the
second set of hypotheses can be derived from the first set by simply subtracting µ2 from both sides
of the equation.
Levene’s Test for Equality of Variances
Recall that the Independent Samples t Test requires the assumption of homogeneity of variance --
i.e., both groups have the same variance. SPSS conveniently includes a test for the homogeneity of
variance, called Levene's Test, whenever you run an independent samples t test.

The hypotheses for Levene’s test are:

2 2
H0: σ1 - σ2 = 0 ("the population variances of group 1 and 2 are equal")
2 2
H1: σ1 - σ2 ≠ 0 ("the population variances of group 1 and 2 are not equal")

 This implies that if we reject the null hypothesis of Levene's Test, it suggests that the
variances of the two groups are not equal; i.e., that the homogeneity of variances
assumption is violated.

 If Levene’s test indicates that the variances are equal across the two groups (i.e., p-value
large), you will rely on the first row of output, Equal variances assumed, when you look
at the results for the actual Independent Samples t Test (under the heading t-test for
Equality of Means).
 If Levene’s test indicates that the variances are not equal across the two groups (i.e., p-value
small), you will need to rely on the second row of output, Equal variances not assumed,
when you look at the results of the Independent Samples t Test (under the heading t-test for
Equality of Means).
 The difference between these two rows of output lies in the way the independent
samples t test statistic is calculated.
When equal variances are assumed, the calculation uses pooled variances; when equal variances
cannot be assumed, the calculation utilizes un-pooled variances and a correction to the degrees
of freedom.

Example:

H0: µ1 = µ2 ("There is no significant difference between wellbeing of male and female")


H1: µ1 ≠ µ2 ("The participants’ wellbeing differed based on their gender”)
=.05/.01

Group Statistics
Sex N Mean Std. Deviation Std. Error Mean
Total Male 145 139.61 14.576 1.210
Female 55 132.87 14.244 1.921

Independent Samples Test


Levene's Test for
Equality of Variances t-test for Equality of Means
95% Confidence Interval
Sig. (2- Mean Std. Error of the Difference
F Sig. t df tailed) Difference Difference Lower Upper
Total Equal variances .728 .394 2.938 198 .004 6.741 2.294 2.217 11.265
assumed
Equal variances not 2.969 99.529 .004 6.741 2.270 2.237 11.245
assumed

Levene's Test for Equality of of Variances: This section has the test results for Levene's Test.
From left to right:
 The p-value of Levene's test is printed as ".000"; p very small), so we we reject the null of
Levene's test.
 This tells us that we should look at the "Equal variances not assumed" row for
the t test (and corresponding confidence interval) results.
 (If this test result had not been significant -- that is, if we had observed p > α -- then we
would have used the "Equal variances assumed" output.)

t-test for Equality of Means provides the results for the actual Independent Samples t Test.
From left to right:
 t is the computed test statistic, using the formula for the equal-variances-assumed test statistic (first
row of table) or the formula for the equal-variances-not-assumed test statistic (second row of table)
 df is the degrees of freedom, using the equal-variances-assumed degrees of freedom formula (first
row of table) or the equal-variances-not-assumed degrees of freedom formula (second row of table)
 Sig (2-tailed) is the p-value corresponding to the given test statistic and degrees of freedom
 Mean Difference is the difference between the sample means, i.e. x1 − x2.
 Std. Error Difference is the standard error of the mean difference estimate; it also corresponds to the
denominator of the test statistic for that test
Confidence Interval of the Difference: This part of the t-test output complements the
significance test results. Typically, if the CI for the mean difference contains 0 within the interval --
i.e., if the lower boundary of the CI is a negative number and the upper boundary of the CI is a
positive number -- the results are not significant at the chosen significance level.
DECISION AND CONCLUSIONS
Since p < .001 is less than our chosen significance level α = 0.05, we can reject the null hypothesis,
and conclude that the that the mean wellbeing is differed based on the participants’ gender

Based on the results, we can state the following:

 There was a significant difference in mean wellbeing score between male and female (t198 = 2.938, p< .01, for
two tailed test)
 The average wellbeing score (Mean=139.61; SD=14.576) for male was significantly higher than the average
wellbeing score (Mean=132.87; SD=14.244) for female.

Sex N M SD MD t-value df P-Value


Male 145 139.61 14.576 6.741 2.938* 198 .004
Female 55 132.87 14.244

*t198 = 2.938, p< .01, for two tailed test

p=; H0 Rejected

p>; H0 Accepted

p<; H0 Rejected

You might also like