You are on page 1of 27

Correlation is a statistical measure that expresses the extent to which two variables are linearly

related (meaning they change together at a constant rate). It's a common tool for describing simple
relationships without making a statement about cause and effect.
Correlation means association - more precisely it is a measure of the extent to
which two variables are related. There are three possible results of a correlational
study: a positive correlation, a negative correlation, and no correlation.

 A positive correlation is a relationship between two variables in which both variables move in the
same direction. Therefore, when one variable increases as the other variable increases, or one variable
decreases while the other decreases. An example of positive correlation would be height and weight.
Taller people tend to be heavier.

 A negative correlation is a relationship between two variables in which an increase in one variable is
associated with a decrease in the other. An example of negative correlation would be height above sea
level and temperature. As you climb the mountain (increase in height) it gets colder (decrease in
temperature).

 A zero correlation exists when there is no relationship between two variables. For example there is no
relationship between the amount of tea drunk and level of intelligence.

Some uses of Correlations


Prediction

 If there is a relationship between two variables, we can make predictions


about one from another.
Validity

 Concurrent validity (correlation between a new measure and an established


measure).

Reliability

 Test-retest reliability (are measures consistent).


 Inter-rater reliability (are observers consistent).

Theory verification

 Predictive validity

…..

Basic Inferential Statistics: Theory and Application


The heart of statistics is inferential statistics. Descriptive statistics are typically
straightforward and easy to interpret. Unlike descriptive statistics, inferential statistics
are often complex and may have several different interpretations.
The goal of inferential statistics is to discover some property or general pattern about a
large group by studying a smaller group of people in the hopes that the results will
generalize to the larger group. For example, we may ask residents of New York City
their opinion about their mayor. We would probably poll a few thousand individuals in
New York City in an attempt to find out how the city as a whole views their mayor. The
following section examines how this is done.
Basic Inferential Statistics: Theory and Application

MAIN NAVIGATION

 HOME

 TUTORIALS

 MCQ

 INTERVIEW QUESTIONS

 ASK DOUBT

 LOG IN
Importance of Inferential Statistics
DATA SCIENCE/ FEBRUARY 22, 2021

The foundation of Data Science and Machine Learning algorithms are


Mathematics and Statistics, and in Statistics, we use two types of
statistical methods, Descriptive and Inferential Statistics. 

Inferential Statistics allows you to make decisions based on


extrapolation. In that manner, we can fundamentally distinguish
Inferential Statistics from Descriptive Statistics, which summarize the
measured data. 

There are numerous types of Inferential Statistics, and each is suitable


for a specific research design and sample characteristics.

Inferential Statistics uses random samples for testing and, hence,


allows us to have confidence that the Sample represents the
Population.

Here are some reasons why Inferential Statistics is important:

1. Infer Population Sample: You can use Inferential Statistics


to infer a Sample about the Population. Inferential Statistics
aims to draw some conclusions from the Sample and generalize
them for population data. It concludes that the Sample selected
is statistically significant to the whole Population. It uses
measurements from the Sample of subjects in the experiment to
compare the treatment groups and generalize the larger
Population of subjects.
2. Compare Models: It can compare two models to find which
one is more statistically significant than the other.
3. Estimation: It frequently involves guessing the characteristics
of a population from a sample of the Population and hypothesis
testing (i.e., finding evidence for or against an explanation or
theory).
4. Population Characteristics Conclusion: It is the tool that
statisticians use to draw conclusions about the characteristics of
a population from the characteristics of a sample and to decide
how certain they can be of the reliability of those conclusions.
Inferential statistics are an extension of the natural human tendency
toward inference. They are powerful tools that can help answer
questions such as how much, how many, or how often. An
understanding of the process of statistics can help us be better
consumers of research, prevent us from being misled by invalid or
misinterpreted statistics, and give us another tool in the search for
knowledge

5. A sample is a subset of the population. Just like you may sample different types
of ice cream at the grocery store, a sample of a population should be just a
smaller version of the population.
6. It is extremely important to understand how the sample being studied was drawn
from the population. The sample should be as representative of the population as
possible. There are several valid ways of creating a sample from a population,
but inferential statistics works best when the sample is drawn at random from the
population. Given a large enough sample, drawing at random ensures a fair and
representative sample of a population.
COMPARING TWO OR MORE GROUPS
Much of statistics, especially in medicine and psychology, is used to compare two or
more groups and attempts to figure out if the two groups are different from one another.
Example: Drug X
Let us say that a drug company has developed a pill, which they think increases the
recovery time from the common cold. How would they actually find out if the pill works
or not? What they might do is get two groups of people from the same population (say,
people from a small town in Indiana who had just caught a cold) and administer the pill
to one group, and give the other group a placebo. They could then measure how many
days each group took to recover (typically, one would calculate the mean of each
group). Let's say that the mean recovery time for the group with the new drug was 5.4
days, and the mean recovery time for the group with the placebo was 5.8 days.
The question becomes, is this difference due to random chance, or does taking the pill
actually help you recover from the cold faster? The means of the two groups alone does
not help us determine the answer to this question. We need additional information.
SAMPLE SIZE
If our example study only consisted of two people (one from the drug group and one
from the placebo group) there would be so few participants that we would not have
much confidence that there is a difference between the two groups. That is to say, there
is a high probability that chance explains our results (any number of explanations might
account for this, for example, one person might be younger, and thus have a better
immune system). However, if our sample consisted of 1,000 people in each group, then
the results become much more robust (while it might be easy to say that one person is
younger than another, it is hard to say that 1,000 random people are younger than
another 1,000 random people). If the sample is drawn at random from the population,
then these 'random' variations in participants should be approximately equal in the two
groups, given that the two groups are large. This is why inferential statistics works best
when there are lots of people involved
P-VALUES
In classic inferential statistics, we make two hypotheses before we start our study, the
null hypothesis, and the alternative hypothesis.
Null Hypothesis: States that the two groups we are studying are the same.
Alternative Hypothesis: States that the two groups we are studying are different.
The goal in classic inferential statistics is to prove the null hypothesis wrong. The logic
says that if the two groups aren't the same, then they must be different. A low p-value
indicates a low probability that the null hypothesis is correct (thus, providing evidence
for the alternative hypothesis).
Remember: It's good to have low p-values.
Airline Example

To better clarify the process associated with many statistical inferences,


consider the data in Table 1 (R Core Team, 2016).

Table 1. Airline Passenger Data

1959 1960 Difference

360 417 57

342 391 49

406 419 13

396 461 65

420 472 52
1959 1960 Difference

472 535 63

548 622 74

559 606 47

463 508 45

407 461 54

362 390 28

405 432 27

This is the number of passengers that flew each month on a certain


airline in 1959 and 1960, as well as the differences between the two. A
researcher may want to know if there was a difference in passengers
between the two years. This researcher would first need to clarify the
null and alternative hypotheses and set the alpha level (the level our p-
value has to be before we will believe the conclusions).

H0: The average of 1959 = the average of 1960. (i.e the difference = 0)

Ha: The average of 1959 ≠ the average of 1960. (i.e the difference ≠ 0)

α = 0.05

In other words, the researcher is assuming the two are the same but will
have enough evidence to support that they are different if the p-value is
less than 0.05.

The differences between the two groups is found in column three. Our
first step is to find the mean of this column by adding all of the values
and then dividing by the number
of data points we added together. This gives us a mean of 47.83. This
sample mean is a point estimate, or an approximation, of the true
difference. We know this data follows a certain pattern (Figure 1), called
a normal distribution.

References

Brase, C. H., & Brase, C. P. (2016). Understanding basic statistics (7th


ed.). Boston, Massachusetts: Cengage Learning.

Moses, L. E. (1986). Think and explain with statistics. United States of


America: Addison-Wesley Publishing Company, Inc.

Panik, M. J. (2012). Statistical inference: A short course. Hoboken, New


Jersey: John Wiley & Sons, Inc.

R Core Team (2016). R: A language and environment for statistical


computing. R Foundation for Statistical Computing, Vienna, Austria.
Retrieved from https://www.R-project.org/.

Vaughan, S. (2013). Scientific inference: Learning from data. Cambridge,


United Kingdom: Cambridge University Press.

Keywords: inference, inferential statistics, null hypothesis significance


testing
Example: Inferential statisticsYou randomly select a sample of 11th graders in your state and
collect data on their SAT scores and other characteristics.

You can use inferential statistics to make estimates and test hypotheses about the
whole population of 11th graders in the state based on your sample data
Example: Point estimate and confidence intervalYou want to know the average number of paid
vacation days that employees at an international company receive. After collecting survey
responses from a random sample, you calculate a point estimate and a confidence interval.

Your point estimate of the population mean paid vacation days is the sample mean of
19 paid vacation days.

With random sampling, a 95% confidence interval of [16 22] means you can be
reasonably confident that the average number of vacation days is between 16 and 22.
…..

Hypothesis testing
Hypothesis testing is a formal process of statistical analysis using inferential statistics.
The goal of hypothesis testing is to compare populations or assess relationships
between variables using samples.

Hypotheses, or predictions, are tested using statistical tests. Statistical tests also


estimate sampling errors so that valid inferences can be made.

Statistical tests can be parametric or non-parametric. Parametric tests are considered


more statistically powerful because they are more likely to detect an effect if one exists.

Null hypothesis testing makes use of deductive reasoning to ensure that the truth of conclusions is
irrefutable.

Parametric tests make assumptions that include the following:

 the population that the sample comes from follows a normal distribution of scores
 the sample size is large enough to represent the population
 the variances, a measure of spread, of each group being compared are similar

When your data violates any of these assumptions, non-parametric tests are more
suitable. Non-parametric tests are called “distribution-free tests” because they don’t
assume anything about the distribution of the population data.

Statistical tests come in three forms: tests of comparison, correlation or regression.

Steps in Hypothesis Testing for Quantitative Research Designs Hypothesis testing is a


four phase procedure.
Phase I: Research hypotheses, design, and variables.

1. State your research hypotheses.


2. Decide on a research design based on your research problem, your hypotheses, and
what you really want to be able to say about your results (e.g., if you want to say
that A caused B, you will need an experimental or time-series design; if probable
cause is sufficient, a quasi-experimental design would be appropriate).
3. Operationally define your variables. Recall that one variable can have more than
one operational definition.

Phase II: Statistical Hypotheses


1. Consider your chosen statistical procedures.
2. Write one statistical null hypotheses for each operational definition of each
variable that reflects that statistical operations to be performed.

Phase III: Hypotheses Testing


Complete the following steps for each statistical null hypothesis.
1. Select a significance level (alpha).
2. Compute the value of the test statistic (e.g., F, r, t).
3. Compare the obtained value of the test statistics with the critical value
associated with the selected significance level or compare the obtained p-
value with the pre-selected alpha value.
4. If the obtained value of the test statistic is greater than the critical value (or
if the obtained p-value is less than the pre-selected alpha value), reject the
null hypothesis. If the obtained value is less than the critical value of the test
hypothesis, fail to reject the null hypothesis. Another way of looking it: If p
is less than or equal to alpha, reject the null hypothesis.

Phase IV: Decision/Interpretation


1. For each research hypothesis, consider the decisions regarding the statistical
null hypotheses.
2. For each research hypothesis, consider qualitative contextual information
relating potential plausibility.
3. Cautiously explain your findings with respect to the research hypotheses.
4. List and discuss the limitations (threats to valid inference).

Note: Null hypothesis testing is currently under scrutiny (see e.g., Cohen, 1994; Kirk,
1996).

It is generally recommended that you report the effect size along with the value
of the test statistic and the p-value. An alternative is to report confidence
intervals.
The Logic of Hypothesis Testing
1. The Logic of Hypothesis Testing As just stated, the logic of hypothesis
testing in statistics involves four steps.
1. State the Hypothesis: We state a hypothesis (guess) about a
population. Usually the hypothesis concerns the value of a
population parameter.
2. Define the Decision Method: We define a method to make a
decision about the hypothesis. The method involves sample data.
3. Gather Data: We obtain a random sample from the population.
4. Make a Decision: We compare the sample data with the
hypothesis about the population. Usually we compare the value of
a statistic computed from the sample data with the hypothesized
value of the population parameter.
 If the data are consistent with the hypothesis we conclude
that the hypothesis is reasonable. NOTE: We do not
conclude it is right, but reasonable! AND: We actually do
this by rejecting the opposite hypothesis (called the NULL
hypothesis). More on this later.
 If there is a big discrepency between the data and the
hypothesis we conclude that the hypothesis was wrong.

We expand on those steps in this section:

5. First Step: State the Hypothesis


Stating the hypothesis actually involves stating two opposing
hypotheses about the value of a population parameter.

Example: Suppose we have are interested in the effect of


prenatal exposure of alcohol on the birth weight of rats. Also,
suppose that we know that the mean birth weight of the
population of untreated lab rats is 18 grams.

Here are the two opposing hypotheses:

 The Null Hypothesis (Ho). This hypothesis states that the


treatment has no effect. For our example, we formally
state:
The null hypothesis (Ho) is that prenatal exposure to
alcohol has no effect on the birth weight for the population
of lab rats. The birthweight will be equal to 18 grams. This is
denoted

 The Alternative Hypothesis (H1). This hypothesis states


that the treatment does have an effect. For our example,
we formally state:

The alternative hypothesis (H1) is that prenatal exposure to


alcohol has an effect on the birth weight for the population
of lab rats. The birthweight will be different than 18 grams.
This is denoted

6. Second Step: Define the Decision Method


We must define a method that lets us decide whether the sample
mean is different from the hypothesized population mean. The
method will let us conclude whether (reject null hypothesis) or not
(accept null hypothesis) the treatment (prenatal alcohol) has an
effect (on birth weight).

We will go into details later.

7. Third Step: Gather Data.


Now we gather data. We do this by obtaining a
random sample from the population.

Example: A random sample of rats receives daily


doses of alcohol during pregnancy. At birth, we
measure the weight of the sample of newborn
rats. The weights, in grams, are shown in the
table.

We calculate the mean birth weight.

Experiment 1
Sample Mean = 13

8. Fourth Step: Make a Decision


We make a decision about whether the mean of the sample is
consistent with our null hypothesis about the population mean.

 If the data are consistent with the null hypothesis we


conclude that the null hypothesis is reasonable.

Formally: we do not reject the null hypothesis.

o If there is a big discrepency between the data and the null


hypothesis we conclude that the null hypothesis was wrong.

Formally: we reject the null hypothesis.

Example: We compare the observed mean birth weight with the


hypothesized value, under the null hypothesis, of 18 grams.

o If a sample of rat pups which were exposed to prenatal alcohol


has a birth weight "near" 18 grams we conclude that the
treatement does not have an effect.

Formally: We do not reject the null hypothesis that prenatal


exposure to alcohol has no effect on the birth weight for the
population of lab rats.

o If our sample of rat pups has a birth weight "far" from 18 grams we
conclude that the treatement does have an effect.

Formally: We reject the null hypothesis that prenatal exposure to


alcohol has no effect on the birth weight for the population of lab
rats.

For this example, we would probably decide that the observed mean
birth weight of 13 grams is "different" than the value of 18 grams
hypothesized under the null hypothesis.
When interpreting an experimental finding, a natural question arises as to whether the finding could
have occurred by chance. Hypothesis testing is a statistical procedure for testing whether chance is
a plausible explanation of an experimental finding
The logic of hypothesis testing has posed a large problem in research. See articles by Cohen (1994) and
Dar, Serlin, and Omer (1993). Unfortunately, since it is the trade of the land, you will need to
understand it. Therefore, while suggesting that you consider the use of exploratory data analysis and
confidence intervals, I will, nonetheless, continue to explore the logic with you at this time.

We could extensively discuss the probability behind hypothesis testing. However, my major concern is
the LOGIC of hypothesis testing, an unusually fertile source of error.

The purpose of hypothesis testing is to test whether the null hypothesis (there is no difference, no
effect) can be rejected or approved. If the null hypothesis is rejected, then the research hypothesis
can be accepted. If the null hypothesis is accepted, then the research hypothesis is rejected.

Bibliography

Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to Design and Evaluate in Education. (8th Ed.)
McGraw-Hill, New York

Frey, L. R., Carl H. B., & Gary L. K. (2000). Investigating Communication: An

Introduction to Research Methods.2nd Ed. Boston: Allyn and Bacon

Gravetter, F. J., & Wallnau, L. B. (2002). Essentials of Statistics for the Behavioral Sciences (4th Ed.).
Wadsworth, California, USA.

Lohr, S. L. (1999). Sampling: Design and Analysis. Albany: Duxbury Press.

Pallant, J. (2005). SPSS Survival Manual – A step by step guide to data analysis using SPSS for Windows
(Version 12). Australia: Allen & Unwin

………..😅😅😅😅😅.

Q3

7.2.2 Using and Interpreting Pearson Correlation

First let us have a brief discussion about where and why we use correlation. The discussion follows
under following headings. i) Prediction

If two variables are known to be related in some systematic way, it is possible to use one variable to
make prediction about the other. For example, when a student seeks admission in a college, he is
required to submit a great deal of personal information, including his scores in SSC
annual/supplementary examination. The college officials want this information so that they can predict
that student’s chance of success in college.

ii) Validity

Suppose a researcher develops a new test for measuring intelligence. It is necessary that he should
show that this new test valid and truly measures what it claims to measure. One common technique for
demonstrating validity is to use correlation.

If newly constructed test actually measures intelligence, then the scores on this test should be related to
other already established measures of intelligence – for example standardized IQ tests, performance on
learning tasks, problem-solving ability, and so on. The newly constructed test can be correlated to each
of these measures to demonstrate that the new test is valid.

iii) Reliability

Apart from determining validity, correlations are also used to determine reliability. A measurement
procedure is reliable if it produces stable and consistent measurement. It means a reliable measurement
procedure will produce the same (or nearly same) scores when the same individuals are measured
under the same conditions. One common way to evaluate reliability is to use correlations to determine
relationship between two sets of scores.

iv) Theory Verification

Many psychological theories make specific predictions about the relationship between two variables.
For example, a theory may predict a relationship between brain size and learning ability; between the
parent IQ and the child IQ etc. In each case, the prediction of the theory could be tested by determining
the correlation between two variables.

Now let us have a few words on interpreting correlation. For interpreting correlation following
consideration should be kept in mind.

Correlation simply describes a relationship between two variables. It does not explain why two variables
are related. That is why correlation cannot be interpreted as a proof of cause and effect relationship
between two variables.

The value of the correlation cannot be affected by range of scores represented in the data.

One or two extreme data points, often called outliers, can have a dramatic effect on the value of the
correlation.
When judging how good a relationship is, it is tempting to focus on the numerical value of the
correlation. For example, a correlation of + 5 is halfway between 0 and 1.00 and therefore appears to
represent a moderate degree of relationship. Here it should be noted that we cannot interpret
correlation as a proportion. Although a correlation of 1.00 means that there is a 100% perfectly
predictable relationship between variables X and Y; but a correlation of .5 does not mean that we can
make a prediction with 50% accuracy. The appropriate process of describing how accurately one
variable predicts the other is to square the correlation. Thus a correlation of r = .5 provides r2 = .52
= .25, 25% accuracy. (The value r2 is called coefficient of determination because it measures the
proportion of variability in one variable that can be determined from the relationship with the other
variable).

….

Q4

Introduction

Analysis of Variance (ANOVA) is a statistical procedure used to test the degree to which two or more
groups vary or differ in an experiment. This unit will give you an insight of ANOVA, its logic, one-way
ANOVA, its assumptions, logic and procedure. Fdistribution, interpretation of F-distribution and multiple
procedures will also be discussed.

Introduction to Analysis of Variance (ANOVA)

The t-tests have one very serious limitation – they are restricted to tests of the significance of the
difference between only two groups. There are many times when we like to see if there are significant
differences among three, four, or even more groups. For example we may want to investigate which of
three teaching methods is best for teaching ninth class algebra. In such case, we cannot use t-test
because more than two groups are involved. To deal with such type of cases one of the most useful
techniques in statistics is analysis of variance (abbreviated as ANOVA). This technique was developed by
a British Statistician Ronald A. Fisher (Dietz & Kalof, 2009; Bartz, 1981)
Analysis of Variance (ANOVA) is a hypothesis testing procedure that is used to evaluate mean
differences between two or more treatments (or population). Like all other inferential procedures.
ANOVA uses sample data to as a basis for drawing general conclusion about populations. Sometime, it
may appear that ANOVA and t-test are two different ways of doing exactly same thing: testing for mean
differences. In some cased this is true – both tests use sample data to test hypothesis about population
mean. However, ANOVA has much more advantages over t-test. t-tests are used when we have compare
only two groups or variables (one independent and one dependent). On the other hand ANOVA is used
when we have two or more than two independent variables (treatment). Suppose we want to study the
effects of three different models of teaching on the achievement of students. In this case we have three
different samples to be treated using three different treatments. So ANOVA is the suitable technique to
evaluate the difference.

8.1.1 Logic of ANOVA

Let us take a hypothetical data given in the table.

Table 8.1

Hypothetical Data from an Experiment examining learning performance under three Temperature
condition

Treatment 1 Treatment 2 Treatment 3

50o 70o 90o

Sample 1 Sample 2 Sample 3

4 1

3 2

3 6 2

1 3 0

0 4 0
There are three separate samples, with n = 5 in each sample. The dependent variable is the number of
problems solved correctly

These data represent results of an independent-measure experiment comparing learning performance


under three temperature conditions. The scores are variable and we want to measure the amount of
variability (i.e. the size of difference) to explain where it comes from. To compare the total variability,
we will combine all the scores from all the separate samples into one group and then obtain one general
measure of variability for the complete experiment. Once we have measured the total variability, we can
begin to break it into separate components. The word analysis means breaking into smaller parts.
Because we are going to analyze the variability, the process is called analysis of variance (ANOVA). This
analysis process divides the total variability into two basic components:

i) Between-Treatment Variance

Variance simply means difference and to calculate the variance is a process of measuring how big the
differences are for a set of numbers. The between-treatment variance is measuring how much
difference exists between the treatment conditions. In addition to measuring differences between
treatments, the overall goal of ANOVA is to evaluate the differences between treatments. Specifically,
the purpose for the analysis is to distinguish is to distinguish between two alternative explanations.

The differences between the treatments have been caused by the treatment effects.

The differences between the treatments are simply due to chance.

Thus, there are always two possible explanations for the variance (difference) that exists between
treatments

Treatment Effect: The differences are caused by the treatments. For the data in table 8.1, the scores in
sample 1 are obtained at room temperature of 50o and that of sample 2 at 70o. It is possible that the
difference between sample is caused by the difference in room temperature.

Chance: The differences are simply due to chance. It there is no treatment effect, even then we can
expect some difference between samples. The chance differences are unplanned and unpredictable
differences that are not caused or explained by any action of the researcher. Researchers commonly
identify two primary sources for chance differences.

Individual Differences

Each participant of the study has its own individual characteristics. Although it is reasonable to expect
that different subjects will produce different scores, it is impossible to predict exactly what the
difference will be.
Experimental Error

In any measurement there is a chance of some degree of error. Thus, if a researcher measures the same
individuals twice under same conditions, there is greater possibility to obtain two different
measurements. Often these differences are unplanned and unpredictable, so they are considered to be
by chance.

Thus, when we calculate the between-treatment variance, we are measuring differences that could be
either by treatment effect or could simply be due to chance. In order to demonstrate that the difference
is really a treatment effect, we must establish that the differences between treatments are bigger than
would be expected by chance alone. To accomplish this goal, we will determine how big the differences
is when there is no treatment effect involved. That is, we will measure how much difference (variance)
occurred by chance. To measure chance differences, we compute the variance within treatments

ii) Within-Treatment Variance

Within each treatment condition, we have a set of individuals who are treated exactly the same and the
researcher does not do anything that would cause these individual participants to have different scores.
For example, in table 8.1 the data shows that five individuals were treated at a 70o room temperature.
Although, these five students were all treated exactly the same, there scores are different. Question is
why are the score different? A plain answer is that it is due to chance. Figure 8.1 shows the overall
analysis of variance and identifies the sources of variability that are measures by each of two basic
components.

One Way ANOVA (Logic and Procedure)

The one way analysis of variance (ANOVA) is an extension of independent two-sample ttest. It is a
statistical technique by which we can test if three or more means are equal. It tests if the value of a
single variable differs significantly among three or more level of a factor. We can also say that one way
ANOVA is a procedure of testing hypothesis that K population means are equal, where K ≥ 2. It
compares the means of the samples or groups in order to make inferences about the population means.
Specifically, it tests the null hypothesis: Ho : µ1 = µ2 = µ3 = ... = µk

Where µ = group mean and k = number of groups

If one way ANOVA yields statistically significant result, we accept the alternate hypothesis (HA), which
states that there are two group means that are statistically significantly different from each other. Here
it should be kept in mind that one way ANOVA cannot tell which specific groups were statistically
significantly different from each other. To determine which specific groups are different from each
other, a researcher will have to use post hoc test.

As there is only one independent variable or factor in one way ANOVA so it is also called single factor
ANOVA. The independent variable has nominal levels or a few ordinal levels. Also, there is only one
dependent variable and hypotheses are formulated about the means of the group on dependent
variable. The dependent variable differentiates individuals on some quantitative dimension.

3.2 Logic Behind One Way ANOVA

In order to test pair of sample means differ by more than would be expected by chance, we might
conduct a series of t-tests on K sample means – however, this approach has a major problem, i.e.

When we use a t-test once, there is a chance of Type I error. The magnitude of this error is usually 5%.
By running two tests on the same data we will have increased his chance of making error to 10%. For the
third administration, it will be 15%, and so on. These are unacceptable errors. The number of t-tests
needed to compare all possible means would be:

( − 1)

2 Where K = Number of means

When more than one t-test is run, each at a specific level of significance such as α = .05, the probability
of making one or more Type I error in a series of t-test is greater than α. The increased number of Type I
error is determined as:

1 – (1 - α) c

Where α = level of significance for each separate t-test c =


number of independent t-test

An ANOVA controls the chance for these errors so that the type I error remains at 5% and a researcher
can become more confident about the results.

8 8.7 Bibliography
Deitz, T., & Kalof, L. (2009). Introduction to Social Statistics. UK: Wiley_-Blackwell

Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to Design and Evaluate in

Education. (8th Ed.) McGraw-Hill, New York

Pallant, J. (2005). SPSS Survival Manual – A Step by Step Guide to Data Analysis Using SPSS for Windows
(Version 12). Australia: Allen & Unwin.

…..

Q5

Introduction

The chi-square (χ2) statistics is commonly used for testing relationship between categorical variables. It
is intended to test how likely it is that an observed difference is due to chance. In most situations it can
be used as a quick test of significance

9.1 The Chi-Square Distribution

The Chi-Square (or the Chi-Squared - χ2) distribution is a special case of the gamma distribution (the
gamma distribution is family of right skewed, continuous probability distribution. These distributions are
useful in real life where something has a natural minimum of 0.). a chi-square distribution with n degree
of freedom is equal to a gamma distribution with a = n/2 and b = 0.5 (or β = 2).

Chi-Square Distribution
A chi-square distribution is a continuous distribution with k degrees of
freedom. It is used to describe the distribution of a sum of squared
random variables. It is also used to test the goodness of fit of a
distribution of data, whether data series are independent, and for
estimating confidences surrounding variance and standard deviation
for a random variable from a normal distribution. Additionally, chi-
square distribution is a special case of the gamma distribution
Let us consider a random sample taken from a normal distribution. The chi-square distribution is the
distribution of the sum of these random samples squared. The degrees of freedom (say k) are equal to
the number of samples being summed. For example, if 10 samples are taken from the normal
distribution, then degree of freedom df = 10. Chisquare distributions are always right skewed. The
greater the degree of freedom, the more the chi-square distribution looks like a normal distribution.

9.1.1 Uses of Chi-Square (χ2) Distribution

The chi-square distribution has many uses which include:

Confidence interval estimation for a population standard deviation of a normal distribution from a
sample standard deviation.

Independence of two criteria of classification of qualitative variables (contingency tables).

Relationship between categorical variables. iv) Sample variance study when the underlying distribution
is normal.
Tests of deviations of differences between expected and observed frequencies (oneway table).

The chi-square test (a goodness of fit test).

Meaning of Chi-Square Test:


The Chi-square (χ2) test represents a useful method of comparing experimentally
obtained results with those to be expected theoretically on some hypothesis.
Thus Chi-square is a measure of actual divergence of the observed and expected
frequencies. It is very obvious that the importance of such a measure would be very
great in sampling studies where we have invariably to study the divergence between
theory and fact.

Chi-square as we have seen is a measure of divergence between the expected and


observed frequencies and as such if there is no difference between expected and
observed frequencies the value of Chi-square is 0.

If there is a difference between the observed and the expected frequencies then the value
of Chi-square would be more than 0. That is, the larger the Chi-square the greater the
probability of a real divergence of experimentally observed from expected results.

If the calculated value of chi-square is very small as compared to its table value it
indicates that the divergence between actual and expected frequencies is very little and
consequently the fit is good. If, on the other hand, the calculated value of chi-square is
very big as compared to its table value it indicates that the divergence between expected
and observed frequencies is very great and consequently the fit is poor.

To evaluate Chi-square, we enter Table E with the computed value of chi- square and the
appropriate number of degrees of freedom. The number of df = (r – 1) (c – 1) in which r
is the number of rows and c the number of columns in which the data are tabulated.

Using Chi-Square Statistic in Research


The Chi Square statistic is commonly used for testing relationships between
categorical variables.  The null hypothesis of the Chi-Square test is that no relationship
exists on the categorical variables in the population; they are independent.  An example
research question that could be answered using a Chi-Square analysis would be:
Is there a significant relationship between voter intent and political party membership?

How does the Chi-Square statistic work?

The Chi-Square statistic is most commonly used to evaluate Tests of Independence


when using a crosstabulation (also known as a bivariate table).  Crosstabulation
presents the distributions of two categorical variables simultaneously, with the
intersections of the categories of the variables appearing in the cells of the table.  The
Test of Independence assesses whether an association exists between the two
variables by comparing the observed pattern of responses in the cells to the pattern that
would be expected if the variables were truly independent of each other.  Calculating the
Chi-Square statistic and comparing it against a critical value from the Chi-Square
distribution allows the researcher to assess whether the observed cell counts are
significantly different from the expected cell counts.

What are special concerns with regard to the Chi-Square statistic?

There are a number of important considerations when using the Chi-Square statistic to
evaluate a crosstabulation.  Because of how the Chi-Square value is calculated, it is
extremely sensitive to sample size – when the sample size is too large (~500), almost
any small difference will appear statistically significant.  It is also sensitive to the
distribution within the cells, and SPSS gives a warning message if cells have fewer than
5 cases. This can be addressed by always using categorical variables with a limited
number of categories (e.g., by combining categories if necessary to produce a smaller
table).

How is the Chi-Square statistic run in SPSS and how is the output interpreted?

The Chi-Square statistic appears as an option when requesting a crosstabulation in


SPSS. The output is labeled Chi-Square Tests; the Chi-Square statistic used in the Test
of Independence is labeled Pearson Chi-Square. This statistic can be evaluated by
comparing the actual value against a critical value found in a Chi-Square distribution
(where degrees of freedom is calculated as # of rows – 1 x # of columns – 1), but it is
easier to simply examine the p-value provided by SPSS. To make a conclusion about the
hypothesis with 95% confidence, the value labeled Asymp. Sig. (which is the p-value of
the Chi-Square statistic) should be less than .05 (which is the alpha level associated
with a 95% confidence level).

Is the p-value (labeled Asymp. Sig.) less than .05?  If so, we can conclude that the
variables are not independent of each other and that there is a statistical relationship
between the categorical variables.

In this example, there is an association between fundamentalism and


views on teaching sex education in public schools.  While 17.2% of
fundamentalists oppose teaching sex education, only 6.5% of liberals are
opposed.  The p-value indicates that these variables are not independent of
each other and that there is a statistically significant relationship between the
categorical variables ……
…….

Bibliography

Agresti, A. & Finlay, B. (1997). Statistical Methods for Social Sciences, (3rd Ed. ). Prentice Hall.
Anderson, T. W., & Sclove, S. L. (1974). Introductory Statistical Analysis, Finland: Houghton Mifflin
Company.

Bartz, A. E. (1981). Basic Statistical Concepts (2nd Ed.). Minnesota: Burgess Publishing Company

Deitz, T., & Kalof, L. (2009). Introduction to Social Statistics. UK: Wiley_-Blackwell

Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to Design and Evaluate in

Education. (8th Ed.) McGraw-Hill, New York

Gay, L. R., Mills, G. E., & Airasian, P. W. (2010). Educational Research: Competencies for Analysis and
Application, 10th Edition. Pearson, New York USA.

Gravetter, F. J., & Wallnau, L. B. (2002). Essentials of Statistics for the Behavioral Sciences (4th Ed.).
Wadsworth, California, USA.

Pallant, J. (2005). SPSS Survival Manual – A step by step guide to data analysis using SPSS for Windows
(Version 12). Australia: Allen & Unwin.

You might also like