You are on page 1of 6

nferential Statistics

Inferential statistics are often used to compare the differences


between the treatment groups. Inferential statistics use measurements
from the sample of subjects in the experiment to compare the
treatment groups and make generalizations about the larger
population of subjects.
There are many types of inferential statistics and each is appropriate
for a specific research design and sample characteristics.
Researchers should consult the numerous texts on experimental
design and statistics to find the right statistical test for their
experiment. However, most inferential statistics are based on the
principle that a test-statistic value is calculated on the basis of a
particular formula. That value along with the degrees of freedom, a
measure related to the sample size, and the rejection criteria are used
to determine whether differences exist between the treatment groups.
The larger the sample size, the more likely a statistic is to indicate that
differences exist between the treatment groups. Thus, the larger the
sample of subjects, the more powerful the statistic is said to be.
Virtually all inferential statistics have an important underlying
assumption. Each replication in a condition is assumed to be
independent. That is each value in a condition is thought to be
unrelated to any other value in the sample. This assumption of
independence can create a number of challenges for animal behavior
researchers.

Statistical inference is the process of using data analysis to infer properties of an


underlying distribution of probability.[1] Inferential statistical analysis infers properties of
a population, for example by testing hypotheses and deriving estimates. It is assumed
that the observed data set is sampled from a larger population.
Inferential statistics can be contrasted with descriptive statistics. Descriptive statistics is
solely concerned with properties of the observed data, and it does not rest on the
assumption that the data come from a larger population. In machine learning, the
term inference is sometimes used instead to mean "make a prediction, by evaluating an
already trained model";[2] in this context inferring properties of the model is referred to
as training or learning (rather than inference), and using a model for prediction is
referred to as inference (instead of prediction); see also predictive inference.

Introduction[edit]
Statistical inference makes propositions about a population, using data drawn from the population
with some form of sampling. Given a hypothesis about a population, for which we wish to draw
inferences, statistical inference consists of (first) selecting a statistical model of the process that
generates the data and (second) deducing propositions from the model. [citation needed]
Konishi & Kitagawa state, "The majority of the problems in statistical inference can be considered to
be problems related to statistical modeling".[3] Relatedly, Sir David Cox has said, "How [the]
translation from subject-matter problem to statistical model is done is often the most critical part of
an analysis".[4]
The conclusion of a statistical inference is a statistical proposition.[5] Some common forms of
statistical proposition are the following:

 a point estimate, i.e. a particular value that best approximates some parameter of
interest;
 an interval estimate, e.g. a confidence interval (or set estimate), i.e. an interval
constructed using a dataset drawn from a population so that, under repeated sampling of
such datasets, such intervals would contain the true parameter value with
the probability at the stated confidence level;
 a credible interval, i.e. a set of values containing, for example, 95% of posterior belief;
 rejection of a hypothesis;[note 1]
 clustering or classification of data points into groups.

Inferential Statistics
1. Baseline Help Center

2.  Using Baseline

3.  Analyzing Data and Reporting Capabilities

When you have quantitative data, you can analyze it using either descriptive or
inferential statistics. Descriptive statistics do exactly what it sounds like – they describe
the data. Descriptive statistics include measures of central tendency (mean, median,
mode), measures of variation (standard deviation, variance), and relative position
(quartiles, percentiles). There are times, however, when you want to draw conclusions
about the data. This may include making comparisons across time, comparing different
groups, or trying to make predictions based on data that has been collected. Inferential
statistics are used when you want to move beyond simple description or
characterization of your data and draw conclusions based on your data. There are
several kinds of inferential statistics that you can calculate; here are a few of the more
common types:

t-tests
A t-test is a statistical test that can be used to compare means. There are three basic
types of t-tests: one-sample t-test, independent-samples t-test, and dependent-samples
(or paired-samples) t-test. For all t-tests, you are simply looking at the difference
between the means and dividing that difference by some measure of variation.
One-sample t-test 
A one-sample t-test can be used to compare your data to the mean of some known
population.

-          Example: Suppose you are interested in knowing whether students who are
utilizing the Career Services office are generally the students with higher GPAs. You
would take the mean GPA of the students who use Career Services and compare it to
the mean GPA of all students at the institution, taken from the registrar’s records. 

-          Thus, use a one-sample t-test when:

 You have one data set or one mean that you are interested in
 You know the mean of the population (the entire population, not a
sample!) you wish to compare your mean to

Independent-samples t-test
An independent-samples t-test can be used to compare data from two separate, non-
related samples.

-          Example: Suppose you are interested in knowing how your institution compares to
other institutions in terms of hours of community service per capita. You would take
your students’ mean community service hours per person and compare it to other
institutions’ mean community service hours per person.

-          Example: Suppose you are interested in determining whether there is a difference
between students in Greek organizations and students who are not in Greek
organizations on a measure of satisfaction with weekend programming. You could
issue a survey to students and then compare the mean satisfaction of Greeks with the
mean satisfaction of non-Greeks.

-          Thus, use an independent-samples t-test when:

 You have two separate, non-overlapping groups or data sets that you want
to compare. That is, different people provided the data for each group.

Dependent samples t-test
A dependent-samples t-test can be used to compare data from related groups or the
same people over time. This is most often used when you have a pretest/posttest
setup.

-          Example: You want to know whether students’ attitudes toward diversity changes
from their freshman to senior years. You could ask incoming freshmen to indicate their
level of agreement with various statements related to diversity and then administer the
same survey to them again in their senior year and compare their answers.
-          Thus, use a dependent-samples t-test when:

 You have two separate data sets that are provided by the same people, just
at different times (e.g. pre/post)

For more information about t-tests, visit:

http://www.socialresearchmethods.net/kb/stat_t.php

ANOVA (Analysis of Variance)


An ANOVA is a statistical test that is also used to compare means. The difference
between a t-test and an ANOVA is that a t-test can only compare two means at a time,
whereas with an ANOVA, you can compare multiple means at the same time. ANOVAs
also allow you to compare the effects of different factors on the same measure.
ANOVAs can become very complicated, and the analysis should only be done by
someone who has been trained in statistics. There are several types of ANOVAs,
including: one-way ANOVA, within-groups (or repeated-measures) ANOVA, and factorial
ANOVA.

One-way ANOVA
A one-way ANOVA is used to compare three or more groups/levels along the same
dimension. It is similar to an independent-samples t-test, just with more groups.

-          Example: Suppose you want to know whether leadership skills differ between
Freshmen, Sophomores, Juniors, and Seniors. You would take the mean for each group
and compare them to each other.

-          Thus, use a one-way ANOVA when:

 You have three or more separate, non-overlapping groups or data sets that
you want to compare.

Within-groups (Repeated measures) ANOVA


A within-groups ANOVA is used to compare data from related groups or the same
people over time. This is similar to a dependent-samples t-test, just with more data sets.
This is most often used when you are doing a longitudinal study that tracks the same
people across time.

-          Example: Suppose you want to track the development of leadership skills over
time. You would administer your instrument to a group of students during their
Freshman year, during their Sophomore year, during their Junior year, and again during
their Senior year. The same group of people would be taking the survey each year. You
would then compare the means of this group as Freshmen, Sophomores, Juniors, and
Seniors.

-          Thus, use a within-groups ANOVA when:

 You have separate data sets that are provided by the same people over
time

Factorial ANOVA
A factorial ANOVA is used when you have two or more variables/factors/dimensions,
and you want to explore whether there are interactions between these factors.
Essentially, you are comparing the means of the various combinations of factors.

-          Example: You want to know whether there is a difference between males vs.
females and underclassmen vs. upperclassmen on appreciating diversity. While you
could do two separate t-tests, you are also interested in knowing whether
the combination of factors makes a difference. You administer your instrument and
compare Males to Females, Freshmen to Seniors, and then subdivide the data to
compare Freshman Males, Freshman Females, Senior Males, and Senior Females.

-          Example: You want to know whether students improve their communication skills
over time, but you are also interested in knowing whether this differs by major. You
administer the instrument to the same group of students during their Freshman year
and again during their Junior year. You compare the means of Freshmen to Juniors,
Biology to Art to Education majors, and then subdivide the data to compare the means
of Freshman Biology, Freshman Art, Freshman Education, Junior Biology, Junior Art, and
Junior Education majors.

-          Thus, use a factorial ANOVA when:

 You are interested in the interaction between two or more


variables/factors/dimensions

One thing that is important to note about ANOVAs is that because there are more than
two groups that are being compared, follow-up (or post-hoc) tests are often required to
further interpret the data. For instance, if you compare Freshmen, Sophomores, Juniors,
and Seniors on a measure of leadership skills and find a statistically significant
difference, you will have to conduct follow-up tests to determine which groups are
significantly different from each other. These follow-up tests may show that Freshmen
and Sophomores are no different from each other, nor are Juniors and Seniors, but
Juniors and Seniors both have better leadership skills than either Freshmen or
Sophomores.

For more information about ANOVAs, visit:        

http://onlinestatbook.com/2/analysis_of_variance/intro.html
 

Regression
A regression analysis is a statistical procedure that allows you to make a prediction
about an outcome (or criterion) variable based on knowledge of some predictor
variable. To create a regression model, you first need to collect (a lot of) data on both
variables, similar to what you would do if you were conducting a correlation. Then you
would determine the contribution of the predictor variable to the outcome variable. Once
you have the regression model, you would be able to input an individual’s score on the
predictor variable to get a prediction of their score on the outcome variable.

-          Example: You want to try to predict whether a student will come back for a
second year based on how many on-campus activities s/he attended. You would have
to collect data on how many activities students attended and then whether or not those
students returned for a second year. If activity attendance and retention are significantly
related to each other, then you can generate a regression model where you could
identify at-risk students (in terms of retention) based on how many activities they have
attended.

-          Example: You want to try to identify students who are at risk of failing College
Algebra based on their scores on a math assessment so you can direct them to special
services on campus. You would administer the math assessment at the start of the
semester and then match each student’s score on the math assessment to their final
grade in the course. Eventually, your data may show that the math assessment is
significantly correlated to their final grade, and you can create a regression model to
identify those at-risk students so you can direct them to tutors and other resources on
campus.

-          Thus, use regression when:

 You want to be able to make a prediction about an outcome given what


you already know about some related factor.

Another option with regression is to do a multiple regression, which allows you to make


a prediction about an outcome based on more than just one predictor variable. Many
retention models are essentially multiple regressions that consider factors such as
GPA, level of involvement, and attitude towards academics and learning.

You might also like