Statistics Glossary

Statistics Glossary

Statistics Glossary
Average: Same as the mean. Add up all the values and divide by the
number of values.
Categorical variable: A variable whose values can be sorted into
categories. Ex: gender, fur color
Chi-square test: A statistical test used with categorical or nominal
variables to tell whether their distribution is different than expected,
i.e., there is something going on that is more than chance.
Continuous data: Data that can be measured along a continuum of
many values. This is distinct from discrete data, which is divided into
categories. Ex: weight, height, age
Correlation: A statistical test that describes the strength of a linear
relationship between the two variables. Correlation does not
necessarily mean there is a causal relationship.
Dependent variable: A variable which is dependent on the levels of
other variables for its value. Dependent variables are measured but
not manipulated.
Descriptive statistics: Statistics that describe the data. This includes
the mean, median, mode, standard deviation, and variability, among
Discrete variable: This is a variable where the values are distinct,
separate (not on a continuum), and can be counted. Ex: # of kittens in
a litter, the breeds of puppies living in a particular subdivision
Experimental hypothesis: H1 speculates that there is a relationship
between your variables. This is the hypothesis you are trying to
support with your statistical test.
Independent variable: A variable that can be manipulated.
Inferential statistics: Statistics that analyze the data and allow you
to infer something about the population from the sample.
UT Southwestern Medical Center Library\u2014October 2007
Interval variable: An interval variable can be ranked and the
intervals between the values can be compared. For statistical
purposes, interval and ratio variables are generally treated the same.
Mean: Same as the mean. Add up all the values and divide by the
number of values.
Median: The median is the middle number in a set of data values. The
median is important because unlike the mean, the median is not
affected by outliers.
Mode: The data value that appears most frequently in a set.
Nominal variable: This is essentially the same thing as categorical
data. Nominal data values cannot be ranked in any order; they can
only be placed in categories.
Null hypothesis: H0 speculates that there is no relationship between
your variables. You always should have both a null and an
experimental hypothesis when you are performing statistical tests.
Ordinal variable: Ordinal data values can be ranked, but the intervals

between them are essentially meaningless. Ex: Runners in a race\u2014
Someone comes in first, but we don\u2019t know how much faster the first
place winner was than the second place finisher.

Outlier: A data value that is significantly larger or smaller than the
other values measured for that variable.
Population: The set of all objects you are interested in studying. You
define what your population is.
Probability: How likely a value is to occur. Probability is important
because it helps us distinguish whether we are getting a value by
chance or because of the levels of other variables.
Ratio variable: A ratio variable has a true zero point where you can

have zero of whatever you are measuring. These are very rare. For the
purposes of statistics, interval and ratio variables are treated the

Regression: A statistical test that predicts the level of the dependent
variable based on the level(s) of one or more independent variable(s).
UT Southwestern Medical Center Library\u2014October 2007

