Statistics Glossary

# Statistics Glossary

03/18/2014

Statistics Glossary
Average: Same as the mean. Add up all the values and divide by the
number of values.
Categorical variable: A variable whose values can be sorted into
categories. Ex: gender, fur color
Chi-square test: A statistical test used with categorical or nominal
variables to tell whether their distribution is different than expected,
i.e., there is something going on that is more than chance.
Continuous data: Data that can be measured along a continuum of
many values. This is distinct from discrete data, which is divided into
categories. Ex: weight, height, age
Correlation: A statistical test that describes the strength of a linear
relationship between the two variables. Correlation does not
necessarily mean there is a causal relationship.
Dependent variable: A variable which is dependent on the levels of
other variables for its value. Dependent variables are measured but
not manipulated.
Descriptive statistics: Statistics that describe the data. This includes
the mean, median, mode, standard deviation, and variability, among
others.
Discrete variable: This is a variable where the values are distinct,
separate (not on a continuum), and can be counted. Ex: # of kittens in
a litter, the breeds of puppies living in a particular subdivision
Experimental hypothesis: H1 speculates that there is a relationship
between your variables. This is the hypothesis you are trying to
Independent variable: A variable that can be manipulated.
Inferential statistics: Statistics that analyze the data and allow you
to infer something about the population from the sample.
UT Southwestern Medical Center Library\u2014October 2007
Interval variable: An interval variable can be ranked and the
intervals between the values can be compared. For statistical
purposes, interval and ratio variables are generally treated the same.
Mean: Same as the mean. Add up all the values and divide by the
number of values.
Median: The median is the middle number in a set of data values. The
median is important because unlike the mean, the median is not
affected by outliers.
Mode: The data value that appears most frequently in a set.
Nominal variable: This is essentially the same thing as categorical
data. Nominal data values cannot be ranked in any order; they can
only be placed in categories.
Null hypothesis: H0 speculates that there is no relationship between
your variables. You always should have both a null and an
experimental hypothesis when you are performing statistical tests.
Ordinal variable: Ordinal data values can be ranked, but the intervals

between them are essentially meaningless. Ex: Runners in a race\u2014
Someone comes in first, but we don\u2019t know how much faster the first
place winner was than the second place finisher.

Outlier: A data value that is significantly larger or smaller than the
other values measured for that variable.
Population: The set of all objects you are interested in studying. You
Probability: How likely a value is to occur. Probability is important
because it helps us distinguish whether we are getting a value by
chance or because of the levels of other variables.
Ratio variable: A ratio variable has a true zero point where you can

have zero of whatever you are measuring. These are very rare. For the
purposes of statistics, interval and ratio variables are treated the
same.

Regression: A statistical test that predicts the level of the dependent
variable based on the level(s) of one or more independent variable(s).
UT Southwestern Medical Center Library\u2014October 2007