You are on page 1of 7

Glossary of Statistical Terms:

1. Frequency distribution:

When data are grouped according to magnitude, the resulting series is called frequency
distribution. For example, in the following list of numbers, the frequency of the number 9 is 5
(because it occurs five times):

1,2 3,4,6,9,9,8,5,1,1,9,9,0,6,9

2. Descriptive Statistics:

It aims at describing a number of features of data usually involved in a study. Its main purpose is
to provide description of the samples and the measures done on a particular study through either
numerical calculations, graphs or tables.

3. Variable View:

It contains descriptions of the attributes of each variable in the data file. In variable view, rows
are called variables while columns are called variable attributes.

4. Data View

It displays the contents of a currently open (active) dataset (Data Matrix).

5. Standard Deviation:

It is a number used to tell how measurements for a group spread out from the average (mean), or
expected value.

6. Measures of Central Tendency

• Mean: It is the average of the numbers, a calculated "central" value of a set of numbers.

• Median: It is the middle value in the list of numbers. Numbers have to be listed in
numerical order from smallest to largest.
• Mode: It is the value that occurs most often. If no number is repeated in the list, there is
no mode for the list.

7. Correlation:

Correlation is a statistical measure that indicates the extent to which two or more variables
fluctuate together. A positive correlation indicates the extent to which those variables increase or
decrease in parallel; a negative correlation indicates the extent to which one variable increases as
the other decreases.

8. Pearson Correlation:

A Pearson correlation also known as the “product moment correlation coefficient” (PMCC) or
simply “correlation”, is a number between -1 and 1 that indicates the extent to which two
variables are linearly related and are suitable only for metric variables (which include
dichotomous variables).

9. Graph:

Presentation of statistical data by geometrical curves is called Graphical Representation of the


data. A graph is drawn between X-axis and Y-axis where X-axis is a horizontal line while Y-axis
is a vertical line.

10.Regression:

A measure of the relation between the mean value of one variable and corresponding values of
other variables.

11.Simple and multiple linear regression:

In simple linear regression, a single independent variable is used to predict the value of a
dependent variable. In multiple linear regression, two or more independent variables are used to
predict the value of a dependent variable. The difference between the two is the number of
independent variables.
12.T-test

A t-test’s statistical significance indicates whether the difference between two groups’ averages
most likely reflects a “real” difference in the population from which the groups were sampled.

13. Types of T-test:

• Independent Sample T-test: It compares the means of two groups.

• Paired Sample T-test: It compares means from the same group at different times.

• One Sample T-test: It tests the mean of a single group against a known mean.

14. Z-test

A Z-test is a type of hypothesis test. Hypothesis testing is just a way for you to figure out if
results from a test are valid or repeatable

15.One Sample Z-test

It is used to test whether a population parameter is siggnificantly different from some


hypothesized value.

16. ANOVA

ANOVA is an analysis of the variation present in an experiment. It is used for examining the
differences in the mean values of the dependent variable associated with the effect of
independent variables. Essentially, ANOVA is used as a test of means for two or more
populations. The tests in an ANOVA are based on the F-ratio: the variation due to an
experimental treatment or effect divided by the variation due to experimental error.

17.One-way ANOVA

The One-Way ANOVA ("analysis of variance") compares the means of two or more independent
groups in order to determine whether there is statistical evidence that the associated population
means are significantly different. One-Way ANOVA is a parametric test. This test is also known
as:
One-Factor ANOVA One-Way Analysis of Variance

Between Subjects ANOVA

18. Two-way ANOVA

It compares the mean difference between groups that have been split on two independent
variables (called factors) and its main purpose is to understand if there is an interaction between
two independent variable on the dependent variable.

19.Kruskal-Wallis test (non-parametric equivalent to the one-way ANOVA)

Kruskal-Wallis compares the medians of two or more samples to determine if the samples have
come from different populations. It is an extension of the Mann–Whitney U test to 3 or more
groups. The distributions do not have to be normal and the variances do not have to be equal.

20.Welch’s F-test

Welch’s F-test (Field 2009) is designed to test the equality of group means when we have more
than two groups to compare, especially in the cases which didn’t meet the homogeneity of
variance assumption and sample sizes are small.

21.Friedman test (non-parametric equivalent to repeated measures two-way ANOVA)

The Friedman test is used to detect differences in scores across multiple occasions or conditions.
The scores for each subject are ranked and then the sums of the ranks for each condition are used
to calculate a test statistic. The Friedman test can also be used when subjects have ranked a list
e.g. rank these pictures in order of preference.

22.Chi Square Test

Test to determine if the portions of two or more categories differ from the expected proportion.

23.Multivariate Analysis of Variance (MANOVA)


Multivariate analysis of variance (MANOVA) is simply an ANOVA with several dependent
variables. That is to say, ANOVA tests for the difference in means between two or more groups,
while MANOVA tests for the difference in two or more vectors of means.

24.The Mann-Whitney U Test

The Mann-Whitney U test is a non-parametric test that can be used in place of an unpaired t-test.
It is used to test the null hypothesis that two samples come from the same population (i.e. have
the same median) or, alternatively, whether observations in one sample tend to be larger than
observations in the other. Although it is a non-parametric test, it does assume that the two
distributions are similar in shape.

25.Phi-Coefficient /phi correlation:

A Phi coefficient is a non-parametric test of relationships that operates on two dichotomous (or
dichotomized) variables. The phi (rhymes with fee) correlation gives an estimate of the degree of
relationship between two dichotomous variables. The value of the phi φ correlation coefficient is
interpreted just like the Pearson r, that is, it can vary from -1.00 to +1.00.

26.Skewness

It is the degree of distortion from the symmetrical bell curve or the normal distribution. It


measures the lack of symmetry in data distribution. It differentiates extreme values in one versus
the other tail. A symmetrical distribution will have a skewness of 0.

27. Kurtosis

Kurtosis is all about the tails of the distribution — not the peakedness or flatness. It is used to
describe the extreme values in one versus the other tail. It is actually the measure of
outliers present in the distribution.
28.Inferential Statistics:
Inferential statistics use a random sample to draw conclusions about the population. Typically, it
is not practical to obtain data from every member of a population. Instead, we collect a random
sample from a small proportion of the population

29.Outliers:

An outlier is an unusually large or small observation. Outliers can have a


disproportionate effect on statistical results, such as the mean, which can result in misleading
interpretation.

30.Spearman rank-order correlation:

Also called Spearman’s rho, the Spearman correlation evaluates the monotonic relationship
between two continuous or ordinal variables. In a monotonic relationship, the variables tend to
change together, but not necessarily at a constant rate. The Spearman correlation coefficient is
based on the ranked values for each variable rather than the raw data.

31.Range (Statistics)
In statistics, range is defined simply as the difference between the maximum and minimum
observations.

32.Median test:

Test the difference between median of two independent samples.

33. Normal Distribution:

Normal Distribution is uniquely defined by its mean and standard deviation. It is symmetrical
about the mean and may be represented graphically as a bell shaped curve, known as the Normal
curve. The area under the curve = 1. Most of the area under the curve is within ± one SD of the
mean, the large majority (95%) is within ± 1.96 SD (often written as 2 SD for short) of the mean,
almost all is within ± 3 SD of the mean.

34. Parametric Test:


In the literal meaning of the terms, a parametric statistical test is one that makes assumptions
about the parameters (defining properties) of the population distribution(s) from which one's data
are drawn

35.Non-Parametric Test:
Non-parametric test is one that makes no such assumptions. In this strict sense, "non-
parametric" is essentially a null category, since virtually all statistical tests assume one thing or
another about the properties of the source.

You might also like