You are on page 1of 26

PARAMETRIC TESTS

ASSUMPTIONS FOR PARAMETRIC TESTS

• Parametric tests follow the specific assumptions about the population parameters.

a) Normal Distribution
b) Interval scale
c) Population parameters are known
d) Homogeneity of variance – data from various groups have same variance
e) Linearity – data has a linear relationship
f) Independence – data obtained is independent of each other
g) Randomness of sample
ASSUMPTIONS FOR NON-PARAMETRIC TESTS

a) Non normal distribution – The data is unsymmetrical


b) Population parameters are unknown
c) Nominal or ordinal data
d) Non-linearity – Refers to inconsistent or curve slope of change that represents the
relationship between variables
e) Independence – Data obtained must be independent of each other
f) Randomness of sample
HYPOTHESIS TESTS - PARAMETRIC

• t-test ( <30)
One Sample • Z-test ( >30)

• Independent sample – Two group t-test, Two group Z-test


Two-sample • Paired Sample – Paired t-test

More than two • ANOVA (f-test)


sample
HYPOTHESIS TESTS – NON-PARAMETRIC

• Chi-square
One Sample • Run
• Sign

• Independent sample – Chi-square, Mann-Whitney


Two-sample • Paired Sample – Sign, Wilcoxon, Chi-square

More than two • Friedman


• Krushal-Wallis
sample
T-TEST

T-test is compares the difference between two means of different groups to determine whether the
difference is statistically significant or it is simply by chance.

x ̅is the sample mean


s is sample standard deviation
n is sample size
μ is the population mean
T-TEST

T-test is commonly used when the sample size is small. This test is based on the Student’s t –
statistics.
Assumptions of T-test:
a) Variable is normally distributed
b) All data points are independent.
c) The sample size is small. Generally, a sample size exceeding 30 sample units is regarded as
large,
otherwise small but that should not be less than 5, to apply t-test.
d) The mean is known (or assumed to be known)
e) The population variance is estimated form the sample.
ONE SAMPLE T-TEST

• One sample t-test compares the mean of a single group of observations with a specified value.
• We draw a random sample and then compare the sample mean with population mean.
• Then we make statistical decision that whether the sample mean is different from the
population mean using the formula.
• Then we compare the calculated value with the table value at a certain significance level
(either 5% or 1%)
• If the absolute value of calculated ‘t’ is greater than table value, then we reject the null
hypothesis, otherwise null hypothesis is accepted.
TWO SAMPLE T-TEST

• It is used when two independent random samples come from normally distributed
population having same variance.
• We test the null hypothesis that the two population means are same, against an
appropriate one sided or two sided alternate hypothesis.
• Assumptions are:
• Samples are random and independent of each other
• Variance in both groups are equal.
PAIRED TWO SAMPLE T-TEST

• It is used when we have paired data of observations from one sample only, i.e. when each
individual in sample gives a pair of observations. (before and after)
• Assumptions are:
• Variable should be continuous
• Difference between pre and post observations should be normally distributed
Z-TEST

• Z-test is used when the sample size is more than 30.


• It is computed exactly in the same manner as t-test
• The table of Z Statistics is used to find out the critical value.
TEST FOR MORE THAT TWO GROUPS - ANOVA

• ANOVA is used to examine the differences in the mean values of the dependent variable associated with
the effect of independent variable.
• It is used when the means of more than two populations need to be compared.
• ANOVA helps to compare the differences among means of all the populations simultaneously.
• Major assumptions of parametric tests are applicable – Normality of data, homogeneity of variance,
independence, randomness etc.
• Here, the dependent variable is metric data (interval or ratio scale) while independent variable is
categorical in nature. Categorical variables are called factors.
• If there is one factor, which is divided into various categories then one-way ANOVA is used.
• If two or more factors are involved, then it is n-way ANOVA.
NON PARAMETRIC TESTS
NON PARAMETRIC TESTS

• These tests do not follow the specific assumptions which are made about the population
parameters.
• It is mainly used by when population parameters are not known
• Usually the sample size is small.
• They are also known as distribution free tests as they don not require the population to
follow any assumption related to shape of distribution.
• These tests mainly use non-metric data i.e. ordinal and nominal and computations mainly
depend on median or mean of the sample data.
ADVANTAGES OF NON-PARAMETRIC TESTS

• The tests can be applied to many situation specially when rigid requirements of
assumption are not fulfilled like normal distribution of data, metric data,
• The tests applies to ordinal or nominal data which lacks numeric values.
• The tests involve very simple calculations as compared to parametric tests.
• The tests are useful to draw inference from a qualitative data.
• It is useful to analyse and compare the preferences, traits, attitudes, opinions etc.
DISADVANTAGES OF NON-PARAMETRIC TESTS

• When the data is converted from numeric to non numeric, a lot of information is not utilised
properly. For ex. In sign test, only plus and minus sign is taken into account while the values or
quantity related to plus or minus is not used in the test.
• When the basic assumptions like normality of data are valid then non-parametric tests are less
powerful as compared to parametric tests. This also leads to the problem of type II error.
• The non-parametric test is less precise and yields a less detailed conclusion. The non-parametric
test does not specify identical population distribution with respect to mean, median or variance.
It only states whether groups have identical population distribution or not.
CHI-SQUARE TEST

• The Chi-square test is a non-parametric test used for mainly two purposes.
• First, it determines whether the sample data matches the population or not. This is test of
Goodness of Fit.
• Secondly, it determines the association between two samples. The comparison of two
categorical variables are checked to find whether they are related or not. This is a test of
independence of variables.
CHI-SQUARE TEST

• Chi-square test has certain properties


• Data should not be normally distributed or not symmetric.
• The values of chi-square is greater than or equal to zero.
• The shape of chi-square distribution depends on the degree of freedom. With the increase in
degree of freedom, the distribution tends to be normal.
• Data is categorical and required in the form of frequencies. For applying chi-square test, the
data should be in proportion or percentages which should be converted into frequencies.
RUN TEST

• A run is a sequence of events of a certain type preceded and followed by occurrences of


the alternate type or by no event at all.
• A sample with too many or few runs suggests that the sample may not be random.
• Run test is a hypothesis testing procedure that determines whether one sequence of data
within a given distribution has been derived from a random process or not.
• It may be applied to test the randomness of data in a sample.
• The variable must be dichotomous for one sample test, numeric or non numeric data.
SIGN TEST

• The sign test is used to test the null hypothesis that the median of distribution is equal to some
value.
• It can be used in place of a) one sample t-test b) paired t-test or c) for ordered categorical data
where a numerical scale is inappropriate but where it is possible to rank the observations.
• In one sample sign test, the division of data is done based on the median value, i.e. higher or lower
than the median value.
• In paired observations of sign test, it tends to use only sign between two values let’s say x and y.
The sign test will only observe whether x>y or x=y or x<y. The values are not used in computation
process.
MANN-WHITNEY U TEST

• The Mann-Whitney U test is used to compare the differences between two independent
groups when the dependent variable is either ordinal or continuous, but not normally
distributed.
• The test is used to examine whether two samples have been drawn from population with
the same means.
• This test is an alternative to a parametric independent sample test.
WILCOXON SIGNED RANK TEST

• Is used to test the paired or two related samples means. It is used to examine whether
their population mean ranks are different in two related samples or not.
• It can be used as an alternative to paired t-test.
• The assumptions of normal distribution are not required.
• It is a non-parametric test that can be used to determine whether the two independent
samples were selected from populations having the same distribution or not.
KRUSKAL-WALLIS TEST

• Is used to compare difference of more than two independent sample groups when
dependent variable is either ordinal or continuous, but not normally distributed. This is an
alternative to one-way ANOVA test.
• Can be used for a very small sample size.
PARAMETRIC AND NON-PARAMETRIC TESTS

Parametric Test Non-Parametric Test


Test whether sample mean is One sample t-test One sample sign test
same as hypothesized mean One sample z-test Run test
Chi-square test
Tests for the difference between Paired t-test Wilcoxon Sign-Rank Test
two variables from the same Paired z-test Sign test
population Chi-square test
Tests for the difference between Independent t-test Mann-Whitney U test
the same variable from different Chi-square test
populations
Test for comparing the means of ANOVA Kruskal-Wallis Test
more than two populations
CORRELATION AND REGRESSION ANALYSIS

• Correlation measures the degree of association between two or more set of variables.
• Regression is used to explain the variation in one variable (dependent variable) by a set of
one or more independent variables.
• In case of one independent variable, it is called as simple regression whereas if there are
more than one independent variable, it is called multiple regression analysis.
CORRELATION

• Positive correlation – When X & Y variables move in the same direction.


• Negative correlation – When X & Y variables move in opposite direction.
• Zero correlation – When X & Y variables move in no connection with each other.
• The linear correlation coefficient takes values between -1 to +1.
• Closer the scatter of points to the line, higher is the degree of correlation between variables.

Y = α + βX + U
U = Stochastic error term
α,β = parameters to be estimated

You might also like