You are on page 1of 6

Test of differences

Non parametric tests

Nominal Variables (categorical variable)

1. Binomial Test

The binomial test is useful for determining if the proportion of people in one of two categories
is different from a specified amount.

Condition: 1 variable with 2 categories (type: numeric)

E.g.: gender (male / female)

H0 (null hypothesis): We are assuming that respondents are in equal proportion

Interpretation:

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.

2. Chi-Squared, One-Variable Test:

One variable with more than 2 categories (type: numeric)

E.g.: zone (south, north, east, and west)

Note: In expected column the value should be >=5 (if expected values < 5 then we should not
perform this test)

If <5 then convert it into 2 categories and perform binomial test not chi-square test

H0: there is no significant difference between the observed (O) and expected (E) frequencies.

Interpretation:

In this case none of the categories have expected frequencies less than 5.so, this is valid test.

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.
3. Cross tabulation (Two variables: one nominal variable with 2 categories, another variable
with more than 2 categories)

E.g.: gender and zone

Perform cross tab, row = gender (dichotomous variable), column= zone

Statistics -> click on chi-square

Cells -> click on expected

Interpretation:

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.

4. Mcnemar (2 variables (dichotomous) - same nature [ex:Satisfied and Dissatisfied] , data


collected at 2 different intervals)

E.g.: one variable attendance in jan and another variable attendance in feb

Npt -> 2 related samples -> click on mcnemar

Interpretation:

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.

5. Cochran’ Q test (3 variables (dichotomous) - same nature [ex:Satisfied and Dissatisfied] ,


data collected at more than 2 different intervals)

E.g.: one variable attendance in jan and another variable attendance in feb and another var
attendance in mar

Npt -> K related samples -> click on chchran’s Q test

Interpretation:

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.
Ordinal variables:

1.1 sample ks test (1 ordinal variable)

E.g. employee satisfaction

Select test distribution as uniform

Interpretation:

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.

2.2 sample ks test (1 ordinal variable, 1 dichotomous nominal variable)

E.g.: employee satisfaction with respect to gender

Grouping variable = dichotomous variable

Choose test type = kolmorgrov-signmov Z

Interpretation:

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.

3. Fried man test (3 ordinal variable at different time intervals)

E.g.: job satisfaction of 3 different months

K related samples -> Fried man test

Interpretation:

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.
Parametric test perform on interval/scale variables

Distribution is normal

Check normality before performing any of the parametric t tests. If the means are same in the t-
test then look for the f value to look for deviances in the variance.

1. t-test with one sample (one scale variable)

E.g. assume the avg of previous batch is 26.2 then what is avg of current batch?

(in 2006 we had 26000 from literature data then what is the value now ? to peform this we need t-
test )

e.g.: age, salary etc.

Analyze -> compare means -> one-sample T test

Interpretation:

If p value less than 0.05 then there is difference and reject the null hypothesis.

If P value greater than 0.05 then there is no difference and we fail to reject the null hypothesis.

2. t-test for 2 variables (one interval variable, another nominal dichotomous variable)

We perform this test to check with respect t to dichotomous variable whether difference exists in
interval variable or not?

E.g. leadership interaction w.r.t gender

Analyze->compare means -> independent Samples T test

Grouping variable = dichotomous variable (gender)

Normality test :

Analyze -> Explore -> dependent scale variable

Statitics -> check descriptives and outliers

Plots -> histogram and normality curve

If p value in test of normality is less than 0.05 then that variable is not normally distributed.

P value of Shapiro-wilk >0.05 then data is normally distributed.


3. ANOVA (one variable should be interval and one should be nominal or ordinal)

E.g.: income and educational qualification

zone vs income

dependent list -> scale variable

factor list -> either nominal or oridinal variable

Interpretation:

if F values is more than 1 there is scope for variation and sign value <0.05

if F value is more than 1 then variance is high

it F value is close to 1 then variance is low

check F-table ( 4 AND 30 and check the value in table if it more 2 then reject it )

Test of relations

co-relation :

strength and direction of variable

1. linear co-relation ( interval ) ( pearson corelation ) ( parametric ) ( normal distrubuiton )

2. rank co-realation ( ordinal ) ( assume normal distribution )

a. spearman(npt) ( >= 30 )

b. kandles tauby( npt) ( 10-15 datapoints then only use )

Note : if 2astres stirng relatin at 99 and one asterk raltion with 95 CL

note : if you dont have normal distribution then even interval variable choose spearman

analze -> correlate -> biavarate

1. check the normality

2. scatter graph plot

3. check the direction by corelaton


if <0.05 relation exits and the F value for strength

regression

1. check ofor normality

check the corelation between varialbles better thean moderate

run the scatter plot

y always independent var

y=a+bx

zpred = x

zresidul = y

residual is for dependt var

in regression look at r2 value

You might also like