You are on page 1of 22

Determining significance

Significance is determined when:

A statistical value calculated for you data > a critical reference value that relates to the
accuracy of your data

Test Sample Value > Test Critical Value

or when

p < 0.05
Step 1 Type of Data Nominal Ordinal Interval Ratio
What type of data do you have?

Scale

Step 2 – Descriptive Statistics


No Yes
Are the scale data normally distributed?

Step 3 – Inferential Statistics


Are you looking for an association? Categorical Nonparametric Parametric
Are you looking for an difference?
Are you looking for an relationship?
Categorical Data Ho: equal proportions

One Sample Two Samples More than two Samples

Paired Unpaired

Chi-squared McNemar’s test Chi-squared Chi-squared


goodness of fit test of test
test association (E>
or = 5) in 75%
of cases
Fisher’s Exact
test (E< 5)
Parametric Data: Difference – Ho:m1=mo

One Sample Two Samples More than two Samples

Paired Unpaired

z-test (if s Paired t-test t-test equal ANOVA


known) variance
One sample t- t-test unequal
test (if s variance
unknown)
Use an F test to determine
equal or unequal variance
Non Parametric Data: Difference – Ho:median1= mediano

One Sample Two Samples More than two Samples

Paired Unpaired

One sample Wilcox signed Mann Whitney Kruskal Wallis


median test rank test U test test (independent
observations)
Friedman test
Relationship

Parametric Non parametric


data data

Pearson’s Linear Spearman’s Linear


correlation regression rank regression
(r)
Determining significance

Significance is determined when:

A statistical value calculated for you data > a critical reference value that relates to the
accuracy of your data

Test Sample Value > Test Critical Value

or when

p < 0.05
t-tests - t values

ttest aka tstat and tcritical aka tcrit

What is a t value ?

ttest aka tstat


values can be calculated in different ways - it depends on the type of t-tests
you are using

however the decision making step is the same for all t-tests

p < 0.05
ttest aka tstat

dof = n-1 Bias statement


Precision statement

dof = (n1+n2)-2

or if n1 = n2
In this case t-test will have
a negative sign

In this case t-test will have a


positive sign

The sign does not matter


To determine tcritical you need to answer the following three questions:

Is the question one tailed or two tailed?

What is your confidence level?


Hint normally it is 95% with a = 0.05

How many degrees of freedom do we need to consider for the t-test in


question?
Confidence Level / One tailed or two tailed

• the number of ‘tails’;


• the number of degrees of freedom;
• the level of confidence

greater than / less than is there a difference ‘between’


F Test
ANOVA
Dealing with Outliers

Consider the data set below

12.53, 12.56, 12.47, 12.67, 12.48

Is the datum 12.67 an outlier or a bad fit?

To answer this we can apply a Q test, aka Dixons Q test.


To apply this test we need the following:

A value of Qcalculated which can be obtained from

A gap value – the difference between the point under suspicion and the nearest point to it

The value of the range of the data – i.e the range of the total spread of the data

A value for Qcritical which can be obtained from a reference table for N observations and a
particular Confidence Level

gap value =0.11

12.47 12.48 12.53 12.56 12.67

Range = 0.20
Qcalculated = gap/range = 0.11/0.2 = 0.55
gap value =0.11

12.47 12.48 12.53 12.56 12.67

Range = 0.20
Qcalculated = gap/range = 0.11/0.2 = 0.55

Qcritical = 0.71

If Qcalculated > Qcritical then discard the datum

However the argument could be made to never discard


a datum unless you are confident of an error having
been made in obtaining it.

Others would repeat the process several times before


having the confidence to do so

Very subjective subject


D.B. Rorabacher, Anal. Chem. 63 (1991) 139
Remember this example from the first lecture?
The impact Average value for the data set is 20 with group 1 data left in
of the
decision is Standard deviation is 0.3 and SEM is 0.08
shown
here….
20.00

19.40 20.60
19.84 20.16

Average value for the data set is 19.47 with group 1 data is taken out

Standard deviation is 0.18 and SEM is 0.05

19.47

19.11 19.83
19.37 19.57
However the argument could be made to never discard a datum unless you are confident of
an error having been made in obtaining it.

Others would repeat the process several times before having the confidence to do so

Very subjective subject


Now let us look at this
Conclusion – Be Careful with Stats

Be very careful with this test:

The argument could be made to never discard a datum unless you are confident of an error having been
made in obtaining it.

Others would rather repeat the process of measurement several times before having the confidence to
discard the datum

Very subjective subject

Only use the test once on any given data set

Do not discard more than one datum from a data set

You might also like