0 views

Uploaded by Liad Elmalem

- Sampling From Penn State University
- UC Excel 2010 - Module 2 - Calculations
- BBA2STATSPASTPAPERS (Autosaved)
- Statistics Review 5 Comparison of Means
- Mann Whitney U and Wilcoxon
- Bismillah Lampiran 2 Baru
- Stock3e_Empirical_SM.pdf
- California County Health Status Profiles 2009
- Sampling Distribution
- aaawww.123(1).pdf
- Bab Viii a Lampiran Probit
- Specimen (IAL) QP - Unit 6 Edexcel Biology
- Ec Current_1
- 07 Learning About a Mean
- Tutorial 3003
- Risk-based Methodology for Validation of Pharmaceutical Batch Processes.
- Central Limit Theorem
- Ch03SM
- Normal Distribution 2012
- Estimation Guidelines and Templates

You are on page 1of 52

INSTITUTE OF NEUROLOGY

Saiful Islam

Chris Hardy

Feedback from you about past 2 lectures

Error (SE) and Confidence Interval (CI)

•Useful examples

•Bad timing as in the afternoon and everyone

was tired / sleepy.

•Class quizzes are useful

Standard Deviation (SD) vs Standard Error (SE)

• The SE quantifies the typical error or difference between the mean measured in a sample

and the theoretical mean in the population from which the sample was drawn

The SE indicates how accurately the sample mean estimates the population mean

• Standard deviation (SD) of a sample of observations measures how a typical observation in

the sample differs (deviates) from the sample mean

Confidence interval – an example

• There is 95% probability that this interval contains the unknown

but true value of the population mean

• Assume we have a large sample size > 30 say low blood pressure

of the patients, also we found these data are normally

distributed then,

• We need to obtain an upper and lower limit of the interval and

say

95% CI for mean is ( x 2 SE ) to ( x 2 SE )

• Approximated 1.96~2

Next meeting date 23rd October at 4:15pm

Feedback will be collected for today’s and 23rd lectures

Free coffee, snacks available!!

Please contact Max & Nikita to join in this survey.

Max : maximillian.crane.18@ucl.ac.uk

Nikita : skgtnp0@ucl.ac.uk

Learning Outcomes for today’s lecture

•When do you need these tests?

•Difference between paired and unpaired

data

•How to compare data using:

•Mann-Whitney test (two independent samples)

•Kruskal-Wallis test for > 2 independent samples.

•Wilcoxon-Signed Rank test (paired data)

•How to interpret results from STATA output

Current research at UCL

primary progressive aphasia (lvPPA)

struggle to understand speech?

Logopenic Variant PPA

•Rare variant of Alzheimer’s disease

•Young onset (<65)

•Impaired repetition of phrases and

sentences

•Word finding difficulties

understanding speech in noisy places

[video]

Speech perception in lvPPA

Current research at UCL

Current research at UCL

17 healthy controls

7 patients with lvPPA

(9 patients with nfvPPA)

and Neursurgery

Centre

Paradigm

freq

0 1800

time (msec)

Research question

understand degraded speech?

List of variables

Diagnosis (Categorical)

SinewaveScore (SWS) (Continuous)

Bin1 (Trials 1-10 SWS; Continuous)

Bin 2 (Trials 11-20 SWS; Continuous)

Bin 3 (Trials 21-30 SWS; Continuous)

Bin 4 (Trials 31-40 SWS; Continuous)

Study objectives

understand degraded speech

2. Control controls with nfvPPA

3. Compare controls, lvPPA, and nfvPPA

4. Compare SWS score between two time

points (Bin 1 and Bin 4) in the lvPPA

group

What are non-parametric tests?

and assume that distribution of sample means are ‘normally’

distributed – (planned to cover lecture-4 on 23 Oct 2018).

•Often data does not follow a Normal distribution eg number of

cigarettes smoked, cost to NHS etc.

•Positively skewed distributions

20

15

Frequency

10

Mean = 8.03

Std. Dev. = 12.952

N = 30

0

0 10 20 30 40 50

What are non-parametric tests?

situations where fewer assumptions have to be made

• Sometimes called Distribution-free tests

• NP tests STILL have assumptions but are less

stringent

• NP tests can be applied to Normal data but

parametric tests have greater power IF assumptions

met

Ranks

•Practical differences between parametric and

NP are that NP methods use the ranks of

values rather than the actual values

•E.g.

1,2,3,4,5,7,13,22,38,45 - actual

1,2,3,4,5,6, 7, 8, 9,10 - rank

Median

• The median is the value above and below which 50% of

the data lie.

• If the data is ranked in order, it is the middle value

• In symmetric distributions the mean and median are the

same

• In skewed distributions, median more appropriate

Class exercise : Find

median 1 min

• Blood Pressure measures of 7 patients:

135, 138, 140, 140, 141, 142, 143

Median= ?

0, 1, 2, 2, 2, 3, 5, 5, 8, 10

Median=

Paired And Not Paired Comparisons

occasions then this is a paired comparison

• Two independent samples is not a paired comparison

• Different samples which are ‘matched’ by age and gender

are paired

Non parametric tests:

Wilcoxon tests

• Frank Wilcoxon was Chemist in USA

who developed

test

test

Please note that parametric will discuss in next lecture (lecture-4) on 23rd

October 2018.

•Histogram

20

15

Frequency

10

5

0

0 50 100 150

numbersSWS

•Histogram by group 0 1

0

Density

0 50 100 150

0

0 50 100 150

numbersSWS

Density

•0 vs 1 not similar

normal numbersSWS1

Graphs by Group

•0 vs 2 not similar

•1 vs 2 similar

•Null hypothesis : there are no difference in

distribution of SWS score between control

and lvPPA group

•Alternative hypothesis : there are some

differences in distribution of SWS score

between control and lvPPA group.

•Now we check quantile-quantile (q-q) plot to check

normality for group = 1 (control group).

•Data point are

Away from the

straight line

suggests not normal

•Now we check quantile-quantile (q-q) plot to check

normality for group=2 (lvPPa).

•Data point are

Away from the

straight line

suggests not normal.

Very few data points

as well.

•Control group and lvPPA group are

independent , very small sample and none of

them are normally distributed so met the

assumptions of non-parametric test.

•We should choose non-parametric version

of two independent sample test called Mann-

Whitney test to compare SWS score

•Stata output

STATA code . ranksum numbersSWS1 if Group== 0 | Group==1 , by(Group)

0 17 264 212.5

1 7 36 87.5

STATA output

adjustment for ties -1.19

z = 3.279

Prob > |z| = 0.0010

• The output gives us a handy table displaying the two groups, their

Obs (number of observations), the observed ranked sums and the

rank sum that would be expected if the null hypothesis were retained

(if there were no difference).

• Tied ranks can be an issue, so below the table there is a variance

adjustment to account for these ties.

• Then you are reminded of the null hypothesis, and given the z-

statistic (3.29) and p-value (0.001); which suggests that there are

significant difference in the distribution (medians) between control

and experimental group in SSW.

Class Quiz 1 min in

pairs

and experimental lvPPA are not

independent?

•Null hypothesis : there are no difference in

distribution of SWS score between control

and nfvPPA group

•Alternative hypothesis : there are some

differences in distribution of SWS score

between control and nfvPPA group.

•Now we check quantile-quantile (q-q) plot to

check normality for nfvPPa.

•Data point are

Away from the

straight line

suggests not normal.

*Very few data points

as well.

•Control group and lvPPA group are

independent , very small sample and none of

them are normally distributed so met the

assumptions of non-parametric test.

•We should choose non-parametric version

of two independent sample test called Mann-

Whitney test to compare SWS scores.

•Stata output

STATA code . ranksum numbersSWS1 if Group== 0 | Group==2 , by(Group)

0 17 286 221

2 8 39 104

STATA output

adjustment for ties -1.25

z = 3.795

Prob > |z| = 0.0001

• The output gives us a handy table displaying the two groups, their Obs

(number of observations), the observed ranked sums and the rank sum

that would be expected if the null hypothesis were retained (if there

were no difference).

• Tied ranks can be an issue, so below the table there is a variance

adjustment to account for these ties.

• Then you are reminded of the null hypothesis, and given the z-statistic

(3.79) and p-value (0.001); which suggests that there are significant

difference in the distribution (medians) between control and

experimental group in SSW.

•Null hypothesis : there are no difference in the

distribution of at least one pair of SWS score

(among control , lvppa and nfvPPA)

in distribution of SWS scores at least between one

pair.

•Quantitative measure for all outcome

•Overall outcome not normally distributed

•The shapes of at least one pair in groups not

similar

•Each groups are independent from each

other

1. We have Quantitative measure for each

outcome

2. Overall outcome not normally distributed

3. The shapes of at least one pair in groups not

similar (e.g.; control vs lvfppa measure)

4. Each groups are independent from each other

•As we have more than two groups and overall

outcome not normally distributed so a non

parametric test is preferred

•We have more than two independent groups.

•We will consider a non-parametric test called

Kruskal-Wallis equality of populations rank

test

•STATA output

• kwallis numbersSWS1, by( diagnosis1) STATA command

Kruskal-Wallis equality-of-populations rank test

Control 17 397.00

lvPPA 7 63.50

nfvPPA 8 67.50 STATA output

probability = 0.0001

probability = 0.0001

• We had ties in our data, so we want to consult the Kruskal-Wallis H test results

highlighted in the red rectangle above.

• The top line (i.e., "chi-squared with ties = 19.37 with 2 d.f.") reports the chi-squared

value and the degrees of freedom of the test.

• The line below this one (i.e., "probability = 0.0001") indicates the statistical

significance of the Kruskal-Wallis H test (i.e., the p-value).

• We can see that the significance level is 0.0001 (i.e., p = .0001), which is below 0.05,

and, therefore, there is a statistically significant difference in the median score

between the three different groups of the independent variable, SWS (i.e., control

vs lvfppa vs nfvPPA )

• There are only 7 patients in this group

parametric test is appropriate

time1 and time4

• First take the differences of SWS between two time points:

Table : SWS score between two time

points for the patients with lvfppa

diff = time4-

Time1 Time4 time1

2 8 6

12 17 5

2 6 4

20 27 7

30 28 -2

1 0 -1

23 30 7

• Almost all the data points in q-q plots are away from the straight line

so we apply a non-parametric Wilcoxon signed-rank test (an

alternative to paired t-test) for testing the hypothesis that there is no

difference between in SWS score between two time points.

• Use STATA command: gen diff=Time4-Time1 if Diagnosis==2

• Stata output

• signrank diff=0 STATA command

Wilcoxon signed-rank test

positive 5 25 14

negative 2 3 14

zero 0 0 0

all 7 28 28

STATA

unadjusted variance 35.00

adjustment for ties -0.13

output

adjustment for zeros 0.00

Ho: diff1 = 0

z = 1.863

Prob > |z| = 0.0625

• Stata output

Binom. Interp.

Variable Obs Percentile Centile [95% Conf. Interval]

diff 7 50 5 -1.685714 7

•The test gives a p-value of 0.0625 suggesting that there is

not enough evidence of the difference in median of SWS

scores between two time points for the patients with

lvfppa.

• The 95% confidence around the median falls between -1.68

and 7. This confidence interval includes 0, which indicates

there is no much difference in regards to the shape of

sinewave score for lvFPPA patients between two time

points.

Take home : What statistical methods should I use to

analyze my data?

• Choose appropriate statistical methods/tests

Will cover in next lecture 23rd

October 2018

Suggested Reading

Martin Bland (4th edition): page 117-191

Sterne : page 344-350

•Any questions?

- Sampling From Penn State UniversityUploaded byjazzlovey
- UC Excel 2010 - Module 2 - CalculationsUploaded byJulio César Rosales Peñate
- BBA2STATSPASTPAPERS (Autosaved)Uploaded byGina Dee Williams
- Statistics Review 5 Comparison of MeansUploaded byManish Chandra Prabhakar
- Mann Whitney U and WilcoxonUploaded byErika Lin
- Bismillah Lampiran 2 BaruUploaded byFauziah Nabila
- Stock3e_Empirical_SM.pdfUploaded byVarin Ali
- California County Health Status Profiles 2009Uploaded byLakeCoNews
- Sampling DistributionUploaded byASHISH
- aaawww.123(1).pdfUploaded bybrainhub50
- Bab Viii a Lampiran ProbitUploaded byAnindhito Kurnia Pratama
- Specimen (IAL) QP - Unit 6 Edexcel BiologyUploaded byWrikhesh Ibtisum
- Ec Current_1Uploaded byRakeshor Ningthoujam
- 07 Learning About a MeanUploaded byJustinMalin
- Tutorial 3003Uploaded byOntell Mimi Lolo
- Risk-based Methodology for Validation of Pharmaceutical Batch Processes.Uploaded byzombiecorp
- Central Limit TheoremUploaded byMudassir Rehman
- Ch03SMUploaded byLowErEastSide
- Normal Distribution 2012Uploaded byGabriel Lloyd
- Estimation Guidelines and TemplatesUploaded byarsaban
- statistics chapter 7 project brooklyn and meganUploaded byapi-442122486
- ICP-SET-DUploaded bySaurav Somesh
- Introduction to biostatistics: part 1Uploaded byHuidong Tian
- Study CheckListUploaded byPrashanthMNair
- ANA BAL PART 2Uploaded byCara Jaen
- Bootstrap ExampleUploaded byIgor LinUser
- QM+Practice+Ques.Uploaded byCfhunSaat
- Genetic Variations in Local Ecotype Turkeys. 2. Effect of Genotype, Sex and Hatch Batch on Growth-related Measurements in Live BirdsUploaded byInternational Network For Natural Sciences
- SPSS Maps 10.0Uploaded byAdriano Beluco
- CommonUploaded byLianaFibrina

- islm-handout4-oct18 (1)Uploaded byLiad Elmalem
- תמצית המקרה.docxUploaded byLiad Elmalem
- Scan Negative Cauda Equina Syndrome Evidence of Functional Disorder From a Prospective Case Series (1)Uploaded byLiad Elmalem
- Difference t vs NormalUploaded byLiad Elmalem
- Research Proposal - Final Version - Bar.docxUploaded byLiad Elmalem
- Statistics Lecture1 2.10.18Uploaded byLiad Elmalem
- AlsUploaded byLiad Elmalem
- Specialty Board Review Neurology - 2eUploaded bysra1_103
- Practice TestUploaded byLiad Elmalem
- 329 Mastering-Psychiatry-2016.pdfUploaded bydragutinpetric
- תפקודי אגוUploaded byLiad Elmalem
- IMSLP09925-Dvorak - Op.75 - 4 Romantic Pieces for Violin and PianoUploaded byhelenci
- מצגת פסיכופרמקולוגיהUploaded byLiad Elmalem
- אגוUploaded byLiad Elmalem

- Statistical Inference Course NotesUploaded byKurō Yatogami
- Topic 15 - Probabilities Answer.pdfUploaded byhamza omar
- Mendelian Randomization With a Binary OutcomeUploaded byscjofyWFawlroa2r06YFVabfbaj
- chap4solnUploaded byggleichgesinnten
- ssUploaded byenny
- Queuing Thoery MM1 Practice ProblemsUploaded bySoham Samanta
- 11287_Normal Distribution Density FUNCTION TABLEUploaded byamit
- why use bic and aic.pdfUploaded byNasir Ahmad
- Explain 1Uploaded bySudiv Gulla
- Syllabus 4 Statistics for Business and Economics, I. SUBEKTI.docxUploaded byDimas Rahardiyan
- StatisticsUploaded bysgxrgsys
- Monte Carlo Methods.pdfUploaded bybhuniakanishka
- GumUploaded byjohndoe21718
- Lilliefors TestUploaded byMamang Aja
- Homework 3 SolutionsUploaded byislayer
- Normal Probability PlotsUploaded byAbel J. Urbán Ríos
- Test ProceduresUploaded byMelissa Miller
- Classical Multiple RegressionUploaded by13sandip
- Mgmt E-5070 1st Examination Mc and Tf SolutionUploaded byaag0033398
- svyUploaded bybarbarabento
- Fixed vs Random Sept 2013Uploaded byRicardo Candea
- 06a3 AnsUploaded byPETER
- MECH_Probability and StatisticsUploaded byprachi
- Variable Selection in Data Mining: Building a Predictive Model for BankruptcyUploaded bysubha1984
- Training T TestUploaded byThothathiri Sadagopan
- Linear Regression Analysis for Survey DataUploaded byMalav Shah
- Stata ResultUploaded bysnazrul
- HW3 Fall2015 SolutionsUploaded byKassandra Gianinoto
- FE3003 Heteroskedasticity and the White TestUploaded byFatimah Osman
- var48Uploaded byBret Valerio