0 views

Uploaded by Liad Elmalem

stat

- Numerical Data
- A1
- Application of Stats in Business, Notes-ch1,Introduction and Data Collection
- Masters Degree Thesis Proposal
- S. A, - L. M.
- Introduction Defining the Role of Statistics in Business
- The Accounting Information Quality And The Accounting Information System Quality Through The Organizational Structure : A Survey Of The Baitulmal Wattamwil (BMT) In West Java Indonesia
- UTHM Assignment BFC 34303 – Civil Engineering Statistics Sem I 2017/2018
- Chapter 1
- 10 Bivariate Analysis V2.1
- JMP Win Menu Description Card
- MVsyllabus
- summative reflection
- summary of tests
- 7th Grade Ch8 Reteaching
- Part 1- Statistical Errors Even You Can Recognize
- Syllabus - Statistics2 Fall
- 201718_HOE2_TD1
- PValuesAugust2007
- 30 Fair empl.prac.cas. 906, 29 Empl. Prac. Dec. P 32,720 Equal Employment Opportunity Commission v. American National Bank, 680 F.2d 965, 4th Cir. (1982)

You are on page 1of 41

By

&

Dr Caroline Selai , Senior Lecturer , IoN

Date : 02.10.2018

Module

Cohort code : CLNE0007

study-practical

Module name: Research Methods and Introduction to Statistics

Introduction to statistics will cover: Critical appraisal will cover:

How to conduct research / research (EBM) ?

process?

Why EBM?

Role of statistics in research.

How my research data looks like? Hierarchy of evidence.

Data presentation/display.

Data analysis using appropriate How to extract evidence you need?

statistical tests/methods

How to interpret / present statistical What is critical appraisal?

output? Methods of critical appraisal?

Lecture : 8 one hour lecture Consider a variety of published

Workshops : 2 (Repeated) research paper to make it clear

Revision Lectures : 2 how you could appraise them

Assessments: 1 hours unseen written critically?

exam (proposed).

Lecture : 8 one hour lecture

Exam date : 6th February 2019 at 11.30am

Total credits = 15 . Half in introduction to statistics & the other half for research

Overall module aim

independently by

• Understanding research process

• Critically appraising any research paper

• Understanding current research methods by critically appraising

some recent research

• Clearly knowing different statistical methods needed in common

research

• Presenting/Displaying your own research data

• Learning Statistical tests/methods needed for neuroscience

research

• Learn clearly at least one statistical software (we will use

STATA) aiming to analyze and interpret your own data.

Lesson plan

• Research process

• Role of statistics in data analysis

• Research process

• Summary measure of the data

• Identifying outliers in your data

• Types of data

• Data management

Introduction to data analysis

Learning outcome

familiar with

• Importance of learning statistics

• How statistics is related to Neurology/Neuroscience

Different ways of displaying and summarising data

Identifying outliers in my data

How I can manage my own research data

Using STATA to carry out exploratory analysis and presentation of a

dataset (histogram, box plot, cumulative frequency)

Introduction to data analysis

The reason you are here is because you have a inquiring mind!

• Does using a mobile phone increase risk of brain cancer?

• Is drinking the occasional glass of wine during pregnancy harmful to the baby?

• Why do women live longer than men?

• What are the potential health risks of climate change, and who will be most

affected?

• Is there a gene for Alzheimer’s?

• Will banning cigarette sales from vending machines reduce smoking rates in

children?

• Should all children be routinely offered the swine flu vaccination?

explanation of those data

Other Reasons

Analysing MRI data

Analysing dementia / Alzheimer disease data

Preparing poster

Reading scientific paper

Conducting MSc project

Post-qualification:

Interview for PhD/Job

Doing PhD

Publishing paper

Leading judge who hanged himself after dementia diagnosis left wife a note

saying she had 'a life to live', inquest hears :Telegraph Reporters

7 June 2017 • 11:39am

Sir Nicholas, who has died aged 71, was England’s senior divorce court judge who had rare neurological

disease called fronto-temporal lobe dementia that had only recently been diagnosed.

Leading judge who hanged himself after dementia diagnosis left wife a note

saying she had 'a life to live', inquest hears :Telegraph Reporters

7 June 2017 • 11:39am

of dementia and is sometimes called Pick's disease or

frontal lobe dementia, according to the Alzheimer's

Society.

It affects part of the brain connected to control behaviour and

emotions plus the understanding of words. Fronto

temporal dementia is caused when nerve cells in the

frontal and/or temporal lobes of the brain die and the

pathways that connect them change.

We might save this person’s life by early

detection of this rare neurological disease by

doing more in-depth research in this area.

Undiagnosed: mother-of-four Marina Fagan had a family history of aneurism. Her brain

disease went unrecognised for 13 days :

Evening Standard : Wednesday 15 June 2016 08:53

Marina Fagan, a 51-year-old mother of four, was discharged following a two-day stay at

Whipps Cross hospital, in Leytonstone, after investigations ruled out a brain haemorrhage.

She returned to A&E the same day as her headache persisted but was advised to get her

GP to refer her to an outpatient clinic. Her condition was finally diagnosed 11 days after she

was first admitted to hospital. She died six days later, on October 6, 2015.

So we need more research & more neurologists

understand the underlying/persisting disease

process……………….

Introduction to data analysis

Statistical

thinking is

involved in all

these phases,

along with

substantive

scientific

knowledge.

Introduction to data analysis

• Scientists rely on data to provide empirical evidence to support and refine their

theories

• Governments, businesses, communities, hospitals, GP’s and individuals need data to

help inform decision-making and risk assessment

• Learning statistics will provide you with basic skills to read and

understand data

• Broadly speaking, statistics provides us with techniques for

– Summarising and presenting the information contained in a data set

– Handling and quantifying variation and uncertainty in the data, to help us

infer what they tell us about the underlying theory of interest

Summary measure of any numerical data:

mean, median, mode and inter-quartile range (IQR)

Mean, Median, Mode , range and IQR

Example: Patient ages (ordered)

24 32 37 39 40 41 41 43 44

25th value 75th value

Inter-quartile range = 25th value – 75th value =

Range = smallest value – largest value = ?

Mode: the number occurs repeatedly which is ……..

Variability within data – Variance and standard deviations

11

9 10

12

8 10

Summary measure of any numerical data:

Use statistical software STATA

We are in the age of technology so use statistical software

STATA –. Type data in STATA , give the variable name ‘Age’

Type following command in STATA in command line:

summarize Age

Output is

But you should know what is Mean ,Std. Dev . & all others.

Summary measure of any numerical data:

Use statistical software STATA

Type following command in STATA in command line to get

more information (quartiles, median etc…):

summarize Age, detail

Output is Age of patients

Percentiles Smallest

1% 24 24

5% 24 32

10% 24 37 Obs 8

25% 34.5 39 Sum of Wgt. 8

Largest Std. Dev. 6.674846

75% 42 41

90% 44 41 Variance 44.55357

95% 44 43 Skewness -1.136833

99% 44 44 Kurtosis 3.142978

Mean < median

No symmetry in the data and

looks like negatively skewed

it looks like positively skewed

So mean and standard

So mean and standard

deviation is not appropriate

deviation is not appropriate

measure , median and inter-

measure , median and inter-

quartile range

quartile range

equally over both sides so mean

and standard deviations are

appropriate measure.

Introduction to data analysis

a feel for:

– typical (central) values and range of values

– shape and spread of the distribution of values

– interesting patterns and relationships in the data

– ……..

data quality, e.g.:

– outlying / erroneous observations

– digit preference

– ……..

Introduction to data analysis

Displaying Data

• Tables

– Frequency Tables

– Cross tabulations (contingency tables)

– …...

• Graphs

– Bar Charts

– Histograms

– Line Graphs

– ……

Introduction to data analysis

Displaying Data

dataset, it is essential to carry out some simple

exploratory analyses to get a feel for the data

Example: Normal and day case hospital admissions in England

with a neurological condition.

Introduction to data analysis

Histogram of the 2012/13 ordinary hospital admissions with a neurological condition among England CCGs

40

30

Frequency

20

10

0

Ordinary hospital admissions

Introduction to data analysis

patterns.

• Too many classes and you will end up with only one

observation per class.

• Aim is to ensure that the number of classes does not mask

interesting patterns

– Rule of thumb: optimal number of classes is approximately log

(base 2) of the number of observations

Number of obs Approx. number of classes

50 5-6

100 6-7

1000 10

10000 13

Introduction to data analysis

15,000

10,000

5,000

The box indicates that the median and two quartiles (1st quartiles = 2269, median= 2895 and

3rd quartile = 4013) . The vertical lines above and below the box indicate the range of values,

with outliers shown as separate points.

Introduction to data analysis

800000

600000

400000

200000

Ordinary hospital admissions

Identifying outliers in your data

• Outliers are identified by assessing whether or not they fall within a set

of numerical boundaries called "inner fences" and "outer fences".

• A point that falls outside the data set's inner fences is classified as a

minor outlier, while one that falls outside the outer fences is classified

as a major outlier.

• Multiplying inter-quartile range (Q3-Q1) by 1.5 then add this number to

Q3 and subtract it from Q1 to find the boundaries of the inner fences.

• Multiplying inter-quartile range (Q3-Q1) by 3 (instead of 1.5) then add

this number to Q3 and subtract it from Q1 to find the upper and lower

boundaries of the outer fences.

• A point that falls outside the data set's inner fences is classified as a

minor outlier, while one that falls outside the outer fences is classified

as a major outlier.

Identifying outliers in your data-example hospital admissions

• Use summ Ordinary1213, det \\ to find 1st quartile & 3rd quartile

• IQR = Q3-Q1 = 4013-2269 = 1744,

• 1744 × 1.5 = 2616, 1744 × 3 = 5232

• Boundaries for outer fence : (Q3+ 5232 , Q1- 5232 ) = (9245, - 2969)

• As hospital admissions never be negative we now check how many

data points are outside inner fence & how many are outside outer

fence using STATA :

• count if Ordinary1213 > 6626

15

• count if Ordinary1213 > 9245

4

As the data are positively skewed so report median and inter-quartile range.

Types of data

Quantitative

Continuous Discrete

Blood pressure Number of children (parity)

Age Number of cigarettes per day

Concentration of a pollutant Counts of deaths in small areas

Categorical

Ordinal Nominal

(Ordered categories) (Unordered categories)

Grade of breast cancer Sex (male/female)

Disease severity (mild/moderate/severe) Exposed/unexposed

Social class (I, II, III, IV, V) Ethnicity (white/asian/black/other)

Comments

• Categorical data that take on only two distinct

values are said to be dichotomous or binary

• Categorical data are often coded using numerical

values (e.g. 0 = NO, 1 = YES)

– statistical packages usually treat numeric data as quantitative

unless you explicitly declare it to be categorical

the accuracy of the measurement instrument

Quantitative versus Categorical

provided by continuous data, in which case we can

transform into categorical (ordinal) data.

• For example, in a study of the effect of maternal smoking

on birthweight, we can recode birthweight as:

≥2.5kg 0 (normal bwt)

<2.5kg 1 (low bwt)

prevalence, we can recode ambient NO2 concentration as:

<30 mg m-3 LOW

30-60 mg m-3 MEDIUM

>60 mg m-3 HIGH

Introduction to data analysis

Transformations

scale, to aid interpretation and/or statistical analysis

• Reasons for transforming data include:

– improved approximation to normality

– reducing skewness

– linearising the relationship between 2 variables

– making multiplicative relationships additive

– Natural logarithm (y = loge(x) x = ey or exp(y), where e =

2.718…)

– Power transformations (y = x , y = x2 , y = x3 , etc.)

Introduction to data analysis

Log transformation

• Log transform stretches scale at

2

log(e)=1

lower end and compresses it at

1

upper end

y = log(x)

log(1)=0

0

-1

values

-2

0 2 4 6 8 10

x

100 200 300

0 20 40 60 80

Number of patients

Number of patients

0

CD4 count (per cubic mm) Log CD4 count (per cubic mm)

Class Exercise

Classify the following data as categorical

(Binary/nominal/ordinal) or numerical (discrete/continuous)

Age at diagnosis Age of patients at diagnosis of

cancer

3=Tertiary

Smoking status 0= Non-smoker, 1=Smoker

Derived variable

Percentage, Ratios, Can be treated as numerical in most analyses

Rates & Scores

Data display in a spreadsheet / Data management

Suppose you are running a study at UCLH aiming to lowers the low-density

lipoprotein (LDL) cholesterol levels for the patients with cardiovascular

disease. Your study is an RCT , double blind and placebo-controlled.

Patients were randomly assigned to receive evolocumab (either 140 mg

every 2 weeks or 420 mg monthly) or matching placebo as

subcutaneous injections. Out of first 20 patients

Group: 11 patients received evolocumab and 9 patients received placebo.

Gender: 12 female and 8 male.

Statin use: High intensity – 12 patients

Medium intensity – 6 patients

Low intensity – 2 patients

Using patient ID 1 to 20 and appropriate code display above information in

a spreadsheet. Ignore between variables information for now.

Data display in a spreadsheet - coding

0 if patients received placebo.

Gender: 1 if patient is female

0 if patients is male

Statin use: 2 for High intensity

1 for Medium intensity

0 for Low intensity

Data display in a spreadsheet – looks like -

1 1 0 0

2 1 1 2

3 1 1 2

4 1 1 1

5 1 0 2

6 1 1 2

7 1 1 2

8 1 0 2

9 1 1 1

10 1 0 2

11 1 1 2

12 0 0 0

13 0 1 1

14 0 0 2

15 0 1 2

16 0 1 2

17 0 0 1

18 0 1 1

19 0 0 1

20 0 1 2

Data display in a spreadsheet - coding

Consider the patients age between 50 and 70 with a mean age of 60 years.

Can you now put an extra column for age of the patients?

In your study you might get different variables but need to present in a similar

way!

Data display in a spreadsheet – type in extra column Age

1 1 0 0 56

2 1 1 2 52

3 1 1 2 59

4 1 1 1 60

5 1 0 2 63

6 1 1 2 70

7 1 1 2 63

8 1 0 2 58

9 1 1 1 55

10 1 0 2 59

11 1 1 2 68

12 0 0 0 59

13 0 1 1 67

14 0 0 2 69

15 0 1 2 52

16 0 1 2 53

17 0 0 1 61

18 0 1 1 63

19 0 0 1 62

20 0 1 2 51

Data display in a spreadsheet

Check twice that your coding is correct and make sure you

didn’t put any wrong information or typed any number wrongly

Check relevant research data matched your findings

and have lowered LDL. Is it consistent with yours?

Identify and develop methods how you handle missing

values

Introduction to data analysis

Recap

(continuous, discrete, categorical)

• Most appropriate way of presenting data depends on data type

• Frequency tables are appropriate for all types of data

– For quantitative data, need to think carefully about appropriate choice of

classes/intervals to group data before display

– Keep information in tables to the minimum necessary to convey the

message (story) you want to present (significant figures, number of

variables/categories)

• Histograms and box plots are appropriate for quantitative data

Reference :

1. Introduction to medical statistics by Martin Bland : Chapter – 4

2. Medical Statistics by B. Kirkwood & J. Sterne : Chapter-4

3. Practical Statistics for medical research by Douglas Altman : Chapter 6

- Numerical DataUploaded byAnonymous uJkREmF
- A1Uploaded byDASHPAGAL
- Application of Stats in Business, Notes-ch1,Introduction and Data CollectionUploaded byminwooJing
- Masters Degree Thesis ProposalUploaded byRaj Singh
- S. A, - L. M.Uploaded byvengador
- Introduction Defining the Role of Statistics in BusinessUploaded byDr Rushen Singh
- The Accounting Information Quality And The Accounting Information System Quality Through The Organizational Structure : A Survey Of The Baitulmal Wattamwil (BMT) In West Java IndonesiaUploaded byinventionjournals
- UTHM Assignment BFC 34303 – Civil Engineering Statistics Sem I 2017/2018Uploaded byWeyWeyEnne
- Chapter 1Uploaded byMARVIN
- 10 Bivariate Analysis V2.1Uploaded bysathyavarathan
- JMP Win Menu Description CardUploaded bygzapata31
- MVsyllabusUploaded bydine
- summative reflectionUploaded byapi-269816404
- summary of testsUploaded byapi-399763067
- 7th Grade Ch8 ReteachingUploaded byksimmons82
- Part 1- Statistical Errors Even You Can RecognizeUploaded bywhatchamitocallit
- Syllabus - Statistics2 FallUploaded byNhím Biển
- 201718_HOE2_TD1Uploaded byNelsonSilva
- PValuesAugust2007Uploaded bysantoshchitra
- 30 Fair empl.prac.cas. 906, 29 Empl. Prac. Dec. P 32,720 Equal Employment Opportunity Commission v. American National Bank, 680 F.2d 965, 4th Cir. (1982)Uploaded byScribd Government Docs
- Par inc Golf Statistics problem guidanceUploaded byrkpatham1718
- Accounting TheoryUploaded byjhlim1294
- AP Stats Chapter 9B TestUploaded byDavid Woods
- A Primer on Cash Flow ForecastingUploaded bysreesarma
- What Do You Mean by the Word StatisticsUploaded byDennis Thomas Tudtud
- quantitative researchUploaded byNheru Veraflor
- Fiore Chi2006 WorkshopRevealing Communication PatternsUploaded byTionaTiona
- IE 467 Statistics Syllabus (2013 Fall)Uploaded byDas Chimera
- First Statistics for Economics and Business Tutorial+JawabUploaded byventy
- OutputUploaded byshufiyahnuraini

- Difference t vs NormalUploaded byLiad Elmalem
- Scan Negative Cauda Equina Syndrome Evidence of Functional Disorder From a Prospective Case Series (1)Uploaded byLiad Elmalem
- תמצית המקרה.docxUploaded byLiad Elmalem
- islm-handout4-oct18 (1)Uploaded byLiad Elmalem
- AlsUploaded byLiad Elmalem
- islm-handout4-oct18 (1).pdfUploaded byLiad Elmalem
- Research Proposal - Final Version - Bar.docxUploaded byLiad Elmalem
- IMSLP09925-Dvorak - Op.75 - 4 Romantic Pieces for Violin and PianoUploaded byhelenci
- תפקודי אגוUploaded byLiad Elmalem
- 329 Mastering-Psychiatry-2016.pdfUploaded bydragutinpetric
- Practice TestUploaded byLiad Elmalem
- Specialty Board Review Neurology - 2eUploaded bysra1_103
- אגוUploaded byLiad Elmalem
- מצגת פסיכופרמקולוגיהUploaded byLiad Elmalem

- Salmani - the Book Review Genre a Structural Move AnalysisUploaded byEstevao Batista
- Developing Retention StrategyUploaded byRasheeq Rayhan
- Chi Square NoteUploaded byShiwanka Handapangoda
- Stat GraphicsUploaded byJLuisHCarpio
- SPSS Quick GuideUploaded bygulwareen
- One Button Automating Feature EngineeringUploaded bysudheer1044
- rubricUploaded byapi-302940298
- malhotra16_tif.docUploaded byIsabella Mehnaz
- Kinds of Statistics and Types of DataUploaded byNeatTater
- stats_minitab_card.pdfUploaded byEliana Lopez
- Solution-Manual-for-Statistics-for-Managers-8th-Edition-by-Levine.docUploaded bya742435638
- SAS ProceduresUploaded bysarath.annapareddy
- Lecture 6Uploaded byCrystal Eshraghi
- hello-L-1.pptUploaded byattaullahciit_lahore
- Civil War Math UnitUploaded byAlicia Hewitt
- Minitab ManualUploaded bysouvik5000
- 04fUploaded bypasaitow
- Imagined Communities: Awareness, Information Sharing and Privacy on the FacebookUploaded byCynthia Helen Malakasis
- BRM Multi VarUploaded byUdit Singh
- Stats Online Readings for 2.15.11Uploaded byJessica Deakyne
- Statistik Deskriptif Cara Penyajian DataUploaded byriezea
- Baudm - Logistic RegressionUploaded byRohit Krishnan
- Research MethodsUploaded byQasim
- Kellstedt P., Whitten G.-The Fundamentals of Political Science Research-CUP (2013).pdfUploaded byCamila Castilho
- Using Data and Statistical Tools for Operations ManagementUploaded byAmrish Kamboj
- Understanding Clinical Research Course Keynotes - Documentos de GoogleUploaded byBenjamin Alvarado
- SPSS Manual QM 2014Uploaded bycoroline
- Globalizing SMEsUploaded byRafi Javed Qureshi
- Effects of Changes in Intraoperative Management on Recovery From AnesthesiaUploaded byaksinu
- Design Expert 7Uploaded byGaby Aguilar