How To Read A Paper - Critical Appraisal

Critical Appraisal
Mark Kerr
Clinical Librarian, EKHUFT
Learning objectives
By the end of this session you will be able to:
• critically appraise research papers
• identify internal and external validity and consider usefulness
• identify different study designs and know which type of
research question each relates to
• understand and identify sources of bias, confounding and other
weaknesses which may affect the validity or applicability of
their results
• interpret the key statistics in a research paper
• consider teaching critical appraisal with greater confidence
Critical appraisal definitions
“Critical appraisal is the process of carefully and systematically

examining research to judge its trustworthiness, and its value
and relevance in a particular context” Burls, Amanda
(2009)
“EBP requires that decisions about health and social care are
based on the best available, current, valid and relevant
evidence. These decisions should be made by those receiving
care, informed by the tacit and explicit knowledge of those
providing care, within the context of the available resources”
Dawes et al. 2005 p.7
What is critical appraisal?
Critical > Not “criticise” but “critique”
CA is not just: CA is:
 Negative dismissal  Balanced assessment
 Assessment of results  Test of processes
 Statistical Analysis  Look at all aspects
 Task for experts  Task for all ‘users’
- but it may include all of - a considered look at the

these as part of the overall evidence, using judgment as well
process as set criteria
Critical appraisal as part of
Evidence-Based Practice
5 steps of EBP
1. Formulate an answerable question (in response to an

identified information need)
2. Find evidence from research
3. Appraise for validity and usefulness (critical appraisal)
4. Implement change
5. Evaluate performance
Essential questions for a paper –
but is it appraisal?
• Who was the research carried out by
(organisation/individuals)? Seniority, reputation, experience?
• Which journal was the research published by? Impact factor?
Reputation?
• When was the research carried out and when was it
published?
• Where is the research based? (setting)
In a nutshell
1. PICO – if relevant, then read it
2. Difference due to chance? Due to bias?
• If not then likely to be due to intervention
3. Is difference big enough to be clinically useful?
• And is it ‘real’ (patient-oriented) or surrogate
outcome?
4. Does it change/inform knowledge or practice?
• Can I apply this to my patient?
1.
1. SCREEN
SCREEN 2.
2. APPRAISE
APPRAISE 3.
3. ASSESS
ASSESS 4.
4. APPLY
APPLY
IRRELEVANCE PICO
BIAS RAMMBO
CHANCE Sample size, P value, 95% CI
FUTILITY Effect Size, Clinical Significance
REPRODUCABILITY
Similar studies,
meta-analysis
ACCEPTABILITY Social, regulatory &

clinical approval
AFFORDABILITY Cost-effectiveness studies
GENERALISABILITY Are they sufficiently

like us?
What is critical appraisal?
• Was it the right question to ask?

• Did they use the right method?
• Did they measure the right outcome(s)?
• Did they achieve validity/significance?
• Was there bias, conflict of interest, fraud?!?
• Is the result clinically significant?
• Is it relevant to my clinical practice?
• Did I learn something new / useful?
• Does it change my clinical practice?
Quick Appraisal – GATE tool
GATE = Graphic Approach to Epidemiology
Evidenced by:
Essential first questions to ask of a
paper
• Who was the research carried out by
(organisation/individuals)
• Which journal was the research published by?
• When was the research carried out and when was it
published?
• Where is the research based?
- relevant questions, but are they critical appraisal?
Study Design
In a Nutshell . . .
1. PICO - if relevant, then read it
2. Difference due to chance? Due to bias?
- If not then likely to be due to intervention
3. Is it different enough to be clinically useful?
- Is it ‘real’ (patient-oriented) or surrogate outcome?
4. Does it change/inform knowledge/practice?
- Can I apply this to my patient?
1. SCREEN 2. APPRAISE 3. ASSESS 4. APPLY

The Three Questions
Look for sources of bias
1. Randomisation
2. Blinding
3. Allocation concealment
Are the results valid? 4. Intention to treat
5. Follow up
6. Conflicts of interest
1. Size of the effect

2. Relative vs absolute numbers
What are the results? 3. Confidence intervals around the
effect seen
1. Similar population
2. Outcomes important and all
Will they help me care for my patient? considered
3. Patient preferences and values
Are the chosen outcomes valid?
• A biomarker is a characteristic that is objectively
measured and evaluated as an indicator of normal biological
processes, pathogenic processes, or pharmacologic
responses to a therapeutic intervention.
• A clinical endpoint is a characteristic or variable that
reflects how a patient feels, functions, or survives.
• A surrogate endpoint is a biomarker that is intended to
substitute for a clinical endpoint:
• Cholesterol down = risk down =mortality down?
• Skin thickness = skin condition?
• Exercise tolerance = risk of MI?
• Patient-oriented vs research-oriented outcomes
Hierarchy of Evidence
Randomised Controlled Trial
Experimental Causation RR, OR, ARR, NNT
Treatment Arm Outcomes
Sample
Analysis
Population
Selection
& screening Control Arm Outcomes
Recruitment Time Data Collection

Prospective
Key biases to identify/avoid: recruitment, sampling, allocation, treatment, attrition

Randomised Controlled Trial
• Explanatory / Efficacy (see also Placebo)

• Pragmatic / Effectiveness (‘comparative’)
• Equivalence / Non-inferiority
• Parallel / Crossover
• Quasi-Experimental Design
• Studying effects someone else’s intervention
Cohort Study
Observational Longitudinal Association+Time RR
Good Outcome
Treatment/Exposure
Bad Outcome
Identify Classify
Study Treatment
Subjects Status
Good Outcome
Control
Bad Outcome
Recruitment Time Data Collection
Prospective Cohort Study
Data Collection Recruitment
Retrospective Cohort Study
Key biases to identify/avoid: selection (of controls), recall (if retrospective), information
Case Control Study
Observational Association + Time OR
Treatment or Exposure A
Cases
Treatment or Exposure B
Study
Population
Treatment or Exposure A
Controls
Treatment or Exposure B
Data Collection Time Recruitment

Retrospective
Key biases to identify/avoid: ascertainment, recall, information, selection
Case Control Study
• Matched
• Each case is matched for some relevant characteristics (eg age, gender,
ethnicity, comorbidities), to minimise confounding
• Nested
• Where the study population is drawn from within an arm of a cohort study
Cross-Sectional Study
Observational Prevalence OR
Study Population
Survey Sample
Treatment Treatment Treatment Treatment

or Exposure A or Exposure A or Exposure B or Exposure B
Good outcome Bad outcome Good outcome Bad outcome
See also: Ecologic, Migrant, Time

Environmental Studies
Key biases to identify/avoid: non-response, self-selection, location/time/seasonal

Case Report / Case Series
Observational -
Patient/case data “What is known” Analysis What this adds/confirms
Data Collection Time Recruitment/Identification of case

Retrospective
Systematic Reviews
Desk Research As per studies
Preliminary Protocol/
Search Proposal
Introduction, Full Screening Data

Meta- Discussion
Background Literature & Extraction
analysis Conclusion
Literature review Search Appraisal & Analysis
Key biases to identify/avoid: author, publication, outcome selection, data extraction, also check
for heterogeneity
Choose/Identify Design
• A study that aims to establish the normal height of 4yr old children by
measuring height at school entry Cross-sectional
• A study that compares a group of children whose heights are below the
tenth centile with a group of matched controls of normal height aiming
to identify possible causative factors Case-control
• A study comparing 2 groups of 4yr olds with similar characteristics: one

group is given a drug and the other a placebo and the growth of each
Controlled Trial
group is measured
• A study that compares the height of a group of 4yr olds living near a
nuclear plant with the height of a group of 4yr olds who live elsewhere
Cohort
• A study that looks at all children born at one hospital in one Longitudinal
year and
measures their height at intervals up to four years of age
Bias, Confounding, Limitations
Assessing the risk of
bias in studies
What makes a study
questionable?
• Confounding Factors
– Underlying factors that can affect the results, which
are outside the control of the research team. For
example, age, comorbidities, etc.
• Systematic Bias
– Mistakes made by the investigators (not necessarily
intentional) that result in a false conclusion
Selection bias at Recruitment
– is it a representative sample?
Occurs when the method of recruiting

participants creates a non-representative
group for study
Do the participants selected represent the
population of interest?
Have they been selected randomly or
sequentially?
Are the exclusion criteria relevant to the
study?
Selection bias at Allocation
– are the groups well-balanced?
• In general you can tell by the description of the

randomisation process whether or not it was done
correctly. Successful randomisation results in well-
balanced groups.
• It can happen that groups are not balanced in spite
of randomisation.
• The issue is not whether these differences are
statistically significant. It is whether you feel they
are large enough to affect the conclusion being
drawn.
Concealed allocation
A process of masking, or hiding, which participant has

been allocated to which arm of the trial.
Single blinding - the participant doesn’t know, at the point
when the decision was made, which arm of the trial s/he has
been allocated to.
Double blinding - neither the trial administrator (i.e. the GP or

clinician administering the treatment) nor the participant
knows which arm of the trial the participant is being allocated
to.
Performance bias during
Maintenance
Occurs when what was measured happened
because of the study itself rather than the
intervention.
– Ask whether participants have been treated the
same way throughout the trial (apart from the
intervention)
– Have they been exposed to the same influences?
Observer bias during
Measurement – blinding
Occurs when those involved with the study* allow
their knowledge of the study to affect the way
observations are scored or recorded.
*This can mean: researchers, healthcare staff, or participants

Attrition bias during Measurement
Occurs when there are important differences between

the number of participants lost to follow-up in the
comparison groups (15-20% = flawed study)
Intention to Treat analysis says that data on all
participants should be analysed with respect to the
groups to which they were initially randomised.
Per protocol analysis – data on all participants is
analysed with respect to the treatment they actually
received.
Blinding during Measurement
Similar to concealed allocation but during the treatment and
analysis elements of the trial
– Single blinding - the participant doesn’t know which arm of

the trial s/he has been allocated to.
– Double blinding - the trial clinician AND the participant are

unaware of which arm of the trial the patient is in
– Triple blinding – the statistician analysing the results does

not know which arm is being analysed
BIAS
• Bias, in epidemiology, is an error in design or execution of a study, which
produces results that are consistently distorted in one direction because of
non-random factors.
• Bias can occur in randomized controlled trials but tends to be a much
greater problem in observational studies.
• Not ‘prejudice’, more ‘skewing’
Selection/Sampling Bias
The survey sample does not accurately represent the population
• Undercoverage
• convenience sampling (clinic, registered lists) misses the inconvenient!
• Non-response
• respondents differ from non-respondents, higher health literacy, social visibility
• Hospital Admission rate
• patients more likely to have other conditions
Selection/Sampling Bias
The survey sample does not accurately represent the population
• Exclusion
• criteria used to reduce confounding not applied equally to control group
• Publicity / Awareness
• The Jade Goody, Michael J Fox, Terry Pratchett effect – rise in ‘incidence’ or
reporting due to greater visibility of condition
• Voluntary Response /Self-selection
• favours those with strong opinions, more informed
Information Biases
Accuracy of information about the exposure differs between cases and controls
• Interviewing
• Interviewers more thorough with cases than controls
• Abstracting
• Same as interviewing, but done when retrieving case data from records – limited
data for controls
• Ascertainment/Surveillance
• Caused by patient or clinician knowing which group they are in, by more intense
investigation in cases compared to controls
Information Biases
Accuracy of information about the exposure differs between cases and controls
• Recall
• past exposures and events better recalled than non-events; memory fallibility,
especially in consumption of alcohol, caffeine, tobacco...
• Reporting
• When a case emphasises exposure/symptoms that they believe to be important
• Treatment bias
• Difference in any element of treatment between cases & controls – sham surgery,
transport etc
Fixing Biases
• Bias is especially important in observational studies, as they lack the
random sampling and random allocation of RCTs, leading to greater
potential for error.
• Increasing sample size cannot compensate for selection/survey biases
• Blinding helps with interviewing, reporting and treatment biases,
cannot ‘fix’ sampling biases
Designing-out confounders
• Restriction
• Inclusion/exclusion criteria to prevent confounders entering
sample population
• Matching
• Allocating subjects with confounding factors equally across
both/all arms of the study
• Randomisation
• Appropriate randomisation should distribute those with
confounding factors (known AND unknown) among the study
groups (cluster randomisation might concentrate them!)
It’s all about evidence
• Not about you recalculating statistics

• Not about you accessing raw research data
• Find the evidence that potential errors have been considered and
managed
• Achieving the sample, good power, P & CI values indicate errors have
been avoided.
• BUT statistical validity cannot compensate for systematic error (bias)
in a trial. If the bias is large, the p-value is irrelevant, and increasing
the sample size cannot correct for bias.
Outcome
YES NO
SAFE Study - albumin/saline Treatment

Control
a
c
b
d
Dead Not Dead
Albumin 726 2747 3473
Saline 729 2771 3460

Outcome
YES NO
SAFE Study - albumin/saline Treatment

Control
a
c
b
d
SAFE Study: http://www.nejm.org/doi/pdf/10.1056/NEJMoa040232

NEJM Correspondence:
http://www.nejm.org/doi/full/10.1056/NEJM200410283511818
Journal Club Commentaries
http://www.biomedcentral.com/content/pdf/cc3006.pdf
http://www.biomedcentral.com/content/pdf/cc8940.pdf
Appraising systematic reviews
Useful checklists
There are several checklists to support appraisal of systematic reviews:

• CASPwww.casp-uk.net/
• CEBM www.cebm.net/
• SIGN www.sign.ac.uk/

How To Read A Paper - Critical Appraisal

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

How To Read A Paper - Critical Appraisal

Uploaded by

Copyright:

Available Formats

Critical Appraisal

“Critical appraisal is the process of carefully and systematically

- but it may include all of - a considered look at the

1. Formulate an answerable question (in response to an

CHANCE Sample size, P value, 95% CI

FUTILITY Effect Size, Clinical Significance

ACCEPTABILITY Social, regulatory &

AFFORDABILITY Cost-effectiveness studies

GENERALISABILITY Are they sufficiently

• Was it the right question to ask?

1. SCREEN 2. APPRAISE 3. ASSESS 4. APPLY

1. Size of the effect

Treatment Arm Outcomes

Recruitment Time Data Collection

Key biases to identify/avoid: recruitment, sampling, allocation, treatment, attrition

• Explanatory / Efficacy (see also Placebo)

Data Collection Time Recruitment

Treatment Treatment Treatment Treatment

Good outcome Bad outcome Good outcome Bad outcome

See also: Ecologic, Migrant, Time

Key biases to identify/avoid: non-response, self-selection, location/time/seasonal

Patient/case data “What is known” Analysis What this adds/confirms

Data Collection Time Recruitment/Identification of case

Introduction, Full Screening Data

• A study comparing 2 groups of 4yr olds with similar characteristics: one

Occurs when the method of recruiting

• In general you can tell by the description of the

A process of masking, or hiding, which participant has

Double blinding - neither the trial administrator (i.e. the GP or

*This can mean: researchers, healthcare staff, or participants

Occurs when there are important differences between

– Single blinding - the participant doesn’t know which arm of

– Double blinding - the trial clinician AND the participant are

– Triple blinding – the statistician analysing the results does

• Not about you recalculating statistics

SAFE Study - albumin/saline Treatment

Dead Not Dead

Albumin 726 2747 3473

Saline 729 2771 3460

SAFE Study - albumin/saline Treatment

SAFE Study: http://www.nejm.org/doi/pdf/10.1056/NEJMoa040232

There are several checklists to support appraisal of systematic reviews:

You might also like