You are on page 1of 48

Critical Appraisal

Mark Kerr
Clinical Librarian, EKHUFT
Learning objectives
By the end of this session you will be able to:
• critically appraise research papers
• identify internal and external validity and consider usefulness
• identify different study designs and know which type of
research question each relates to
• understand and identify sources of bias, confounding and other
weaknesses which may affect the validity or applicability of
their results
• interpret the key statistics in a research paper
• consider teaching critical appraisal with greater confidence
Critical appraisal definitions

“Critical appraisal is the process of carefully and systematically


examining research to judge its trustworthiness, and its value
and relevance in a particular context” Burls, Amanda
(2009)

“EBP requires that decisions about health and social care are
based on the best available, current, valid and relevant
evidence. These decisions should be made by those receiving
care, informed by the tacit and explicit knowledge of those
providing care, within the context of the available resources”
Dawes et al. 2005 p.7
What is critical appraisal?
Critical > Not “criticise” but “critique”
CA is not just: CA is:
 Negative dismissal  Balanced assessment
 Assessment of results  Test of processes
 Statistical Analysis  Look at all aspects
 Task for experts  Task for all ‘users’

- but it may include all of - a considered look at the


these as part of the overall evidence, using judgment as well
process as set criteria
Critical appraisal as part of
Evidence-Based Practice
5 steps of EBP

1. Formulate an answerable question (in response to an


identified information need)
2. Find evidence from research
3. Appraise for validity and usefulness (critical appraisal)
4. Implement change
5. Evaluate performance
Essential questions for a paper –
but is it appraisal?
• Who was the research carried out by
(organisation/individuals)? Seniority, reputation, experience?
• Which journal was the research published by? Impact factor?
Reputation?
• When was the research carried out and when was it
published?
• Where is the research based? (setting)
In a nutshell
1. PICO – if relevant, then read it
2. Difference due to chance? Due to bias?
• If not then likely to be due to intervention
3. Is difference big enough to be clinically useful?
• And is it ‘real’ (patient-oriented) or surrogate
outcome?
4. Does it change/inform knowledge or practice?
• Can I apply this to my patient?

1.
1. SCREEN
SCREEN 2.
2. APPRAISE
APPRAISE 3.
3. ASSESS
ASSESS 4.
4. APPLY
APPLY
IRRELEVANCE PICO

BIAS RAMMBO

CHANCE Sample size, P value, 95% CI

FUTILITY Effect Size, Clinical Significance

REPRODUCABILITY
Similar studies,
meta-analysis

ACCEPTABILITY Social, regulatory &


clinical approval

AFFORDABILITY Cost-effectiveness studies

GENERALISABILITY Are they sufficiently


like us?
What is critical appraisal?

• Was it the right question to ask?


• Did they use the right method?
• Did they measure the right outcome(s)?
• Did they achieve validity/significance?
• Was there bias, conflict of interest, fraud?!?
• Is the result clinically significant?
• Is it relevant to my clinical practice?
• Did I learn something new / useful?
• Does it change my clinical practice?
Quick Appraisal – GATE tool
GATE = Graphic Approach to Epidemiology

Evidenced by:
Essential first questions to ask of a
paper
• Who was the research carried out by
(organisation/individuals)
• Which journal was the research published by?
• When was the research carried out and when was it
published?
• Where is the research based?
- relevant questions, but are they critical appraisal?
Study Design
In a Nutshell . . .
1. PICO - if relevant, then read it
2. Difference due to chance? Due to bias?
- If not then likely to be due to intervention
3. Is it different enough to be clinically useful?
- Is it ‘real’ (patient-oriented) or surrogate outcome?
4. Does it change/inform knowledge/practice?
- Can I apply this to my patient?

1. SCREEN 2. APPRAISE 3. ASSESS 4. APPLY


The Three Questions
Look for sources of bias
1. Randomisation
2. Blinding
3. Allocation concealment
Are the results valid? 4. Intention to treat
5. Follow up
6. Conflicts of interest

1. Size of the effect


2. Relative vs absolute numbers
What are the results? 3. Confidence intervals around the
effect seen

1. Similar population
2. Outcomes important and all
Will they help me care for my patient? considered
3. Patient preferences and values
Are the chosen outcomes valid?
• A biomarker is a characteristic that is objectively
measured and evaluated as an indicator of normal biological
processes, pathogenic processes, or pharmacologic
responses to a therapeutic intervention.
• A clinical endpoint is a characteristic or variable that
reflects how a patient feels, functions, or survives.
• A surrogate endpoint is a biomarker that is intended to
substitute for a clinical endpoint:
• Cholesterol down = risk down =mortality down?
• Skin thickness = skin condition?
• Exercise tolerance = risk of MI?
• Patient-oriented vs research-oriented outcomes
Hierarchy of Evidence
Randomised Controlled Trial
Experimental Causation RR, OR, ARR, NNT

Treatment Arm Outcomes

Sample
Analysis
Population

Selection
& screening Control Arm Outcomes

Recruitment Time Data Collection


Prospective

Key biases to identify/avoid: recruitment, sampling, allocation, treatment, attrition


Randomised Controlled Trial

• Explanatory / Efficacy (see also Placebo)


• Pragmatic / Effectiveness (‘comparative’)
• Equivalence / Non-inferiority
• Parallel / Crossover
• Quasi-Experimental Design
• Studying effects someone else’s intervention
Cohort Study
Observational Longitudinal Association+Time RR

Good Outcome
Treatment/Exposure
Bad Outcome
Identify Classify
Study Treatment
Subjects Status
Good Outcome
Control
Bad Outcome
Recruitment Time Data Collection
Prospective Cohort Study
Data Collection Recruitment
Retrospective Cohort Study
Key biases to identify/avoid: selection (of controls), recall (if retrospective), information
Case Control Study
Observational Association + Time OR

Treatment or Exposure A
Cases
Treatment or Exposure B

Study
Population

Treatment or Exposure A
Controls
Treatment or Exposure B

Data Collection Time Recruitment


Retrospective
Key biases to identify/avoid: ascertainment, recall, information, selection
Case Control Study

• Matched
• Each case is matched for some relevant characteristics (eg age, gender,
ethnicity, comorbidities), to minimise confounding
• Nested
• Where the study population is drawn from within an arm of a cohort study
Cross-Sectional Study
Observational Prevalence OR

Study Population

Survey Sample

Treatment Treatment Treatment Treatment


or Exposure A or Exposure A or Exposure B or Exposure B

Good outcome Bad outcome Good outcome Bad outcome

See also: Ecologic, Migrant, Time


Environmental Studies

Key biases to identify/avoid: non-response, self-selection, location/time/seasonal


Case Report / Case Series
Observational -

Patient/case data “What is known” Analysis What this adds/confirms

Data Collection Time Recruitment/Identification of case


Retrospective
Systematic Reviews
Desk Research As per studies

Preliminary Protocol/
Search Proposal

Introduction, Full Screening Data


Meta- Discussion
Background Literature & Extraction
analysis Conclusion
Literature review Search Appraisal & Analysis

Key biases to identify/avoid: author, publication, outcome selection, data extraction, also check
for heterogeneity
Choose/Identify Design
• A study that aims to establish the normal height of 4yr old children by
measuring height at school entry Cross-sectional

• A study that compares a group of children whose heights are below the
tenth centile with a group of matched controls of normal height aiming
to identify possible causative factors Case-control

• A study comparing 2 groups of 4yr olds with similar characteristics: one


group is given a drug and the other a placebo and the growth of each
Controlled Trial
group is measured

• A study that compares the height of a group of 4yr olds living near a
nuclear plant with the height of a group of 4yr olds who live elsewhere
Cohort

• A study that looks at all children born at one hospital in one Longitudinal
year and
measures their height at intervals up to four years of age
Bias, Confounding, Limitations
Assessing the risk of
bias in studies
What makes a study
questionable?
• Confounding Factors
– Underlying factors that can affect the results, which
are outside the control of the research team. For
example, age, comorbidities, etc.

• Systematic Bias
– Mistakes made by the investigators (not necessarily
intentional) that result in a false conclusion
Selection bias at Recruitment
– is it a representative sample?

Occurs when the method of recruiting


participants creates a non-representative
group for study
Do the participants selected represent the
population of interest?
Have they been selected randomly or
sequentially?
Are the exclusion criteria relevant to the
study?
Selection bias at Allocation
– are the groups well-balanced?

• In general you can tell by the description of the


randomisation process whether or not it was done
correctly. Successful randomisation results in well-
balanced groups.
• It can happen that groups are not balanced in spite
of randomisation.
• The issue is not whether these differences are
statistically significant. It is whether you feel they
are large enough to affect the conclusion being
drawn.
Concealed allocation

A process of masking, or hiding, which participant has


been allocated to which arm of the trial.
Single blinding - the participant doesn’t know, at the point
when the decision was made, which arm of the trial s/he has
been allocated to.

Double blinding - neither the trial administrator (i.e. the GP or


clinician administering the treatment) nor the participant
knows which arm of the trial the participant is being allocated
to.
Performance bias during
Maintenance
Occurs when what was measured happened
because of the study itself rather than the
intervention.
– Ask whether participants have been treated the
same way throughout the trial (apart from the
intervention)
– Have they been exposed to the same influences?
Observer bias during
Measurement – blinding
Occurs when those involved with the study* allow
their knowledge of the study to affect the way
observations are scored or recorded.

*This can mean: researchers, healthcare staff, or participants


Attrition bias during Measurement

Occurs when there are important differences between


the number of participants lost to follow-up in the
comparison groups (15-20% = flawed study)
Intention to Treat analysis says that data on all
participants should be analysed with respect to the
groups to which they were initially randomised.
Per protocol analysis – data on all participants is
analysed with respect to the treatment they actually
received.
Blinding during Measurement
Similar to concealed allocation but during the treatment and
analysis elements of the trial

– Single blinding - the participant doesn’t know which arm of


the trial s/he has been allocated to.

– Double blinding - the trial clinician AND the participant are


unaware of which arm of the trial the patient is in

– Triple blinding – the statistician analysing the results does


not know which arm is being analysed
BIAS
• Bias, in epidemiology, is an error in design or execution of a study, which
produces results that are consistently distorted in one direction because of
non-random factors.
• Bias can occur in randomized controlled trials but tends to be a much
greater problem in observational studies.
• Not ‘prejudice’, more ‘skewing’
Selection/Sampling Bias
The survey sample does not accurately represent the population

• Undercoverage
• convenience sampling (clinic, registered lists) misses the inconvenient!
• Non-response
• respondents differ from non-respondents, higher health literacy, social visibility
• Hospital Admission rate
• patients more likely to have other conditions
Selection/Sampling Bias
The survey sample does not accurately represent the population

• Exclusion
• criteria used to reduce confounding not applied equally to control group
• Publicity / Awareness
• The Jade Goody, Michael J Fox, Terry Pratchett effect – rise in ‘incidence’ or
reporting due to greater visibility of condition
• Voluntary Response /Self-selection
• favours those with strong opinions, more informed
Information Biases
Accuracy of information about the exposure differs between cases and controls

• Interviewing
• Interviewers more thorough with cases than controls
• Abstracting
• Same as interviewing, but done when retrieving case data from records – limited
data for controls
• Ascertainment/Surveillance
• Caused by patient or clinician knowing which group they are in, by more intense
investigation in cases compared to controls
Information Biases
Accuracy of information about the exposure differs between cases and controls
• Recall
• past exposures and events better recalled than non-events; memory fallibility,
especially in consumption of alcohol, caffeine, tobacco...
• Reporting
• When a case emphasises exposure/symptoms that they believe to be important
• Treatment bias
• Difference in any element of treatment between cases & controls – sham surgery,
transport etc
Fixing Biases
• Bias is especially important in observational studies, as they lack the
random sampling and random allocation of RCTs, leading to greater
potential for error.
• Increasing sample size cannot compensate for selection/survey biases
• Blinding helps with interviewing, reporting and treatment biases,
cannot ‘fix’ sampling biases
Designing-out confounders
• Restriction
• Inclusion/exclusion criteria to prevent confounders entering
sample population
• Matching
• Allocating subjects with confounding factors equally across
both/all arms of the study
• Randomisation
• Appropriate randomisation should distribute those with
confounding factors (known AND unknown) among the study
groups (cluster randomisation might concentrate them!)
It’s all about evidence

• Not about you recalculating statistics


• Not about you accessing raw research data
• Find the evidence that potential errors have been considered and
managed
• Achieving the sample, good power, P & CI values indicate errors have
been avoided.
• BUT statistical validity cannot compensate for systematic error (bias)
in a trial. If the bias is large, the p-value is irrelevant, and increasing
the sample size cannot correct for bias.
Outcome
YES NO

SAFE Study - albumin/saline Treatment


Control
a
c
b
d

Dead Not Dead

Albumin 726 2747 3473

Saline 729 2771 3460


Outcome
YES NO

SAFE Study - albumin/saline Treatment


Control
a
c
b
d

SAFE Study: http://www.nejm.org/doi/pdf/10.1056/NEJMoa040232


NEJM Correspondence:
http://www.nejm.org/doi/full/10.1056/NEJM200410283511818
Journal Club Commentaries
http://www.biomedcentral.com/content/pdf/cc3006.pdf
http://www.biomedcentral.com/content/pdf/cc8940.pdf
Appraising systematic reviews
Useful checklists

There are several checklists to support appraisal of systematic reviews:


• CASPwww.casp-uk.net/
• CEBM www.cebm.net/
• SIGN www.sign.ac.uk/

You might also like