You are on page 1of 26

05.10.

2011

Antivirals for Influenza:


Review of the Evidence from
Observational Studies

Holger Schünemann, MD, PhD


Chair and Professor, Department of Clinical Epidemiology & Biostatistics
Professor of Medicine
Michael Gent Chair in Healthcare Research
McMaster University, Hamilton, Canada

Seoul, August 30, 2011


10:30 am – 12:00 pm

1
05.10.2011

Collaborators
Jonathan Hsu*, Nancy Santesso*, Reem
Mustafa, Jan Brozek, Yao Long Chen, Jessica
Hopkins Adrienne Cheung, Gayane
Hovhannisyan, Liudmila Ivanova, Signe
Agnes Flottorp, Ingvil von Mehren Sæterdal,
Arthur D. Wong, Jinhui Tian, Tim Uyeki, Elie
Akl, Pablo Alonso, Fiona Smaill

Disclosure
• Co-chair GRADE Working Group
• Work with various guideline groups using GRADE
• American College of Physicians (ACP) Clinical Practice
Guidelines Committee
• WHO: Expert Advisory Panel on Clinical Practice
Guidelines and Clinical Research Methods and Ethics &
chair of various guideline panels
• WHO funding
• No direct/personal for profit payments

2
05.10.2011

Outline
• Observational studies and systematic reviews
• Quality of evidence and GRADE in
observational studies
• Feasibility of synthesizing information from
observational studies
• Reconciling inconsistent evidence from
randomized trials and observational studies

Background
• Little is known about the proper synthesis of
observational study evidence in systematic
reviews
• Special challenges:
– Assessing quality of a body of evidence from
observational studies
– Developing recommendations based on observational
research
• But necessary:
– Safety, public health, surgical specialties, policy
making

3
05.10.2011

Determinants of quality
• RCTs 
• observational studies 
• 5 factors that can lower quality
1. limitations in detailed design and execution (risk of
bias criteria)
2. Inconsistency (or heterogeneity)
3. Indirectness (PICO and applicability)
4. Imprecision (number of events and confidence
intervals)
5. Publication bias
• 3 factors can increase quality
1. large magnitude of effect
2. plausible residual bias or confounding
3. dose-response gradient

• Valid – yes!

GRADE and observational studies


• Users of GRADE have expressed concern that
GRADE places greater confidence on the results
of randomized studies (RCTs)
– population or public health interventions and
environmental health, health policy making and often
surgery, where conducting RCTs is either challenging
or unethical
– Consequently, the best quality of evidence for these
questions will come from observational studies
• Some argue that it would be unreasonable to
grade such “best quality” evidence, typical of
most public health questions, as low

4
05.10.2011

GRADE and observational studies


• Argument is not valid for several reasons:
– inability to obtain RCT data does not eliminate or
minimize the bias associated with observational data
– quality of evidence from observational studies can
lead to moderate and even high quality evidence
within the GRADE framework – why is this forgotten?
– need to be able to compare confidence in estimates
of effect across healthcare questions

GRADE and WHO guidelines


• GRADE is approach for WHO
• New guideline on pharmacological
management of influenza
– Previously few randomized trials
• Criticized
• Low quality evidence for many outcomes (imprecision)
• Industry sponsored – publication bias
• Not all outcomes
• Review of observational studies
– To inform guidelines

5
05.10.2011

Methods
• Standard systematic review
– MEDLINE, EMBASE, CENTRAL, CINAHL, SIGLE, the Chinese
Biomedical Literature Database, Panteleimon and LILACS
for relevant studies up to November 2010
– contacted pharmaceutical companies and international
agencies
– RevMan 5.1
• 10 PICO → recommendations approach
– Outcomes determined through Delphi process previously
• QoE according to GRADE approach
– GRADEpro (www.gradeworkinggroup.org)
– Risk of bias using Ottawa Newcastle scale

6
05.10.2011

Methods
Types of participants
• We included studies in all populations with influenza or influenza like-illness.
Types of intervention
• Oseltamivir, zanamivir, amantadine or rimantadine in any dose or by any route.
Type of outcome measures
• We determined a priori to report on the following outcomes because they were
judged to be important or critical for decision making:
• Mortality, Hospitalisation, ICU Admission, mechanical ventilation and respiratory
failure, Duration of hospitalization, Time to alleviation of symptoms, Time to
return to normal activity, Complications
• Critical adverse events (e.g. major psychotic disorders, encephalitis, stroke and
seizure),
• Important adverse events (e.g. pain in extremities, clonic twitching, body
weakness, dermatological changes such as uticaria and rash)
• Viral shedding and Resistance

Results - PRISMA
Records identified through
database searching (all
study designs)
EMBASE, MEDLINE = 9873
SIGLE = 7
CINAHL = 1062 Additional records
LILACS = 19 identified through
COCHRANE = 301
Chinese Biomedical other sources
Literature Database = 914 Pharmaceutical
Panteleimon = 12 companies
(Total n = 12176)
(n = 12)
Reference lists of
relevant papers
(n=15)
Records after duplicates removed
(n = 7456)

Records screened Records excluded


(n = 7483) (n = 6563)

Studies awaiting
assessment
(n = 6) Full-text articles Full-text articles
•Studies awaiting assessed for eligibility excluded
translation (1) (n = 920 ) (n = 825)
•Papers could not Excluded for
obtain in full (5) •Not influenza or
influenza like illness
•Fewer than 25 people
•Randomised controlled
trial, or not an
Studies included observational study
N = 89 •Not antiviral agent
Question •Antiviral agents
•51 + 5 studies analysed together
•7 studies •Prophylaxis
•6 studies •No outcomes reported
•0 studies
•8
•0 studies
•16
•0 studies
•1 study
•2 studies
Note: one study may
be relevant to
multiple questions

7
05.10.2011

Should oseltamivir versus no treatment be used to treat influenza?


Results
Mortality (adjusted)

oseltamivir no treatment Odds Ratio Odds Ratio


Study or Subgroup log[Odds Ratio] SE Total Total Weight IV, Random, 95% CI IV, Random, 95% CI
Hanshaoworakul 2009 -2.040221 0.58739416 315 130 28.5% 0.13 [0.04, 0.41]
Liem 2009 (1) -0.941609 0.75113239 55 12 17.5% 0.39 [0.09, 1.70]
McGeer 2009 (2) -1.309333 0.4270348 69 100 54.0% 0.27 [0.12, 0.62]

Total (95% CI) 439 242 100.0% 0.23 [0.13, 0.43]


Heterogeneity: Tau² = 0.00; Chi² = 1.58, df = 2 (P = 0.45); I² = 0%
0.1 0.2 0.5 1 2 5 10
Test for overall effect: Z = 4.63 (P < 0.00001)
Favours oseltamivir Favours no treatment

(1) Adjusted for neutropenia and hospital admission


(2) Does not specify what was adjusted for

Mortality (unadjusted)
Oseltamivir No treatment Odds Ratio Odds Ratio
Study or Subgroup Events Total Events Total Weight M-H, Random, 95% CI M-H, Random, 95% CI
Chemaly 2007 0 25 3 8 5.2% 0.03 [0.00, 0.69]
Estenssoro 2010 150 328 5 8 13.4% 0.51 [0.12, 2.15]
Hien 2009 5 25 2 4 8.6% 0.25 [0.03, 2.24]
Huang 2009 2 17 1 57 7.3% 7.47 [0.63, 88.02]
Li 2010 0 118 0 27 Not estimable
Liem 2009 18 55 8 12 14.4% 0.24 [0.06, 0.92]
McGeer 2009 8 68 34 100 18.9% 0.26 [0.11, 0.60]
Siston 2010 (1) 21 476 5 74 17.3% 0.64 [0.23, 1.74]
Xi 2009 24 125 3 30 14.9% 2.14 [0.60, 7.64]

Total (95% CI) 1237 320 100.0% 0.51 [0.23, 1.14]


Total events 228 61
Heterogeneity: Tau² = 0.70; Chi² = 16.76, df = 7 (P = 0.02); I² = 58%
0.001 0.1 1 10 1000
Test for overall effect: Z = 1.64 (P = 0.10)
Favours oseltamivir Favours no treatment

(1) Pregnant women

Results
Question: Should oseltamivir vs. no antiviral treatment be used for influenza (follow-up: 30 days)?
Quality assessment Summary of Findings
Participants Risk of bias Inconsistency Indirectness Imprecision Publication Overall quality of Study event rates (%) Relative effect Anticipated absolute effects
(studies) bias evidence With no With oseltamivir (95% CI) Risk with no antiviral treatment Absolute effect with Oseltamivir
antiviral (95% CI)
treatment
Mortality
681 no serious no serious no serious no serious undetected1 ⊕⊕⊝⊝ 59/242 31/439 adj OR 0.23
172 fewer deaths per 1000
(3 studies) risk of bias inconsistency indirectness imprecision LOW1 (24.4%) (7.1%) (0.13 to 0.43) 240 deaths per 1000
(from 120 to 201 fewer)

1557 serious2 no serious no serious no serious undetected1 ⊕⊝⊝⊝ 61/320 228/1237 OR 0.51
101 fewer deaths per 1000
(9 studies) inconsistency indirectness imprecision VERY LOW1,2 (19.1%) (18.4%) (0.23 to 1.14)3 240 deaths per 1000
(from 172 fewer to 25 more)
due to risk of bias
Hospitalisation

150710 no serious no serious no serious no serious undetected4 ⊕⊕⊝⊝ 1238/100585 431/50125 adj OR 0.75 12 hospitalisations per 1000 3 fewer hospitalisations per 1000
(5 studies) risk of bias inconsistency indirectness imprecision LOW4 (1.2%) (0.86%) (0.66 to 0.89) (from 1 to 4 fewer)

242762 serious2 no serious no serious no serious undetected4 ⊕⊝⊝⊝ 1738/ 1086/96352 OR 0.75 12 hospitalisations per 1000 3 fewer hospitalisations per 1000
(6 studies) inconsistency indirectness imprecision VERY LOW2,4 146410 (1.1%) (0.66 to 0.86) (from 2 to 4 fewer)
due to risk of bias (1.2%)
ICU admissions/mechanical ventilation/respiratory failure

1032 Serious5 serious6 no serious no serious undetected1 ⊕⊝⊝⊝ - 200/1032 - -


(6 studies5) indirectness imprecision VERY LOW1,6 (19.4%)
due to risk of bias,
inconsistency
Complications - Pneumonia

150466 no serious serious6 no serious no serious undetected4 ⊕⊝⊝⊝ 2111/ 647/50017 adj OR 0.83 21 pneumonias per 1000 4 fewer pneumonias per 1000
(3 studies) risk of bias indirectness imprecision VERY LOW4,6 100449 (1.3%) (0.59 to 1.16) (from 9 fewer to 3 more)
due to (2.1%)
inconsistency
265276 serious2 serious6 no serious no serious undetected4 ⊕⊝⊝⊝ 3244/ 1273/99020 OR 0.64 20 pneumonias per 1000 7 fewer pneumonias per 1000
(6 studies) indirectness imprecision VERY LOW2,4,6 166256 (1.3%) (0.46 to 0.88) (from 2 to 10 fewer)
due to risk of bias, (2%)
inconsistency

8
05.10.2011

Results
• We successfully used the GRADE approach to assess the
quality of evidence for observational studies
• GRADE evidence profiles for most PICO questions to inform
the WHO essential medicine list and the WHO committee
that prepares guidelines for the pharmacological
management of influenza
• WHO guideline panel to use the evidence profiles for
decision making
• Very low to low quality evidence for four major
pharmacological interventions
• However, this evidence of equal or higher quality compared
to that of RCTs for some of the interventions and outcomes
• Evidence for harms!

Results/Discussion
• Large Team – work completed in 6 months
– 10 PICOs
– Complete evidence profiles
• Challenges with using existing risk of bias tool
• Indirect comparison and internally controlled studies
– Should we use these studies?
• Publication bias?
• Upgrade mortality: adj OR 0.23 (0.13 to 0.43)
– 3 studies, fairly narrow CI
• Easy to use information for guideline panels

9
05.10.2011

Discussion
• The quality of evidence from observational studies may
be equivalent to that of RCTs
• Do we need rating of quality within categories
– Within low quality
– Probably under certain circumstances
• What if body of evidence inconsistent from RCTs and
observational studies but of perceived similar quality
– Further downgrade for inconsistency? E.g. adverse events
• RCT Odds Ratio: odds ratio 1.79 (1.10 to 2.93) – nausea
• Obs. Adj Rate ratio: 0.76 (0.7 to 0.81) – low quality
– Further upgrade for consistency? E.g. Rx complications:
• RCT Odds Ratio 0.55 (0.22 – 1.35)
• Obs adj OR 0.58 (0.31 to 1.1); OR 0.45 (0.25 to 0.81)

10
05.10.2011

Agenda
• Introduction to GRADE
– Presentation (20 min)
• Small group work
– Evidence profile (5 min)
– PICO (5 min)
– Complete domain and subdomain for each
question (20 min)
– Formulate recommendation (5 min)
– Feedback (10 min)

GRADE Working Group meeting


Date & Time: August 30, 2011 at 13:00h (1:00
pm) to 17:00 (5:00 pm)
Location: Room 6

www.gradeworkinggroup.org

11
05.10.2011

Content
GRADE – 20 minute overview
• Quality of evidence
• Going from evidence to recommendations

GRADE Uptake
 World Health Organization
 Allergic Rhinitis in Asthma Guidelines (ARIA)
 American Thoracic Society
 American College of Physicians (ACP)
 Canadian Task Force for the Preventive Services
 European Respiratory Society
 European Society of Thoracic Surgeons
 British Medical Journal
 Infectious Disease Society of America
 UpToDate®
 National Institutes of Health and Clinical Excellence (NICE)
 Scottish Intercollegiate Guideline Network (SIGN)
 Cochrane Collaboration
 Clinical Evidence
 Agency for Health Care Research and Quality (AHRQ)
 Partner of GIN
 Over 60 major organizations (over 250 members)

12
05.10.2011

Guideline
development
Process
(for WHO)

“Healthy people”

Healthcare problem “Herd immunity”


“Long term perspective”
“Few RCTs”
“Lots of other things”

recommendation

13
05.10.2011

Hierarchy of evidence
based on quality
STUDY DESIGN BIAS
 Randomized Controlled
Trials
 Cohort Studies and Case
Control Studies
 Case Reports and Case
Series, Non-systematic
observations
 Expert Opinion

Relative risk reduction:


….> 99.9 % (1/100,000)
U.S. Parachute Association BMJ 2003

reported 821 injuries and 18


deaths out of 2.2 million jumps
in 2007

BMJ, 2003

14
05.10.2011

Simple hierarchies are


(too) simplistic
STUDY DESIGN BIAS
 Randomized Controlled
Trials

Expert Opinion
 Cohort Studies and Case
Control Studies
 Case Reports and Case
Series, Non-systematic
observations
Expert Opinion
Schünemann & Bone, 2003

GRADE: recommendations &


quality of (a body of) evidence
Clear separation, but judgments required:

1) Recommendation: 2 grades – conditional (aka weak)


or strong (for or against an action)?
– Balance of benefits and downsides, values and
preferences, resource use and quality of evidence

2) 4 categories of quality of evidence:


 (High), (Moderate), (Low), (Very low)?
– methodological quality of evidence
– likelihood of bias related to recommendation
– by outcome and across outcomes
*www.GradeWorking-Group.org

15
05.10.2011

Meta-analyses of several critical and


important outcomes (one PICO)
(critical) High 

Myo. Infarct. (critical) Moderate 


Due to imprecision

(important) Low 


Due to imprecision and risk of bias

(critical) High 

0.5 0.75 1 1.25 1.5


Better Relative Risk Worse

Low 
Moderate 
based on critical outcomes

Meta-analyses of several critical


outcomes (one PICO) Threshold of
acceptable harm for strong
recommendation based on sure
benefit in mortality and stroke
High 

Dis. Specific QoL Moderate 


Due to imprecision

High 

High 

0.5 0.75 1 1.25 1.5


Better Relative Risk Worse

High 

16
05.10.2011

Meta-analyses of several critical


outcomes (one PICO) Threshold of
acceptable harm for strong
recommendation based on sure
benefit in mortality and stroke
High 

Dis. Specific QoL High 

Moderate 
due to risk of bias

High 

0.5 0.75 1 1.25 1.5


Better Relative Risk Worse

Moderate 

GRADE evidence profile

34

17
05.10.2011

Content

• Quality of evidence
• Going from evidence to recommendations

Strength of
recommendation
“The strength of a recommendation reflects
the extent to which we can, across the range
of patients for whom the recommendations
are intended, be confident that desirable
effects of a management strategy outweigh
undesirable effects.”
• Strong or conditional/weak

18
05.10.2011

Determinants of the
strength of recommendation
Factors that can strengthen a Comment
recommendation
Quality of the evidence The higher the quality of evidence, the
more likely is a strong
recommendation.
Balance between desirable and The larger the difference between the
undesirable effects desirable and undesirable
consequences, the more likely a strong
recommendation warranted. The
smaller the net benefit and the lower
certainty for that benefit, the more likely
weak recommendation warranted.
Values and preferences The greater the variability in values and
preferences, or uncertainty in values
and preferences, the more likely weak
recommendation warranted.
Costs (resource allocation) The higher the costs of an intervention
– that is, the more resources
consumed – the less likely is a strong
recommendation warranted

Determinants of the strength of


recommendation
Factors that can weaken the Decision Explanation
strength of a recommendation.
Example:
Lower quality evidence □ Yes
□ No
Uncertainty about the balance of □ Yes
benefits versus harms and burdens □ No
Uncertainty or differences in values □ Yes
□ No
Uncertainty about whether the net □ Yes
benefits are worth the costs □ No

Table. Decisions about the strength of a recommendation


Frequent “yes” answers will increase the likelihood of a weak recommendation

19
05.10.2011

Recommendation: In patients with HIV and drug resistant TB requiring second line drugs, the expert panel recommends/suggests to (not)
administer ART (? recommendation, ? quality evidence).
Population: HIV positive individuals with drug resistant TB requiring second line drugs
Intervention: ART use during TB treatment vs ART non-use
Factor Decision Explanation
High or moderate quality evidence There is limited evidence from published studies to evaluate ART use
(is there high quality evidence?) in HIV-TB coinfected patients receiving second line drugs for XDR-TB
The higher the quality of evidence, the more and MDR-TB. However, using IPD from longitudinal cohort studies,
likely is a strong recommendation. we found moderate quality evidence from observational studies that
 Yes there
OO
 No

Certainty about the balance of benefits  Cure and survival appear to be more likely in drug resistant TB
versus harms and burdens requiring second line drugs if ART is used during TB treatment.
Although there is
(is there certainty?) o HR of 3.17 (1.46, 6.9) for cure and HR of 0.41 (0.26,
some uncertainty
The larger the difference between the 0.63) for death in ART vs. non ART group.
about cure, there is
desirable and undesirable consequences and  Yes o No significant change in HR for cure [HR 2.93(0.98,
a significant
the certainty around that difference, the 8.69)], and decreased HR for death [HR 0.23 (0.12,
decrease in hazards
more likely a strong recommendation. The  No 0.46)] if controlling for initial CD4 count (HR 0.23)
ratio for death even
smaller the net benefit and the lower the
after controlling for
certainty for that benefit, the more likely is a
initial CD4 count
conditional/weak recommendation.

Certainty or similarity in values (is there


certainty?)  Little uncertainly regarding the outcomes of cure and survival.
The smaller the variability or uncertainty  Yes Significant uncertainly regarding effects of ART on other
around values and preferences, the more  No outcomes, including adverse events, default, time to smear
likely is a conditional or weak and culture conversion and timing of ART initiation.
recommendation.
Resource implications (are the resources
consumed worth the expected benefit)
The higher the costs of an intervention More resources
compared to the alternative that is  Yes required for  Need for more skilled providers trained in HIV and drug
considered and other cost related to the  No concomitant ART resistant TB care and drug-drug interactions.
decision – that is, the more resources use
consumed – the more likely is a
conditional/weak recommendation.
Overall strength of recommendation
Strong or conditional

Balancing desirable and undesirable


consequences

↑ herd ↓ ↑ Resources ↑ Nausea


immunity Morbidity

↑ Allergic ↑ Local skin


Conditional ↑ QoL ↓ Death reactions
reactions

Strong For Against

20
05.10.2011

Balancing desirable and undesirable


consequences

Conditional

Strong For Against

Balancing desirable and undesirable


consequences

Conditional

Strong For Against

21
05.10.2011

Balancing desirable and undesirable


consequences

Conditional

Strong For Against

Balancing desirable and undesirable


consequences

Conditional

Strong For Against

22
05.10.2011

Randomization
increases initial
quality
1. Risk of bias

Grade down
P Outcome Critical High 2. Inconsistency
I Outcome Critical Moderate 3. Indirectness
Low 4. Imprecision
C Outcome Important
5. Publication
Very low
O Outcome Not bias
Summary of findings 1. Large effect

Grade up
& estimate of effect 2. Dose
for each outcome response
Systematic review 3. Confounders

Guideline development

Formulate recommendations: Grade


• For or against (direction) overall quality of evidence
• Strong or weak/conditional across outcomes based on
(strength) lowest quality
By considering: of critical outcomes
 Quality of evidence
 Balance benefits/harms • “We recommend using…”
 Values and preferences • “We suggest using…”
• “We recommend against using…”
Revise if necessary by considering:
• “We suggest against using…”
 Resource use (cost)

GRADE Grid

23
05.10.2011

Implications of
a strong recommendation
• Patients: Most people in this situation
would want the recommended course of
action and only a small proportion would
not
• Clinicians: Most patients should receive
the recommended course of action
• Policy makers: The recommendation can
be adapted as a policy in most situations

Implications of
a weak/conditional recommendation
• Patients: The majority of people in this
situation would want the recommended
course of action, but many would not
• Clinicians: Be more prepared to help
patients to make a decision that is
consistent with their own values/decision
aids and shared decision making
• Policy makers: There is a need for
substantial debate and involvement of
stakeholders

24
05.10.2011

Conclusions
 Practice guidelines should be based on the best
available evidence to be evidence based
 GRADE combines what is known in health
research methodology and provides a structured
approach to improve communication
 Criteria for evidence assessment across
questions and outcomes
 Criteria for moving from evidence to
recommendations
 Systematic
 four categories of quality of evidence
 two grades for strength of recommendations
 Transparency in decision making and judgments
is key

25
05.10.2011

26

You might also like