You are on page 1of 10

medical education in review

Conducting systematic reviews in medical education: a


stepwise approach
David A Cook1,2 & Colin P West2,3

OBJECTIVES As medical education research studies using multiple databases (MEDLINE


continues to proliferate, evidence syntheses will alone is insufficient) and other resources
become increasingly important. The purpose of (article reference lists, author files, content
this article is to provide a concise and practical experts). Expert assistance is helpful. (v)
guide to the conduct and reporting of system- Decide on the inclusion or exclusion of each
atic reviews. identified study, ideally in duplicate, using
explicitly defined criteria. (vi) Abstract key
RESULTS (i) Define a focused question information (including on study design, par-
addressing the population, intervention, com- ticipants, intervention and comparison fea-
parison (if any) and outcomes. (ii) Evaluate tures, and outcomes) for each included article,
whether a systematic review is appropriate to ideally in duplicate. (vii) Analyse and synthesise
answer the question. Systematic and non- the results by narrative or quantitative pooling,
systematic approaches are complementary; the investigating heterogeneity, and exploring the
former summarise research on focused topics validity and assumptions of the review itself. In
and highlight strengths and weaknesses in addition to the seven key steps, the authors
existing bodies of evidence, whereas the latter provide, information on electronic tools to
integrate research from diverse fields and facilitate the review process, practical tips to
identify new insights. (iii) Assemble a team and facilitate the reporting process and an anno-
write a study protocol. (iv) Search for eligible tated bibliography.

Medical Education 2012: 46: 943–952


doi:10.1111/j.1365-2923.2012.04328.x

Discuss ideas arising from this article at


www.mededuc.com ‘discuss’

1
Office of Education Research, Mayo Medical School, Rochester, Correspondence: David A Cook MD, MHPE, Division of General Internal
Minnesota, USA Medicine, Mayo Clinic College of Medicine, 200 First Street SW,
2
Division of General Internal Medicine, Mayo Clinic, Rochester, Rochester, Minnesota 55905, USA. Tel: 00 1 507 266 4156;
Minnesota, USA Fax: 00 1 507 284 5370; E-mail: cook.david33@mayo.edu
3
Division of Biomedical Statistics and Informatics, Mayo Clinic,
Rochester, Minnesota, USA

ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952 943
D A Cook & C P West

guidelines offering a structured approach to plan-


INTRODUCTION
ning, conducting and reporting a systematic review in
medical education.
As medical education research continues to prolifer-
ate, syntheses of this evidence will become increas-
The purpose of this article is to provide a concise
ingly important. Systematic reviews play a critical role
and practical guide to the conduct and reporting of
in this process of synthesis by identifying and
systematic reviews, with particular attention to issues
summarising published research on focused topics
affecting medical education research. Table 1
and highlighting strengths and weaknesses in that
summarises the key steps. An annotated bibliogra-
field. Although some have criticised systematic
phy (Appendix S1, online) lists resources that
reviews as engendering false confidence in their
elaborate on each of the methods that will be
objectivity and freedom from bias,1 others have
described. At each step, we will illustrate the
argued for a more balanced role.2 Of the 108
principles involved using text drawn from a recent
applications for funding to conduct a literature
review of simulation-based education published by
review received in the past 2 years by the Society of
the first author.6
Directors of Research in Medical Education, over half
have proposed systematic reviews. As chair of the
review committee, author DAC has observed that
many applicants fail to anticipate the key actions THE PROCESS
required in a rigorous review. It appears that guid-
ance on how to conduct a high-quality systematic Define a focused research question
review in medical education is needed. Although a
number of books have been published on systematic The first step in conducting a systematic review is to
reviews, and previous articles in the medical educa- identify a focused question. This can be more
tion literature have highlighted challenges and challenging than it would first appear. A good
provided brief tips,3–5 we are not aware of any concise question usually evolves from discussions with

Table 1 Steps in the review process

1 Define a focused question


Consider Population, Intervention, Comparison, Outcomes
2 Evaluate whether a systematic review is appropriate to answer the question
3 Assemble a team and write a protocol
4 Search for eligible studies
Identify information sources: indexing databases; previous reviews; reference lists; author files, and experts in the field
Define search terms
5 Decide on the inclusion or exclusion of each identified study
Define inclusion and exclusion criteria; pilot-test and refine operational definitions
Define restrictions
Stage 1: review titles and abstracts in duplicate; err on the side of inclusion
Stage 2: review full text in duplicate; resolve disagreements by consensus
6 Abstract data
Define data abstraction elements; pilot-test and refine operational definitions
Abstract data in duplicate; resolve disagreements by consensus
7 Analyse and synthesise
Focus on synthesis: organise and interpret the evidence while providing transparency
Pool results through narrative or meta-analysis
Explore strengths, weaknesses, heterogeneity and gaps
Explore the validity and assumptions of the review itself

944 ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952
Conducting systematic reviews

collaborators and undergoes multiple iterations Evaluate whether a systematic review is an


before reaching its final form. The PICO mnemonic appropriate way of answering the question
often used in clinical evidence-based medicine can
also be helpful for systematic review questions. This There are both advantages and disadvantages to the
mnemonic requires specification of the population, systematic review process in comparison with other
intervention (or other review topic, such as assess- review approaches.2 The issue is, in many respects,
ment tool), comparison interventions (if any), and analogous to debates regarding quantitative and
outcomes of interest. For example, a focused question qualitative research in general. Non-systematic
might ask: reviews integrate research from diverse fields and
identify new insights,1 whereas systematic reviews
P: in health professionals, is summarise research on focused topics and highlight
I: training using simulation technologies, strengths and weaknesses in existing bodies of
C: in comparison with no intervention, evidence.7 Much like a lighthouse shining over the
O: associated with improved knowledge, skills, ocean, systematic reviews illuminate our under-
behaviours or patient effects? standing of a focused question, but leave other
issues in the dark: ‘The very rules that enhance the
The terms used in the question may require further systematic review’s rigor blind the researcher to
definition. For example, what are ‘simulation tech- ideas outside the scope of the focused question and
nologies’? Do veterinarians count as ‘health profes- resultant search strategy.’2 Thus, systematic and non-
sionals’? What is the difference between a behaviour systematic approaches are perhaps best viewed as
and a patient effect? Thus, although the focused complementary, and the decision to conduct a
question should be stated clearly and concisely – systematic review should be based on the overall
usually as a single sentence within the PICO study objective.
framework – it will often require further elaboration
to be truly useful. Before embarking on a review – systematic or
otherwise – the reviewers should carefully consider
Researchers often have some idea of what the results the strengths and weaknesses of existing reviews on
of a systematic review may show, based on relevant that topic. In order to contribute to the literature, a
theories, previous reviews addressing related ques- new review must fill a meaningful gap in published
tions, and their (non-systematic) familiarity with the reviews and add significantly to current knowledge, in
literature. It is often appropriate to make such terms of either quality or data. For example, a new
predictions explicit in the form of ‘anticipated systematic review that adds one small study to an
results’ or formal hypotheses. existing high-quality systematic review of a dozen well-
executed studies is unlikely to be useful.
Although deceptively brief, the importance of a clear
question cannot be overstated. It will establish the Finally, prospective authors should realise that a well-
framework for every step that follows. conducted systematic review requires much time and
effort. Systematic reviews do not represent ‘quick and
The authors of the review of simulation-based easy’ research.
education ‘sought to answer two questions: (i)
To what extent are simulation technologies for The first author of the review of simulation-based
training health care professionals associated education6 conducted a thorough search to iden-
with improved outcomes in comparison with tify previous reviews, and found only one com-
no intervention? and (ii) How do outcomes prehensive review (8 years old) in the field and
vary for different simulation instructional only one meta-analysis (on a narrow subtopic).
designs?’6 Relevant outcomes were defined in The investigating team anticipated that there
the subsequent paragraph as learning (knowl- would be substantial heterogeneity among differ-
edge or skills in a test setting), behaviours (in ent studies, but felt that a meta-analytic summary
practice), or effects on patients. The first would be useful provided they paid close attention
author framed a preliminary version of this to the comparison group (e.g. active intervention
question prior to assembling the study team. versus no intervention), and performed subgroup
The scope and wording of the ques- analyses to explore heterogeneity when found.
tion evolved through iterative discussions The investigators conducted a preliminary search
amongst team members before its final form to estimate the number of eligible articles and thus
was established. anticipate the time commitment required.

ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952 945
D A Cook & C P West

Assemble a team and write a protocol Search for eligible studies

Assemble a team The comprehensive identification of relevant studies


is a hallmark of a systematic review. Much has been
Systematic reviews are a team activity. Choosing the written on the conduct of literature searches8–11 and
right team may be one of the most important Maggio et al.12 have proposed 10 criteria that collec-
decisions in the entire review process. Although tively define an ideal literature search. However, in
consensus on definitions and coding decisions is broad terms there are two key questions to consider.
necessary for a review to move forward, a diversity of
perspectives helps to enrich discussions and ulti- Firstly, what sources of information will be used? A
mately enhances the quality and generalisability of comprehensive systematic search will interrogate
the review. At least one member of the team should multiple information sources in an attempt to uncover
have experience with systematic reviews and ideally all eligible studies. MEDLINE is typically used, but
one member (often a medical librarian) should have alone will usually be insufficient because the overlap
experience in conducting literature reviews. between MEDLINE and other databases is incom-
plete.8 Other indexing databases include EMBASE,
Because systematic reviews are hard work, it is helpful Scopus, PsycINFO, Web of Science, CINAHL (Cumu-
to provide an idea of the scope of the project and lative Index to Nursing and Allied Health Literature
expected workload early in the process. Team mem- [for nursing]) and ERIC (Educational Resources
bers who lack the necessary time, commitment or Information Centre [for education studies]). In
expertise may contribute to frustrating delays in the addition to these indexing databases, it is useful for
research plan. Thus, effectively matching project reviewers to look for relevant articles in their own files
needs with available resources is important. and to contact experts in the field for further relevant
publications. The references cited in previous reviews
Write a protocol on the topic can be used both as a verification step (see
below) and to supplement gaps. Finally, a hand search
As with any research activity, a project protocol is a of the references cited in included articles may reveal
crucial element that provides both rigor and guidance studies that were missed in the search but are known to
during the process. The protocol should be written other authors in that field.
during or immediately after the writing of the focused
question. The protocol incorporates specific plans for Secondly, what search terms will be used to query
each of the elements of a successful systematic review, these information sources? The development of a
listed in Table 1 and described in greater detail thorough search strategy requires knowledge of
herein. The protocol may be revised as the project appropriate indexing terms [e.g. medical subject
progresses and more is learned about the study headings (MeSH)], qualifiers and logical operators,
question, but the ability to reference a core protocol all of which vary from one indexing database to
document allows modifications to be tracked and another. For this reason, input from an expert in
applied reproducibly to all steps of the review. literature searches, such as a research librarian, can
be invaluable. The sensitivity (ability to identify
The first author of the review of simulation-based relevant articles) of a preliminary search strategy
education6 deliberately assembled a multidisci- should be verified by ensuring that known relevant
plinary team, including internal medicine doc- articles (e.g. articles known to the reviewers or cited
tors, surgeons, PhD education researchers, and in a previous review or seminal work) are identified
an experienced research librarian, to ensure that using the planned keywords. Reviewers should seek
diverse opinions and skills were reflected in the new keywords in any omitted articles to improve the
review process. The anticipated scope, objectives, search strategy in subsequent iterations.
workload, timeline and products were discussed
and agreed upon prior to beginning. Some of A more sensitive search usually identifies more
those invited to join the team declined because ineligible articles (i.e. it is less specific). Finding the
they could not commit to the deadline. A written right balance usually requires several iterations, and
protocol including the focused question, initial benefits from expert assistance.
search terms, plans for further defining the
search strategy, inclusion and exclusion criteria, All articles identified in the search, including those
key data abstraction items, and the analysis plan excluded at later stages, should be assigned a unique
was finalised prior to beginning the review. identification number. The complete strategy,

946 ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952
Conducting systematic reviews

including specific search terms for each indexing Web-based learning it made conceptual sense to
database, the databases and other sources searched, begin the search at a date subsequent to the
search dates and all search results should be carefully development of the World Wide Web.13 By contrast,
archived for subsequent reporting. there is rarely a good conceptual reason to limit the
search to English-language publications only, be-
Using the initial search terms as a starting point, cause excellent research is often published in other
the research librarian and the first author of the languages.6 The inclusion of grey literature is more
review of simulation-based education6 worked in controversial; some correctly argue that non-peer-
collaboration to identify a comprehensive search reviewed research may be of inferior quality, but
strategy. The search sensitivity was evaluated by others correctly argue that such studies can still,
comparing the articles identified against those when properly analysed, contribute importantly to
already known to the authors and those cited in evidence-based decisions.
previous seminal reviews. If an article was missed,
the title and abstract were carefully reviewed to Defining inclusion and exclusion criteria
identify terms that would improve the sensitivity.
The search was repeated, with adaptations as Regardless of the actual criteria selected, it is impor-
needed, in MEDLINE, EMBASE, CINAHL, tant to clearly define these criteria both conceptually
PsycINFO, ERIC, Web of Science and Scopus. In (often by using a formal definition from a dictionary,
addition, all references cited in several seminal theory or previous review) and operationally (by using
reviews, and the entire table of contents of two detailed explanations and elaborations that help
key journals, were added to the list of articles. reviewers recognise the key concepts as reported by
Finally, the reference lists of randomly selected authors in published articles). Although some oper-
articles were hand-searched to identify additional ational definitions will be defined from the outset,
articles; this continued until no additional arti- many of these may actually emerge during the process
cles were identified. The unabridged search of the review as reviewers come across articles of
strategy was published as an online appendix. uncertain inclusion status. Such cases should be
discussed by the group with the goal not only of
Decide on the inclusion or exclusion of each deciding on the inclusion or exclusion of that article,
identified study but also of defining a rule that will determine the
triage of similar articles in the future. Such decisions,
After identifying a pool of articles, reviewers include along with brief examples of what should and should
or exclude articles based on predefined inclusion and not be included, can be catalogued in an explanatory
exclusion criteria. These criteria typically emerge document. Although the conceptual definitions
naturally from the focused question and, again, the should remain unchanged, the explanatory document
PICO framework can often be used to help to define and the operational definitions it contains often
the population (e.g. ‘medical students’), intervention continue to evolve throughout the review process.
(e.g. ‘problem-based learning’), comparison (e.g. ‘no
intervention’) and outcome (e.g. ‘any learning out- Involving the entire reviewer group in the develop-
come’). Study design (e.g. ‘any comparative design’ ment of the conceptual and operational definitions
or ‘only randomised trials’) may also be considered, not only improves the likelihood that others will
although a more inclusive approach can allow agree with the decisions made, but ensures that
subsequent evaluation of whether results differ everyone will apply the criteria using this shared
depending on study design. understanding. Yet even after the group development
process, it remains essential to pilot-test the inclu-
Additional restrictions may be placed on included sion ⁄ exclusion form and process on a small subset of
work. Reviewers will occasionally exclude articles articles. After each round of pilot-testing, all reviewers
based on language (e.g. by excluding non-English compare their decisions and use points of discrep-
publications), publication date (e.g. by excluding ancy to refine the operational definitions and to
articles older than 20 years), length (e.g. by exclud- recalibrate their own standards.
ing abstracts) and rigor of peer review (e.g. by
excluding graduate theses, papers presented at The inclusion and exclusion process
meetings, and other unpublished works which are
collectively termed ‘grey literature’). Decisions about As with nearly all phases of the review process,
restrictions are best made on conceptual grounds inclusion and exclusion should involve at least two
rather than convenience. For example, in a review of reviewers. Duplicate, independent review minimises

ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952 947
D A Cook & C P West

random error and helps to avoid idiosyncrasies that Defining the data abstraction elements
would bias the review.
What information should be collected? The PICO
The inclusion ⁄ exclusion process typically has two framework can again provide guidance in planning
stages. In stage 1, reviewers look only at the title, which data to collect, including the key features of
abstract and – if available – the keywords. During this participants (number and key demographics), inter-
stage, if both reviewers are confident based on the title ventions (key elements of design, intensity, timing,
and abstract that the article is ineligible, it is excluded. duration and implementation), comparisons (similar
If there is any doubt, such as in a case in which the to interventions) and outcomes. Information on out-
abstract contains insufficient information, the article comes should include details of both the measurement
advances to stage 2. Reviewers typically do not recon- method (e.g. outcome classification, assessor blinding,
cile disagreements at this stage. If either reviewer feels timing in relation to intervention, score validity) and
the paper should be included, it is duly advanced based the actual results (mean and standard deviation, event
on the rationale that resolving uncertainties is best rate, correlation coefficient, effect size, etc.).
done using the full text rather than the abstract alone.
Reviewers should also code information on study
During stage 2, reviewers read the full text of each design, which might include the number of groups,
article to make a final inclusion ⁄ exclusion decision. method of group assignment (randomised versus
Here, two independent reviews are required in all non-randomised), timing of assessments (e.g. post-
cases. Reviewers initially attempt to resolve the intervention versus pre- and post-intervention),
inevitable coding disagreements through discussion enrolment and follow-up rates, and other features of
and consensus, and appeal to another member of the study quality. These elements may vary for different
review team if needed. study designs, but a focus on threats to study validity14
is common among them. Many instruments for
In the review of simulation-based education,6 assessing study quality have been described, including
inclusion and exclusion criteria had been defined the Medical Education Research Study Quality
with the writing of the study protocol. Included Instrument (MERSQI)15 for education research, the
studies were required to have a comparison group, Jadad scale16 for randomised trials, the Newcastle–
but no other design restrictions were imposed Ottawa Scale13,17 for non-randomised studies, and the
(i.e. both randomised and non-randomised stud- Quality Assessment of Diagnostic Accuracy Studies
ies were eligible). The investigators applied these (QUADAS-2)18 for studies of assessment tools. All
criteria to each article identified in step 4. In stage have strengths and weaknesses, and none has been
1, one or two authors reviewed each title and universally accepted as a reference standard. More
abstract; two negative votes were required to important than a score on any particular instrument
exclude an article, whereas one positive vote would is the assessment of possible bias and validity threats
advance the article to stage 2 (i.e. err on the side of in each study in the systematic review.
inclusion). In stage 2, two investigators indepen-
dently reviewed the full text of each article and The data abstraction process
resolved all disagreements by consensus. Whereas
the original wording of the inclusion and exclu- A data abstraction form should be developed and
sion criteria remained unchanged, the opera- iteratively refined. As with the inclusion ⁄ exclusion
tional definitions of these criteria evolved over criteria, the elements of data to be abstracted must be
time. Over 30 articles were translated from non- defined both conceptually and operationally, and the
English languages including Chinese, Japanese, development of an explanatory document with
Korean, Spanish, French, German, Swedish and detailed definitions and examples is essential. In
Finnish. The authors kept a careful accounting of addition to the questions defined at the study outset,
the reason for each inclusion and exclusion, and new questions often emerge as the review team reads
summarised this in a trial flow figure. All included articles during the inclusion process. These questions
articles were listed in an online appendix. dictate the data to be abstracted.

Abstract data Pilot-testing is necessary to identify ambiguous defi-


nitions and other areas for which additional consis-
After studies have been selected for inclusion, the tency or clarification is required. The entire review
next step is to methodically abstract key information team may examine the same article or small set of
from each included article. articles prior to a group discussion, and additional

948 ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952
Conducting systematic reviews

cycles of this process may be carried out thereafter as quality was evaluated using two complementary
needed until a high degree of consistency is achieved. criteria. Investigators reviewed each article in
Data abstraction should ideally be carried out by two duplicate to abstract this information using an
independent reviewers. Coding disagreements must electronic tool designed for this purpose
be resolved, ideally by consensus and by appeal to a (DistillerSR.com) and resolved discrepant codes
third party if necessary. by consensus.

Many reviews will encounter articles containing Analyse and synthesise


incomplete information. The review team must
decide how to handle such articles; solutions might During synthesis the hard work of systematic inclu-
involve excluding such articles, imputing missing sion and data abstraction pays off as the collected
information from other articles, or attempting to evidence is analysed to answer the focused question
obtain missing information from the original authors. and to develop new insights. This synthesis can take
The impact of these decisions on the overall review the form of a quantitative summary (i.e. meta-
results should be considered regardless of the choices analysis)19 or a more qualitative narrative synthesis20
made. or realist review.21 However, the most important part
of the analysis or synthesis is that it should actually
The authors of the review of simulation-based synthesise the evidence rather than simply catalogue it.
education6 considered the findings of previous Synthesis requires more than merely reporting the
reviews and their own impressions of the field as results of each study one at a time (the ‘litany of the
they identified data elements to collect. Study literature’) or counting the number of studies with

Table 2 Tools to facilitate the review

Tool Example Useful for:

E-mail, instant messaging, Various options Team communication, particularly to accommodate


teleconferencing physical distances and scheduling difficulties
Videoconferencing Skype One-to-one communication, particularly if visual
cues are important
E-mail listservs and groups Google groups, Yahoo groups Group discussions; particularly valuable when discussing
coding criteria and other key decisions as the tool will
keep a permanent archive of the discussion that can
later be used to both recall the decision and
understand the arguments that led to that decision
Scheduling software Doodle, MeetingWizard, Tungle, TimeBridge Scheduling group meetings
Wikis Google Docs, Wikispaces.com Group collaboration on documents such as the coding
explanatory document, deadlines, assigned tasks,
thematic analysis
Spreadsheet Microsoft Excel, OpenOffice Calc, Google Docs Article inclusion ⁄ exclusion, data abstraction and
rudimentary data analysis
23
Bibliographic software EndNote, Reference Manager Article inclusion ⁄ exclusion, data abstraction,
preparation of reference lists
Translation software Google Translate Translating articles from other languages
Purpose-built review DistillerSR, EPPI-Reviewer 4 Article inclusion ⁄ exclusion and data abstraction
software
Meta-analysis software RevMan, MetaAnalyst, Comprehensive Meta-analysis
Meta Analysis, MetaWin, SAS, SPSS, STATA SAS and SPSS do not directly support meta-analysis,
but macros are available to do this
(http://mason.gmu.edu/~dwilsonb/ma.html)

ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952 949
D A Cook & C P West

Table 3 Tips on reporting a systematic review*

Manuscript section Key points

Introduction Keep the Introduction brief (two or three paragraphs)


Focus the literature review on previous review articles, emphasising their strengths and limitations, and highlighting
the added contribution of (i.e. need for) a new review on the topic
Methods Divide the Methods into five sections, with headings pertaining to: a focused question; search strategy; inclusion
criteria; data abstraction, and analysis
Most of this information can be pulled from your written protocol
Results Divide the Results section into at least four sections, with headings pertaining to: trial flow; study characteristics;
study quality, and synthesis
Trial flow: describe the number of studies identified in the search, the number included and excluded, the number
remaining for full review, and any special issues that arose during this process
Study characteristics: describe a few (three to five) of the most salient study features
Study quality: describe the most important aspects of study quality
Synthesis: present the results of the narrative or quantitative data analysis. The synthesis narrative should distil the
evidence into a clear message for readers. Articles that support a similar point (either favourable or unfavourable
to the overall conclusion) should be grouped together rather than listed individually. For example, a reviewer
might write: ‘It appears that intervention X improves outcomes. Five randomised trials addressed this issue; four
found favourable results and one found no significant difference (see Table 2 for details).’ The reviewer might
then proceed to discuss salient between-study differences, such as in design, participants, interventions or
instruments, that might have influenced results
Include at least one specific figure and two specific tables: (i) a trial flow diagram as specified in the QUOROM
and PRISMA guidelines; (ii) a table that contains information on the key features of each study, and (iii) a table
with details of each study’s quality features
Discussion Divide the Discussion into at least four sections, with headings for the last three, pertaining to: summary;
limitations; integration with previous reviews, and implications for practice and future research
Summary (no separate heading): recap (but do not restate) the most important results, including key
uncertainties if any
Limitations: acknowledge the review’s limitations and unique strengths
Integration: discuss how the present review supports, contradicts or extends the findings of previous relevant
reviews. In addition to considering reviews in the present field of study, it is often helpful to draw parallels with
findings in other fields (e.g. clinical medicine, other education topics, or non-medical education)
Implications: outline two to four main points that can be immediately applied in practice or will provide a starting
point for future research. Note that there is no need for a separate conclusions section; the implications are,
in reality, your conclusions
Abstract Write the abstract last
Follow the structure prescribed by the PRISMA guidelines unless the journal requires a different format
Even if the journal requires a different format, retain the content requested by PRISMA

* These complement the more complete recommendations of the PRISMA statement26

statistically significant results (vote counting).22 organise and interpret the evidence, anticipating and
Rather, synthesis involves pooling and exploring the answering readers’ questions about this topic, while
results to provide a ‘bottom-line’ statement regarding simultaneously providing transparency that allows
what the evidence supports and what gaps remain in readers to verify the interpretations and arrive at their
our current understanding. This requires reviewers to own conclusions.

950 ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952
Conducting systematic reviews

Reviewers must make a number of key decisions The authors of the review of simulation-based
regarding the analysis. Firstly, should they attempt a education6 used DistillerSR.com to track inclu-
statistical pooling of quantitative results (i.e. meta- sion ⁄ exclusion and for data abstraction. They
analysis)? If so, further decisions about this process used a Google Group to archive all e-mail
will refer to which statistical model to apply (e.g. a communications, Google Docs to keep an
fixed-effects or random-effects model) and how to ongoing list of articles in need of translation,
standardise outcomes across studies. Details on Doodle.com to schedule teleconferences,
meta-analysis are beyond the scope of this article; Google Translate for some simple transla-
Appendix S1 lists several helpful resources. tion needs, and EndNote to manage refer-
ences. They used SAS macros to perform
Secondly, how will reviewers explore heterogeneity meta-analysis.
(inconsistency) across studies? The most informative
aspect of many reviews is not the average result across
studies, but, rather, the exploration of why results REPORTING
differ from study to study. An explanation of between-
study inconsistency should be part of all systematic The key elements in reporting systematic reviews and
reviews. meta-analyses have been codified in guidelines such
as the QUOROM (quality of reporting of meta-
Finally, the authors should consider threats to analyses),24 MOOSE (meta-analysis of observational
the validity of their own review. By transparently studies in epidemiology)25 and, most recently, PRIS-
reporting their methods, acknowledging key MA (preferred reporting items for systematic reviews
assumptions, exploring potential sources of bias and and meta-analyses)26 statements. We encourage
providing tables containing detailed information on reviewers to adhere to these guidelines (http://
each study, the reviewers encourage readers to verify www.prisma-statement.org), but we will not repeat
and potentially reinterpret the information for these in detail. We provide some practical advice for
themselves. Indeed, the degree to which reviewers writing the manuscript itself in Table 3.
explore the strengths, weaknesses, heterogeneity and
gaps in the evidence determines in large part the The review6 team cited the PRISMA guidelines
value of the review. and adhered to these during the planning,
conduct and reporting of the review.
The authors of the review of simulation-based
education6 used meta-analysis to synthesise the
results. They used I2 statistics to quantify heter- CONCLUSIONS
ogeneity and subgroup analyses to explore this
heterogeneity. They also performed a narrative As the volume and quality of evidence in medical
synopsis of key study characteristics including education continue to expand, the need for evidence
trainees, clinical topics and study quality. In synthesis will grow. By following the seven key steps
subsequent manuscripts (in press) on focused outlined in this paper to complete a high-quality
topics they have used narrative synthesis methods systematic review, authors will more meaningfully
to identify and summarise salient themes. contribute to this knowledge base.

REVIEWING TOOLS Contributors: DAC drafted the initial manuscript. Both


authors jointly revised subsequent drafts and both approved
Paper tracking and generic statistical software may the final manuscript for publication.
have been sufficient to perform a systematic review in Acknowledgements: none.
the past, but as the volume of evidence and the Funding: none.
sophistication of techniques expand, it is increasingly Conflicts of interest: none.
necessary to rely on electronic resources and tools to Ethical approval: not applicable.
support high-quality reviews. As detailed in Table 2,
such tools can facilitate and archive team communi-
cations, streamline the process of inclusion ⁄ exclusion REFERENCES
and data abstraction,23 assist in thematic analysis,
1 Eva KW. On the limits of systematicity. Med Educ
format bibliographies and even translate excerpts
2008;42:852–3.
from articles in other languages.

ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952 951
D A Cook & C P West

2 Cook DA. Narrowing the focus and broadening programs/clinical_epidemiology/oxford.htm.


horizons: complementary roles for non-systematic and [Accessed 29 February 2012.]
systematic reviews. Adv Health Sci Educ Theory Pract 18 Whiting PF, Rutjes AWS, Westwood ME, Mallett S,
2008;13:391–5. Deeks JJ, Reitsma JB, Leeflang MMG, Sterne JAC,
3 Reed D, Price EG, Windish DM, Wright SM, Gozu A, Bossuyt PMM, Group tQ. QUADAS-2: a revised tool for
Hsu EB, Beach MC, Kern D, Bass EB. Challenges in the quality assessment of diagnostic accuracy studies.
systematic reviews of educational intervention studies. Ann Intern Med 2011;155:529–36.
Ann Intern Med 2005;142:1080–9. 19 Cook DA. Randomised controlled trials and meta-
4 Reeves S, Koppel I, Barr H, Freeth D, Hammick M. analysis in medical education: what role do they play?
Twelve tips for undertaking a systematic review. Med Med Teach 2012;34: 468–73.
Teach 2002;24:358–63. 20 Bland CJ, Meurer LN, Maldonado G. A systematic
5 Hammick M, Dornan T, Steinert Y. Conducting a approach to conducting a non-statistical meta-analysis
best evidence systematic review. Part 1: from idea to of research literature. Acad Med 1995;70:642–53.
data coding. BEME Guide No. 13.. Med Teach 21 Pawson R, Greenhalgh T, Harvey G, Walshe K. Realist
2010;32:3–15. review – a new method of systematic review designed
6 Cook DA, Hatala R, Brydges R, Zendejas B, Szostek JH, for complex policy interventions. J Health Serv Res Policy
Wang AT, Erwin P, Hamstra S. Technology-enhanced 2005;10 (Suppl 1):21–34.
simulation for health professions education: a system- 22 Bushman BJ, Wang MC. Vote-counting procedures in
atic review and meta-analysis. JAMA 2011;306:978–88. meta-analysis. In: Cooper H, Hedges LV, Valentine JC,
7 Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: eds. The Handbook of Research Synthesis, 2nd edn. New
synthesis of best evidence for clinical decisions. Ann York, NY: Russell Sage Foundation 2009;207–20.
Intern Med 1997;126:376–80. 23 King R, Hooper B, Wood W. Using bibliographic soft-
8 Lefebvre C, Manheimer E, Glanville J. Searching ware to appraise and code data in educational system-
for studies. In: Higgins JPT, Green S, eds. Cochrane atic review research. Med Teach 2011;33:719–23.
Handbook for Systematic Reviews of Interventions. Chiches- 24 Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D,
ter: Wiley-Blackwell 2008;95–150. Stroup DF. Improving the quality of reports of meta-
9 White HD. Scientific communication and literature analyses of randomised controlled trials: the QUO-
retrieval. In: Cooper H, Hedges LV, Valentine JC, eds. ROM statement. Quality of reporting of meta-analyses.
The Handbook of Research Synthesis, 2nd edn. New York, Lancet 1999;354:1896–900.
NY: Russell Sage Foundation 2009;51–71. 25 Stroup DF, Berlin JA, Morton SC et al. Meta-analysis of
10 Haig A, Dozier M. BEME Guide No. 3: systematic observational studies in epidemiology: a proposal for
searching for evidence in medical education. Part 1: reporting. JAMA 2000;283:2008–12.
sources of information. Med Teach 2003;25:352–63. 26 Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred
11 Haig A, Dozier M. BEME Guide No. 3: systematic reporting items for systematic reviews and meta-
searching for evidence in medical education. Part 2: analyses: the PRISMA statement. Ann Intern Med
constructing searches. Med Teach 2003;25:463–84. 2009;151:264–9.
12 Maggio LA, Tannery NH, Kanter SL. Reproducibility of
Received 2 March 2012; editorial comments to authors 3 May
literature search reporting in medical education re- 2012; accepted for publication 24 May 2012
views. Acad Med 2011;86:1049–54.
13 Cook DA, Levinson AJ, Garside S, Dupras DM, Erwin
PJ, Montori VM. Internet-based learning in the health
SUPPORTING INFORMATION
professions: a meta-analysis. JAMA 2008;300:1181–96.
14 Cook DA, Beckman TJ. Reflections on experimental
research in medical education. Adv Health Sci Educ Additional supporting information may be found in
Theory Pract 2010;15: 455–64. the online version of this article. Available at: http://
15 Reed DA, Cook DA, Beckman TJ, Levine RB, Kern DE, onlinelibrary.wiley.com/doi/10.1111/j.1365-2923.2012.
Wright SM. Association between funding and quality of 04328.x/suppinfo
published medical education research. JAMA
2007;298:1002–9. Appendix S1. Annotated bibliography of additional
16 Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds resources.
DJM, Gavaghan DJ, McQuay HJ. Assessing the quality of
reports of randomised clinical trials: is blinding nec-
Please note: Wiley-Blackwell is not responsible for the
essary? Control Clin Trials 1996;17:1–12.
content or functionality of any supporting materials
17 Wells GA, Shea B, O’Connell D, Peterson J, Welch V,
Losos M, Tugwell P. The Newcastle–Ottawa Scale supplied by the authors. Any queries (other than for
(NOS) for assessing the quality of non-randomised missing material) should be directed to the corre-
studies in meta-analyses. http://www.ohri.ca/ sponding author for the article.

952 ª Blackwell Publishing Ltd 2012. MEDICAL EDUCATION 2012; 46: 943–952

You might also like