You are on page 1of 3

Current Commentary

The Numbers Game


Evaluation of Statistics by Obstetrics & Gynecology
Roy M. Pitkin, MD, James R. Scott, MD, and Leon F. Burmeister, PhD

purpose of this article is to trace the evolution of the


Statistical analysis has become integral to the planning,
process of evaluating statistics in this Journal.
conduct, and reporting of modern medical research.
A detailed review of the 50-year history of Obstet-
Attention to the statistical aspects of manuscripts sub-
rics & Gynecology1 identified recurring concern with
mitted to Obstetrics & Gynecology goes back approxi-
mately 40 years and the process used in their evaluation
statistics in meetings of the Editorial Board as far back
has evolved over that time. For the past 20 years, sub- as 1973 when the Editor was authorized to obtain
missions with any type of statistics and being seriously statistical consultation of manuscripts when it was felt
considered for acceptance have routinely been reviewed to be necessary. How often—indeed if at all—this sys-
by a Statistical Editor who judges the work on a number tem of ad hoc consultation with a statistician was used
of statistical and design characteristics. Findings of the is not known.
statistical design review (which has been done by one Statistical consultation was formalized in 1986
Statistical Editor over the entire 20-year period) are inte- with the appointment of a designated Statistical
grated into the editorial decision about acceptance. The Consultant to review statistical and design character-
statistical review generally leads to rejection of approxi- istics in manuscripts referred by the Editor. The
mately 16–25% of manuscripts and in a larger propor- consultant’s second duty was to present an educa-
tion, it identifies less serious problems, the correction tional program to the Editorial Board at its annual
of which improves the final product. meeting. The latter was designed as a 4-year cycle
(Obstet Gynecol 2014;123:353–5) so that each member would receive the full program
DOI: 10.1097/AOG.0000000000000079 during his or her time on the Board. The person
appointed to the position (Leon F. Burmeister) would

O ver the past 50 years or so, the use of statistics serve the Journal for 27 years under a number of titles:
has assumed ever-increasingly importance in Statistical Consultant 1986–1999, Assistant Editor
planning, conduct, interpretation, and reporting of (Statistics) 1999–2002, and Associate Editor (Statistics)
medical research. Clearly, the point has been reached 2001–2013.
where statistical analysis is integral and essential in all
but the simplest of observational reports (eg, case SELECTIVE EVALUATION
reports). Thus, peer review journals such as Obstetrics The system initiated in 1986 involved formal evalu-
& Gynecology have needed to include evaluation of sta- ation of statistics in manuscripts identified during the
tistics as an integral part of their assessment of manu- standard review process as needing it. Extramural
scripts submitted for consideration of publication. The peer reviewers were advised that the Journal had
a Statistical Consultant and if they felt anything about
From the Departments of Obstetrics and Gynecology, David Geffen School of
the manuscript’s design, methodology, results, or
Medicine at UCLA, Los Angeles, California, and University of Utah School interpretation warranted this kind of specialized eval-
of Medicine, Salt Lake City, Utah; and the College of Public Health, University uation, they should so indicate in their comments to
of Iowa, Iowa City, Iowa.
the Editor. Additionally, the consultant became
Corresponding author: Roy M. Pitkin, MD, 78900 Rancho La Quinta Drive, La involved if the Editorial Board member or the Editor
Quinta, CA 92253; e-mail: r.pitkin@earthlink.net.
primarily responsible for the manuscript regarded for-
Financial Disclosure
The authors did not report any potential conflicts of interest. mal statistical review as indicated. By this process of
© 2014 by The American College of Obstetricians and Gynecologists. Published
selective evaluation, some 5–6% of submitted manu-
by Lippincott Williams & Wilkins. scripts were examined by the consultant, whose find-
ISSN: 0029-7844/14 ings were incorporated into the Editor’s disposition

VOL. 123, NO. 2, PART 1, FEBRUARY 2014 OBSTETRICS & GYNECOLOGY 353
letter. A few submissions were judged to be so flawed reviews by expert referees and the Editorial Board), if
with respect to design, statistical analysis, or both that it appears that the work is potentially acceptable, it is
they were declined for publication on this basis alone then sent to the Statistical Editor. The results of the
and more had needed revisions identified to improve screen are integrated into the editorial decision on
the work. initial manuscript disposition. Sometimes the screening
The system seemed sound and there was a general finds one or more fatal flaws, leading the editors to
sense that it addressed the issue appropriately. How- decide to decline the paper for publication on that basis
ever, a suggestion arose that it might be overly alone. More often, the screening identifies important
optimistic. Editorial Board members traditionally ways of improving the manuscript, and these are incor-
gave short talks at their last Board meeting and in porated into the Editor’s letter inviting revision.
1991, Susan Johnson gave as her valedictory an Experience with the first 8 months of this program
analysis of statistics and design of articles published clearly supported the value of the approach.3 Some 16%
over 1 year. She found that only 72% of articles using of manuscripts screened were found to have such seri-
statistics identified the analytical techniques used and ous statistical or design flaws as to prompt rejection, an
that problems such as multiple comparisons and astonishing datum in view of the fact that all had
failure to emphasize disease prevalence in relation to “passed” the standard peer review process. In 65% of
predictive values were encountered occasionally. manuscripts screened, a need for important improve-
These observations and other concerns suggested that ments was identified; common deficiencies found were
the selective system might not be functioning opti- inadequate description of the study population, failure
mally. Therefore, in 1993, it was decided to conduct to justify sample size, use of statistical tests without pre-
an internal study of routine statistical screening. senting evidence their underlying assumptions are sat-
For the study, 100 consecutive manuscripts report- isfied, and inappropriate use of the term “randomized.”
ing any type of statistical data or analysis were sent to Gratifyingly, the process did not prolong the review
the Statistical Consultant simultaneously with being time appreciably.
sent for review by external referees and Editorial Board
members. The consultant screened the papers using REFINEMENTS (2001–2013)
a checklist including such items as definition of the A new Editor-in-Chief and editorial team, which took
study population, justification of sample size, assign- office in 2001, continued with a similar approach of all
ment to interventions, and plan of statistical analysis. manuscripts containing any statistical data and being
The results were surprising: fully one-fourth of sub- seriously considered for publication undergoing for-
missions were judged to have serious enough statistical mal statistical review. This review followed a format
or design flaws that the consultant concluded they covering 10 essential points of design and analysis,
should be rejected on that basis alone. More impor- with additional comments as appropriate, and the
tantly, this was unrecognized by the standard review Statistical Editor’s report was reviewed by each of the
process in 15 of the 25 cases. Moreover, more than half editors at a weekly telephonic conference at which
of those manuscripts returned to the author with an manuscript disposition was decided. If a study was
invitation to revise went with important but nonfatal judged unacceptable by the Statistical Editor because
flaws identified by the statistical screen, whose correc- of poor design or other statistically related issues that
tion improved the final product. could not be corrected, it was usually rejected. With
These results indicated clearly that formal screen- less serious and potentially remediable statistical
ing of manuscripts for statistical and design aspects design issues, the manuscript was returned for possi-
makes unique contributions by 1) identifying fatal flaws ble revision; if the editorial team felt the author(s) had
that would otherwise go undetected and 2) pointing out been able to address less serious statistical issues by
important improvement needed in papers ultimately satisfactory revision, the paper usually was accepted.
accepted and published. The Editorial Board consid- During the 12 years from 2001 to 2013, 5,305
ered the matter at its 1994 meeting and decided to manuscripts underwent statistical review. Recent tab-
establish a policy of routine screening of all manuscripts ulation of 18 months’ experience (July 2011 to Janu-
before acceptance. ary 2013), involving 719 consecutive manuscripts,
reveals the relation between the Statistical Editor’s
ROUTINE STATISTICAL SCREENING recommendation and ultimate manuscript disposition
On July 1, 1994, the Journal initiated its new policy of was quite close (Table 1). Most importantly, 85% of
routine screening.2 At the time of initial disposition papers the Statistical Editor recommended be rejected
(ie, when the Editor analyzes the manuscript and the were ultimately declined for publication.

354 Pitkin et al Statistics in Obstetrics & Gynecology OBSTETRICS & GYNECOLOGY


Table 1. Statistical Editor Recommendation and the Reporting of Diagnotic accuracy studies (STARD)
Manuscript Acceptance* for reporting studies of diagnostic accuracy adopted in
2004; and STrengthening the Reporting of OBserva-
Recommendation Accepted and Published tional studies in Epidemiology (STROBE) guidelines
Accept (n563) 48 (76) for reporting the results of observational studies in
Minor revision (n5313) 233 (74) epidemiology adopted in 2007.
Major revision (n5138) 87 (63) In 2002, a Consultant Editor for Epidemiology
Reject (n5205) 31 (15) (David A. Grimes, MD) joined the editorial team. An
Data are n (%). editorial introducing this new appointment empha-
* Seven hundred nineteen consecutive manuscripts, July 2011 to sized the importance of evidence-based medicine,
January 2013.
which “seeks to replace unproved clinical practices with
a more effective and scientific approach to patient
Additionally, the Statistical Editor’s educational care.”5 During his tenure, from 2002 to 2012, the Epi-
lecture at the annual Editorial Board meeting clarified demiology Editor participated in the assessment of cer-
controversial topics and questions that had arisen dur- tain complex manuscripts and in addition contributed
ing the past year. Among the subjects covered were educational lectures on epidemiologic topics during
meta-analysis, decision analysis, and cost-effectiveness annual Editorial Board meetings.
analysis; data “torturing” or overanalysis; limitations
and consequences of various sample designs; interpre-
CONCLUSION
tation of relative risks and odds ratios; and failed
assumptions on multivariable analyses. Statistical aspects of manuscripts submitted to Obstet-
In addition to emphasizing formal statistical rics & Gynecology have been a subject of attention going
evaluation of manuscripts, Obstetrics & Gynecology has back at least 5 years. Over this period, the methodol-
been among the first medical journals to adopt newly ogy has evolved and for approximately 20 years,
promulgated guidelines for the proper reporting of some form of routine screening by a Statistical Editor
specific types of studies.4 Each of these sets of guide- has been part of the system. In addition, the closely
lines has its own requirements for reporting statistics related field of epidemiology has been incorporated in
and includes a checklist to be filled out by the author the evaluation process, whose ultimate goal is to pro-
and submitted with the manuscript. These interna- vide the readers with the most reliable information
tional standards and the year of their adoption by possible.
Obstetrics & Gynecology are: Consolidated Standards
of Reporting Trials (CONSORT) for reporting ran- REFERENCES
domized trials adopted in 1996; Quality of Reporting 1. Pitkin RM. The Green Journal: fifty years on. Washington (DC):
of Meta-analyses (QUOROM) for reporting system- American College of Obstetricians and Gynecologists; 2003.
atic reviews and meta-analyses of randomized trials 2. Pitkin RM. Statistical evaluation of manuscripts: it’s all in the
numbers. Obstet Gynecol 1994;83:1043–4.
adopted in 2000, replaced by the Preferred Reporting
3. Pitkin RM, Burmeister LF. Routine statistical screening revisited.
Items for Systematic Reviews and Meta-Analyses Obstet Gynecol 1995;86:124–5.
(PRISMA) guidelines in 2009; Meta-analysis of
4. Begg G, Cho M, Eastwood S, Horton R, Moher D, Olkin I, et al.
Observational Studies in Epidemiology (MOOSE) Improving the quality of reporting of randomized controlled
for reporting meta-analyses of observational studies trials: the CONSORT statement. JAMA 1996;276:637–9.
in epidemiology adopted in 2001; STAndards for 5. Scott JR. Show me the evidence. Obstet Gynecol 2002;100:403–4.

VOL. 123, NO. 2, PART 1, FEBRUARY 2014 Pitkin et al Statistics in Obstetrics & Gynecology 355

You might also like