You are on page 1of 10

review articles

DOI:10.1145/ 2979672
more opportunities for computational
New tools tackle an age-old practice. support that are not currently being
exploited. The aim of this article is to
BY SIMON PRICE AND PETER A. FLACH identify such opportunities and de-
scribe a few early solutions for auto-
mating key stages in the established

Computational
academic peer review process. When
developing these solutions we have
found it useful to build on our back-

Support for
ground in machine learning and ar-
tificial intelligence: in particular, we
utilize a feature-based perspective in
which the handcrafted features on

Academic
which conventional peer review usu-
ally depends (for example, keywords)
can be improved by feature weight-

Peer Review:
ing, selection, and construction (see
Flach17 for a broader perspective on
the role and importance of features in

A Perspective from
machine learning).
Twenty-five years ago, at the start
of our academic careers, submitting a

Artificial Intelligence paper to a conference was a fairly in-


volved and time-consuming process
that roughly went as follows: Once an
author had produced the manuscript
(in the original sense, that is, manu-
ally produced on a typewriter, possibly
by someone from the university’s pool
of typists), he or she would make up to
seven photocopies, stick all of them

key insights
PEER REVIEW IS the process by which experts in some ˽˽ State-of-the-art tools from machine
learning and artificial intelligence
discipline comment on the quality of the works of are making inroads to automate parts
of the peer-review process; however,
others in that discipline. Peer review of written works many opportunities for further
improvement remain.
is firmly embedded in current academic research ˽˽ Profiling, matching, and open-world
practice where it is positioned as the gateway process expert finding are key tasks that can
be addressed using feature-based
and quality control mechanism for submissions to representations commonly used in
machine learning.
conferences, journals, and funding bodies across ˽˽ Such streamlining tools also offer
a wide range of disciplines. It is probably safe to perspectives on how the peer-review
process might be improved: in particular,
assume that peer review in some form will remain a
ILLUSTRAT IO N BY SPOOK Y POOKA AT DEBUT A RT

the idea of profiling naturally leads to


cornerstone of academic practice for years to come, a view of peer review being aimed at
finding the best publication venue (if any)
evidence-based criticisms of this process in computer for a submitted paper.
˽˽ Creating a more global embedding for
science22,32,45 and other disciplines23,28 notwithstanding. the peer-review process that transcends
While parts of the academic peer review process individual conferences or conference
series by means of persistent reviewer
have been streamlined in the last few decades to take and author profiles is key, in our opinion,
to a more robust and less arbitrary
technological advances into account, there are many peer-review process.

70 COMMUNICATIO NS O F TH E ACM | M A R C H 201 7 | VO L . 60 | NO. 3


MA R C H 2 0 1 7 | VO L. 6 0 | N O. 3 | C OM M U N IC AT ION S OF T HE ACM 71
review articles

in a large envelope, and send them to Computer scientists have been study- world setting there is a fixed or prede-
the program chair of the conference, ing automated computational support termined pool of people or resources.
taking into account that international for conference paper assignment since For example, assigning papers for re-
mail would take 3–5 days to arrive. On pioneering work in the 1990s.14 A range view in a closed-world setting assumes
their end, the program chair would of methods have been used to reduce the a program committee or editorial board
receive all those envelopes, allocate human effort involved in paper alloca- has already been assembled, and hence
the papers to the various members tion, typically with the aim of producing the main task is one of matching pa-
of the program committee, and send assignments that are similar to the ‘gold pers to potential reviewers. In contrast,
them out for review by mail in another standard’ manual process.9,13,16,18,30,34,37 in an open-world setting the task be-
batch of big envelopes. Reviews would Yet, despite many publications on this comes one of finding suitable experts.
be completed by hand on paper and topic over the intervening years, re- Similarly, in a closed-world setting an
mailed back or brought to the program search results in paper assignment have author has already decided which con-
committee meeting. Finally, notifica- made relatively few inroads into main- ference or journal to send their paper
tions and reviews would be sent back stream CMS tools and everyday peer to, whereas in an open-world setting
by the program chair to the authors by review practice. Hence, what we have one could imagine a recommender sys-
mail. Submissions to journals would achieved over the last 25 years or so ap- tem that suggests possible publication
follow a very similar process. pears to be a streamlined process rather venues. The distinction between closed
It is clear that we have moved on than a fundamentally improved one: we and open worlds is gradual rather than
quite substantially from this paper- believe it would be difficult to argue the absolute: indeed, the availability of a
based process—indeed, many of the decisions taken by program committees global database of potential publica-
steps we describe here would seem today are significantly better in compar- tion venues or reviewers with associated
arcane to our younger readers. These ison with the paper-based process. But metadata would render the distinction
days, papers and reviews are submitted this doesn’t mean that opportunities for one of scale rather than substance. Nev-
online in some conference manage- improving the process don’t exist—on ertheless, it is probably fair to say that,
ment system (CMS), and all communi- the contrary, there is, as we demonstrate in the absence of such global resources,
cation is done via email or via message in this article, considerable scope for current opportunities tend to be focus
boards on the CMS with all metadata employing the very techniques that re- on closed-world settings. Here, we re-
concerning people and papers stored searchers in machine learning and arti- view research on steps II, III and V, start-
in a database backend. One could ar- ficial intelligence have been developing ing with the latter two, which are more
gue this has made the process much over the years. of a closed-world nature.
more efficient, to the extent that we The accompanying table recalls the
now specify the submission deadline main steps in the peer review process Assigning Papers for Review
up to the second in a particular time and highlights current and future op- In the currently established academic
zone (rather than approximately as the portunities for improving it through ad- process, peer review of written works
last post round at the program chair’s vanced computational support. In dis- depends on appropriate assignment
institution), and can send out hun- cussing these topics, it will be helpful to to several expert peers for their review.
dreds if not thousands of notifications draw a distinction between closed-world Identifying the most appropriate set of
at the touch of a button. and open-world settings. In a closed- reviewers for a given submitted paper is
a time-consuming and non-trivial task
A chronological summary of the main activities in peer review, with opportunities for for conference chairs and journal edi-
improving the process through computational support.
tors—not to mention funding program
managers, who rely on peer review for
funding decisions. Here, we break the
Actor Activity What can be done now What might be done in future
review assignment problem down into
I Author Paper submission Recommender systems for
publication venue;
its matching and constraint satisfac-
papers carry full previous tion constituents, and discuss possi-
reviewing history bilities for computational support.
II Program chair Assembling Expert finding PCs for an area rather than a Formally, given a set P of papers
program single conference; with |P| = p and a set R of reviewers with
committee workload balancing
|R|= r, the goal of paper assignment is
III Program chair Assigning papers Bidding and assignment Extending PCs based on
for review support submitted papers
to find a binary matrix Ar×p such that
IV Reviewer Reviewing papers Advanced reviewing tools that
Ai j = 1 indicates the i-th reviewer has
find related work been assigned the j-th paper, and Ai j =
and map the paper under review 0 otherwise. The assignment matrix
relative to it
should satisfy various constraints, the
V Program chair Discussion and Reviewer score More outcome categories; most typical of which are: each paper
decisions calibration recommender systems
for outcomes; more decision is reviewed by at least c reviewers (typi-
time points cally, c = 3); each reviewer is assigned no
more than m papers, where m = O (pc/r);
and reviewers should not be assigned

72 COMM UNICATIO NS O F THE ACM | M A R C H 201 7 | VO L . 60 | NO. 3


review articles

papers for which they have a conflict If the number of keywords associ- tend to assume uniform coverage
of interest (this can be represented by ated with each paper and each reviewer of the concept space such that the
a separate binary conflict matrix Cr×p). is not fixed then the comparison may hierarchy is a balanced tree. The ap-
As this problem is underspecified, we be normalized by the CMS to avoid proach is further complicated as it is
will assume that further information overly favoring longer lists of keywords. common for certain concepts to ap-
is available in the form of a score ma- If the overall vocabulary from which pear at multiple places in a hierarchy,
trix Mr×p expressing for each paper-re- keywords are chosen is small then the that is, taxonomies may be graphs
viewer pair how well they are matched concepts they represent will necessar- rather than just trees, and conse-
by means of a non-negative number ily be broad and likely to result in more quently there may be multiple paths
(higher means a better match). The best matches. Conversely, if the vocabulary between a pair of concepts. The situ-
allocation is then the one that maximizes is large, as in the case of free-text or the ation grows worse still if different tax-
the element-wise matrix product Σi j Ai j Mi j ACM CCS, then concepts represented onomies are used to describe the sub-
while satisfying all constraints.44 will be finer grained but the number ject of written works from different
This one-dimensional definition of of matches is more likely to be small or sources because a mapping between
‘best’ does not guarantee the best set even non-existent. Also, manually as- the taxonomies is required. Thus, it
of reviewers if a paper covers multiple signing keywords to define the subject is not surprising that one of the most
topics, for example, a paper on machine of written material is inherently sub- common findings in the literature on
learning and optimization could be as- jective. In the medical domain, where ontology engineering is that ontolo-
signed three reviewers who are machine taxonomic classification schemes are gies, including taxonomies, thesauri,
learning experts but none who are opti- commonplace, it has been demonstrat- and dictionaries, are difficult to de-
mization experts. This shortcoming can ed that different experts, or even the velop, maintain, and use.12
be addressed by replacing R with the same expert over time, may be inconsis- So, even with good CMS support,
set Rc such that each c-tuple ∈ Rc repre- tent in their choice of keywords.6,7 keyword-based matching still requires
sents a possible assignment of c review- When a pair of keywords does not manual effort and subjective decisions
ers.24,25,42 Recent works add explicit con- literally match, despite having been from authors, reviewers and, some-
straints on topic coverage to incorporate chosen to refer to the same underly- times, ontology engineers. One useful
multiple dimensions into the definition ing concept, one technique often used aspect of feature-based matching using
of best allocation.26,31,40 Other types of to improve matching is to also match keywords is that it allows us to turn a het-
constraints have also been considered, their synonyms or syntactic variants— erogeneous matching problem (papers
including geographical distribution and as defined in a thesaurus or dictionary against reviewers) into a homogeneous
fairness of assignments, as have alterna- of abbreviations, for example, treating one (paper keywords against reviewer
tive constraint solver algorithms.3,19,20,43 ‘code inspection’ and ‘walkthrough’ as keywords). Such keywords are thus a
The score matrix can come from differ- equivalent; likewise for ‘SVM’ and ‘sup- simple example of profiles that are used
ent sources, possibly a combination. port vector machine’ or ‘λ-calculus’ and to describe relevant entities (papers
Here, we review three possible sources: ‘lambda calculus.’ However, if such and reviewers). Next, we take the idea of
feature-based matching, profile-based simple equivalence classes are not suf- profile-based matching a step further by
matching, and bidding. ficient to capture important differences employing a more general notion of pro-
Feature-based matching. To aid as- between subjects—for example, if the file that incorporates nonfeature-based
signing submitted papers to reviewers difference between ‘code inspection’ representations such as bags of words.
a short list of subject keywords is often and ‘walk-through’ is significant—then Automatic feature construction
required by mainstream CMS tools as an alternative technique is to exploit with profile-based matching. The main
part of the submission process, either the hierarchical structure of a concept idea of profile-based matching is to
from a controlled vocabulary, such as taxonomy in order to represent the dis- automatically build representations of
the ACM Computing Classification Sys- tance between concepts. In this setting, semantically relevant aspects of both
tem (CCS),a or as a free-text “folkson- a match can be based on the common papers and reviewers in order to facili-
omy.” As well as collecting keywords ancestors of concepts—either counting tate construction of a score matrix. An
for the submitted papers, taking the the number of shared ancestors or com- obvious choice of such a representation
further step of also requesting subject puting some edge traversal distance be- for papers is as a weighted bag-of-words
keywords from the body of potential tween a pair of concepts, for example, (see “The Vector Space Model” sidebar).
reviewers enables CMS tools to make the former ACM CCS concept ‘D.1.6 We then need to build similar profiles of
a straightforward match between the Logic Programming’ has ancestors reviewers. For this purpose we can rep-
papers and the reviewers based on a ‘D.1 Programming Techniques’ and resent a reviewer by the collection of all
count of the number of keywords they ‘D. Software,’ both of which are shared their authored or co-authored papers,
have in common. For each paper the by the concept ‘D.1.5 Object-oriented as indexed by some online repository
reviewers can then be ranked in order Programming’, meaning that D.1.5 and such as DBLP29 or Google Scholar. This
of the number of matching keywords. D.1.6 have a non-zero similarity because collection can be turned into a profile
they have common ancestors. in several ways, including: build the
a http://www.acm.org/about/class/ (The exam-
Obtaining a useful representation profile from a single document or Web
ples in this article refer to ACM’s 1998 CCS, of concept similarity from a taxonomy page containing the bibliographic de-
which was recently updated.) is challenging because the measures tails of the reviewer’s publications (see

MA R C H 2 0 1 7 | VO L. 6 0 | N O. 3 | C OM M U N IC AT ION S OF T HE ACM 73
review articles

set of keywords. However, the number of


possible terms in a profile can be huge
The Vector Space Model and so systems like TPMS use automatic
topic extraction as a form of dimension-
The canonical task in information retrieval is, given a query in the form of a list of ality reduction, resulting in profiles with
words (terms), to rank a set of text documents D in order of their similarity to the query.
In the vector space model, each document d ∈ D is represented as the multiset of terms terms chosen from a limited number
(bag-of-words) occurring in that document. The set of distinct terms in D, vocabulary V, keywords (topics). As a useful by-product
defines a vector

space with dimensionality |V| and thus each document d is represented of profiling, each paper and each re-

as a vector d in this space. The query q can also be represented as a vector q in this
space, assuming it shares vocabulary V. The query and a document are considered
viewer is characterized by a ranked list
similar if the angle q between their vectors is small. The angle can be conveniently of terms which can be seen as automati-
→ → → →
captured by its cosine q ∙ d /||q || ∙ ||d ||, giving rise to the cosine
→ →
similarity. cally constructed features that could be
However, if raw term counts are used in vectors q and d then similarity will: (i) further exploited, for instance to allocate
be biased in favor of long documents and; (ii) treat all terms as equally important,
irrespective of how commonly they occur across all documents. The term frequency– accepted papers to sessions or to make
inverse document frequency (tf-idf) weighting scheme compensates for (i) by normalizing clear the relative contribution of individ-
term counts within a document by the total number of terms in that document, and ual terms to a similarity score (see “Sub-
(ii) by penalizing terms that occur in many documents, as follows. The term frequency
of term ti in the document dj is tfij = nij /Σk nkj. The inverse document frequency of term
Sift and MLj Matcher” sidebar).
ti is idfi = log (|D|) / dfi), where term count nij is the number of times term ti occurs in the Bidding. A relatively recent trend is
document dj, and document frequency dfi of term ti is the number of documents in D in to transfer some of the paper alloca-
which term ti occurs. A term that occurs often in a document has high term frequency; tion task downstream to the reviewers
if it occurs rarely in other documents it has high inverse document frequency. The
product of the two, tf-idf, thus expresses the extent to which a term characterizes a themselves, giving them access to the
document relative to other documents in D. full range of submitted papers and ask-
ing them to bid on papers they would
like to review. Existing CMS tools offer
support for various bidding schemes,
Toronto Paper including: allocation of a fixed number
of ‘points’ across an arbitrary number
Matching System of papers, selection of top k papers,
rating willingness to review papers ac-
The Toronto Paper Matching System TPMS (papermatching.cs.toronto.edu) originated cording to strength of bid, as well as
as a standalone paper assignment recommender for the NIPS 2010 conference and was
subsequently loosely integrated with Microsoft’s Conference Management Toolkit (CMT)
combinations of these. Hence, bidding
to streamline access to paper submissions for ICML 2012. TPMS requires reviewers to can be seen as an alternative way to
upload a selection of their own papers, reports and other self-selected textual documents, come up with a score matrix that is re-
which are then analyzed to produce their reviewer profile. This places control over the quired for the paper allocation process.
scope of the profile in the hands of the reviewers themselves so that they need only include
publications about topics they are prepared to review. Once uploaded, TPMS persists There is also the opportunity to register
the documents and resultant profile beyond the scope of a single conference, allowing conflicts of interests, if a reviewer’s re-
reviewers to reuse the same profile for future conferences, curating their own set of lations with the authors of a particular
characteristic documents as they see fit.
The scoring model used is similar to the vector-space model but takes a Bayesian
paper are such that the reviewer is not a
approach. In addition, profiles in TPMS can be expressed over a set of hypothesized topics suitable reviewer for that paper.
rather than raw terms. Topics are modeled as hidden variables that can be estimated using While it is in a reviewer’s self-interest
techniques such as Latent Dirichlet Allocation.4,8 This increased expressivity comes at the to bid, invariably not all reviewers will do
cost of requiring more training data to stave off the danger of overfitting.
so, in which case the papers they are al-
located for review may well not be a good
“SubSift and MLj-Matcher” sidebar); lished works in relation to the body as match for their expertise and interests.
or retrieve or let the reviewer upload a whole, discriminating profiles may This can be irritating for the reviewer but
full-text of (selected) papers, which are be produced that effectively character- is particularly frustrating for the authors
then individually converted into the re- ize reviewer expertise from the content of the papers concerned. The absence
quired representation and collectively of existing heterogeneous documents of bids from some reviewers can also
averaged to form the profile (see “To- ranging from traditional academic pa- reduce the fairness of allocation algo-
ronto Paper Matching System” (TPMS) pers to websites, blog posts, and social rithms in CMS tools.19 Default options in
sidebar). Once both the papers and the media. Such profiles have applications the bidding process are unable to allevi-
reviewers have been profiled, the score in their own right but can also be used ate this: if the default is “I cannot review
matrix M can be populated with the co- to compare one body of documents to this” the reviewer is effectively excluded
sine similarity between the term weight another, ranking arbitrary combina- from the allocation process, while if the
vectors of each paper-reviewer pair. tions of documents and, by proxy, indi- default is to indicate some minimal will-
Profile-based methods for match- viduals by their similarity to each other. ingness to review a paper the reviewer is
ing papers with reviewers exploit the From a machine learning point of effectively used as a wildcard and will re-
intuitive idea that the published works view, profile-based matching differs ceive those papers that are most difficult
of reviewers, in some sense, describe from feature-based matching in that the to allocate.
their specific research interests and profiles are constructed in a data-driven A hybrid of profile-based matching
expertise. By analyzing these pub- way without the need to come up with a and manual bidding was explored for

74 COM MUNICATIO NS O F TH E AC M | M A R C H 201 7 | VO L . 60 | NO. 3


review articles

the ACM SIGKDD International Con- some reviewers tend to stay to the center scores. This corrects for differences in
ference on Knowledge Discovery and of the scale while others tend to go more both mean scores and standard devia-
Data Mining in 2009. At bidding time for the extremes. In this case it would tions among reviewers and is a simple
the reviewers were presented with ini- be advisable to normalize the scores, example of reviewer score calibration.
tial bids obtained by matching reviewer for example, by replacing them with z- In order to estimate a reviewer’s
publication records on DBLP with paper
abstracts (see “Experience from SIG-
KDD’09” sidebar for details) as a start-
ing point. Several PC members reported
they considered these bids good enough
SubSift and MLj-Matcher
SubSift, short for ‘submission sifting’, was originally developed to support paper
to relieve them from the temptation to assignment at SIGKDD’09 and subsequently generalized into a family of Web services
change them, although we feel there and re-usable Web tools (www.subsift.com). The submission sifting tool composes
several SubSift Web services into a workflow driven by a wizard-like user interface that
is considerable scope to improve both takes the Program Chair through a series of Web forms of a paper-reviewer profiling
the quality of recommendations and of and matching process.
the user interface in future work. ICML On the first form, a list of PC member names is entered. SubSift looks up these
2012 further explored the use of a hy- names on DBLP and suggests author pages which, after any required disambiguation,
are used as documents to profile the PC members. Behind the scenes, beginning from
brid model and a pre-ranked list of sug- a list of bookmarks (urls), SubSift’s harvester robot fetches one or more DBLP pages per
gested bids.b The TPMS software used at author, extracts all publication titles from each page and aggregates them into a single
ICML 2012 offers other scoring models document per author. In the next form, the conference paper abstracts are uploaded
as a CSV file and their text is used to profile the papers. After matching PC member
for combining bids with profile-based profiles against paper profiles, SubSift produces reports with ranked lists of papers per
expertise assessment.8,9 Effective auto- reviewer, and ranked lists of reviewers per paper. Optionally, by manually specifying
matic bid initialization would address threshold similarity scores or by specifying absolute quantities, a CSV file can be
the aforementioned problem caused by downloaded with initial bid assignments for upload into a CMS.
For the editor-in-chief of a journal, the task of assigning a paper to a member of
non-bidding reviewers. the editorial board for their review can be viewed as a special case of the conference
paper assignment problem (without bidding), where the emphasis is on finding the
Reviewer Score Calibration best match for one or a few papers. We built an alternative user interface to SubSift
that supports paper assignment for journals. Known as MLj Matcher in its original
Assuming a high-quality paper assign-
incarnation, this tool has been used since 2010 to support paper assignment for the
ment has been achieved by means of one Machine Learning journal as well as other journals.
of the methods described earlier, review-
ers are now asked to honestly assess the
quality and novelty of a paper and its suit-
ability for the chosen venue (conference
or journal). There are different ways in
which this assessment can be expressed:
Experience from SIGKDD’09
from a simple yes/no answer to the ques- Our own experience with bespoke tools to support the research paper review process
started when Flach was appointed, with Mohammed Zaki from Rensselaer Polytechnic
tion: “If it was entirely up to you, would Institute, program co-chair of ACM SIGKDD International Conference on Knowledge
you accept this paper?” via a graded an- Discovery and Data Mining 2009 (SIGKDD’09). The initial SubSift tools were written by
swer on a more common five- or seven- members of the Bristol Intelligent Systems Laboratory with external collaborators at
Microsoft Research Cambridge. As reported in Flach et al.,18 the SubSift tools assisted in
point scale (for example, Strong Accept the allocation of 537 submitted research papers to 199 reviewers.
(3); Accept (2); Weak Accept (1); Neutral Using these tools, each reviewer’s bids were initialized using a weighted sum of
(0); Weak Reject (−1); Reject (−2); Strong cosine similarity between the paper’s abstract and the reviewer’s publication titles as
Reject (−3)), to graded answers to a set of listed in the DBLP computer science online bibliography,30 and the number of shared
subject areas (keywords). The combined similarity scores were discretized into four bins
questions aiming to characterize differ- using manually chosen thresholds, with the first bin being a 0 (no-bid) and the other
ent aspects of the paper such as novelty, three being bids of increasing strength: 1 (at a pinch), 2 (willing) and 3 (eager). These
impact, technical quality, and so on. initial bids were exported from SubSift and imported into the conference management
Such answers require careful inter- tool (Microsoft CMT, cmt.research.microsoft.com).
Based on the same similarity information, each reviewer was sent an email
pretation for at least two reasons. The containing a link to a personalized SubSift generated Web page listing details of all 537
first is that reviewers, and even area papers ordered by initial bid allocation or by either of its two components: keyword
chairs, do not have complete informa- matches or similarity to their own published works. The page also listed the keywords
extracted from the reviewer’s own publications and those from each of the submitted
tion about the full set of submitted pa- papers. Guided by this personalized perspective, plus the usual titles and abstracts,
pers. This matters in a situation where reviewers affirmed or revised their bids recorded in the conference management tool.
the total number of papers that can be To quantitatively evaluate the performance of the SubSift tools, the bids made by
accepted is limited, as in most confer- reviewers were considered to be the ‘correct assignments’ against which SubSift’s
automated assignments were compared. Disregarding the level of bid, a median of 88.2%
ences (it is less of an issue for journals). of the papers recommended by SubSift were subsequently included in the reviewers’
The main reason why raw reviewer own bids (precision). Furthermore, a median of 80.0% of the papers on which reviewers
scores are problematic is that different bid for were ones initially recommended to them by SubSift (recall).
These results suggest that the papers eventually bid on by reviewers were largely
reviewers tend to use the scale(s) in- drawn from those that were assigned non-zero bids by SubSift. These results on real-
volved in different ways. For example, world data in a practical setting are comparable with other published results using
language models.
b ICML 2012 reviewing; http://hunch.net/?p=2407

MA R C H 2 0 1 7 | VO L. 6 0 | N O. 3 | C OM M U N IC AT ION S OF T HE ACM 75
review articles

score bias (do they tend to err on the is up for debate. It is worth pointing election and representation mecha-
accepting side or rather on the reject- out that, while the peer review proc- nism. The current generation of CMSs
ing side?) and spread (do they tend to ess eventually leads to a binary accept/ do not offer computational support for
score more or less confidently?) we reject decision, paper quality most the formation of a balanced program
need a representative sample of pa- certainly is not: while a certain frac- committee; they assume prior exis-
pers with a reasonable distribution in tion of papers clearly deserves to be tence of the list of potential reviewers
quality. This is often problematic for accepted, and another fraction clearly and instead concentrate on supporting
single references as the number of pa- deserves to be rejected, the remaining the administrative workflow of issuing
pers m reviewed by a single reviewer papers have pros and cons that can be and accepting invitations.
is too small to be representative, and weighed up in different ways. So if two Expert finding. This lack of tool sup-
there can be considerable variation in reviewers assign different scores to port is surprising considering the body
the quality of papers among different papers this doesn’t mean that one of of relevant work in the long-established
batches that should not be attributed them is wrong, but rather they picked field of expert finding.2,11,15,34,47 Over
to reviewers. It is, however, possible up on different aspects of the paper in the years since the first Text Retrieval
to get more information about re- different ways. Conference (TREC) in 1992, the task of
viewer bias and confidence by leverag- We suggest a good way forward is to finding experts on a particular topic has
ing the fact that papers are reviewed think of the reviewer’s job as to “pro- featured regularly in this long-running
by several reviewers. For SIGKDD’09 file” the paper in terms of its strong conference series and is now an active
we used a generative probabilistic and weak points, and separate the subfield of the broader text informa-
model proposed by colleagues at Mi- reviewing job proper from the even- tion retrieval discipline. Expert finding
crosoft Research Cambridge with la- tual accept/reject decision. One could has a degree of overlap with the fields
tent (unobserved) variables that can imagine a situation where a submit- of bibliometrics, the quantitative analy-
be inferred by message-passing tech- ted paper could go to a number of sis of academic publications and other
niques such as Expectation Propaga- venues (including the ‘null’ venue), research-related literature,21,38 and sci-
tion.35 The latent variables include and the reviewing task is to help de- entometrics, which extends the scope
the true paper quality, the numerical cide which of these venues is the most to include grants, patents, discoveries,
score assigned by the reviewer, and appropriate one. This would turn the data outputs and, in the U.K., more ab-
the thresholds this particular review- peer review process into a matching stract concepts such as ‘impact.’5 Expert
er uses to convert the numerical score process, where publication venues finding tends to be more profile-based
to the observed recommendation on have a distinct profile (whether it ac- (for example, based on the text of docu-
the seven-point scale. The calibration cepts theoretical or applied papers, ments) than link-based (for example,
process is described in more detail in whether it puts more value on novelty based on cross-references between doc-
Flach et al.18 or on technical depth, among others) uments) although content analysis is an
An interesting manifestation of re- to be matched by the submission’s active area of bibliometrics in particular
viewer variance came to light through profile as decided by the peer review and has been used in combination with
an experiment with NIPS reviewing in process. Indeed, some conferences citation properties to link research top-
2014.27 The PC chairs decided to have already have a separate journal track ics to specific authors.11 Even though
one-tenth (166) of the submitted pa- that implies some form of reviewing by comparison with bibliometrics, sci-
pers reviewed twice, each by three re- process to decide which venue is the entometrics encompasses additional
viewers and one area chair. It turned most suitable one.c measures, in practice the dominant
out the accept/reject recommenda- approach in both domains is citation
tions of the two area chairs differed Assembling Peer Review Panels analysis of academic literature. Citation
in about one quarter of the cases (43). The formation of a pool of reviewers, analysis measures the properties of net-
Given an overall acceptance rate of whether for conferences, journals, or works of citation among publications
22.5%, roughly 38 of the 166 double- funding competitions, is a non-trivial and has much in common with hyper-
reviewed papers were accepted fol- process that seeks to balance a range link analysis on the Web, where these
lowing the recommendation of one of objective and subjective factors. In measures employ similar graph theo-
of the area chairs; about 22 of these practice, the actual process by which retic methods designed to model repu-
would have been rejected if the rec- a program chair assembles a program tation, with notable examples including
ommendation of the other area chair committee varies from, at one extreme, Hubs and Authorities, and PageRank.
had been followed instead (assuming inviting friends and co-authors plus Citation graph analysis, using a particle-
the disagreements were uniformly their friends and co-authors, through swarm algorithm, has been used to sug-
distributed over the two possibilities), to the other extreme of a formalized gest potential reviewers for a paper on
which suggests that more than half the premise that the subject of a paper
(57%) of the accepted papers would c For example, the European Conference on is characterized by the authors it cites.39
not have made it to the conference if Machine Learning and Principles and Practice Harvard’s Profiles Research Net-
reviewed a second time. of Knowledge Discovery in Databases (ECML- work Software (RNS)d exploits both
PKDD) has a journal track where accepted pa-
What can be concluded from what pers are presented at the conference but pub-
graph-based and text-based methods.
came to be known as the “NIPS experi- lished either in the Machine Learning journal
ment” beyond these basic numbers or in Data Mining and Knowledge Discovery. d http://profiles.catalyst.harvard.edu

76 COMM UNICATIO NS O F THE AC M | M A R C H 201 7 | VO L . 60 | NO. 3


review articles

By mining high-quality bibliographic resolution even more challenging


metadata from sources like PubMed, for a peer review tool. Indeed, the
Profiles RNS infers implicit networks problem of author disambiguation is
based on keywords, co-authors, de- sufficiently challenging to have mer-
partment, location, and similar re-
search. Researchers can also define We suggest a good ited the investment of considerable
research effort over the years, which
their own explicit networks and cu-
rate their list of keywords and publi-
way is to think of has in turn led to practical tool de-

a reviewer’s job
velopment in areas with similar re-
cations. Profiles RNS supports expert quirements to finding potential peer
finding via a rich set of searching and
browsing functions for traversing
to “profile” the reviewers. For instance, Profiles RNS
supports finding researchers with
these networks. Profiles RNS is a note- paper in terms of specific expertise and includes an Au-
worthy open source example of a grow-
ing body of research intelligence tools
its strong and weak thor Disambiguation Engine using
factors such as name permutations,
that compete to provide definitive da- points, and separate email address, institution affiliations,
tabases of academics that, while vary-
ing in scope, scale and features, collec- the reviewing job known co-authors, journal titles, sub-
ject areas, and keywords.
tively constitute a valuable resource for proper from the To address these problems in their
a program chair seeking new review-
ers. Well-known examples include free eventual accept/ own record systems, publishers and
bibliographic databases like DBLP
sites like academia.edu, Google Schol-
ar, Mendeley, Microsoft Academic
reject decision. and Google Scholar have developed
their own proprietary UID schemes
Search, ResearchGate, and numerous for identifying contributors to pub-
others that mine public data or solicit lished works. However, there is now
data directly from researchers them- considerable momentum behind the
selves, as well as pay-to-use offerings non-proprietary Open Researcher and
like Elsevier’s Reviewer Finder. Contributor ID (ORCID)e and publish-
Data issues. There is a wealth of ers are increasingly mapping their
publicly available data about the ex- own UIDs onto ORCID UIDs. A subtle
pertise of researchers that could, in problem remains for peer review tools
principle, be used to profile program when associating data, particularly
committee members (without requir- academic publications, with an indi-
ing them to choose keywords or up- vidual researcher because a great deal
load papers) or to suggest a ranked of academic work is attributed to mul-
list of candidate invitees for any given tiple contributors. Hope for resolving
set of topics. Obvious data sources individual contributions comes from a
include academic home pages, on- concerted effort to better document all
line bibliographies, grant awards, job outputs of research, including not only
titles, research group membership, papers but also websites, datasets, and
events attended as well as member- software, through richer metadata de-
ship of professional bodies and other scriptions of Research Objects.10
reviewer pools. Despite the availabil- Balance and coverage. Finding can-
ity of such data, there are a number of didate reviewers is only part of a pro-
problems in using it for the purpose of gram chair’s task in forming a com-
finding an expert on a particular topic. mittee—attention must also be paid to
If the data is to be located and used coverage and balance. It is important to
automatically then it is necessary to ensure more popular areas get propor-
identify the individual or individuals tionately more coverage than less pop-
described by the data. Unfortunately ular ones while also not excluding less
a person’s name is not guaranteed well known but potentially important
to be a unique identifier (UID): of- new areas. Thus, there is a subjective
ten not being globally unique in the element to balance and coverage that
first place, they can also be changed is not entirely captured by the score
through title, choice, marriage, and matrix. Recent work seeks to address
so on. Matters are made worse be- this for conferences by refining clus-
cause many academic reference styles ters, computed from a score matrix,
use abbreviated forms of a name us- using a form of crowdsourcing from
ing initials. International variations the program committee and from the
in word ordering, character sets, and
alternative spellings make name e http://orcid.org

MA R C H 2 0 1 7 | VO L. 6 0 | N O. 3 | C OM M U N IC AT ION S OF T HE ACM 77
review articles

authors of accepted papers.1 Another els with the NSF organizational struc- ily be solved once we have a score ma-
example of computational support for ture; and, similarly, no alignment trix assessing for each paper-reviewer
assembling a balanced set of reviewers with specific competition goals, such pair how well they are matched.f We
comes not from conferences but from as increasing participation of under- have described a range of techniques
a U.S. funding agency, the National Sci- represented groups or creating results from information retrieval and ma-
ence Foundation (NSF). of interest to industry. So, eschewing chine learning that can produce such
The NSF presides over a budget of clustering, Revaide instead supported a score matrix. The notion of profiles
over $7.7 billion (FY 2016) and receives the established manual process by an- (of reviewers as well as papers) is use-
40,000 proposals per year, with large notating each proposal with its top ful here as it turns a heterogeneous
competitions attracting 500–1,500 pro- 20 terms as a practical alternative to matching problem into a homoge-
posals; peer review is part of the NSF’s manually supplied keywords. neous one. Such profiles can be for-
core business. Approximately a decade Other ideas for tool support in mulated against a fixed vocabulary
ago, the NSF developed Revaide, a data- panel formation were considered. In- (bag-of-words) or against a small
mining tool to help them find proposal spired by conference peer review, NSF set of topics. Although it is fashion-
reviewers and to build panels with ex- experimented with bidding but found able in machine learning to treat
pertise appropriate to the subjects of that reviewers had strong preferences such topics as latent variables that
received proposals.22 In constructing toward well-known researchers and can be learned from data, we have
profiles of potential reviewers the NSF this approach failed to ensure there found stability issues with latent
decided against using bibliographic da- were reviewers from all contribut- topic models (that is, adding a few
tabases like Citeseer or Google Scholar, ing disciplines of a multidisciplinary documents to a collection can com-
for the same reasons we discussed ear- proposal—a particular concern for pletely change the learned topics)
lier. Instead they took a closed-world NSF. Again, manual processes won and have started to experiment with
approach by restricting the set of po- out. However, Revaide did find a valu- handcrafted topics (for example, en-
tential reviewers to authors of past able role for clustering techniques cyclopedia or Wikipedia entries) that
(single-author) proposals that had been as a way of checking manual assign- extend keywords by allowing their
judged ‘fundable’ by the review proc- ments of proposals to panels. To do own bag-of-words representations.
ess. This ensured the availability of a this, Revaide calculated an “average” A perhaps less commonly studied
UID for each author and reliable meta- vector for each panel, by taking the area where nevertheless progress has
data, including the author’s name and central point of the vectors of its pan- been achieved concerns interpreta-
institution, which facilitated conflict el members, and then compared each tion and calibration of the intermedi-
of interest detection. Reviewer profiles proposal’s vector against every panel. ate output of the peer reviewing proc-
were constructed from the text of their If a proposal’s assigned panel is not ess: the aspects of the reviews that
past proposal documents (including its closest panel then the program di- feed into the decision making proc-
references and résumés) as a vector of rector is warned. Using this method, ess. In their simplest form these are
the top 20 terms with the highest tf-idf Revaide proposed better assignments scores on an ordinal scale that are of-
scores. Such documents were known to for 5% of all proposals. Using the same ten simply averaged. However, averag-
be all of similar length and style, which representation, Revaide was also used ing assessments from different asses-
improved the relevance of the resultant to classify orphaned proposals, sug- sors—which is common in other areas
tf-idf scores. The same is also true of the gesting a suitable panel. Although as well, for example, grading course-
proposals to be reviewed and so pro- the classifier was only 80% accurate, work—is fraught with difficulties as
files of the same type were constructed which is clearly not good enough it makes the unrealistic assumption
for these. for a fully automated assignment, it that each assessor scores on the same
For a machine learning researcher, played a valuable role within the NSF scale. It is possible to adjust for differ-
an obvious next step toward forming workflow: so, instead of each program ences between individual reviewers,
panels with appropriate coverage for director having to sift through, say, particularly when a reviewing history
the topics of the submissions would be 1,000 orphaned proposals they re- is available that spans multiple con-
to cluster the profiles of received pro- ceived an initial assignment of, say, ferences. Such a global reviewing sys-
posals and use the resultant clusters 100 of which they would need to reas- tem that builds up persistent reviewer
as the basis for panels, for example, sign around 20 to other panels. (and author) profiles is something
matching potential reviewers against that we support in principle, although
a prototypical member of the cluster. Conclusion and Outlook many details need to be worked out
Indeed, prior to Revaide the NSF had We have demonstrated that state-of- before this is viable.
experimented with the use of auto- the-art tools from machine learning We also believe it would be ben-
mated clustering for panel formation and artificial intelligence are mak- eficial if the role of individual review-
but those attempts had proved unsuc- ing inroads to automate and improve ers shifted away from being an ersatz
cessful for a number of reasons: the parts of the peer review process. Al- judge attempting to answer the ques-
sizes of clusters tended to be uneven; locating papers (or grant proposals)
clusters exhibited poor stability as to reviewers is an area where much f This holds for the simple version stated ear-
new proposals arrived incrementally; progress has been made. The combi- lier, but further constraints might complicate
there was a lack of alignment of pan- natorial allocation problem can eas- the allocation problem.

78 COMM UNICATIO NS O F THE AC M | M A R C H 201 7 | VO L . 60 | NO. 3


review articles

tion “Would you accept this paper if it bibliometrics in the evaluation of research institutes. Proceedings of the 9th International Symposium on
Next Generation Metrics, 2013. String Processing and Information Retrieval (London,
was entirely up to you?” toward a more 6. Boxwala, A.A., Dierks, M., Keenan, M., Jackson, S., U.K., 2002). Springer-Verlag, 1–10.
constructive role of characterizing— Hanscom, R., Bates, D.W. and Sato, L. Review paper: 30. Liu, X., Suel, T. and Memon, N. A robust model for
Organization and representation of patient safety paper reviewer assignment. In Proceedings of the 8th
and indeed, profiling—the paper under data: Current status and issues around generalizability ACM Conference on Recommender Systems (2014).
submission. Put differently, besides and scalability. J. American Medical Informatics ACM, New York, NY, 25–32.
Association 11, 6 (2004), 468–478. 31. Long, C., Wong, R.C., Peng, Y. and Ye, L. On good and
suggestions for improvement to the au- 7. Brixey, J., Johnson, T. and Zhang, J. Evaluating a medical fair paper-reviewer assignment. In Proceedings of
thors, the reviewers attempt to collect error taxonomy. In Proceedings of the American Medical the 2013 IEEE 13th International Conference on Data
Informatics Association Symposium, 2002. Mining (Dallas, TX, Dec. 7–10, 2013), 1145–1150.
metadata about the paper that is used 8. Charlin, L. and Zemel, R. The Toronto paper matching 32. Mehlhorn, K., Vardi, M.Y. and Herbstritt, M. Publication
system: An automated paper-reviewer assignment culture in computing research (Dagstuhl Perspectives
further down the pipeline to decide system. In Proceedings of ICML Workshop on Peer Workshop 12452). Dagstuhl Reports 2, 11 (2013).
the most suitable publication venue. In Reviewing and Publishing Models, 2013. 33. Meyer, B., Choppy, C., Staunstrup, J. and van Leeuwen,
9. Charlin, L., Zemel, R. and Boutilier, C. A framework J. Viewpoint: Research evaluation for computer
principle, this would make it feasible to for optimizing paper matching. In Proceedings of the science. Commun. ACM 52, 4 (Apr. 2009), 31–34.
decouple the reviewing process from in- 27th Annual Conference on Uncertainty in Artificial 34. Mimno, D. and McCallum, A. Expertise modeling for
Intelligence (Corvallis, OR, 2011). AUAI Press, 86–95. matching papers with reviewers. In Proceedings of
dividual venues, something that would 10. De Roure, D. Towards computational research objects. the 13th ACM SIGKDD International Conference on
also enable better load balancing and In Proceedings of the 1st International Workshop Knowledge Discovery and Data Mining. ACM, New
on Digital Preservation of Research Methods and York, NY, 2007, 500–509.
scaling.46 In such a system, authors and Artefacts (2013). ACM, New York, NY, 16–19. 35. Minka, T. Expectation propagation for approximate
reviewers would be members of some 11. Deng, H., King, I. and Lyu, M.R. Formal models for expert Bayesian inference. In Proceedings of the 17th
finding on DBLP bibliography data. In Proceedings of Conference in Uncertainty in Artificial Intelligence.
central organization, which has the au- the 8th IEEE International Conference on Data Mining J.S. Breese and D. Koller, Eds. Morgan Kaufmann,
thority to assign papers to multiple pub- (2008). IEEE Computer Society, Washington, D.C., 163–172. 2001, 362–369.
12. Devedzić, V. Understanding ontological engineering. 36. Price, S. and Flach, P.A. Mining and mapping the
lication venues—a futuristic scenario, Commun. ACM 45, 4 (Apr. 2002), 136–144. research landscape. In Proceedings of the Digital
perhaps, but it is worth thinking about 13. Di Mauro, N., Basile, T. and Ferilli, S. Grape: An expert Research Conference. University of Oxford, Sept. 2013.
review assignment component for scientific conference 37. Price, S., Flach, P.A., Spiegler, S., Bailey, C. and Rogers,
the peculiar constraints that our current management systems. Innovations in Applied Artificial N. SubSift Web services and workflows for profiling and
conference- and journal-driven system Intelligence. LNCS 3533 (2005). M. Ali and F. Esposito, comparing scientists and their published works. Future
eds. Springer, Berlin Heidelberg, 789–798. Generation Comp. Syst. 29, 2 (2013), 569–581.
imposes, and which clearly leads to a 14. Dumais S.T. and Nielsen, J. Automating the assignment 38. Pritchard, A. et al. Statistical bibliography or
bibliometrics. J. Documentation 25, 4 (1969), 348–349.
sub-optimal situation in many respects. of submitted manuscripts to reviewers. In Proceedings
39. Rodriguez, M.A and Bollen, J. An algorithm to
of the 15th Annual International ACM SIGIR Conference
The computational methods we de- on Research and Development in Information Retrieval determine peer-reviewers. In Proceedings of the 17th
ACM Conference on Information and Knowledge
scribed in this article have been used (1992). ACM, New York, NY, 233–244.
Management. ACM, New York, NY, 319–328.
15. Fang, H. and Zhai, C. Probabilistic models for
to support other academic processes expert finding. In Proceedings of the 29th European 40. Sidiropoulos, N.D. and Tsakonas, E. Signal processing
Conference on IR Research (2007). Springer-Verlag, and optimization tools for conference review and
outside of peer review, including a session assignment. IEEE Signal Process. Mag. 32, 3
Berlin, Heidelberg, 418–430.
personalized conference planner app 16. Ferilli, S., Di Mauro, N., Basile, T., Esposito, F. (2015), 141–155.
and Biba, M. Automatic topics identification for 41. Surpatean, A., Smirnov, E.N. and Manie, N. Master
for delegates,g an organizational pro- orientation tool. ECAI 242, Frontiers in Artificial
reviewer assignment. Advances in Applied Artificial
filer36 and a personalized course rec- Intelligence. LNCS 4031 (2006). M. Ali and R. Intelligence and Applications. L.De Raedt, C. Bessière,
Dapoigny, eds. Springer, Berlin Heidelberg, 721–730. D. Dubois, P. Doherty, P. Frasconi, F. Heintz, and P.J.F.
ommender for students based on their 17. Flach, P. Machine Learning: The Art and Science of Lucas, Eds. IOS Press, 2012, 995–996.
academic profile.41 The accompanying Algorithms That Make Sense of Data. Cambridge 42. Tang, W. Tang, J., Lei, T., Tan, C., Gao, B. and Li, T.
University Press, 2012. On optimization of expertise matching with various
table presented a few other possible 18. Flach, P.A., Spiegler, S., Golénia, B., Price, S., Herbrich, J.G.R., constraints. Neurocomputing 76, 1 (Jan. 2012), 71–83.
future directions for computation sup- Graepel, T. and Zaki, M. J. Novel tools to streamline the 43. Tang, W., Tang, J. and Tan, C. Expertise matching via
conference review process: Experiences from SIGKDD’09. constraint-based optimization. In Proceedings of the
port of academic peer review itself. We SIGKDD Explorations 11, 2 (Dec. 2009), 63–67. 2010 IEEE/WIC/ACM International Conference on Web
hope that they, along with this article, 19. Garg, N., Kavitha, T., Kumar, A., Mehlhorn, K., and Intelligence and Intelligent Agent Technology (Vol 1).
Mestre, J. Assigning papers to referees. Algorithmica IEEE Computer Society, Washington, DC, 2010, 34–41.
stimulate our readers to think about 58, 1 (Sept. 2010), 119–136. 44. Taylor, C.J. On the optimal assignment of conference
papers to reviewers. Technical Report MS-CIS-08-30,
ways in which the academic peer re- 20. Goldsmith, J. and Sloan, R.H. The AI conference paper
Computer and Information Science Department,
assignment problem. In Proceedings of the 22nd AAAI
view process—this strange dance in Conference on Artificial Intelligence (2007).
University of Pennsylvania, 2008.
45. Terry, D. Publish now, judge later. Commun. ACM 57, 1
which we all participate in one way or 21. Harnad, S. Open access scientometrics and the U.K.
(Jan. 2014), 44–46.
research assessment exercise. Scientometrics 79, 1
another—can be future-proofed in a (Apr. 2009), 147–156.
46. Vardi, M.Y. Scalable conferences. Commun. ACM 57, 1
(Jan. 2014), 5.
sustainable and scalable way. 22. Hettich, S. and Pazzani, M.J. Mining for proposal
47. Yimam-Seid, D. and Kobsa, A. Expert finding systems
reviewers: Lessons learned at the National Science
for organizations: Problem and domain analysis and
Foundation. In Proceedings of the 12th ACM SIGKDD
g http://www.subsift.com/ecmlpkdd2012/attend- the DEMOIR approach. J. Organizational Computing
International Conference on Knowledge Discovery and
and Electronic Commerce 13 (2003).
ing/apps// Data Mining (2006). ACM, New York, NY, 862–871.
23. Jennings, C. Quality and value: The true purpose of
peer review. Nature, 2006. Simon Price (simon.price@bristol.ac.uk) is a Visiting
References 24. Karimzadehgan, M. and Zhai, C. Integer linear Fellow in the Department of Computer Science
1. André, P., Zhang, H., Kim, J., Chilton, L.B., Dow, S.P. programming for constrained multi-aspect committee at the University of Bristol, U.K., and a data scientist
and Miller, R.C. Community clustering: Leveraging review assignment. Inf. Process. Manage. 48, 4 (July at Capgemini.
an academic crowd to form coherent conference 2012), 725–740.
sessions. In Proceedings of the First AAAI Conference 25. Karimzadehgan, M., Zhai, C. and Belford, G. Multi- Peter A. Flach (peter.flach@bristol.ac.uk) is a professor
on Human Computation and Crowdsourcing (Palm aspect expertise matching for review assignment. of artificial intelligence in the Department of Computer
Springs, CA, Nov. 7–9, 2013). B. Hartman and E. In Proceedings of the 17th ACM Conference on Science at the University of Bristol, U.K. and editor-in-chief
Horvitz, ed. AAAI, Palo Alto, CA. Information and Knowledge Management (2008). of the Machine Learning journal.
2. Balog, K., Azzopardi, L. and de Rijke, M. Formal models ACM, New York, NY 1113–1122.
for expert finding in enterprise corpora. In Proceedings 26. Kou, N.M., U, L.H. Mamoulis, N. and Gong, Z. Weighted
of the 29th Annual International ACM Conference on coverage based reviewer assignment. In Proceedings of Copyright held by owners/authors.
Research and Development in Information Retrieval the 2015 ACM SIGMOD International Conference on
(2006). ACM, New York, NY, 43–50. Management of Data. ACM, New York, NY, 2031–2046.
3. Benferhat, S. and Lang, J. Conference paper 27. Langford, J. and Guzdial, M. The arbitrariness of
assignment. International Journal of Intelligent reviews, and advice for school administrators. Watch the authors discuss
Systems 16, 10 (2001), 1183–1192. Commun. ACM 58, 4 (Apr. 2015), 12–13. their work in this exclusive
4. Blei, D.M, Ng, A.Y. and Jordan, M.I. Latent dirichlet 28. Lawrence, P.A. The politics of publication. Nature 422 Communications video.
allocation. J. Mach. Learn. Res. (Mar. 2003), 993–1022. (Mar. 2003), 259–261. http://cacm.acm.org/videos/
5. Bornmann, L., Bowman, B., Bauer, J., Marx, W., 29. Ley, M. The DBLP computer science bibliography: computational-support-for-
Schier, H. and Palzenberger, M. Standards for using Evolution, research issues, perspectives. In academic-peer-review

MA R C H 2 0 1 7 | VO L. 6 0 | N O. 3 | C OM M U N IC AT ION S OF T HE ACM 79

You might also like