You are on page 1of 4

BRIEF COMMUNICATION

What Is Behind the Curtain of the Leiden Ranking?

Rüdiger Mutz
Professorship for Social Psychology and Research on Higher Education, D-GESS, ETH Zurich, Muehlegasse
21, 8001 Zurich, Switzerland. E-mail: mutz@gess.ethz.ch

Hans-Dieter Daniel
Professorship for Social Psychology and Research on Higher Education, D-GESS, ETH Zurich, Muehlegasse
21, 8001 Zurich, Switzerland; Evaluation Office, University of Zurich, Muehlegasse 21, 8001 Zurich,
Switzerland. E-mail: daniel@gess.ethz.ch

Even with very well-documented rankings of universities, The CWTS Leiden Ranking 2013 measures the scientific per-
it is difficult for an individual university to reconstruct its formance of 500 major universities worldwide. Using a
position in the ranking. What is the reason behind sophisticated set of bibliometric indicators, the ranking aims to
whether a university places higher or lower in the provide highly accurate measurements of the scientific impact
ranking? Taking the example of ETH Zurich, the aim of of universities . . . The CWTS Leiden Ranking 2013 is based on
this communication is to reconstruct how the high posi-
Web of Science indexed publications from the period 2008–
tion of ETHZ (in Europe rank no. 1 in PP[top 10%]) in the
Centre for Science and Technology Studies (CWTS) 2011 (http://www.leidenranking.com/ranking/2013).
Leiden Ranking 2013 in the field “social sciences, arts
and humanities” came about. According to our analyses, To be able to draw valid conclusions based on rankings,
the bibliometric indicator values of a university depend information on how the raw data have been weighted and
very strongly on weights that result in differing estimates aggregated is extremely important. Our look behind the
of both the total number of a university’s publications and
the number of publications with a citation impact in the curtain of the Leiden Ranking, which is at present the best
90th percentile, or PP(top 10%). In addition, we examine research ranking of universities worldwide based on a rig-
the effect of weights at the level of individual publica- orous data analytic framework, can reveal central assump-
tions. Based on the results, we offer recommendations tions and decisions on the part of the CWTS that affect
for improving the Leiden Ranking (for example, publica- results and that the user is not at all aware of when quickly
tion of sample calculations to increase transparency).
interpreting ranking results. But knowledge of those
assumptions and decisions is indispensable for correct inter-
Introduction pretation and use of the results. Criticisms (see next section)
aside.
Even when rankings of universities are very well docu- Taking the example of ETH Zurich (ETHZ) in Switzer-
mented, some uncertainty always remains with respect to land and the Leiden Ranking main field of “social sciences,
how the rankings actually came about, and this applies also arts, and humanities” (SOC_HU), we reconstruct how a rank
to the Leiden Ranking produced by the Centre for Science came about (citation impact, PP[top 10%]) also using the raw
and Technology Studies (CWTS) at Leiden University data made available to us by the CWTS. ETHZ is “one of the
(Waltman et al., 2012). According to the CWTS Leiden leading international universities for technology and the
Ranking website: natural sciences” (www.ethz.ch/en/the-eth-zurich.html),
which also has departments and research centers in SOC_HU
(e.g., Department of Humanities, Social and Political Sci-
Received May 5, 2014; revised May 28, 2014; accepted June 5, 2014 ences; Competence Centre for the History of Knowledge).
In the field of SOC_HU, ETHZ ranks number one in
© 2015 The Authors. Journal of the Association for Information Sciences
and Technology published by Wiley Periodicals, Inc. on behalf of Europe in the Leiden Ranking 2013, with the highest per-
ASIS&T • Published online 29 April 2015 in Wiley Online Library centage of PP(top 10%) publications (journal articles,
(wileyonlinelibrary.com). DOI: 10.1002/asi.23360 reviews). Of the total 358 publications of the ETHZ, 16.6%

JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 66(9):1950–1953, 2015
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and
distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are
made.
FIG. 1. Screenshot of the EXCEL table (extract) of the 31 most-cited publications of the ETHZ (raw data) in the field of SOC_HU. The documents were
sorted in descending order by the total number of citations (CS). We did the fractional counting (fraction) shown here in column 1. [Color figure can be
viewed in the online issue, which is available at wileyonlinelibrary.com.]

are in the 10% most-cited publications in the field of this quite heterogeneous list of subfields, unambiguous
SOC_HU. This basically favorable result immediately raises assignment of the publications and the resulting rank posi-
the question as to what disciplines and organizational units tion in the Leiden Ranking to a particular organizational unit
in the field “social sciences, arts, and humanities” contrib- such as the Department of Humanities, Social and Political
uted to ETHZ’s top position in the ranking. Sciences is not possible. For example, the most-cited article
of ETHZ up to the end of 2012 in SOC_HU published in the
analysis period from 2008 to 2011 (Figure 1, row 2) is Engel,
Organizational Units Versus Journal Sets Pagiola, and Wunder (2008). The first author’ affiliation,
The main fields in the Leiden Ranking are defined based however, is the Department of Environmental Systems
on journal subject categories, and they should on no account Science, and not the Department of Humanities, Social and
be equated with organizational units. Journal subject catego- Political Sciences. The authors of the most-cited publica-
ries are journal sets1 of selected subfields. For the ETH tions of the ETHZ (raw data) in the field of SOC_HU in
publications, a total of 54 subfields (that is, journal sets) are Figure 1 are affiliated with the Department of Environmental
assigned to the main field of SOC_HU. The list of subfields Systems Science, just mentioned, and the Department of
includes, among others, Agricultural Economics & Policy, Humanities, Social and Political Sciences, the Department of
Economics, Education & Educational Research, History & Management, Technology and Economics, the Department
Philosophy of Science, Information Science & Library of Chemistry and Applied Biosciences and the Department
Science, Law, Management, Political Science, Sociology, of Health Sciences and Technology, among others.
Transportation, Urban Studies, and Women’s Studies. With
Raw Data Versus “Advanced Parameters”
1
The CWTS Leiden Ranking 2014 published on April 30th, 2014 differs
Analysis
from the CWTS Leiden Ranking 2013 in one respect: “The fields have been Here we take a closer look at the rank position of ETHZ
defined algorithmically in a unique bottom-up fashion based on millions of
citation relations between publications in the Web of Science data base.”
in SOC_HU for the indicator PP(top 10%):
(CWTS Press release Leiden Ranking 2014). “828 fields are distinguished.
These fields are defined at the level of individual publications” • What publications and citations are considered in
(www.leidenranking.com/methodology/indicators). SOC_HU?

JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—September 2015 1951
DOI: 10.1002/asi
• How is the total number of ETHZ publications in SOC_HU 2013). If, for example, a journal belongs to an SOC_HU
determined? subfield and to three other, non-SOC_HU fields, the journal
• How is the number of ETHZ publications in the top 10% has a weight of 0.25 instead of the 1.0 that the journal would
calculated? be given if it were assigned to the field of SOC_HU only.
• What are the effects of weightings at the level of individual
The EXCEL table provided by the CWTS (see excerpt in
publications?
Figure 1) shows this in column H, “weight.” The sum of the
weights is 729.93 (∼730) publications. Thus, depending on
What Publications and Citations Are Considered in the advanced parameters selected, the number of publica-
SOC_HU? tions varies from 358 (with the advanced parameters chosen
in the Leiden Ranking) to 854 (raw data set). By different
All publications (articles, reviews) classified as SOC_HU
combinations of selection and deselection of advanced
in the time period 2008–2011/2012 were included in the
parameters except for calculation of size-independent indi-
Leiden Ranking, that is, publications up to 2011 and cita-
cators the position of the ETH Zurich does not change (first
tions up to 2012 (author self-citations are not included).
position). This is, however, not the case for all universities in
the ranking (e.g., Imperial College London).
How Is the Total Number of ETHZ Publications in
SOC_HU Determined?
How Is the Number of ETHZ Publications in the Top
According to Figure 1, ETHZ published a total of 358
10% Calculated?
articles in the field of SOC_HU, of which 16.6% belong to
the 10% most-cited publications in their journal set. Weights also affect how citation impact is determined.
However, when interpreting this finding it is important to The value of the indicator PP(top 10%) for ETHZ varies in
note any “advanced parameters” selected for the ranking the Leiden Ranking from 16.6% to 18.2%. The reason for
calculation. For instance, publications in special types of this is the following:
journals (journals not published in English, journals without
strong international focus, trade journals, and popular jour- Most publications either do or do not belong to the top 10%
nals) are excluded. When articles published in special types most highly cited of their publication year and field, but some
of journals are included, the total number of articles pub- publications are partially considered to be 10% publications.
lished by ETHZ in SOC_HU increases to 441, if all other This is the case if a publication belongs to multiple fields and is
advanced parameters are selected. in the top 10% in some of these fields but not in others. It is also
the case if the number of citation of a publication equals exactly
But the advanced parameter with the greatest impact on
the top 10% threshold (CWTS, personal communication, 2013;
the calculated total number of articles is “calculate impact
Waltman & Schreiber, 2013).
indicators using fractional counting”:
To take this into account, “partly top 10%” publications
The full counting method gives equal weight to all publications
are given a weight below 1.0. PP(top 10%) is actually a
of a university. The fractional counting method gives less
weight to collaborative publications than to non-collaborative binary variable, with 1 for publications that belong to the top
ones. For instance, if the address list of a publication contains 10% and 0 for publications that do not belong to the top
five addresses and two of these addresses belong to a particular 10%.
university, then the publication has a weight of 2 / 5 = 0.4 Without any kind of weighting (PP(top 10%) > 0), based
(www.leidenranking.com/methodology/indicators). on the raw data, a total of 223 of 854 publications of the
ETHZ are in the top 10% with regard to some reference
If this advanced parameter is not selected, but all other distribution (multiple fields); that is, 26.1% versus 16.6% in
advanced parameters are, the result for ETHZ is now 614 the case of all selected “advanced parameters.” With the
articles. And if the advanced parameter “exclude publica- advanced parameter “partly top 10%,” the result is 133 pub-
tions in special types of journals” is additionally deselected, lications in the top 10% highly cited papers. Finally, if all
ETHZ has 730 publications. But that number does not equal advanced parameters are selected (fractional counting,
the actual number of ETHZ articles published in the main excluding publications in special types of journals, size-
field of SOC_HU, which is—according to the raw data— independent indicators), the number of publications in the
854 publications. The difference between the two numbers top 10% is only 60. The number of ETHZ publications in the
(730 vs. 854) is explained by the fact that some journals in top 10% thus varies absolutely between 60 (with advanced
the Web of Science belong to multiple fields (i.e., journal parameters) and 223 (raw data) and in percent between
subject categories), and it may be that a journal belongs to 16.6% (with advanced parameters) and 26.1% (raw data).
one or more SOC_HU subfields and to one or more science, Similar calculations can also be performed for the other
engineering, or life science fields. citation impact measures (total citations/CS, field normal-
Therefore, each publication has a weight that indicates ized citations/NCS). It should be noted that fractional count-
how strongly the journal in which the article was published ing was not only done for a single publication, but also for
belongs to SOC_HU (CWTS, personal communication, all publications in reference sets for each field. Due to

1952 JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—September 2015
DOI: 10.1002/asi
fractional counting, for instance, the PP(top 10%) value for raw data. This would increase the transparency of the
each field is exactly 10%. ranking.
Also not very satisfactory is the methodological-
statistical approach. On the one hand, basic statistical con-
What Are the Effects of Weightings at the Level of
cepts are used, such as percentiles or means, but on the other
Individual Publications?
hand, pure mathematically based weightings are used. On
The effects of an advanced parameter analysis are seen the one hand, these weights guarantee that the numbers, for
most tellingly at the level of the individual publication instance the top10% values, keep constant across various
(EXCEL table of raw data; see Figure 1). The most-cited fields, on the other hand, such weights might be difficult
ETHZ article in the field of SOC_HU, Engel et al. (2008), in a statistical analysis (e.g., van den Boogart &
has a weight of 0.5. This weight results from the fact that the Tolosana-Delgado, 2013). What is lacking is an overarching
journal Ecological Economics is evidently not only assigned statistical model of the data of the sort that mathematicians
to the field of SOC_HU but also to a science subfield (e.g., and statisticians have called for (Adler, Ewing, & Taylor,
Environmental Sciences & Ecology). In other words, this 2009; Goldstein, 2014).
highly cited ETHZ article in the field of SOC_HU is only Last but not least, we think that as a matter of urgency
half credited. Using fractional counting of 0.33 (Figure 1, more research should be conducted on the various weak-
column 1) this same article becomes less important again for nesses not only of the Leiden Ranking but also of interna-
ETHZ’s ranking in the field of SOC_HU. The effect of the tional university rankings in general (e.g., sensitivity
selection of advanced parameters is even more drastic in the analysis, weighting, fractional counting, statistical model)
case of Schneider, Holzer, and Hoffmann (2008), published (Bornmann, Mutz, & Daniel, 2013).
in the journal Energy Policy, which appears at the bottom
(row 32) of the table in Figure 1. This article was evidently
References
assigned to both the field of SOC_HU and two other fields,
so that it was given a weight of 0.33. In addition, it belongs Adler, R., Ewing, J., & Taylor, P. (2009). Citation statistics. A report from
the International Mathematical Union (IMU) in cooperation with the
to the top 10% most-cited publications in only two of the
International Council of Industrial and Applied Mathematics (ICIAM)
three fields. This explains the PP(top 10%) value of 0.67, or and the Institute of Mathematical Statistics (IMS). Statistical Science, 24,
2/3. In the total number of publications of ETHZ in the top 1–14.
10% (N = 133), the Schneider et al. article is thus entered Bornmann, L., Mutz, R., & Daniel, H.-D. (2013). A multilevel-statistical
with a value of only 0.67*0.33 = 0.22. If in addition the reformulation of citation-based university rankings: The Leiden Ranking
2011/2012. Journal of the American Society for Information Science and
fractional counting of 0.50 (see Figure 1 table, column 1)
Technology, 64, 1649–1658.
was also selected, the resulting value would be even smaller. CWTS. (2013). Personal correspondence (emails 18.12./19.12.2013, M.
Although the Leiden Ranking is one of the most presti- Neijssel, Leiden Ranking data).
gious university rankings worldwide, several things remain Engel, S., Pagiola, S., & Wunder, S. (2008). Designing payments for
unsatisfactory: environmental services in theory and practice: An overview of the issues.
Ecological Economics, 65, 663–674.
Goldstein, H. (2014). Using league table rankings in public policy forma-
A subject/discipline comparison of universities, for example in
tion: Statistical issues. Annual Review of Statistics and Its Application, 1,
the field of SOC_HU, cannot be achieved using the Leiden 385–399.
Ranking. The journal sets that are assigned to the field of Robinson-García, N., & Calero-Medina, C. (2014). What do university
SOC_HU do not allow unambiguous assignment of subjects rankings by fields rank? Exploring discrepancies between the organiza-
and organizational units (c.f. Robinson-García & Calero- tional structure of universities and bibliometric classifications. Sciento-
Medina, 2014). metrics, 98, 1955–1970.
Schneider, M., Holzer, A., & Hoffmann, V.H. (2008). Understanding the
The user of the Leiden Ranking is not given sufficient CDM’s contribution to technology transfer. Energy Policy, 36, 2930–
2938.
information on the nature of the raw data. The rounding up Van den Boogart, K.G., & Tolosana-Delgado, R. (2013). Analyzing com-
of weighted values wrongly conveys the impression that the positional data with R. New York: Springer.
values represent countable integers. Waltman, L., Calero-Medina, C., Kosten, J., Noyons, E.C.M., Tijssen,
No sensitivity analyses are conducted that would deliver R.J.W., van Eck, N.J. . . . (2012). The Leiden ranking 2011/2012: Data
information on what effects the different weightings have on collection, indicators, and interpretation. Journal of the American
Society for Information Science and Technology, 63, 2419–2432.
the ranking results. It would be helpful if the Leiden Waltman, L., & Schreiber, M. (2013). On the calculation of percentile-
Ranking website provided simple calculation examples based bibliometric indicators. Journal of the American Society for Infor-
showing how a specific ranking is calculated based on the mation Science and Technology, 64, 372–379.

JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY—September 2015 1953
DOI: 10.1002/asi

You might also like