You are on page 1of 14

Research Data Alliance

Open Science Monitor Case Study

Tung Tung Chan


July – 2019 EN
Research Data Alliance - Open Science Monitor Case Study

European Commission
Directorate-General for Research and Innovation
Directorate G — Research and Innovation Outreach
Unit G.4 — Open Science
E-mail Rene.VonSchomberg@ec.europa.eu
RTD-PUBLICATIONS@ec.europa.eu
European Commission
B-1049 Brussels
Manuscript completed in July 2019.
This document has been prepared for the European Commission however it reflects the views only of the authors, and the
Commission cannot be held responsible for any use which may be made of the information contained therein.
More information on the European Union is available on the internet (http://europa.eu).
Luxembourg: Publications Office of the European Union, 2019

EN PDF ISBN 978-92-76-12112-1 doi: 10.2777/261887 KI-04-19-652-EN-N

© European Union, 2019.


Reuse is authorised provided the source is acknowledged. The reuse policy of European Commission documents is regulated by
Decision 2011/833/EU (OJ L 330, 14.12.2011, p. 39).

For any use or reproduction of photos or other material that is not under the EU copyright, permission must be sought directly
from the copyright holders.
EUROPEAN COMMISSION

Research Data Alliance


Open Science Monitor Case Study

2019 Directorate-General for Research and Innovation EN


Table of Contents
Acknowledgements ...................................................................................... 5
1 Background .......................................................................................... 6
2 Drivers ................................................................................................ 7
3 Barriers ............................................................................................... 7
4 Impact ................................................................................................ 8
5 Lessons Learnt.................................................................................... 10
6 Policy conclusions................................................................................ 11
References ............................................................................................... 12

4
Acknowledgements

Disclaimer: The information and views set out in this study report are those of the
author(s) and do not necessarily reflect the official opinion of the Commission. The
Commission does not guarantee the accuracy of the data included in this case study.
Neither the Commission nor any person acting on the Commission’s behalf may be held
responsible for the use which may be made of the information contained therein.

The case study part of Open Science Monitor led by the Lisbon Council together with CWTS, ESADE
and Elsevier.

Author

Tung Tung Chan – CWTS

Acknowledgements

The study team would like to thank Hilary Hanahoe, Secretary General of the RDA for sharing her
valuable time and experiences.

5
STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

1 Background

Sharing research data is central to advancing the aims of Open Science, as sharing creates
opportunities for reuse, increases collaboration, and transparent research practices. Open
sharing of research data on a global scale do not only require the cooperation of
researchers, but also an interoperable data infrastructure with international framework and
standards (Wittenburg et al., 2010). However, the global research data infrastructure
needed to enable research data sharing and exchange is still far from realised. To address
this gap, the Research Data Alliance (RDA) was established in 2013 through the support
and funding by the European Commission, Australian Government Department of
Education and Training, and the United States National Science Foundation (NSF) and
National Institute of Standards and Technology (NIST) (Berman, Wilkinson & Wood, 2014).

An international and community-driven organisation, RDA builds the social and technical
bridges which enable open research data sharing on a global scale through a neutral forum
(Parsons, 2013). It facilitates discussion and exchange by identifying best practices and
standards for research data, tools and infrastructure across scientific disciplines (Parsons,
2013; Treloar, 2014). This is achieved through concrete outputs and recommendations
developed by Working Groups and Interest Groups, formed by expert members from
academia, private sector, and government (Berman, Wilkinson & Wood, 2014).

As of May 2019, the RDA includes more than 8,400 members representing 137 countries,
growing exponentially since its inception six years ago (with 1,300 members from 53
countries). The following actors and programs are instrumental in realising the goals of
RDA, which create, develop and adopt the social, organisational, and technical
infrastructure solutions needed to reduce barriers to research data sharing and exchange.

• Individuals: As primary research data producers and users, researchers currently


make up the largest group of RDA stakeholders, with 1,898 members (May, 2019).

• Research Performing Organisations (RPOs): Organisations can become


sponsors, supporters or organisational members through various forms of financial
contribution.

• Libraries: Library and information service professionals represent the second


largest group of stakeholders, with more than 1,000 members. They are one of the
key providers within universities to raise awareness, provide training and support
to researchers in managing data in all steps of the research lifecycle.

• The European Open Science Cloud (EOSC): Launched in 2015 by the European
Commission, the EOSC will operate as a federation of research data infrastructures,
to create a trusted environment for hosting and processing research data to support
open science (Giannoutakis & Tzovaras, 2016). The EOSC implementation roadmap
outlines six action lines: (1) architecture, (2) data, (3) services, (4) access and
interfaces, (5) rules and (6) governance, where the first four action lines fall within
the scope of RDA.

• Regions: RDA US and RDA Europe are official regional groups within RDA global
which facilitate its members to coordinate activities, events and exchanges in their
research and data management communities on a national level. Members may
volunteer to be a national contact point and form a national group. As of May 2019,
there are 13 RDA Europe national nodes (Austria, Denmark, Ireland, Netherlands,
United Kingdom, Germany, Greece, Spain, Italy, France, Finland, Slovenia,
Portugal) and six national groups: Norway, Brazil, North America, United States,
Asia and South Eastern Europe.

• Students and Early Career Professionals: RDA/US Data Share Program and
RDA EU Early Career Support Programme introduce its research fellows to RDA’s
work through attending the bi-annual plenaries, and participating in RDA’s Working

6
STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

Groups and Interest Groups, under the mentorship of RDA advisory committee
members.

2 Drivers

In 2010, the Final report of the High-Level Expert Group on Scientific Data highlighted the
importance international partnerships and global governance in building open e-
infrastructures to collect, curate, preserve and share the ever-increasing amounts of
scientific data (Wittenburg et al., 2010). Inspired by the success of Internet Engineering
Task Force (IETF), the Data Access Interoperability Task Force (DAITF) from Europe and
the Data Web Forum proposed by NSF and NIST from the United States led to the bottom-
up establishment of RDA (Treloar, 2014). The emergence of RDA was a result of the
following scientific, industrial and societal drivers.

• For science, open research data infrastructures that support seamless access, use,
and re-use, would enable anyone to find, access and process the data they need.
This will encourage all researchers to deposit their data, foster collaboration among
scientists, generate new insights and new forms of scientific inquiry to address the
grand challenges of society.

• For industry, public research data can be used for commercial purposes. Ideally,
cross-fertilisation and knowledge exchange between public and private sectors will
remove adversarial attitudes amongst the two, which will generate new discoveries,
new companies and new jobs. Open research data infrastructures would facilitate
academics and industrialists to engage in a virtuous cycle to amplify the impact of
innovation, and advance the national economy.

• For society, open research data infrastructures that protect data ownership and
integrity will create trust, increase confidence in our abilities to use and understand
data, and evaluate the degree to which that data was collected in a responsible
manner. This will empower citizens to contribute more easily and creatively to the
scientific process.

3 Barriers

The challenges encountered by RDA members seeking to engage in research data


management issues are found at the individual level and at the organisational level. On
the individual level, a large percentage of RDA members come from US and Western
Europe, and they are overrepresented in Working and Interest Groups (Treloar, 2014).
Members from other regions may not have the financial means to be present, or lack skills
and confidence to contribute in the events and meetings. Further, institutions which RDA
members represent, may not have the resources or capacities to create a supportive
environment for research data management. It would be difficult for all RDA members to
adopt the outputs and recommendations to bring about a wider change in behaviour among
researchers within his/her organisation. Therefore, the success of RDA in removing data
sharing barriers depends on both the upskilling of members across disciplines and regions,
and the willingness of researchers from all scientific communities to engage in the process.
It is thus essential that RDA keeps growing in membership vis-à-vis attractive institutional
subscription models to maintain its inclusive and international position.

On the organisational level, community support for and interest in research data
infrastructures and management requires consensus and regular discussions. The rise in
membership will also bring new topics and shared interests, which will increase the number
of Working Groups and Interest Groups. The RDA forum might turn into a complex
environment that will be difficult for new members to navigate. Finally, the RDA organising
committee will have to be more creative in developing processes and activities that would
engage all members to take the RDA outputs and recommendations up to their
organisations. This is the most challenging of all, as policies come only after practices have
stabilised and become accepted, yet this is not the case with research data management
(Asher et al., 2013).

7
STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

Finally, RDA’s organisational visions and missions may not be well-aligned with that of
universities. This makes it difficult for RDA to translate or demonstrate its values to
academic institutions which researchers are a part of. Research culture rewards
publications, and put research data sharing and reuse far down the list of institutional
priority (McNutt et al., 2016). The contributions of researchers, data stewards and data
professionals may be invisible if universities do not recognize research data stewardship
and data management as scientific excellence, for they build trust, promote transparency
and reproducibility in science.

4 Impact

The Research Data Alliance (RDA) plays a significant role in developing a consolidated
global research data infrastructure. There is an urgent need to identify the necessary
technical aspects, governance issues, and best practices required to support more
coordinated approaches to make research data sharing a reality. Specifically, RDA creates
concrete pieces of social, organisational, and technical infrastructure that accelerate data
sharing and exchange for a target community, use and adopt the infrastructure within the
target community, and recommend the infrastructure to other communities. These efforts
are being developed by Working Groups (WGs) and Interest Groups (IGs).

WGs generate outputs and recommendations which develop and implement data
infrastructure, including tools, policy, practices and products in the timespan of 12 to 18
months. WGs members are RDA individuals who will endorse, adopt, and use these outputs
in their projects, organisations and communities. IGs operate without a time limit as they
provide discussion forums in topical areas that address a specific data sharing problem,
and identify the kind of research data infrastructure solutions that needs to be developed
in WGs.

Table 1 below contains 12 endorsed outputs officially recognised by the RDA (May, 2019).
In this table, the scientific, industrial and societal impact of the outputs will be examined
through solution (impact statements provided by RDA), identification of target community
(researchers, libraries, funders, policymakers etc), as well as the disciplinary domains they
address.

Table 1. RDA Endorsed Recommendations (RDA, May, 2019).


Recommendation Target Disciplinary Solution Impact
title community domain
Scalable Dynamic Researchers, Data science Supporting accurate Scientific:
Data Citation developers, and research citation of data technological
Methodology data centres data subjected to change, aspects of data
management for the efficient infrastructure.
(RDM) processing of data
and linking from
publications
Data Description Researchers RDM for cross- Provides researchers Scientific:
Registry disciplinary a mechanism to interoperability of
Interoperability Model and cross- connect datasets in data
platform various data infrastructure.
research data repositories based on
discovery various models such
as co-authorship,
joint funding, grants,
etc.
Basic Vocabulary of Researchers All disciplines Ensures researchers Scientific: ICT
Foundational apply a common core technical
Terminology Query data model when specifications for
Tool organising their data common
and thus making data standards and
accessible and re- harmonisation of
usable. terminologies.

8
STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

Data Type Model and Researchers, All disciplines Ensures data Scientific: ICT
Registry developers, producers classify technical
RPOs, their data sets in specifications for
governmental standard data types, interoperability of
agencies allowing data users data
to automatically infrastructure.
identify instruments Societal: Remove
to process and technical hurdles
visualise the data on data sharing.
FAIRsharing: Researchers, All disciplines Guide producers of Scientific: Reduce
standards, databases, RPOs, standards, knowledge gap
repositories and developers, databases, across
policies – Final funders, policy repositories and stakeholders and
Recommendation makers, create a registry to encourage FAIR
librarians, make their resources practices through
journal discoverable to information
publishers and prospective users. curation and
learned This registry tracks implementation of
societies the development and common
evolution of standards.
standards, Societal: Remove
implementation and technical and
adoption in data social hurdles on
policies from funders, data sharing,
journals and RPOs. provide education
and training.
Persistent Identifier Developers Data science Defines standard Scientific: ICT
Type Registry and RDM core PID information technical
types to enable specification for
simplified verification semantic
of data identity and interoperability of
integrity data.
Machine Actionable Policy makers, All disciplines A standardised Scientific: ICT
Policy Templates RPOs, template which can technical
governmental be used to enforce specification for
agencies management, common
automate standards in
administrative tasks, automate process
validate assessment management.
criteria, and
automate scientific
analyses
Repository Audit and Data centres, Data science Creates harmonized Scientific:
Certification data and RDM common procedures certification of
Catalogues communities for certification of data repositories.
and services repositories at the
basic level, drawing
from the procedures
already put in place
by the Data Seal of
Approval (DSA) and
the ICSU World Data
System (ICSU-WDS)
Recommendation on Libraries, data RDM Provides a Scientific:
Research Data centres, data comprehensive technological
Collections communities, model for actionable aspects of data
repositories and collections and a collections.
services technical interface
specification to
enable client-server
interaction for
research data
collections.
Workflows for Researchers, All disciplines Assists research Societal: remove
Research Data publishers, communities in technical and
Publishing: Models libraries, data understanding social hurdles on
and Key Components centres options for data data publication to
publishing workflows enable data
and increases sharing.
awareness of

9
STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

emerging standards
and best practices.

Research Data Libraries, data Research data Provides Scientific:


Repository centres, data management interoperable Common
Interoperability WG communities, packaging and standards and
Final repositories and exchange format for exchange format
Recommendations services digital content in data which enable the
repository. Once interoperability of
implemented, research data
compliant packages repository
can be used to
migrate or replicate
digital content
between research
data repository
platforms or for
preservation
purposes.

Wheat Data Researchers Agriculture An interactive Scientific: the


Interoperability cookbook that helps standardization
Guidelines, researchers create, and harmonization
Ontologies and User manage and of wheat data.
Cases exchange wheat
data.

Of the 12 official RDA recommendations, four were ICT technical specifications endorsed
by the European Commission. At a glance, the generated outputs are non-domain specific
and primarily focused on research data management through common standards and best
practices. RDA delivered predominantly scientific impact, followed by societal impact that
enable research data sharing, exchange and interoperability. Considering its industrial
impact, publishers such as Elsevier and Wiley do adopt and endorse the above outputs.
However, there is hardly any evidence of collaboration or engagement with global
corporations, SMEs, commercial software and data service providers. This is logical given
RDA’s focus in reaching a critical mass of individual and institutional members from the
public sector during its start-up phase. For without the financial contribution of funders,
bottom-up support of academic researchers and librarians, RDA would not have achieved
such prolific outputs.

While there have been one or two corporate representation in the RDA advisory board,
such appointment is insufficient to build ‘bridges’ to industry. Attracting industrial RDA
members may be an important next step for RDA, to enable information and knowledge
exchange between the public and private. With their participation, innovative education
and training programmes as well as unforeseen opportunities may arise between
academics and industrialists, which will help accelerate the development of research data
infrastructure in the public domain.

5 Lessons Learnt

RDA is a crucial platform in advancing the field of research data management and
accelerating the development of open data research infrastructure. While there are many
technical and social hurdles that awaits its members, the structure of WGs and IGs are
useful in generating concrete outputs and recommendations. These have provided the
research communities with abundant opportunities for reflection, identification of best
practice and analysis of beneficial ways forward. RDA is well on its way in developing areas
that advance the scientific gateways, which Barker et al. (2018) refer to as community-
driven digital environments that meet the particular needs of a research community:

• “Technical solutions for the development of science gateways, including


interoperability, standards, software registries, and data management.

10
STUDY ON OPEN SCIENCE: MONITORING TRENDS AND DRIVERS (Reference: PP-05622-2017)

• Best practices and policies for the valuation of science gateways, including
incentives for open science, reproducibility, data and software citation.

• Sustainability models for the maintenance, development, and exploitation of


science gateways, including development of skills, training, career paths and
funding.”

Barker et al. (2018, p.17)

Six years after its establishment, RDA is well beyond its start-up phase. It is time to review
its organisational structure and reflect on whether RDA is generating necessary and
sufficient impact on the scientific, industrial and societal level for a diverse set of target
community. Currently, RDA outputs are mostly technical solutions for research data
infrastructure issues, best practices and policies. They have not yet addressed the
sustainability models as outlined by Barker et al. (2018), which is fundamental in fostering
a bottom-up cultural change for research data sharing and reuse. New and creative ways
of engaging with individual members beyond the WGs, IGs and bi-annual plenaries meeting
would increase awareness, generate momentum and engagement in these issues. Further,
RDA should consider transforming its website to a more user-friendly and forward-looking
web design and layout, given its tremendous amount of content and various sub-groups
embedded within larger groups. Finally, the RDA organisational body may consider working
on a local level with other regions, to obtain a diverse and sustainable array of funding
sources, further replicating the success of RDA US and RDA Europe.

6 Policy conclusions

In conclusion, it is important that RDA continues to evolve, increase international


collaboration and global sharing mechanisms, to remove social and technical barriers to
research data sharing and reuse. Public funding of RDA is crucial to its survival and
operational evolution. The ongoing investment in national and international programs, in
tandem with community and disciplinary initiatives, are facilitating the public debate on
research data issues across scientific communities. However, the lack of cooperation with
industry and development of sustainability models may hinder the ability of research data
to increase industrial and societal impact, improve research career paths and funding on
the individual and institutional level. Increased coordination across varied initiatives in the
Working Groups and Interest Groups will continue to improve identification of best practice
and development of policies and standards. New members need to be trained and
mentored by more experienced RDA members across regions, to realise their full potential
in demonstrating the value of research data sharing and reuse.

11
References

Asher, A., Deards, K., Esteva, M., Halbert, M., Jahnke, L., Jordan, C., ... & Urban, T. (2013).
Research data management: Principles, practices, and prospects In. Washington, DC:
Council on Library and Information Resources.

Barker, M., Olabarriaga, S. D., Wilkins-Diehr, N., Gesing, S., Katz, D. S., Shahand, S., ...
& Treloar, A. (2019). The global impact of science gateways, virtual research environments
and virtual laboratories. Future Generation Computer Systems, 95, 240-248.

Berman F, Wilkinson R, & Wood J. (2014). Building Global Infrastructure for Data Sharing
and Exchange through the Research Data Alliance. D-Lib Magazine 20 (1), retrieved from
www.dlib.org/dlib/january14/01guest_editorial.html.

Giannoutakis, K. M., & Tzovaras, D. (2016, October). The European Strategy in Research
Infrastructures and Open Science Cloud. In International Conference on Data Analytics and
Management in Data Intensive Domains (pp. 207-221). Springer, Cham.

McNutt, M., Lehnert, K., Hanson, B., Nosek, B. A., Ellison, A. M., & King, J. L. (2016).
Liberating field science samples and data. Science, 351(6277), 1024-1026.

Parsons, M. A. (2013). The research data alliance: Implementing the technology, practice
and connections of a data infrastructure. Bulletin of the American Society for Information
Science and Technology, 39(6), 33-36.

RDA. (2019, May 20). All Recommendations & Outputs. Retrieved June 11, 2019, from
https://rd-alliance.org/recommendations-and-outputs/all-recommendations-and-outputs

Treloar, A. (2014). The Research Data Alliance: globally co-ordinated action against
barriers to data publishing and sharing. Learned Publishing, 27(5), S9-S13.

Wittenburg, P., Van de Sompel, H., Vigen, J., Bachem, A., Romary, L., Marinucci, M., ... &
Lopez, D. R. (2010). Riding the wave: How Europe can gain from the rising tide of scientific
data.

12
Getting in touch with the EU

IN PERSON
All over the European Union there are hundreds of Europe Direct Information Centres.
You can find the address of the centre nearest you at: http://europa.eu/contact

ON THE PHONE OR BY E-MAIL


Europe Direct is a service that answers your questions about the European Union.
You can contact this service
– by freephone: 00 800 6 7 8 9 10 11 (certain operators may charge for these calls),
– at the following standard number: +32 22999696 or
– by electronic mail via: http://europa.eu/contact

Finding information about the EU

ONLINE
Information about the European Union in all the official languages of the EU is available on the Europa website at:
http://europa.eu

EU PUBLICATIONS
You can download or order free and priced EU publications from EU Bookshop at:
http://bookshop.europa.eu. Multiple copies of free publications may be obtained
by contacting Europe Direct or your local information centre (see http://europa.eu/contact)

EU LAW AND RELATED DOCUMENTS


For access to legal information from the EU, including all EU law since 1951 in all the official language versions,
go to EUR-Lex at: http://eur-lex.europa.eu

OPEN DATA FROM THE EU


The EU Open Data Portal (http://data.europa.eu/euodp/en/data) provides access to
datasets from the EU. Data can be downloaded and reused for free, both for commercial and
non-commercial purposes.

13
This case study reports activities performed by Research Data
Alliance (RDA), an international grassroots organisation that
promotes international collaboration and global sharing mechanisms
to remove social and technical barriers to research data sharing and
reuse. Drivers, impacts and barriers of the RDA were identified and
discussed using secondary data sources, close reading of the RDA
website, interview, as well as attending the RDA 13th Plenary
Meeting. The report concludes with lessons learnt and policy
conclusions, calling for increased industrial impact, development of
sustainability models for open research data management, and
induction and mentorship programs for RDA members.

Studies and reports

You might also like