Professional Documents
Culture Documents
Author(s): Virginia A. de Wolf, Joan E. Sieber, Philip M. Steel and Alvan O. Zarate
Source: IRB: Ethics and Human Research, Vol. 27, No. 6 (Nov. - Dec., 2005), pp. 12-16
Published by: The Hastings Center
Stable URL: http://www.jstor.org/stable/3563537 .
Accessed: 20/04/2014 04:40
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
The Hastings Center is collaborating with JSTOR to digitize, preserve and extend access to IRB: Ethics and
Human Research.
http://www.jstor.org
BYVIRGINIA
A. DE WOLF,JOAN E. SIEBER,PHILIPM. STEEL,
AND ALVANO. ZARATE
CurrentData SharingPractices
ata sharinghas beena well-established
practicein
some physical sciences (e.g., astronomy, oceanography) for many decades, and in meteorology for well over
a century. Some agencies that fund human subjects
research also have long required or encouraged data
sharing. For example, the National Science Foundation
(NSF) incorporates as a default element of its research
grants a general requirementthat its grantees arrangeto
make their data available to other scientists.4In addition, the National Institute of Justice (NIJ) incorporates
in its research contracts, grants, and cooperative agreements the stipulation that investigators return resulting
data to the agency, which then prepares and archives
that data in the National Archive of CriminalJustice
Data at the Inter-universityConsortium for Political and
Social Research (ICPSR).5In some of the biosciences, the
sharing of reagents, cell lines, probes, clones, descriptions, quantitative data, and other items of data has
been the basis for rapid scientific development. However,
many bioscience researchershave experienced a tension
between the ideal of data sharing and institutional, pro-
2005
NOVEMBER-DECEMBER
Table 1.
Glossaryof Key Terms
Term
Anonymous
Deidentification
Definition
Strictlyspeaking,the removalof identifiers
only, but used here synonymouslywith
deidentification.At issue in data sharingis
what to regardas an identifierthat should
be removedto prevent reidentification.
Hence, in this context, the term
"anonymous"is used to mean
deidentified.
The removalof direct identifierstogether
with the removalor modificationof other
informationabout study participantsin
order to minimizethe riskof
reidentification.
Identifier
(also referredto as
director explicit
identifier)
Public-use
data products
Reidentification
* Definitionfrom
page 3, U.S.Officeof Managementand Budget.Report
on StatisticalDisclosureLimitation
Methodology.(StatisticalPolicy
WorkingPaper22). Washington,DC:May,1994. http://www.fcsm.gov/
working-papers/spwp22.html.
social attitudes, economics)-whose data have been rendered anonymous ("deidentified")-have long been subject to sharing requirements,and their data reside in
archives readily available to scientists and the public.7 A
major national archive, ICPSR, permits its subscribersto
access its data sets via the Internet.8Students as well as
scientists have routinely used these data for teaching and
research purposes since 1962.
Federal statistical agencies have also released deidentified data from a wide variety of surveys as "public-use"
microdata files.9 In addition, several of these agencies
have developed special facilities known as Research Data
Centers, where approved researcherscan analyze data
that have not been deidentified under carefully controlled conditions.1' A brief description of such centers is
given in our third article.
While these large-scaledata sharing arrangements
have been very successful, the sharing of smaller human
subjects data sets from biomedical studies and socialbehavioral researchhas had a contentious history.
Questions and concerns have been raised about what
constitutes data and sharing;who pays the cost of data
preparation, archiving, and sharing;who owns data
gathered under federal funding;" who is requiredto
share; what the informed consent requirementsare; and
how confidentiality of human subjects data can be protected." In 1989, an article appeared in this journal that
informed IRBs of challenges they would face as investigators began to share more kinds of human subjects
data.13That article outlined the reasons for data sharing,
its ethical foundations, barriersto sharing, rights of the
original researcher,how IRBs may be involved, and how
to solve problems surroundingdata sharing. The article
had a tentative tone. At that time, data sharing was still
new to most small-scale biomedical and social-behavioral research.While NSF and NIJ by then were fostering data sharing, the role of IRBs in this matter was still
ambiguous. This is no longer the case.
With respect to sharing social-behavioralresearch and
biomedical data, two things have changed. First, federal
funding agencies are requiring,rather than urging, data
sharing when it can reasonably occur. In such cases,
grant proposals must contain evidence of a well-planned
sharing arrangementthat provides data in a useful form
to other researcherswithout breachingpromises of confidentiality.Second, there is a clear role for IRBs in
reviewing protocols when research funding is contingent
on an agreement to share data. These changes raise
pressing questions for IRBs and investigators that we
NOVEMBER-DECEMBER
2005
deliverdatato NIJ at the completionof the projectleavesit with the responsibilityto archivedeidentified
datawith ICPSR,and to keep identifiabledatawithinits
own securearchiveswherethey may be usedon a
restrictedbasis.16(Wewill discusssucharchivesin Part
III.)
Otheragencies,such as the NSF,have less specific
Overthe past 25 years,the NSF has
requirements.
stronglyencourageddata sharingand soughtto better
understandhow to maximizethe usefulnessof shared
data.17The agencyhas fundedprojectsdevelopingvarious technologiesof data sharing(e.g.,via the Internet).
are not as specificas
Althoughthe NSF'srequirements
the NIH'snew rules,NSF proposalsthat spellout specificallyhow datawill be sharedand what methodsof
nondisclosurewill be employedhave a distinctadvantage overthose that do not.'8
In 1999, the Officeof Managementand Budget
for grantsand agreerevisedthe federalrequirements
mentswith nonprofitorganizations.'9
Agenciesare
data
to
share
"research
relatingto published
required
researchfindingsunderan awardthat wereusedby the
federalgovernmentin developingan agencyactionthat
has the forceand effectof law."2?This meansthatif
publishedresultsof fundedresearchbecomethe rationale for a governmentagencyto take some action,the
datamust be madeavailableto thosewho want to reexaminethe analysisof the data on whichthe alleged
resultswere based.
Why Share Data?
therearemany
Spart fromfunderrequirements,
otherreasonsto sharedata.Most investigators
embracethe normof openness,and peerpressureis used
to promotethis normwith reluctantcolleagues.
Scientificsocietiesurgedatasharing,and manyjournal
editorsrequireit as a conditionof publication.Shared
datagive some assuranceof the validityof the research
and providea commongrounduponwhichto work out
controversy over the analysis of research findings.
Data sharing provides an efficient and feasible way to
make comparisons across populations; build upon one's
own data with additional related data; reanalyze data
with different designs, methods, or hypotheses; promote
interdisciplinaryanalyses; validate original results;conduct methodological and policy research;and create curricula in research methodology or statistics using real
participant-leveldata. Researcherswho do secondary
analysis of data, and those whose researchdraws on
2005
IRB:
Kindsof Sharing
kindsof data sharingwithininstitutionsare
Certain
normal and expected, as when an investigator
2005
D sdaimer
1994;308:1519-15z0.
22. http://www.murray.harvard.edu/mra/index.jsp.
23. For the websites of the NCHS Research Data Center and the
Census Bureau'sResearch Data Center Program see ref. lo. For the
Agency for Health Care Research and Quality's CFACTData Center
see http://www.meps.ahrq.gov/datacenter.htm.
24. A description of remote access, as well as other types of
restricted access used by federal statistical agencies, is available in the
Confidentiality and Data Access Committee's publication entitled
"RestrictedAccess Procedures,"http://www.fcsm.gov/committees/
cdac/cdacra9.pdf.
2005
NOVEMBER-DECEMBER