You are on page 1of 10

ARTICLE

Third-Party Genetic Interpretation Tools:


A Mixed-Methods Study
of Consumer Motivation and Behavior
Sarah C. Nelson,1,2,* Deborah J. Bowen,3 and Stephanie M. Fullerton3

In an effort to meet ethical obligations and/or participant expectations, researchers may consider offering ‘‘raw’’ or uninterpreted genetic
data for result return. It is therefore important to understand the motivations, behaviors, and perspectives of individuals who might
choose to access raw data before such return becomes routine. In the direct-to-consumer (DTC) context, where raw data are often
made available to customers, the use of third-party interpretation tools has raised concerns about genotype accuracy, data privacy, reli-
ability of interpretation, and consumption of limited health care resources. However, relatively little is known about why individuals
access raw data or what they do with the information received from third-party interpretation. Accordingly, we conducted a survey
on raw data access and third-party tool usage among 1,137 DTC customers recruited through social media. Most survey respondents
(89%) reported downloading their raw data. Among downloaders, 94% used at least one tool, most commonly Promethease (63%) or
GEDmatch (84%). More than half (56%) used both health-related and non-health-related tools and differed significantly from those
who used only one tool type in terms of demographics, participation in research, DTC tests ordered, and testing motivations. Explor-
atory interviews were conducted with 10 respondents and illustrated how social networking, initial lack of interesting findings, and
general curiosity contributed to use of multiple tool types. These results suggest that even when initially motivated by ancestry and
genealogy, consumers frequently also pursue health information in a largely unregulated and expanding suite of third-party tools,
raising both challenges and opportunities for the professional genetics community.

Introduction However, the implications of making raw genetic data


available to research participants remain largely unknown.
There is a widening consensus that researchers should, One avenue to explore potential outcomes is to examine
wherever possible, offer individual results to research par- current uses of raw genetic data from direct-to-consumer
ticipants, both to enhance autonomy, agency, and reci- (DTC) genetic testing, which to date has been the most
procity of participants and to increase engagement with common route of access. Broadly, there are two types of ac-
and contribution to both traditional and citizen science tivities DTC customers can undertake with their raw data.
research initiatives.1–3 While the return of results has The first is to contribute to collective research efforts by
been the subject of scientific, ethical, and legal debate for sharing genotypic and phenotypic data on platforms
some time,4–6 the discourse has focused mostly on inter- such as Open Humans and openSNP.11,12 The potential
preted results rather than ‘‘raw’’ or uninterpreted genetic benefits of such sharing include increasing the size of data-
data. Here, raw data refer to genotype calls from either sets available to traditional researchers and creating novel
array genotyping or sequencing but may also encompass opportunities for participant-driven or ‘‘citizen science’’
upstream formats such as the sequence alignment data research initiatives.13,14 The second type of activity is the
used to make genotype calls (e.g., BAM files).1 Empirical ev- pursuit of individually tailored information via third-
idence suggests that members of the general public would party interpretation (TPI) tools such as Promethease or
want their raw genetic data if they were to participate in a GEDmatch. As we found in a prior study,15 TPI tools are
genome sequencing study,7 and some investigators have quite heterogeneous in terms of types of information re-
begun offering such data to research participants.1 Indeed, turned (e.g., health, ancestry, genealogy), methods for
organizers of the National Institutes of Health ‘‘All of Us’’ generating that information, and cost to users. Further
research cohort have stated plans to make raw genetic complicating this picture is that some TPI tools also pro-
data available to participants,8,9 and the National Acade- vide opportunities to contribute to collective research ef-
mies of Sciences, Engineering, and Medicine (NASEM) forts (e.g., openSNP and DNA.Land).
have issued recommendations that are supportive of re- Recently, TPI tools have received growing attention in
turning ‘‘individual research results’’ (both raw data and both academic and lay media discourse, where concerns
interpreted results) where feasible.10 Therefore, under- have been raised about genotype accuracy, data privacy
standing how individuals make use of their raw genetic and security, reliability of health-related information, po-
data is important for genetic researchers considering the tential for false positives or false negatives, and down-
return of such information to participants. stream consumption of limited health care resources. For

1
Institute for Public Health Genetics, University of Washington, Seattle, WA 98195, USA; 2Genetic Analysis Center, Department of Biostatistics, University
of Washington, Seattle, WA 98195, USA; 3Department of Bioethics and Humanities, University of Washington School of Medicine, Seattle, WA 98195, USA
*Correspondence: sarahcn@uw.edu
https://doi.org/10.1016/j.ajhg.2019.05.014.
Ó 2019 American Society of Human Genetics.

122 The American Journal of Human Genetics 105, 122–131, July 3, 2019
example, media reports have recounted individuals ethease sub-Reddit or DNA.Land Facebook page). No incentives
receiving distressing results from TPI, which upon clinical were offered for participating. The survey was closed 7 weeks after
confirmatory testing turned out to be false positives.16–19 it first opened and after posting final reminders on each recruit-
These individuals experienced emotional and financial ment venue.
hardship in unnecessary follow-up, often tracing back to
errors in the raw DTC genotypes. Accounts in academic Survey Design and Implementation
literature have illustrated outcomes of bringing TPI results The survey questionnaire was designed to cover three main topics:
to genetic counselors (GCs),20 genetics specialty clinics,21 DTC testing, raw data download, and TPI tool use. Respondents
were asked which DTC test(s) they had ordered, when, what moti-
or otherwise leading to follow-up clinical sequencing.22
vated them to order the test(s), and whether they had downloaded
In addition to health-related concerns, novel uses by law
their raw data. Non-downloaders were asked about reasons for not
enforcement of online genealogy databases such as
downloading while downloaders proceeded to a series of questions
GEDmatch are raising questions about data privacy and about TPI tools. Respondents who had used multiple tools were
consent in third-party services.23,24 prompted to select one on which to base their responses. All
However, few studies have reported on the perspectives respondents filled out a demographics section and were invited to
of the tool users themselves. One survey of DTC customers provide contact information for voluntary follow-up interviewing.
found a high volume of raw data download and TPI tool We developed survey questions primarily based on our prior
use (67%, or 321/478 respondents), where the majority study of TPI tools and developers.15 However, some items were
of tool users (81%) were satisfied with the information based on existing instruments: motivations for DTC testing were
received.25 However, that study did not explore the rela- adapted from the PGen (Impact of Personal Genomics) Study base-
line survey,28 and questions on how the respondent learned about
tionships between use of specific tools and reactions,
tools were adapted from Wang et al.25 A draft instrument was eval-
follow-up actions, and DTC testing motivations. Some
uated via cognitive interviewing29 with six individuals recruited
tool developers have also surveyed their own users,12,26 from a combination of personal and professional networks. The
but this provides limited insight into use of raw data and self-administered, online questionnaire was implemented in
TPI tools more broadly and does not measure the degree REDCap.30 While the majority of survey questions were fixed
to which users access multiple tools or make distinctions response, some open text comment boxes were also included.
among the tools that they use.
Here we contribute new information about consumers’ Follow-up Interviews
use of raw genetic data and TPI tools from the results of a We purposively sampled interview participants from among those
survey of more than 1,100 DTC customers and follow-up survey respondents who volunteered for follow-up contact and
interviews with a subset of survey respondents. Our aim reported using a specific combination of tools that spanned
was to better understand users’ motivations and behaviors health, ancestry, and genealogy (Promethease, DNA.Land, and
from initial DTC testing through to downstream use of GEDmatch, respectively)—or ‘‘crossover’’ use, described further
specific TPI tools and resulting follow-up actions, with below. Initially, 179 respondents met these criteria. Because we
the broader goal of understanding how raw data return wanted to explore crossover in both directions (i.e., from health
to non-health and vice versa), we stratified this sample by ‘‘DTC
in non-DTC contexts may unfold. This primarily descrip-
test(s) ordered’’ as a proxy for initial motivations for tool use
tive study contributes novel insights into how individuals
among potential interviewees, and, within each bin, randomly
leverage their raw genetic data in multiple ways, often
sampled participant IDs to contact.
concurrently, including to learn about health risks, We conducted semi-structured interviews via phone or Skype
ancestry, and genealogy. and had audio recordings transcribed on Rev (see Web Resources).
The interview guide comprised six questions focused on gaining a
deeper understanding of respondents’ timing and motivations to
Subjects and Methods download data and use multiple types of tools. For example, did
respondents start by using GEDmatch with an interest in geneal-
Participant Recruitment ogy and go on to additionally use Promethease and if so, why?
Or, conversely, did initial interest in health and use of Prome-
We recruited survey respondents during October and November
thease eventually lead to use of genealogy and ancestry-focused
2017 with staggered postings to various social media venues: six
tools as well and if so, why? We conducted a total of ten inter-
genomics-related sub-Reddits, Twitter, and several Facebook
views, after which point we had observed multiple examples of
groups (four genealogy groups and the DNA.Land page). Early in
crossover in each direction (i.e., participants initially using health
the 2-month period and once per venue, we posted a brief study
tools before using non-health tools, and vice versa) and halted
description and link to the survey, after seeking permission from
recruitment for this exploratory follow-up phase of the study. In-
group moderators/administrators. Additionally, the survey was
terviews averaged 36 min (SD ¼ 14 min); an interview with one
sent via newsletter to openSNP27 users and was posted on the first
deaf respondent was carried out via email.
author’s academic website and the Institute for Translational
Health Sciences (ITHS) Participant Portal. Recruitment messages
stated eligibility criteria as being 18 years or older and having Data Analysis
taken at least one DTC genetic test—i.e., raw data download and Quantitative survey data were analyzed via univariate descriptives,
TPI tool use were neither mentioned nor required, though some bivariate, and multivariate analyses (i.e., logistic regression) using
of the groups targeted were oriented toward TPI tools (e.g., Prom- all available, non-missing data. The total number of potential

The American Journal of Human Genetics 105, 122–131, July 3, 2019 123
responses changes between survey sections due to branching 68% rated as very important) and curiosity about ancestry
logic; sporadic missing answers also affect the count of available (645/977 or 66% rated as very important; see Table 2). Less
responses for any given survey item. Therefore, throughout we common motivations were limited information about
report both percentages and counts. All quantitative analyses family health history (201/972 or 21% rated as very impor-
were carried out in R statistical and graphing software.31
tant) and other family members pursuing testing (104/969
Throughout, DTC testing motivations were dichotomized into
or 11% rated as very important). In free text responses,
‘‘very important’’ versus ‘‘somewhat important’’ or ‘‘not at all
important,’’ following precedent in analyses of the PGen survey
20% of respondents (225/1,137) noted additional motiva-
from which those survey items were adapted.32,33 To understand tions, which included pharmacogenomics; being adopted;
factors influencing use of specific third-party tools, we performed breaking through ‘‘brick walls,’’ or dead ends, in genealogy
a series of logistic regression analyses using tool use (yes/no) as an research by matching with new relatives;34 and profes-
outcome and the following covariates: each of the eight dichoto- sional interests (e.g., teaches genetics or is a GC). Notably,
mized DTC testing motivations, survey recruitment venue, and 81% of respondents (787/969) rated desire to have their
DTC test(s) ordered. The survey instrument, results dataset, and raw genetic data file as either a very important or some-
R analysis code are available on openICPSR (see Web Resources). what important motivation to pursue DTC testing.
We thematically analyzed qualitative survey data from open text The prevalence of raw data access as a motivation was
boxes in Atlas.ti v8. Interview analysis was also conducted in
borne out in the rate of data download: 26% (252/974)
Atlas.ti v8 and focused on understanding how and why partici-
of respondents reported they had not or were unsure
pants came to use tools across the multiple domains of health,
ancestry, and genealogy.
whether they had downloaded their raw data from at least
This study was approved by the University of Washington (UW) one of the DTC tests they had taken (note respondents
Institutional Review Board as minimal risk human subjects were asked about download separately for each DTC test
research, protocol #50238. taken). Of these 252, 148 had downloaded data from
another one of the DTC tests they had taken, leaving 104
(11% of 974) who had not downloaded any of their
Results available raw data files. Demographically, non-down-
loaders did not differ from downloaders; however, in pur-
A total of 1,137 eligible respondents took the survey (see sing DTC testing, downloaders were more motivated to
Table 1), including 268 respondents (24%) who did not find relatives (b ¼ 0.839, p ¼ 0.0007) and obtain raw
progress to the end of the survey. The most common data (b ¼ 1.49, p ¼ 9.62e08) compared to non-down-
recruitment venue was Facebook (624/1,137 or 55%), fol- loaders (see also Figure S2).
lowed by Reddit (357/1,137 or 31%), Twitter (71/1,137 or
6%), and the openSNP newsletter (62/1,137 or 5%). Fewer Third-Party Tool Use
than 20 respondents were recruited from each of the re- Impressions and Follow-up Actions
maining venues. Below we report patterns of DTC testing A total of 820 respondents who downloaded raw data also
and TPI tool use, including respondents’ impressions of reported using at least one tool and formed the basis
information received and follow-up actions taken. We for subsequent analyses. Most used multiple tools, with
then analyze users by categories of TPI tools, focusing on 76% (623/820) using two or more (median number of
the common phenomenon (56%, or 458/819 tool users) tools ¼ 3, max number ¼ 11). Thirteen tools were specified
of using both health-related and non health-related tools. in fixed-response survey questions, though respondents
could note additional tools via free text. The most
Patterns of DTC Testing commonly used tools were GEDmatch (n ¼ 688 respon-
DTC Tests Taken dents), Promethease (n ¼ 515), and DNA.Land (n ¼ 450;
Respondents reported ordering a range of DTC tests, with see Table S1). Additional tools most frequently noted in
36% (413/1,137) ordering multiple tests: 21% (236/ the open text box were WeGene (n ¼ 39), FamilyTreeDNA
1,137) ordered two tests, 12% (135/1,137) ordered three, (n ¼ 35), and MyHeritage (n ¼ 34).
and 4% (42/1,137) of respondents ordered four or more. Respondents’ impressions of the information they
While asked explicitly about 23andMe, AncestryDNA, received varied by tool (see Figure 1). Promethease users
and FamilyTreeDNA, respondents noted additional tests overwhelmingly agreed they received information about
via an open text box, most commonly MyHeritage (n ¼ health (171/186, or 92%) and about risk of specific disease
37), Living DNA (n ¼ 32), National Genographic (n ¼ (180/186, or 97%). Notably, some respondents indicated
19), and Genes for Good (n ¼ 18). The majority of DTC receiving information outside a tool’s scope: 16% of
tests were ordered after 2015 (see Figure S1), potentially re- GEDmatch users (83/505) agreed they received some
flecting a rise in DTC test popularity and/or a recency effect health information while 37% of Promethease users (69/
in that those who recently ordered tests were more likely to 188) agreed they received results related to ancestry or
be active in our online recruitment venues. genealogy. (It is possible some respondents who used mul-
DTC Testing Motivations tiple tools did not limit responses to the selected tool as
The most common motivations for pursuing DTC testing directed in survey instructions.) Across all tools, respon-
were general curiosity about genetic make-up (661/972 or dents largely agreed using the tool increased their

124 The American Journal of Human Genetics 105, 122–131, July 3, 2019
Table 1. Survey Respondent Characteristics
All Tool Non-health Health Only Test
Variable Overall Users Only Tools Crossover Tools Statistica p Valuea

Number of respondents 1,137 820b 263 458 98

Mean age (SD; range) 46.4 (15; 46.7 (15; 51.8 (14; 45.5 (15; 39.4(12; 27.2 3.87e12
18–>89) 18–84) 18–83) 18–84) 20–73)

Gender

Women (%) 67.4 67.1 69.8 68.7 53.3 8.99 0.0111

Race (%)

Asian 1.8 1.6 0.9 1.9 2.2 N/A 0.0650

Black or African American 1.7 1.6 0.9 2.1 1.1

Hawaiian or Pacific Islander 0.1 0.1 0.0 0.2 0.0

White 81.6 80.6 76.2 81.7 86.7

Other 3.7 4.0 6.8 2.8 2.2

Prefer no answer 2.2 2.4 4.7 1.4 1.1

Multiplec 8.9 9.8 10.6 10.0 6.7

Hispanic/Latino (%) 6.7 6.6 8.1 6.0 5.6 N/A 0.677

Lives in US (%) 75.9 74.9 74.5 74.2 78.9 1.45 0.485

Max education (%)

Less than high school 1.0 1.1 2.1 0.5 1.1 12.3 0.139

High school graduate or GED 26.1 26.8 27.0 28.1 20.0

College degree 41.0 41.3 39.9 42.2 41.1

Master’s degree 23.1 22.8 21.5 23.2 24.4

Doctorate/terminal degree 8.9 8.1 9.4 6.0 13.3

Occupation (%)

Business, financial, management, sales 14.2 14.2 15.0 15.1 7.8 31.0 0.0556

Computer, engineering, math 16.9 17.1 16.7 15.5 25.6

Life, physical, and social science 9.2 8.2 6.8 7.2 15.6

Legal 2.5 2.6 0.9 3.5 3.3

Education, training, library 14.3 13.6 13.2 15.3 6.7

Arts, design, entertainment, sports, media 4.7 4.6 3.4 5.1 5.6

Healthcare practitioner 8.9 8.9 10.3 7.7 11.1

Office, administrative support 7.7 7.5 6.4 8.8 4.4

Construction, maintenance, natural 1.9 1.9 2.6 1.6 1.1


resources

Production and transportation 1.5 1.7 2.1 1.6 1.1

Other 18.2 19.7 22.6 18.6 17.8

Works in genetic research/medicine (%) 5.1 3.8 3.0 2.3 13.3 24.8 4.11e06

Participant in genetic research (%) 14.9 16.2 11.5 19.0 15.6 7.68 0.0215

Survey respondent characteristics, overall and grouped by type(s) of tools used. For categorical variables, values are given as within-group percentage, excluding
NA/missing values from the denominator. Statistical tests of difference are reported for comparison between groups of tool users: users of non-health only tools,
crossover users (used both health and non-health tools), and users of health-only tools. SD indicates standard deviation.
a
Comparing three groups of tool users: non-health only tool users, crossover tool users, and health-only tool users. For categorical values where all cell counts >5,
the test statistic and p value are from a chi-square test. For categorical values with low cell counts (i.e., race and Hispanic/Latino), the p value is from a Fisher exact
test, and the test statistic is N/A. ANOVA was used to compare continuous variables; the test statistic given is the F value.
b
Of the 820 respondents who used at least one tool, one respondent reported using only ‘‘various R packages’’ and could therefore not be assigned a tool user
group (non-health only; crossover; health only).
c
Respondents who checked more than one box for self-identified race are counted under ‘‘Multiple.’’ Note all American Indian/Alaska Native respondents checked
more than one box and are therefore all counted under ‘‘Multiple’’ here.

The American Journal of Human Genetics 105, 122–131, July 3, 2019 125
Table 2. DTC Tests Ordered and DTC Testing Motivations
Variable Overall All Tool Users Non-health Only Tools Crossover Health Only Tools Test Statistica p Valuea

Number of respondents 1,137 820 263 458 98

DTC Tests Ordered (%)

23andMe 62.4 61.7 40.7 68.1 87.8 85.3 <2.2e16

AncestryDNA 53.9 59.8 73.0 62.0 14.3 104.5 <2.2e16

FamilyTreeDNA 26.8 30.2 37.3 32.1 3.1 41.2 1.14e09


b
Rating of DTC Testing Motivations (% Very Important)

General curiosity 68.0 67.6 55.5 71.4 82.7 30.7 2.1e07

Ancestry 66.0 66.3 69.6 69.8 41.2 31.1 1.8e07

Find relatives 46.9 49.9 62.4 50.4 14.3 66.1 4.4e15

Risk for specific diseases 31.4 30.9 16.1 34.2 54.1 53.9 1.97e12

Limited family health history 20.7 20.7 21.4 21.0 17.7 0.621 0.733

Other family members 10.7 10.7 10.8 11.0 8.3 0.656 0.720
are using

Participate in research 32.3 31.6 26.3 33.8 34.7 4.79 0.0911

Raw genetic data file 50.8 53.1 44.4 56.4 60.4 11.9 0.00263

Comparisons of DTC tests ordered and DTC testing motivations, overall and grouped by type(s) of tools used.
a
Comparing non-health only tool users, crossover tool users, and health-only tool users. Test statistics and p values are from chi-square tests.
b
Responses to DTC testing motivations were dichotomized into ‘‘very important’’ versus ‘‘somewhat important’’ or ‘‘not at all important.’’

understanding of genetics in general (593/778 or 76% non-genetics specialties (e.g., cardiologist, gastroenterolo-
agreed) and of how DTC companies interpret genetic gist, ophthalmologist), psychiatry/psychology, and alter-
data (522/775 or 67% agreed). The majority also felt satis- native medicine (e.g., naturopath, acupuncturist).
fied with the information received (678/770 or 88%), Use of Specific Third-Party Tools
though some reported feeling confused (266/767 or 35%) Next, we sought further understanding of what factors
or upset (48/758 or 6%). These reactions were relatively influenced an individual to use a given tool or combina-
consistent across tools (see Figure S3). tion of tools, in particular when respondents used multiple
The most common follow-up actions were sharing the tools across the domains of health, ancestry, and geneal-
results with a family member (664/780 or 85% shared) or ogy. We performed a series of logistic regression analyses
non-family member friend or loved one (552/778 or 71% of tool use, separately for the 7 tools with at least 40 users:
shared). Other common follow-up actions included pursu- GEDmatch, Promethease, DNA.Land, openSNP, Genetic-
ing additional analysis via a different tool (430/776 or Genie, Interpretome, and Livewello (see Table S2). Desire
55%), pursuing more genetic testing (253/776 or 33%), to learn about ancestry was significantly and positively
or participating in a genetic research study (275/775 or associated with GEDmatch (b ¼ 0.957, p ¼ 3.76e05)
35%). Few respondents made changes to either health in- and DNA.Land (b ¼ 0.447, p ¼ 0.00775). Desire to learn
surance (7/770 or 0.9% changed) or other types of insur- about personal risk for specific diseases was positively asso-
ance (i.e., life or long-term care; 8/771 or 1% changed). ciated with use of Promethease (b ¼ 0.604, p ¼ 0.00146),
Fifteen percent of respondents (116/775) indicated sharing Genetic Genie (b ¼ 1.23, p ¼ 0.000286), and Livewello
results with a health care provider (HCP), most commonly (b ¼ 1.30, p ¼ 0.000766). Desire to obtain raw genetic
a general practitioner (92/116, or 79% of those who data was positively associated with use of DNA.Land (b ¼
shared). As might be expected based on the differing types 0.517, p ¼ 0.00176) and Interpretome (b ¼ 0.854, p ¼
of information offered by each tool, the reported rate of 0.0166). Survey recruitment venue was strongly associated
HCP sharing differed significantly by tool: 8.1% (39/484) with tool use, as GEDmatch and DNA.Land users were
of GEDmatch users shared while 31% (56/183) of Prome- more likely to have been recruited via Facebook while
thease users shared (Chi-square test statistic ¼ 53.4, p ¼ Promethease users were more likely to have been recruited
2.69e13). Notably, across all respondents who shared from Reddit or the openSNP newsletter. DTC test(s) or-
with a HCP, 13 (11% of 116) reported sharing results dered were also linked to tool use (see also Figure 2), with
with a medical geneticist or GC. A total of 52 respondents the most significant associations between 23andMe and
(45% of 116) wrote in an additional type of HCP with Promethease (b ¼ 0.791, p ¼ 3.e06) and FamilyTreeDNA
whom they shared results, including practitioners from and GEDmatch (b ¼ 1.53, p ¼ 0.000199).

126 The American Journal of Human Genetics 105, 122–131, July 3, 2019
Figure 1. Perceived Information Re-
ceived from Tools
Agreement with statements about infor-
mation received from TPI tools, separately
by tool, over 820 respondents who re-
ported using at least one tool. Counts
plotted to the right of x ¼ 0 are for agree-
ment; counts plotted to the left of x ¼
0 are for disagreement.

health-only tool group and lowest in


the non-health only, suggesting that
those who exclusively used health-
related TPI tools were more highly
motivated by raw data access when
initially pursuing DTC testing.
Interviews with Crossover Tool Users
In follow-up interviews with respon-
dents in the crossover group, we
observed tool crossover in both direc-
tions: three initially used health-
related tools before trying non-health
Categories of Third-Party Tool Use tools, and four initially used non-health tools before trying
The large proportion of respondents using multiple third- health-related tools, illustrated by quotes in Box 1. In
party tools, including tools spanning the disparate cate- contrast to those who began with an interest in one
gories of health, ancestry, and genealogy, led us to next domain then moved to another, for three interviewees
examine combinations of tools used. We grouped respon- who were either adopted or had an adopted parent, inter-
dents into those using only health-related tools (n ¼ 98 est in health and genealogy were inextricably bound
respondents); only non-health tools (n ¼ 263); and those together in their search to learn simultaneously about their
using both types (n ¼ 458), which we refer to as the ‘‘cross- biological family and about potential implications for their
over group’’ (see Table S1 for tool characterization). We pre- own health risks.
sent differences in DTC testing motivations, DTC test(s) or- The primary phenomena responsible for tool crossover
dered, and demographics between the three groups were social networking, general curiosity, and initial lack
(health-only tool users, non-health only, and crossover of interesting findings. Facebook and Reddit were the
group) in Tables 1 and 2. There was a linear trend in age, primary social networks in which participants learned
where non-health only tool users were oldest (mean about multiple tools, including those seemingly outside
age ¼ 51.8), health only tool users were youngest (mean the scope of the given Facebook group or sub-Reddit.
age ¼ 39.4), and the crossover group was in between Once participants learned about additional tools, they
(mean age ¼ 45.5). The health-only tool group had a signif- tried them often out of curiosity or general hunger to learn
icantly lower proportion of women (53% versus >68% in more, in particular to go beyond the information provided
the other two groups) and higher proportion of respon- in DTC company reports. An initial lack of interesting find-
dents working in genetic research or medicine (13% ings in one domain also pushed some participants to seek
versus %3% in the other two groups). The crossover group out tools in another area.
had a slightly higher proportion of respondents partici-
pating in (non-DTC) genetic research (19%) compared to Discussion
the health only (15.6%) or non-health only (11.5%) user
groups (p ¼ 0.0215). DTC test(s) ordered were significantly In this sample of DTC customers, we found high rates of
different between the three groups, with the proportion of raw data download and usage of TPI tools. Given some of
23andMe customers increasing from the non-health group our recruitment venues, this volume of TPI tool use may
(40.7%) to crossover group (68.1%) to health-only group be expected; however, the scale and scope were notable.
(87.8%) and a reverse pattern for both AncestryDNA and Specifically, respondents reported using on average three
FamilyTreeDNA (proportion decreasing from non-health different TPI tools, often spanning a range of health,
to crossover to health-only; see Table 2). Notable associa- ancestry, and genealogy. Users of different tool categories
tions between DTC testing motivations and user group (health, non-health, and both) differed in their demo-
included increasing importance of general curiosity about graphics, motivations for DTC testing, and DTC tests
genetics from non-health to crossover to health-only. taken. In follow-up interviews with a subset of crossover
Interest in having raw data was highest among the users (i.e., those using both health and non-health tools),

The American Journal of Human Genetics 105, 122–131, July 3, 2019 127
Figure 2. Tools Used Based on DTC
Test(s) Taken
Results are plotted for 820 respondents
who reported using at least one TPI tool.
Darker blue shading on bars indicates tools
that offer health-related information (see
Table S1).

of patients’ misunderstandings of TPI


reports; however, they do not indicate
how often these scenarios result from
‘‘garden variety’’ TPI tool usage. In the
current study, we observed a relatively
low reported rate of sharing TPI re-
sults with HCPs (15%). This rate was
lower than the 30% (of 321 surveyed)
previously reported by Wang et al.,25
which may be due to our participants
responding based on a specific tool
rather than across all tools used.
Indeed, when limiting to respondents
we observed individuals often migrated to using tools who answered based on Promethease, our rate of HCP
outside their original scope of interest due to peer-to-peer sharing was comparable (31%). Likely of concern to ge-
sharing on social networks, general curiosity, and/or initial netics professionals is that among respondents who shared
lack of interesting results. Below we discuss implications of with HCPs, the majority (79% of 116) did so with general
these findings for the professional genetics community, practitioners or non-genetics specialists, rather than with
including both researchers and providers. medical geneticists or GCs. This will present challenges
Our findings suggest that in the research context, return- for primary care physicians, especially given that TPI re-
ing raw data to participants could present both a benefit ports are typically longer and harder to digest compared
and a liability to researchers. Raw data return may benefit to DTC company reports.15,20,35
the research community by increasing rates of participa- Regardless of whether they brought TPI reports to pro-
tion in genetic research and engagement with and enthu- viders, respondents were frequently engaging with health
siasm for genetics more broadly.12 In our survey, 35% of information via TPI (i.e., 68% of tool users used health-
tool users reported that using third-party tools led them related tools; see also Table S1). This was true even for re-
to participate in a genetic research study, while 76% agreed spondents who were initially intent on finding relatives
that it increased their understanding of genetics in general. and receiving genetic ancestry percentages. Due to the
The potential liability for the researcher community is that flow of information on social media platforms such as
third-party tools vary widely in the quality, scope, and Facebook and Reddit, respondents who started off using
complexity of information returned,15,35 in addition to tools in one domain often switched or ‘‘crossed over’’ to us-
having variable data security and privacy practices.36,37 Ad- ing tools in another. This has implications for those whose
dressing potential legal liability is outside the scope of this TPI reports eventually do prompt them to interact with the
paper, but we suggest that researchers may bear some health care system. For example, those initially interested
ethical responsibility if participants are harmed by in genealogy who later use health tools may be more likely
downstream third-party tool usage in the way some DTC to overestimate the reliability and comprehensiveness of
customers have been harmed.16–19 Notably, in our results, the health information. There is relatively little uncer-
only 6% of respondents reported feeling upset by informa- tainty in identifying close relatives from genotyping array
tion received from third-party tools while 35% felt information; as one interviewee said about genetic geneal-
confused. While not reported by a majority of our respon- ogy, ‘‘DNA doesn’t lie.’’ This is in stark contrast to disease
dents, these negative outcomes merit further study and prediction based on similar data, which is far more
consideration on how to avoid and/or mitigate. probabilistic and uncertain. Furthermore, the raw data
As with DTC testing, one concern with TPI tool usage is file is incomplete given that it is often based on array gen-
downstream overutilization of scarce health care resources, otyping rather than genome sequencing, and may even be
or ‘‘raiding of the medical commons.’’38 Indeed, of the few incorrect.22 The opposite effect is also possible: those
accounts of TPI tool usage to date, several have focused on involved in both paper trail and genetic genealogy may
interactions with the health care system.20–22 These prior realize that genetics alone does not provide complete in-
studies rightly illustrate the potentially alarming outcomes formation when researching family history and so may

128 The American Journal of Human Genetics 105, 122–131, July 3, 2019
Box 1. Interview Quotes (Participant ID in Parentheses)

Initial Use of Health-Related Tools:


I was interested in finding out exactly what my DNA meant, not just what 23andMe wanted to tell me, but maybe other information I could
glean from it.I think I used Promethease first.from what I could tell, it was the most information...[On additionally using GEDmatch] I may
have just Googled ‘‘things to do with raw DNA.’’ I just wanted something different from Promethease. Specifically, I wanted to see if I could find
matches other than what 23andMe had found for me. (574)
I got quite frustrated how boring I am in terms of my ancestry [from DTC reports], there’s hardly any variation...I think that’s why I decided to
concentrate more on the health side of things rather than the rather dull ancestry part, because I was hoping to find something quite exciting
and something I didn’t know about before, and the health side provided that — but the ancestry didn’t.[On additionally using GEDmatch and
DNA.Land] On a lot of these forum posts, they tend to list the third party websites along with their cost. GEDmatch and DNA.Land come up
frequently in these lists on the posts. (58)
Initial Use of Non-health Related Tools:
I was primarily focusing on genealogy at that point [when first using TPI tools].I may have found Promethease mentioned online or in an
article I read.I just said, ‘‘Oh, that’s cool.’’ So I did that.It’s just general interest, I just wanted to see if there was anything to the health
thing, and I don’t carry anything specific.I didn’t think I did, but it’s good to know. But the genealogy thing is what I’ve been focusing on
mostly, because the health is all like, ‘‘Ok, I know. I could get heart disease.’’ I don’t have any ‘‘bad genes,’’ as they say — the high correlation
ones. I don’t really have any of those. (798)
I wasn’t doing it [using TPI tools] for health at all and just when I saw the Promethease and I thought, ‘‘Well for five dollars, you know, we’ll see
what it says.’’ (559)
When I did download [raw data] for the first time, it was specifically to upload to GEDmatch.After that, I explored what else I could do and
found out that I could get more matches by doing free uploads to other sites. I uploaded to DNA.Land, Geni, MyHeritage and FamilyTreeDNA.
Through membership of a genealogy Facebook group, I found out about Promethease and did that out of curiosity. (664)
Inextricably Linked (e.g., for Adoptees):
I’m adopted, and I know a limited amount about my biological mother. I wanted to see if I could fish around and find out what was going on the
other side, find anybody that I was connected to. I was doing all kinds of testing.partly I was trying to take control of whatever health
information I could get, but then also fishing around for family connections. I was doing some of each. (577)
Partially it was ancestry, partially it was finding my bio family, partially it was finding out what [family history] boxes can I check now when
I go to the doctor’s office. It’s a little bit of everything which all comes together for a big ball of happy.It’s not any one single reason. (226)

be better equipped to understand limitations of health- is difficult to know how our survey respondents compare
related genetic information. Indeed, some individuals to the larger population of DTC customers. It is reasonable
who primarily pursue ancestry testing over health-related to infer, however, that our respondents are likely highly
testing may do so because they perceive the latter to lack motivated individuals who were active in these online
accuracy and utility. Providers who better understand the forums and thus may overestimate the degree of data
course patients have taken to leverage their raw data via download and tool usage. The relatively small number of
TPI may be better equipped to calibrate and manage pa- non-downloaders may not have provided enough power
tient expectations and understanding. to detect differences with downloaders. However, we
Another important finding is that approximately 40% of contend that limited generalizability is mitigated in part
tool users agreed they had received health information, by our collection of qualitative data through open text sur-
including about disease risk. This contrasts with our prior vey responses and follow-up interviews, which generated
study of tool developers in which they characterized tools’ deeper understanding of users’ motivations and experi-
direct linking to scientific publications or variant annota- ences. Convenience sampling also allowed a rapid collec-
tion databases as merely ‘‘bridging to the literature’’ and tion of a large number of respondents already engaged in
hence stopping short of actual interpretation.15 However, the topics of interest, which seems appropriate for gaining
our survey data suggest users regard TPI reports as preliminary insight into a relatively understudied area.
providing personally relevant health information. At the Furthermore, by recruiting from social media venues
same time, developers’ claims that ‘‘bridging’’ may increase where individuals were likely to be engaging with raw
understanding of genetic risk15,35 were supported by our data and third-party interpretation, we were able to
survey results: the majority of respondents agreed that us- observe more of the phenomena of interest. The length
ing TPI tools increased their understanding of genetics in of the survey may have contributed to the 24% non-
general (76%) and in particular how DTC companies inter- completion rate and therefore potential survey item
pret genetic data (67%). response bias, though compared to prior surveys,25 we
To our knowledge, this is the largest study to date of raw collected more extensive and granular information about
data and TPI tool usage. However, our recruitment of a con- specific tool usage. Some incongruities between informa-
venience sample via social media limits the ability to tion reportedly received and the actual offerings of tools
generalize findings to DTC customers more broadly, as it suggests that respondents may not have limited responses

The American Journal of Human Genetics 105, 122–131, July 3, 2019 129
to the selected tool as directed; however, these incongru- Web Resources
ities were not widespread.
Service used to transcribe audio recordings from participant inter-
We have focused on the consumer genomics context as views, https://www.rev.com
it is currently the most common way to access raw data. Survey instrument, quantitative survey dataset, and R analysis code,
However, this work can help explore how broadening http://doi.org/10.3886/E105721V3
routes of access may unfold. We have discussed return of
raw data from genetic research above, but another poten-
tial route is through clinical sequencing. Since 2014, the References
HIPAA direct access right has allowed individuals to access 1. Thorogood, A., Bobe, J., Prainsack, B., Middleton, A., Scott, E.,
the contents of their designated record sets,39 which for Nelson, S., Corpas, M., Bonhomme, N., Rodriguez, L.L.,
clinical sequencing laboratories would likely include unin- Murtagh, M., Kleiderman, E.; and Participant Values Task
terpreted sequence data.40 Laboratories are not required to Team of the Global Alliance for Genomics and Health
provide additional explanation or interpretation, which (2018). APPLaUD: access for patients and participants to
may lead recipients to seek out TPI. Indeed, many TPI tools individual level uninterpreted genomic data. Hum. Genomics
accept the Variant Call Format (VCF) file type common to 12, 7.
genome and exome sequencing.15 Future research should 2. Nelson, S. (2016). Geneticists should offer data to participants.
evaluate how individuals’ interactions with their raw Nature 539, 7–7.
3. Lunshof, J.E., Church, G.M., and Prainsack, B. (2014). Infor-
data potentially differ across the contexts of DTC testing,
mation access. Raw personal data: providing access. Science
clinical sequencing, and return of results from research.
343, 373–374.
In summary, moving forward individuals will have 4. Beskow, L.M., and Burke, W. (2010). Offering individual ge-
increasing routes to access their raw genetic data and netic research results: context matters. Sci. Transl. Med. 2,
leverage it in an expanding menu of largely unregulated 38cm20.
TPI services. These activities raise a set of concerns related 5. Bredenoord, A.L., Kroes, H.Y., Cuppen, E., Parker, M., and van
to but distinct from DTC genetic testing and thus merit Delden, J.J.M. (2011). Disclosure of individual genetic data to
further investigation to more fully understand potential research participants: the debate reconsidered. Trends Genet.
harms and benefits. Rather than taking sides in a potential 27, 41–47.
ensuing ‘‘culture war’’ about raw data,41 the professional 6. Wright, C.F., Middleton, A., Barrett, J.C., Firth, H.V.,
genetics community has an opportunity to proactively FitzPatrick, D.R., Hurles, M.E., and Parker, M. (2017). Return-
ing genome sequences to research participants: Policy and
engage with users, understand the complexity of their mo-
practice. Wellcome Open Res 2, 15.
tivations for pursing third-party analysis, and ultimately
7. Middleton, A., Wright, C.F., Morley, K.I., Bragin, E., Firth, H.V.,
educate them about potential limitations. Hurles, M.E., Parker, M.; and DDD study (2015). Potential
research participants support the return of raw sequence
Supplemental Data data. J. Med. Genet. 52, 571–574.
8. Karow, J. (2018). All of Us Program Plans to Return Disease Var-
Supplemental Data can be found online at https://doi.org/10. iants, PGx Results, Primary Genomic Data (GenomeWeb).
1016/j.ajhg.2019.05.014. 9. National Institutes of Health (2018). Informational Webinar
on the ‘‘All of Us Genetic’’ Counseling Resource Funding
Acknowledgments Announcement, https://allofus.nih.gov/sites/default/files/
genetic_counseling_resource_webinar.pdf.
We thank our participants as well as the administrators of the Face- 10. NASEM (2018). Returning Individual Research Results to Par-
book groups and openSNP newsletter who aided in recruitment. ticipants: Guidance for a New Research Paradigm (Washing-
Interview transcription was supported by funds from the UW ton, D.C.: National Academies Press).
Institute for Public Health Genetics. This work was partially sup- 11. Haeusermann, T., Fadda, M., Blasimme, A., Tzovaras, B.G., and
ported by the National Human Genome Research Institute Vayena, E. (2018). Genes wide open: Data sharing and the
(NHGRI) and the National Cancer Institute (NCI) CSER Con- social gradient of genomic privacy. AJOB Empir. Bioeth. 9,
sortium, U01 HG006507 and U24 HG007307 (Jarvik, PI). This 207–221.
research used statistical consulting resources provided by the Cen- 12. Haeusermann, T., Greshake, B., Blasimme, A., Irdam, D., Ri-
ter for Statistics and the Social Sciences at UW. REDCap and the chards, M., and Vayena, E. (2017). Open sharing of genomic
Participant Portal at ITHS are supported by the National Center data: Who does it and why? PLoS ONE 12, e0177158.
for Advancing Translational Sciences of the National Institutes 13. Ball, M.P., Bobe, J.R., Chou, M.F., Clegg, T., Estep, P.W., Lun-
of Health under Award Number UL1 TR002319. shof, J.E., Vandewege, W., Zaranek, A., and Church, G.M.
(2014). Harvard Personal Genome Project: lessons from partic-
ipatory public research. Genome Med. 6, 10.
Declaration of Interests
14. Swan, M. (2012). Crowdsourced health research studies: an
The authors declare no competing interests. important emerging complement to clinical trials in the pub-
lic health research ecosystem. J. Med. Internet Res. 14, e46.
Received: March 12, 2019 15. Nelson, S.C., and Fullerton, S.M. (2018). ‘‘Bridge to the Litera-
Accepted: May 20, 2019 ture’’? Third-Party Genetic Interpretation Tools and the Views
Published: June 13, 2019 of Tool Developers. J. Genet. Couns. 27, 770–781.

130 The American Journal of Human Genetics 105, 122–131, July 3, 2019
16. Kolata, G. (2018). The Online Gene Test Finds a Dangerous 28. Carere, D.A., Couper, M.P., Crawford, S.D., Kalia, S.S., Duggan,
Mutation. It May Well Be Wrong. New York Times, July 3, J.R., Moreno, T.A., Mountain, J.L., Roberts, J.S., Green, R.C.;
2018. D1. https://www.nytimes.com/2018/07/02/health/ and PGen Study Group (2014). Design, methods, and partici-
gene-testing-disease-nyt.html. pant characteristics of the Impact of Personal Genomics
17. Almendrala, A. (2018). Home Genetic Tests May Be Riddled (PGen) Study, a prospective cohort study of direct-to-
With Errors, And Companies Aren’t Keeping Track. Huffingt. consumer personal genomic testing customers. Genome
Post, April 3, 2018. https://www.huffpost.com/entry/home- Med. 6, 96.
genetic-test-false-positives_n_5ac27188e4b04646b6451c42. 29. Krosnick, J.A., and Presser, S. (2010). Question and Question-
18. Hercher, L. (2018). 23andMe Said He Would Lose His Mind. naire Design. In Handbook of Survey Research, P.V. Marsden
Ancestry Said the Opposite. Which Was Right? New York and J.D. Wright, eds. (Emerald), pp. 263–313.
Times, September 16, 2018. SR7. https://www.nytimes.com/ 30. Harris, P.A., Taylor, R., Thielke, R., Payne, J., Gonzalez, N., and
2018/09/15/opinion/sunday/23andme-ancestry-alzheimers- Conde, J.G. (2009). Research electronic data capture
genetic-testing.html. (REDCap)–a metadata-driven methodology and workflow pro-
19. Matloff, E. (2019). I Had Lynch Syndrome For 30 Hours. cess for providing translational research informatics support.
Forbes, February 12, 2019. https://www.forbes.com/sites/ J. Biomed. Inform. 42, 377–381.
ellenmatloff/2019/02/12/i-had-lynch-syndrome-for-30-hours- 31. R Core Team (2013). R: A language and environment for statis-
2/#66ed644a2567. tical computing (R Foundation for Statistical Computing).
20. Allen, C.G., Gabriel, J., Flynn, M., Cunningham, T.N., and 32. Baptista, N.M., Christensen, K.D., Carere, D.A., Broadley, S.A.,
Wang, C. (2018). The impact of raw DNA availability and cor- Roberts, J.S., and Green, R.C. (2016). Adopting genetics: moti-
responding online interpretation services: A mixed-methods vations and outcomes of personal genomic testing in adult
study. Transl. Behav. Med. 8, 105–112. adoptees. Genet. Med. 18, 924–932.
21. Moscarello, T., Murray, B., Reuter, C.M., and Demo, E. (2019). 33. Koeller, D.R., Uhlmann, W.R., Carere, D.A., Green, R.C., Rob-
Direct-to-consumer raw genetic data and third-party interpre- erts, J.S.; and PGen Study Group (2017). Utilization of Genetic
tation services: more burden than bargain? Genet. Med. 21, Counseling after Direct-to-Consumer Genetic Testing: Find-
539–541. ings from the Impact of Personal Genomics (PGen) Study.
22. Tandy-Connor, S., Guiltinan, J., Krempely, K., LaDuca, H., J. Genet. Couns. 26, 1270–1279.
Reineke, P., Gutierrez, S., Gray, P., and Tippin Davis, B. 34. Kirkpatrick, B.E., and Rashkin, M.D. (2017). Ancestry Testing
(2018). False-positive results released by direct-to-consumer and the Practice of Genetic Counseling. J. Genet. Couns. 26,
genetic tests highlight the importance of clinical confirma- 6–20.
tion testing for appropriate patient care. Genet. Med. 20, 35. Badalato, L., Kalokairinou, L., and Borry, P. (2017). Third party
1515–1521. interpretation of raw genetic data: an ethical exploration. Eur.
23. Erlich, Y., Shor, T., Pe’er, I., and Carmi, S. (2018). Identity J. Hum. Genet. 25, 1189–1194.
inference of genomic data using long-range familial searches. 36. Ney, P.M., Ceze, L., and Kohno, T. (2018). Computer Security
Science 362, 690–694. Risks of Distant Relative Matching in Consumer Genetic Data-
24. Ram, N., Guerrini, C.J., and McGuire, A.L. (2018). Genealogy bases. ArXiv, 1810.02895.
databases and the future of criminal investigation. Science 37. Hazel, J.W., and Slobogin, C. (2018). Who Knows What, and
360, 1078–1079. When?: A Survey of the Privacy Policies Proffered by U.S.
25. Wang, C., Cahill, T.J., Parlato, A., Wertz, B., Zhong, Q., Cun- Direct-to-Consumer Genetic Testing Companies. Cornell J.
ningham, T.N., and Cummings, J.J. (2018). Consumer use Law Public Policy 28, 35–66.
and response to online third-party raw DNA interpretation 38. McGuire, A.L., and Burke, W. (2008). An unwelcome side ef-
services. Mol. Genet. Genomic Med. 6, 35–43. fect of direct-to-consumer personal genome testing: raiding
26. Yuan, J., Gordon, A., Speyer, D., Aufrichtig, R., Zielinski, D., the medical commons. JAMA 300, 2669–2671.
Pickrell, J., and Erlich, Y. (2018). DNA.Land is a framework 39. U.S. DHHS (2014). CLIA Program and HIPAA Privacy Rule;
to collect genomes and phenomes in the era of abundant ge- Patients’ Access to Test Reports. 42 CFR x 493, 45 CFR x 164.
netic information. Nat. Genet. 50, 160–165. 40. U.S. DHHS (2016). Individuals’ right under HIPAA to access
27. Greshake, B., Bayer, P.E., Rausch, H., and Reda, J. (2014). their health information 45 CFR x 164.524.
openSNP–a crowdsourced web resource for personal geno- 41. Evans, J.P., and Green, R.C. (2009). Direct to consumer genetic
mics. PLoS ONE 9, e89204. testing: Avoiding a culture war. Genet. Med. 11, 568–569.

The American Journal of Human Genetics 105, 122–131, July 3, 2019 131

You might also like