You are on page 1of 11

2

Data
protection
Add to myFT

How top health websites are sharing sensitive data with advertisers

Save
Madhumita Murgia and Max Harlow 5 HOURS AGO
Share 55

Some of the UK’s most popular health websites are sharing people’s sensitive data —
including medical symptoms, diagnoses, drug names and menstrual and fertility
information — to dozens of companies around the world, ranging from ad-targeting
giants such as Google, Amazon, Facebook and Oracle, to lesser-known data-brokers and
adtech firms like Scorecard and OpenX.

Using open-source tools to analyse 100 health websites, which include WebMD,
Healthline, Babycentre and Bupa, an FT investigation found that 79 per cent of the sites
dropped “cookies” — little bits of code that, when embedded in your browser, allow
third-party companies to track individuals around the internet. This was done without
the consent that is a legal requirement in the UK.

Google’s advertising arm DoubleClick was by far the most common destination for
data, showing up on 78 per cent of the sites tested, followed by Amazon, which was
present in 48 per cent of cases, Facebook, Microsoft and adtech firm AppNexus.

“These findings are quite remarkable, and very concerning,” said Wolfie Christl, a
technologist and researcher who has been investigating the adtech industry. “From my
perspective, this kind of data is clearly sensitive, has special protections under the
[General Data Protection Regulation] and transmitting this data most likely violates the
law.”

Health for sale


For centuries, physicians have sworn the Hippocratic oath, to keep secret “whatever I
see or hear in the lives of my patients”.

But hundreds of millions of people now turn to the web each day to allay their medical
worries, which range from the mundane to the grave. Despite the illusion of privacy
that exists between users and their computers, the reality is starkly different.
Digging deeper into 10 of the sites, chosen to reflect the different types of health
information they offer to users, the FT looked at the types of data they were sharing.

The investigation excluded data sent to analytics companies to improve the


performance of a website, and consent was given for cookies on all the websites that
requested it. The privacy policies the FT reporters consented to did not adequately
outline that this sensitive data would be shared with third parties, however, or for what
purposes.

The data shared included:

drug names entered into Drugs.com were sent to Google’s ad unit DoubleClick.
symptoms inputted into WebMD’s symptom checker, and diagnoses received,
including “drug overdose”, were shared with Facebook.
menstrual and ovulation cycle information from BabyCentre ended up with
Amazon Marketing, among others.
keywords such as “heart disease” and “considering abortion” were shared from
sites like the British Heart Foundation, Bupa and Healthline to companies
including Scorecard Research and Blue Kai (owned by software giant Oracle).

In eight cases (with the exception of Healthline and Mind), a specific identifier linked to
the web browser was also transmitted — potentially allowing the information to be tied
to an individual — and tracker cookies were dropped before consent was given.
Healthline confirmed that it also shared unique identifiers with third parties.

‘Data silos of undesirables’


Since the adoption of the Europe-wide General Data Protection Regulation in May
2018, the EU online advertising industry, which makes $200bn of annual sales, has
been subject to tighter rules around the collection and processing of data.

It is now illegal for advertisers to share the most sensitive data, including on health and
sexual orientation, without explicit consent, where the user agrees to the specific
sharing of their “special category” data, and is told how it will be used and by whom.
None of the websites tested asked for this type of explicit and detailed consent.

How health websites pass on your personal data

The ultimate destinations of the personal and sensitive data collected and shared by the
websites was opaque, as it was not visible via an internet browser.

Research into the “data broker” industry shows that dozens of companies profit from
buying and selling data to multiple clients who want to better understand users.

Experts believe that the predictive models built by the plethora of advertising and data-
targeting companies may use ill health to profile and prey on users.

Knowledge of an individual’s medical ailments allows companies to try to sell specific


treatments, services or financial products that desperate users might turn to.

“There is a whole system that will seek to take advantage of you because you’re in a
compromised state. I find that morally repugnant,” said Tim Libert, a computer
scientist at Carnegie Mellon university, who built the open source WebXray tool used
by the FT, and specialises in the social and legal implications of online ad tracking.

Previous research in which Mr Libert analysed 80,000 unique pages relating to


common diseases found that more than 91 per cent contacted third parties in the US.
The paper explains that holding such sensitive data on a person can result in
discriminatory marketing, even without marketers knowing their identity.
“As medical expenses leave many with less to spend on luxuries, these users may be
segregated into ‘data silos’ of undesirables who are then excluded from favourable offers
and prices,” Mr Libert wrote. “This forms a subtle, but real, form of discrimination
against those perceived to be ill.”

In the UK, the online advertising industry was put on watch in June by the regulator,
the Information Commissioner’s Office. It gave the industry until December to clean up
its data practices, or face further probes.

“This investigation by the Financial Times further highlights the ICO’s concerns about
the processing of special category data in online advertising, as well as the role that site
owners and publishers play in this ecosystem,” said Simon McDougall, the ICO’s
executive director for technology policy and innovation.

“Special category data — such as health information — requires greater protection


because of its sensitivity and the increased risk of harm to or discrimination against
individuals. We will be assessing the information provided by the FT before considering
our next steps,” he added.

The advertisers’ defence


Google, which powers the online advertising industry, said that it “does not build
advertising profiles from sensitive data . . . and has strict policies preventing advertisers
from using such data to target ads”.
It told the FT that the named sites investigated had been marked as “sensitive”
internally, meaning the information that we found being sent to them was specifically
excluded from the database used for personalised advertising. It said that its technology
might be used to serve “contextual” ads, based on the contents of the page, but not user
information.

The company explained that if a publisher chose to include information like the date of
its visitor’s last period in the URL, it could be sent to Google as part of an ad request
from that page. But Google’s ads systems would not understand what that URL data
represents, nor use it to create profiles of users.
The sensitive data could be used for a variety of other reasons, including protecting
against fraud and abuse and measuring the engagement with an advert, Google said.

Facebook, another frequent tracker across the sites we surveyed, which also received
data on highly sensitive symptoms and diagnoses, was not able to confirm what it does
with this information. “We don’t want websites sharing people’s personal health
information with us — it’s a violation of our rules, and we enforce against sites we find
doing this,” a company spokesperson said. “We’re conducting an investigation and will
take action against those sites in violation of our terms.”

Methodology We carried out our analysis on August 29 2019, based on a list of the top 100 health
sites produced by SimilarWeb based on average UK monthly traffic as of July 2019.
We ran this list through WebXray, an open-source tool that opens each site and
records all the subsequent “requests” made to third parties, and also used HTTP
Toolkit to look more closely at what data were specifically being received by third
parties.

It should be noted that our investigation represents only a limited view, since we
could not see what happened to data beyond the user’s browser, and that it is a
snapshot in time: if the experiment were repeated, even on the same computer in
the same location, it is likely the results would vary.

Amazon said: “We do not use the information from publisher websites to inform
advertising audience segments,” but it did not confirm what it did with the sensitive
data it received, such as user-input fertility information.

It was unclear if either Facebook or Amazon also received personal identifiers, such as
an IP address or a unique ID, alongside health data.

The companies also emphasised that the publishers of the websites were required to
manage user consent and the type of data sent to third parties.

“It’s like a bar saying we don’t like to serve people that are underage, they shouldn’t
come here to drink,” Mr Libert said. “They are being negligent and it’s deeply
disingenuous.”

Meanwhile the website publishers themselves did not provide details of why the data
was being shared or what would be done with it once it left their hands. A WebMD
spokesperson said: “[W]e only use, collect or share user information to the extent
disclosed in our privacy policy.” The policy reviewed by the FT did not appear to
provide clear answers about the fate of the data.
Condensed comments from other companies that responded are published at the end of
this article. Others contacted, including Lotame, ComScore, AppNexus, Drugs.com,
Health.com and Bounty, did not respond to request for comment.

As the ICO’s deadline for online ad auction firms to audit themselves approaches, it will
be a time of reckoning for many in the industry that was until recently self-regulated.

“The internet has turned into a privacy wasteland. But there’s a suspension of disbelief
in the [ad] industry. Companies say they are GDPR-compliant, there’s a codependency
where everybody pretends everything is OK, but the deep technical architecture is
fundamentally incompatible with the right to privacy,” Mr Libert said.

“Ultimately it’s going to be the ICO that decides, and based on early guidance, I suspect
they may not be a willing participant in this fictional world built by online advertisers.”

The business of ad tech explained


Subtitles unavailable

Further company Bupa: “Advertising cookies are used on our site but we have set them so that no
responses personal data about visitors to our website, including our health information pages,
is passed on to third parties.

“Unique IDs are shared with some third parties in order to measure website
performance and engagement. This is anonymised data and is not personally
identifiable. No health information of visitors to our website is shared with third
parties.”

BabyCenter: “Our privacy and consent statements clearly indicate that we may use
data including due date to personalise content and ads. BabyCenter would only pass
personal data to third parties after consent is given. As of August 19, 2019,
BabyCenter is under new ownership and will be rolled under Everyday Health
Group’s GDPR and data privacy consent policies and practices in line with the
digital properties in its portfolio.”

British Heart Foundation: “The data captured by the cookies on our website is
protected (pseudonymised) so it doesn’t directly identify individuals. We don’t sell
data and we don’t share sensitive personal data on areas such as ethnic origin and
health that could directly identify people.

“To reflect recent changes in guidelines we are reviewing how we use cookies and
how we seek consent for their use when people visit our website. In the coming
months, we will be implementing a new version of our cookies model.

“We don’t share or sell sensitive personal information that could directly identify an
individual. We only share information about pages that devices have visited, for
example the URL.”

Healthline: “Of the eight platforms you referenced, five are service providers and are
not used by Healthline.com for advertising. Facebook, Pinterest and Trade Desk are
platforms we may use for re-marketing. However, the data we pass to these
platforms is for our use only and is subject to data protection agreements.”

Mind: “Since the Information Commissioner published updated guidance about


cookies in July, we have been reviewing our practice, including an audit of tracking
across our website. As a result of a report published by Privacy International in
September, we have removed marketing trackers from our site and won’t reinstate
them until our review has finished and we are satisfied we’re using them
appropriately.

“No data is being explicitly shared with Google DoubleClick through the Mind.org.uk
site, however Google sets its own DoubleClick cookies via other Google products to
support development and optimisation.

“We have never sold or shared, and will never sell or share, any of our website users’
personal information with organisations so that they can be contacted for any
marketing activities. Nor do we sell any information about our website users’ web
browsing activity.”

Oracle: “Regarding BlueKai, any site setting a BlueKai cookie is required to collect
data in accordance with applicable legal requirements. In addition, Oracle Data
Cloud has implemented processes designed to prevent the ingest of third-party data
from EU-based users for the sites you reference. Finally, Oracle Data Cloud does not
create or offer any sensitive third-party audience segments on consumers in the
EU.”

These responses have been condensed

Additional research by Edwin Esosa. The UK non-profit Privacy International also


guided us on our methodology

Copyright The Financial Times Limited 2019.


All rights reserved. Share this article Reuse this content

Data protection stories you missed

Analysis Technology sector Special Report Innovative Lawyers: Europe Special Report Cyber Security Biometrics
India’s facial recognition plans spark Data trusts raise questions on Does online security cross the line New details emerge of King’s Cross
privacy debate privacy and governance when crossing borders? facial recognition plans
SEPTEMBER 29, 2019 SEPTEMBER 12, 2019 OCTOBER 14, 2019 SEPTEMBER 3, 2019

Latest on Data protection

Facebook Inc Biometrics Marietje Schaake


Paid Post by Corteva Agriscience
Facebook accused of not co- India defends plans for facial Tech industry should not be
Sustainable food: what Europe’s
operating with privacy probe recognition system deciding on political advertising
consumers want
Follow the topics in this article

Cyber
Add to myFT
Security

Data
Add to myFT
protection

Technology
Add to myFT
sector

Madhumita Murgia Add to myFT

Max
Add to myFT
Harlow

How easy or hard was it to use FT.com today?

Leave feedback

Support Legal & Privacy Services Tools

View Site Tips Terms & Conditions FT Live Executive Job Search Portfolio Enterprise Tools

Help Centre Privacy Share News Tips Securely Advertise with the FT Today's Newspaper (ePaper) News feed

About Us Cookies Individual Subscriptions Follow the FT on Twitter Alerts Hub Newsletters

Accessibility Copyright Group Subscriptions FT Transact MBA Rankings Currency Converter

myFT Tour Slavery Statement & Policies Republishing Secondary Schools

Careers Contracts & Tenders

More from the FT Group

Markets data delayed by at least 15 minutes. © THE FINANCIAL TIMES LTD 2019. FT and ‘Financial Times’ are trademarks of The Financial Times Ltd.
The Financial Times and its journalism are subject to a self-regulation regime under the FT Editorial Code of Practice.

You might also like