You are on page 1of 9

The

Data Trade
Instaread Short Cuts bring you up to speed on the latest research, analysis, and
commentary on today’s hottest topics. In this Short Cut, we explain data
brokering, a practice in which private online data is collected, analyzed, and
sold to groups who hope to use the information to manipulate behavior. Do you
want to learn more about how companies gather so much information about
online users? Or are you curious how much companies and other organizations
can learn about you through your online interactions, and what you can do to
protect your privacy? Find out more in this Instaread original.

The Data Trade

The market for data on individual consumers and users is red-hot. But several
companies have also been caught, red-faced and red-handed, using their
clients’ data in dubious ways. The most notorious of these potential abuses
came to light in the aftermath of the 2016 US presidential election—but it was
many years in the making.

In 2007, an application called myPersonality was uploaded to Facebook,


which had launched its social media website three years prior. The quiz app,
created by Cambridge University researcher David Stillwell, not only gathered
respondents’ answers; it also collected some of their personal data. About two
fifths of respondents agreed to share information from their Facebook
profiles, ostensibly to enhance the results of the personality quiz, generating a
database that contained information about the backgrounds and preferences
of six million people over the course of five years. [1]

Two years after myPersonality’s collection efforts ended, another app called
This is My Life was uploaded to Facebook. Although it was created by
Cambridge University researcher Aleksandr Kogan, the app wasn’t made
strictly for academic purposes. Inspired by myPersonality, Kogan originally
hoped that any data gathered by his app could be used in a partnership using
a model developed by fellow Cambridge colleague Michal Kosinski. Kogan
proposed that Kosinski license his analytical model to a United Kingdom
marketing company called SCL, where it would be applied to any data
collected by This is My Life. Kosinski refused Kogan’s proposal, however, so
Kogan took his app to SCL and created his own model for interpreting the
data. SCL later merged with American hedge fund Renaissance Technologies
and rebranded as Cambridge Analytica, a data firm that has since become
synonymous with political scandal and the misuse of private data gathered
online. [2]
Between 2014 and 2018, Kogan’s app and analysis model helped Cambridge
Analytica collect data on more than 40 million Facebook users. [3] The
personal data was later used by the campaign for Donald Trump, which
deployed Facebook ads to influence voter behavior in the 2016 election. [4]
Cambridge Analytica has also since been accused of illegally participating in a
pro-Brexit campaign in the United Kingdom. [5]

Cambridge Analytica’s most notable donors include Republican hedge fund


manager Robert Mercer and former Trump adviser Stephen K. Bannon.
Christopher Wylie, who helped found Cambridge Analytica, told The New York
Times in 2018 that the company’s focus was not on protecting user privacy,
but instead on shaping the behavior of the general public. For Cambridge
Analytica’s leaders, the Facebook data presented an opportunity to fight a
perceived culture war in both the United States and Britain. Conservative
investors knew the company would be able to use the data it had collected to
predict voter turnout in key regions of the United States, to create ads that
could be targeted at certain populations, and to spread messages meant to
sway political opinions among specific demographics. [6]

Cambridge Analytica isn’t the only company to collect and distribute vast
amounts of information about people’s private lives. Data brokering, or the
practice of gathering and selling data to parties who hope to use it for their
own purposes, has become a lucrative industry in an era when many internet
users freely share personal details in exchange for online convenience. Every
time a user clicks a link, shops online, opens an app, or goes on a walk while
carrying a cell phone or smart device, information about that behavior is
added to a dossier that, in many cases, is sold to the highest bidder. [7]

The most well-known companies that engage in data brokering, such as


Experian, TransUnion, and Equifax, largely gather data to sell to marketers
who want to create more effective advertisements. However, data brokers
have also sold to companies with other goals in mind for the information they
purchase. Personal data has been used to create more effective
advertisements, to exploit emotional volatility in teenagers, and even to target
high-interest loans at low-income families. Collecting and selling data may be
disconcerting, but in the United States, it’s completely legal. Without stronger
legislative protection, consumers and other online users have little hope of
preventing their data from being exploited. [8]

Before Big Data

Americans haven’t always been so complacent about their data privacy. In the
late 1980s, when the US Senate held confirmation hearings on the nomination
of Robert Bork to the Supreme Court, a journalist named Michael Dolan
learned that he and Bork used the same video rental company. On a whim,
Dolan decided to go ask a store clerk whether he could look over Bork’s rental
list. [9] The list contained nothing controversial—mostly Alfred Hitchcock and
James Bond movies, along with a few titles featuring Meryl Streep. [10]
However, the story Dolan wrote, which appeared in the Washington City
Paper, almost immediately sparked controversy. Before long, politicians were
introducing laws on the local, state, and federal level to prevent anyone from
leaking another rental list. [11] The Video Privacy Protection Act (VPPA),
passed in 1988, has since served as one of the major pieces of legislation cited
whenever concerns about the distribution of private data are brought up. [12]

In the twenty-first century, however, the VPPA doesn’t do much to protect


users’ data. The language in the law is outdated and doesn’t cover many of the
innovations that have occurred in the past 30 years. Court interpretation of
the law has also limited its usefulness. For example, judges have ruled that the
law doesn’t cover a user’s IP address, which can easily be used by third parties
to determine the identity of the person using the computer. [13] Some
companies that benefit from sharing user data have also lobbied to change the
law so that information can be more easily shared. In 2012, Netflix attempted
to challenge the VPPA so that Facebook users could share what they were
watching on Netflix with their Facebook friends. Although that proposed
legislation never passed, it gave large companies the opportunity to argue that
users should be able to consent to data distribution online. [14]

VPPA is notable not only for being outdated, but also for being one of the few
laws the United States has passed to protect private data. Although federal
and state regulations limiting the type and amount of personal data that can
be shared do exist, there is no overarching, comprehensive law that protects
American consumers. [15] That lack of legislation has left the United States
behind other government efforts to protect private information, such as the
European Union’s General Data Protection Regulation, or GDPR. [16] Among
other protections, GDPR requires that companies offering services in the EU
provide clearer consent notices that alert users to whether and how their data
will be collected. It also prevents those companies from bundling together
permissions that have little to do with one another so that a user can be
induced to agree to multiple policies at once. The GDPR additionally requires
that companies gain permission from a parent or guardian before gathering
info on users under the age of 16. [17] Without stricter regulations, American
consumers who want to ensure that their data isn’t distributed either have to
rely on companies’ own internal ethical guidelines to keep the information
private, or they must forego internet use altogether. The lack of
comprehensive legislation also means that even if companies promise not to
misuse the data they collect, they have no legal incentive for keeping their
word. For many, buying and selling data is too lucrative an opportunity to
pass up.

The Distributors Are the Collectors

During a 2018 US Senate hearing on data privacy, Facebook founder Mark


Zuckerberg repeatedly told elected officials that the social media site doesn’t
sell user data. Instead, Zuckerberg said, the company offers advertisers access
to specific demographics, which Facebook targets by analyzing the data it
collects. However, a company doesn’t have to sell data directly to third-party
companies in order to provide the same services that data brokers do. By not
selling its data to third parties, Facebook essentially is able to act as its own
data broker and sell advertisers direct access to ideal customer bases. [18]
That practice, along with Facebook’s habit of buying digital competitors such
as Instagram, has prompted some regulators in both Europe and the United
States to express concerns that Facebook and other tech giants have become
monopolies that own and profit from the majority of online activities. [19]

Even if Facebook wasn’t acting as its own data broker, Zuckerberg’s


comments gloss over the fact that other parties—including Cambridge
Analytica—have been able to sell Facebook’s data without the company’s
consent. [20] Zuckerberg likewise failed to mention that a number of other
companies scrape user data using games and surveys developed for Facebook;
some of those apps are ultimately owned by Facebook itself, even if they are
distributed under a different company name. Since 2014, Google, Amazon,
Facebook, and Apple have purchased more than 200 digital companies. Some
were bought and shuttered so that they didn’t become a future competitor;
others simply continued to operate under a new corporate umbrella. [21]

A study published by the 2018 Proceedings of the 10th International ACM


Web Science Conference found that because of these frequent company
acquisitions, most data collected by smartphones ultimately ends up back in
the hands of the biggest tech corporations. In particular, Alphabet, the parent
company of Google, owned tracking companies associated with nearly 88
percent of the apps examined by researchers. A significant portion is also
routed back to Facebook, Twitter, and Verizon. The same study found that out
of nearly 1,000,000 apps examined in the British and American versions of the
Google Play Store, nine out of 10 apps had at least one tracker—a section of
code dedicated to tracking user activity. Almost 20 percent of the examined
apps had more than 20 trackers. In many cases, those trackers were added not
by the company hosting the app, but by another tracking company that has
received permission to embed a tracker in the code. Users of the app may
have a difficult time determining what these other companies are, what kind
of data they’re collecting, and how exactly they’re using the data they’re
scraping. [22]

Data can also be harvested through extensions that are downloaded onto
internet browsers. These extensions usually help users improve their online
experience by offering services like password storage, the ability to block
certain websites, or the ability to quickly search for discounts on online
shopping. When the extensions are downloaded, however, users often
unintentionally agree to also sell access to information about anything they
did through the browser while the extension was installed. The companies
behind the extension can then use the data themselves, or sell it to data
brokers. In a 2019 episode of NPR’s Fresh Air, Washington Post tech columnist
Geoffrey Fowler explained how he worked with an independent researcher to
identify a website called Nacho Analytics, which sells access to information
that can reveal which sites individual users go to, down to web addresses.
Fowler was able to use this information to find tax returns and medical
documents that had been uploaded to online storage services. Fowler even
found out that an extension used by a colleague had gathered that coworker’s
work username, information that was then available for sale on
NachoAnalytics.com. When Fowler reached out to Nacho Analytics for
comment, they pointed out, truthfully, that their business model was not
illegal. As Fowler said of the incident, “I think it’s really telling about the state
of the economy, the internet economy, that what they’re doing is actually
considered pretty common.” [23]

What’s in a Name?

For avid users of Facebook, Twitter, and other social media websites, the
collection and sale of personal data may not seem like a big deal. After all,
people reveal plenty of personal details about themselves directly through
their profile pages. Being unconcerned about data privacy is unwise, however,
if only because the monetization and distribution of personal information by
private companies could increase someone’s risk of identity theft. Add the fact
that many dossiers constructed by data brokers aren’t even accurate, and the
practice of selling data becomes even more alarming. [24]

Imagine, for example, deciding to search Google for information about


managing diabetes after a close family member’s diagnosis with the disease. A
data company, seeing that search, could match it to any other searches that
were performed from that same computer, and then use that information to
determine which person likely entered the search. A life insurance company
buying the dossier from a data broker could now make the assumption that
you have diabetes, and require higher premiums as a result. [25]
That hypothetical situation isn’t too far-fetched. In 2019, New York became
the first state to allow life insurance companies to use information entered on
social media to determine premiums. [26] In a 2019 joint feature, ProPublica
and NPR explained how other online interactions and lifestyle choices can
shape someone’s health care. A woman who buys plus-size clothing, for
example, is likely to be marked as someone suffering from depression. Weight,
marriage status, race, and other factors can also significantly increase health
care costs. Even if the data is accurate, the conclusions drawn from it might
not be. A married woman isn’t more likely to be pregnant than any other
woman, but if she’s changed her name recently, she’s more likely to see her
health costs increase in anticipation of future prenatal care. [27]

Opting Out

At the moment, there’s little that individual users can do to correct inaccurate
dossiers created by data brokers. The data is hard to find and is distributed in
such a manner that incorrect copies might proliferate even if the data broker
makes every effort to remove inaccuracies from a profile. In some states,
consumers have filed lawsuits to force data collectors to correct inaccurate
information. A Virginia resident named Thomas Robins, for example, sued the
website Spokeo after he discovered that the site listed him as a 50-year-old
married man with children who was working in a technical field, a description
that didn’t match him at all. Spokeo is a people search engine, a website that
uses publicly available data from courts and local governments to list people’s
names, phone numbers, addresses, and other personal information in a
manner similar to a phone book. Robins’s case was initially escalated to the
Supreme Court, which found that Robins had provided insufficient evidence
for his claim that Spokeo’s inaccuracies had harmed his ability to get a job.
The US Court of Appeals for the Ninth Circuit eventually ruled in Robins’s
favor, but since his case was not taken up again by the Supreme Court, it
hasn’t resulted in any broader changes in how companies collect, list, and
verify information online. [28] [29]

Limiting the amount of information data brokers can gather, and opting out of
services like Spokeo that collect and display information from public records,
can provide internet users with some amount of protection. However, users
who decide to actively opt out of data broker services should know that they
may have to repeat the process every few months, as some companies use
automatic collection methods to grow their lists. Using ad blockers, deleting
unnecessary smartphone apps, and opting out of pre-approved credit cards
can also limit the amount of information that third parties can track.
Ultimately, however, the most effective method for limiting data collection is
to never go online in the first place. Even then, personal information can still
end up in the hands of data brokers through the activity of friends and family.
Individual responses cannot address the wider privacy concerns caused by
data brokering. Comprehensive legislation could provide users with more
online protections, and allow them to opt out of any service that might make
nefarious use of personal information. [30] But so far that legislation has not
been enacted. For now, at least, the data trade is still heating up.

References

1. “MyPersonality Database.” myPersonality database, July 4, 2013. Accessed


September 3,
2019. https://www.psychometrics.cam.ac.uk/productsservices/myperson
ality
2. Bowen, Flora. “How is Cambridge University linked to Cambridge
Analytica and the Facebook data scandal?” The Cambridge Tab, April 13,
2018. Accessed September 3,
2019. https://thetab.com/uk/cambridge/2018/04/13/how-is-
cambridge-university-linked-to-cambridge-analytica-and-the-facebook-
data-scandal-110205
3. Ibid.
4. Rosenberg, Matthew et al. “How Trump Consultants Exploited the
Facebook Data of Millions.” The New York Times, March 17, 2018. Accessed
September 3,
2019. https://www.nytimes.com/2018/03/17/us/politics/cambridge-
analytica-trump-campaign.html?module=inline
5. Scott, Mark. “Cambridge Analytica did work for Brexit groups, says ex-
staffer.” Politico, July 31, 2019. Accessed September 3,
2019. https://www.politico.eu/article/cambridge-analytica-leave-eu-
ukip-brexit-facebook/
6. Rosenberg.
7. Grauer, Yael. “What Are ‘Data Brokers,’ and Why Are They Scooping Up
Information About You?” Vice, March 27, 2018. Accessed September 3,
2019. https://www.vice.com/en_us/article/bjpx3w/what-are-data-
brokers-and-how-to-stop-my-private-data-collection
8. Ibid.
9. Dolan, Michael. “Borking Around.” The New Republic, December 20, 2012.
Accessed September 3,
2019. https://newrepublic.com/article/111331/robert-bork-dead-video-
rental-records-story-sparked-privacy-laws
10. Maass, Peter. “Was Petraeus Borked?” The New Yorker, June 18, 2017.
Accessed September 3, 2019. https://www.newyorker.com/news/news-
desk/was-petraeus-borked
11. Dolan.
12. “18 U.S. Code § 2710 - Wrongful Disclosure of video tape rental or sale
records.” Legal Information Institute. Accessed September 3,
2019. https://www.law.cornell.edu/uscode/text/18/2710
13. McAllister, Marc Chase. “Modernizing the Video Privacy Protection
Act.” George Mason Law Review, September 22, 2017. Accessed September
3, 2019. http://georgemasonlawreview.org/wp-
content/uploads/2018/10/25-1_5-McAllister.pdf
14. Robertson, Adi. “Netflix urges Senate to let users share viewing data on
Facebook.” The Verge, February 1, 2012. Accessed September 3,
2019. https://www.theverge.com/web/2012/2/1/2764465/netflix-
senate-video-privacy-protection-act-change
15. McAllister.
16. Tiku, Nitasha. “How Europe’s New Privacy Law Will Change the Web, and
More.” Wired, March 19, 2018. Accessed September 3,
2019. https://www.wired.com/story/europes-new-privacy-law-will-
change-the-web-and-more/
17. Kharpal, Arjun. “Everything you need to know about a new EU data law
that could shake up big US tech.” CNBC, May 25, 2018. Accessed September
11, 2019. https://www.cnbc.com/2018/03/30/gdpr-everything-you-
need-to-know.html
18. Rogers, Kaleigh. “Let’s Talk About Mark Zuckerberg’s Claim that Facebook
‘Doesn't Sell Data’.” Vice, April 11, 2018. Accessed September 3,
2019. https://www.vice.com/en_us/article/8xkdz4/does-facebook-sell-
data
19. Laurent, Lionel. “Apple, Facebook and Google Have Lost the Monopoly
Argument.” The Washington Post, June 5, 2019. Accessed September 11,
2019. https://www.washingtonpost.com/business/apple-facebook-and-
google-have-lost-the-monopoly-argument/2019/06/05/2248ae34-8763-
11e9-9d73-e2ba6bbf1b9b_story.html
20. Rogers.
21. Laurent.
22. Binns, Reuben et al. “Third Party Tracking in the Mobile
Ecosystem.” Proceedings of the 10th ACM Conference on Web Science -
WebSci 18, 2018. Accessed September 3,
2019. https://arxiv.org/pdf/1804.03603.pdf
23. Gross, Terry and Dave Davies. “How Tech Companies Track Your Every
Move & Sell Your Data.” Fresh Air, NPR, July 31, 2019. Accessed September
11, 2019. https://www.npr.org/2019/07/31/746970018/how-tech-
companies-track-your-every-move-sell-your-data
24. Grauer.
25. Ibid.
26. Chen, Angela. “Why the Future of Life Insurance May Depend on Your
Online Presence.” The Verge, February 7, 2019. Accessed September 3,
2019. https://www.theverge.com/2019/2/7/18211890/social-media-
life-insurance-new-york-algorithms-big-data-discrimination-online-
records
27. Allen, Marshall. “Health Insurers Are Vacuuming Up Details About You —
And It Could Raise Your Rates.” ProPublica, March 9, 2019. Accessed
September 3, 2019. https://www.propublica.org/article/health-insurers-
are-vacuuming-up-details-about-you-and-it-could-raise-your-rates
28. Grauer.
29. “Spokeo, Inc. v. Robins.” SCOTUSblog. Accessed September 11,
2019. https://www.scotusblog.com/case-files/cases/spokeo-inc-v-
robins/
30. Grauer.

You might also like