You are on page 1of 133

CS 6290

Privacy Enhancing Technologies

Department of Computer Science

Slides credit in part from A. Juels, N. Bielova, M. Langheinrich, S. Yuan, H.


Fosdick, A. Judmayer, V. Shmatikov, I. Gonshorovitz, and L. Cranor
CS6290: Privacy Enhancing Technologies 1
CS6290: Privacy Enhancing Technologies 2
What is Privacy?
Privacy is control over your own information. Freedom from intrusion into
personal matters
Privacy is a person’s right or expectation to control the disclosure of his/her
personal information, including activity metadata
Privacy is the “right to be let alone” — Louis Brandeis
Privacy means something like what the Founders meant by “liberty”
Free speech, free association, autonomy, …
freedom from censorship and constant surveillance
Privacy-motivating examples in U.S. History
Martin Luther King Jr. “blackmailed” by FBI
McCarthyism witch-hunt for communists

3
Direct Sharing

4
5
• Billions of web users surf the web daily
• Some websites have billions of user accounts

➢Tracking users and their online habits


• Lucrative for advertising companies
• Privacy-intrusive to the users

CS6290: Privacy Enhancing Technologies 6


Forms of Tracking
• Client-side • What to track?
• Cookies • Daily expenses
• Server-side • Life pattern
• User accounts • Preference on
commodities
• Device fingerprinting
• Location
• Political choice
• Religion
• …

CS6290: Privacy Enhancing Technologies 7


Cookies
• HTTP protocol is stateless
• Cookies introduced by Napster
• Allows a web server to store a small amount of data on the computers of
visiting users
• Usage: authentication, e.g., sent back to the web server upon subsequent
requests
• Acore technology on which complex, stateful web applications are built

CS6290: Privacy Enhancing Technologies 8


How Cookies Work – the Basics
• A cookie stores a small string of characters
• A web site asks your browser to “set” a cookie
• Whenever you return to that site your browser sends the cookie back automatically

Please store Here is cookie


cookie xyzzy xyzzy

site browser site browser

First visit to site Later visits


CS6290: Privacy Enhancing Technologies 9
Cookie Terminology
• Cookie Replay – sending a cookie back to a site
• Session cookie – cookie replayed only during current browsing session
• Persistent cookie – cookie replayed until expiration date
• First-party cookie – cookie associated with the site the user requested
• Third-party cookie – cookie associated with an image, ad, frame, or other
content from a site with a different domain name that is embedded in the site
the user requested
• Browser interprets third-party cookie based on domain name, even if different domains are
owned by the same company

CS6290: Privacy Enhancing Technologies 10


But … Cookie Abuse?
• Can be used to track users

• A webpage contains various resources


• HTML, images, CSS, JavaScript
• Located on the hosting server, or a third-party server
➢ Third-party cookies

CS6290: Privacy Enhancing Technologies 11


Third-party Cookies
tracking.com
• NYTimes.com contains an image from
tracking.com
• When downloaded, creates a cookie
• When user goes to okcupid.com, also
containing a tracking.com image,
tracking.com detects the user
• Allows tracking.com to profile the user
• Mostly done by Ad networks for
behavioral advertising

User
CS6290: Privacy Enhancing Technologies 12
Third Party Cookies
Facebook, Google, etc. know much more about you than actual website does
because they can track you across websites.

13
How websites profit?
• Subscription-based paid membership?
• Paywall, e.g., NY Times, Wall Street Journal
• Can be easily bypassed, though
• Substack, Patreon, Linkedin premium
• Limited impact…
• Can also be bypassed by bots

• Free access is much more common


• But allow third-party ads

CS6290: Privacy Enhancing Technologies 10


Advertising
• Online Advertising plays a critically important role in the Internet world.
• Advertising is the main way of profiting from the Internet, the history of
Internet advertising developed alongside the growth of the medium itself

CS6290: Privacy Enhancing Technologies 10


Facts and Short History
• First internet banner, 1994, AT&T.

• Also in 1994, the first commercial spam, a "Green Card Lottery".


• The first ad server was developed by FocaLink Media Services and
introduced on 1995.
• In March 2008, Google acquired DoubleClick for US$3.1 billion in cash.

CS6290: Privacy Enhancing Technologies 11


Parties
• Advertiser • Ad-network
• Got money, wants publicity • Got advertising infrastructure,
• e.g., Coca-Cola wants money
• e.g., Google AdSense, Yahoo
• Publisher
• Got content, wants money • Consumer
• Cnn.com • Wants free content

CS6290: Privacy Enhancing Technologies 12


https://www.webfx.com/how-much-does-
Business Model social-media-advertising-cost.html, as in 2022

CPM = Cost
Per • Impression: user just
thousand sees the ad.
impressions

• This is the cost charged


CPC = Cost to an advertiser every
Per Click time their ad is "clicked"
on

CS6290: Privacy Enhancing Technologies 13


Online Behavioral Advertising
• Online behavioral advertising refers to the practice of ad-
networks tracking users across web sites in order to learn
user interests and preferences.
• Benefits
• Advertisers targets a more focused audience which increases the
effectiveness.
• Consumer is “bothered” by more relevant and interesting ads.

CS6290: Privacy Enhancing Technologies 15


How Ad-Networks Match Ads

Most behavioral targeting systems work by categorizing


users into one or more audience segments.

Search history – analyzing search keywords


Profiling users based on Browse history - analyzing content of visited pages
collected data Purchase history, Social networks, Geography

CS6290: Privacy Enhancing Technologies 16


Ad Networks
search for buy CD
medical
information

set cookie replay cookie

Ad Ad

Ad company
can get your
name and
address from
CD order and
link them to
Search Service your search CD Store
CS6290: Privacy Enhancing Technologies 17
CS6290: Privacy Enhancing Technologies
Behavioral Targeting: A Walk-through

Ad network
Advertisers

Publishers

CS6290: Privacy Enhancing Technologies 22


CS6290: Privacy Enhancing Technologies
Partial List of Ad Networks

CS6290: Privacy Enhancing Technologies 23


CS6290: Privacy Enhancing Technologies
CS6290: Privacy Enhancing Technologies 24
CS6290: Privacy Enhancing Technologies
“Surveillance is the Business Model of the Internet”
– Bruce Schneier

CS6290: Privacy Enhancing Technologies 21


Start
11:56am Google/search

Each circle is an event.

Each dot is a tracker for


that event.

End
3:17pm ElizabethWarren.com

https://www.nytimes.com/interactive/2019/08/23/opinion/data-internet-privacy-tracking.html

CS6290: Privacy Enhancing Technologies 26


Tracking You: Psyching You Out

They collected his:


– Location
– Device info
– Behavior
– More...

CS6290: Privacy Enhancing Technologies 27


What Ad Networks may Know…
• Personal data: • Transactional data:
• Email address • Details of plane trips
• Full name • Search phrases used at search
• Mailing address (street, city, state, engines
and Zip code) • Health conditions
• Phone number

“It was not necessary for me to click on the banner ads for information to
be sent to DoubleClick servers.”
– Richard M. Smith

CS6290: Privacy Enhancing Technologies 24


What They Actually Know About You?
• Why not download your raw data and find out?
• Google
• Click on http://takeout.google.com/settings/takeout
• Select the Product data to dump
• Select a file format
• Choose delivery method
Far more intimate than any diary!
• Create Archive
• Facebook
• Go to Settings section, click on “download my data” in General
• Account Settings. They’ll send you a link to your data.

CS6290: Privacy Enhancing Technologies 29


CS6290: Privacy Enhancing Technologies 30
Countermeasures?
• Deleting cookies frequently
• One out of three users do it every month
• Browser extensions that reveal third-
party tracking
• Modern browsers have native support
to reject all third-party cookies
• Private browsing mode
Courtesy: Chris Slane

CS6290: Privacy Enhancing Technologies 31


Popular browser
extension: Ghostery

32
“Do Not Track”
(DNT) header
• The companies “promise”
not to use the information
for advertising purposes
• Websites and may
either honor or ignore
DNT requests
• Hard to verify misuse

CS6290: Privacy Enhancing Technologies 33


DNT Adoption Issues
“But the NAI [Network Advertising Initiative] code also recognizes
that companies sometimes need to continue to collect data for
operational reasons that are separate from ad targeting based on a
user’s online behavior. For example, online advertising companies
may need to gather data to prove to advertisers that an ad has
been delivered and should be paid for; to limit the number of times
a user sees the same ad; or to prevent fraud.”

Translation: we’re going to keep tracking you,


but we’ll simply call it “operational reasons.”

CS6290: Privacy Enhancing Technologies 34


35
Chrome Third Party Cookies

March 2021 — Google began to block third-party cookies in Chrome

Suggesting alternative — "FLoC" — Federated Learning of Cohorts


- Idea: Let advertisers track semi-anonymous groups of thousands of people
- Large number of “cohorts,” groups of people that share interests
- SimHash of domains visited in the last week
- Each flock is assigned an ID — up to advertiser to make sense of what the ID

36
Counter-countermeasure (!)
By Trackers
• Counter-countermeasure: cookie-less tracking
• Browser (device) fingerprinting

CS6290: Privacy Enhancing Technologies 37


Everything Has a Fingerprint

CS6290: Privacy Enhancing Technologies 38


CS6290: Privacy Enhancing Technologies
Browser Fingerprinting
• Features of a browser and its plugins
• Introduced in 2009
• Small number of commercial companies use such methods to provide
“device identification” through web-based fingerprinting

CS6290: Privacy Enhancing Technologies 39


Browser Fingerprinting (cont.)
• Constructive use:
• Combat fraud
• Destructive use:
• Track users between sites, without their knowledge and without a simple way
of opting-out
• Deliver tailored malware

CS6290: Privacy Enhancing Technologies 40


Fingerprinting Web Browsers
• User agent • Installed fonts
• HTTP ACCEPT headers • Cookies enabled?
• Browser plug-ins • Browser add-ons
• Screen resolution
• MIME support
• Clock skew

CS6290: Privacy Enhancing Technologies 41


• First demonstrated in P. Eckersley, “How Unique Is Your Browser?
• Try your own browser here

CS6290: Privacy Enhancing Technologies 42


Your browser fingerprint appears to be unique among the 3,435,834 tested so
far

CS6290: Privacy Enhancing Technologies 43


Panopticlick Example
Plugin 0: Adobe Acrobat; Adobe Acrobat Plug-In Version 7.00 for Netscape; nppdf32.dll; (Acrobat Portable Document
Format; application/pdf; pdf) (Acrobat Forms Data Format; application/vnd.fdf; fdf) (XML Version of Acrobat Forms Data
Format; application/vnd.adobe.xfdf; xfdf) ( Acrobat XML Data Package; application/vnd.adobe.xdp+xml; xdp) (Adobe
FormFlow99 Data File; application/vnd.adobe.xfd+xml; xfd). Plugin 1: Adobe Acrobat; Adobe PDF Plug-In For Firefox and
Netscape; nppdf32.dll; (Acrobat Portable Document Format; application/pdf; pdf) (Adobe PDF in XML Format;
application/vnd.adobe.pdfxml; pdfxml) (Adobe PDF in XML Format; application/vnd.adobe.x-mars; mars) (Acrobat Forms
84% of browser fingerprints are unique
Data Format; application/vnd.fdf; fdf) (XML Version of Acrobat Forms Data Format; application/vnd.adobe.xfdf; xfdf) (
Acrobat XML Data Package; application/vnd.adobe.xdp+xml; xdp) (Adobe FormFlow99 Data File;
With Flash or Java, 94% are unique
application/vnd.adobe.xfd+xml; xfd). Plugin 2: Google Update; Google Update; npGoogleOneClick8.dll; (; application/x-
vnd.google.oneclickctrl.8; ). Plugin 3: Microsoft® Windows Media Player Firefox Plugin; np-mswmp; np-mswmp.dll; (np-
mswmp; application/x-ms-wmp; *) (; application/asx; *) (; video/x-ms-asf-plugin; *) (; application/x-mplayer2; *) (; video/x-
ms-asf; asf,asx,*) (; video/x-ms-wm; wm,*) (; audio/x-ms-wma; wma,*) (; audio/x-ms-wax; wax,*) (; video/x-ms-wmv;
wmv,*) (; video/x-ms-wvx; wvx,*). Plugin 4: Move Media Player; npmnqmp 07103010; npmnqmp07103010.dll; (npmnqmp;
application/x-vnd.moveplayer.qm; qmx,qpl) (npmnqmp; application/x-vnd.moveplay2.qm; ) (npmnqmp; application/x-
vnd.movenetworks.qm; ). Plugin 5: Mozilla Default Plug-in; Default Plug-in; npnul32.dll; (Mozilla Default Plug-in; *; *). Plugin
6: Shockwave Flash; Shockwave Flash 10.0 r32; NPSWF32.dll; (Adobe Flash movie; application/x-shockwave-flash; swf)
(FutureSplash movie; application/futuresplash; spl). Plugin 7: Windows Genuine Advantage; 1.7.0059.0;
npLegitCheckPlugin.dll; (npLegitCheckPlugin; application/WGA-plugin; *).

CS6290: Privacy Enhancing Technologies 44


Commercial Trackers

Cookieless Monster: Exploring the Ecosystem of Web-based Device Fingerprinting

CS6290: Privacy Enhancing Technologies 45


WebFonts
• Problem: Clients ship with ugly fonts
• Solution: Browsers should download fonts from the Internet on
demand!

@font-face { font-family: 'Sirin Stencil'; font-style:


normal; font-weight: 400; src:
url(http://themes.googleusercontent.com/static/fonts/sirinst
encil/v1/[...].woff) format('woff');}

CS6290: Privacy Enhancing Technologies 46


45 Ways To Sirin Stencil
context.font = "12pt 'Sirin Stencil'";

Windows OS X

Reveals:
• Operating system family
• Browser family (except
Linux Chrome, Safari on OS X)
• Installed fonts
• Font smoothing
parameters

CS6290: Privacy Enhancing Technologies 47


History Sniffing
How can a webpage figure out which sites you
visited previously?
• Color of links
• CSS :visited property
• getComputedStyle()
• Cached Web content timing
• DNS timing

CS6290: Privacy Enhancing Technologies 48


CS6290: Privacy Enhancing Technologies 49
Identity Sniffing
[Wondracek et al. Oakland 2010]
• All social networking sites allow users to join groups
• Users typically join multiple groups
• Some of these groups are public
• Group-specific URLs are predictable

• Intersection of group affiliations acts as a fingerprint


• Can sometimes infer identity by computing the
intersection of group membership lists

CS6290: Privacy Enhancing Technologies 50


Brave New World?

How are these identifiers different from


Today, we are announcing the
third-party cookies… launch of Atlas. We’ve rebuilt
Atlas from the ground up to
tackle today’s marketing
Google AdID challenges, like reaching people
across devices and bridging the
gap between online impressions
and offline purchases

CS6290: Privacy Enhancing Technologies 51


Case study: Google Ads

CS6290: Privacy Enhancing Technologies 52


Google’s Place in Advertising

CS6290: Privacy Enhancing Technologies 53


Sponsored Search

CS6290: Privacy Enhancing Technologies 54


Behind the Scene

CS6290: Privacy Enhancing Technologies 55


Display Advertising

CS6290: Privacy Enhancing Technologies 56


State-of-the-art Practice
• Internet Advertising Frontier: Real-Time Bidding (RTB) based Display
Advertising
• What is Real-Time Bidding (RTB)?
• Every online ad view can be evaluated, bought, and sold, all individually, and
all instantaneously.
• Instead of buying keywords or a bundle of ad views, advertisers are now
buying users directly.

CS6290: Privacy Enhancing Technologies 57


Suppose you have just checked the London hotels

CS6290: Privacy Enhancing Technologies 58


Relevant Ads appears in the next webpage you visited!

CS6290: Privacy Enhancing Technologies 59


RTB Advertising Mechanism

CS6290: Privacy Enhancing Technologies 60


Auction Schemes
How to bid so that
I can gain the
maximum profit?

CS6290: Privacy Enhancing Technologies 61


One Natural Idea
• Based on the prediction that how likely the user is going to click the
displayed ad!

CTR: click through rate

CS6290: Privacy Enhancing Technologies 62


Methods
Linear Models Non-linear Models
• Logistic Regression • Gradient Boosting
• With stochastic gradient Decision Trees
descent (SGD) learning
• Factorization Machines
• Sparse solution
• Combined Models
• Online Bayesian Profit
Regression • Deep Neural Networks
•… •…

CS6290: Privacy Enhancing Technologies 63


Data Management Platform (DMP)
• A data warehouse that stores, merges, and sorts, and labels it out in
a way that’s useful for marketers, publishers and other businesses.

How DMP track


a user across
different sites?

CS6290: Privacy Enhancing Technologies 64


Cookie Sync: Merging Audience Data

The process by which two different trackers link the IDs they’ve
given to the same user

CS6290: Privacy Enhancing Technologies 65


Cookie Sync

CS6290: Privacy Enhancing Technologies 66


Cookie Sync (cont.)

Is cookie syncing privacy violating?

CS6290: Privacy Enhancing Technologies 67


Case study: Location privacy

CS6290: Privacy Enhancing Technologies 68


Location privacy?

What’s so special about “location” that it is


worth inventing a special category for it?

CS6290: Privacy Enhancing Technologies 69


CS6290: Privacy Enhancing Technologies 70
CS6290: Privacy Enhancing Technologies 71
CS6290: Privacy Enhancing Technologies 72
CS6290: Privacy Enhancing Technologies 73
CS6290: Privacy Enhancing Technologies 74
https://www.nytimes.com/interactive/2019/12/20/opinion/locati
on-data-national-security.html

CS6290: Privacy Enhancing Technologies 75


Background
• A dataset with more than 50 billion location pings from the phones of
more than 12 million people in the USA.
• Randomly sampled from 2016 and 2017
• Anonymous
A typical day at Grand Central
Terminal in New York City

CS6290: Privacy Enhancing Technologies 76


• “it took only minutes — with assistance from publicly
available information — for us to deanonymize location data
and track the whereabouts of President Trump.”

CS6290: Privacy Enhancing Technologies 77


CS6290: Privacy Enhancing Technologies 78
CS6290: Privacy Enhancing Technologies 79
CS6290: Privacy Enhancing Technologies 80
Who Reveals the Trace?
• The same phone pinged a dozen times at the nearby Secret Service
field office and events with elected officials
→ The phone is believed belonged to a Secret Service agent
→ Trump’s entourage
• Connecting the agent’s home to public deeds further reveals the
person’s name

“The Secret Service declined to comment on


our findings or describe its policies regarding
location data.”

CS6290: Privacy Enhancing Technologies 81


In Fact, Smartphones Track Everyone,
Everyday

CS6290: Privacy Enhancing Technologies 82


Players
• In addition to the most known giant players like Google Maps and
Facebook
• There are plenty more…

CS6290: Privacy Enhancing Technologies 83


How They Get Data?
• Big players have their own apps
• For small players (companies), it’s costly and challenging to build apps
and large audiences from scratch for gaining location data
• Easier to just buy location data from other big players
• E.g., piggyback on bigger app developers, inserting their tracking programs
into established apps via something called software development kits (known
as S.D.K.s)

CS6290: Privacy Enhancing Technologies 84


Companies often pay the apps for access, doling out as
much as USD$20 per 1,000 unique users each month or as
little as $2 per 1,000, depending on the value they can gain

CS6290: Privacy Enhancing Technologies 85


“every app is potentially leaking data to five
or 10 other apps. Every S.D.K. is taking your
data and doing something different … to
learn more about you.”

CS6290: Privacy Enhancing Technologies 86


IOS + Android

CS6290: Privacy Enhancing Technologies 87


The Bright Side?
• The collected location data can help catch criminals
• Help relieve traffic congestion
• Etc…

CS6290: Privacy Enhancing Technologies 88


How to balance the trade-offs
between privacy and utilities?

CS6290: Privacy Enhancing Technologies 89


Transparency Enforcement
• Users should be able to cleanly opt out the tracking practice
• Clearly know what data have been collected and what they will do
with them

CS6290: Privacy Enhancing Technologies 90


Privacy Laws (to list a few)
• The General Data Protection Regulation in the European (GDPR)
• The California Consumer Privacy Act (CCPA)
•…

After Brexit…
CS6290: Privacy Enhancing Technologies 91
The Game Changer?
• It have positive impacts
• Giant companies have to be careful now, otherwise they will receive huge
fines…

• They will surely figure out workarounds, though…

CS6290: Privacy Enhancing Technologies 92


Solution: hiding?

CS6290: Privacy Enhancing Technologies 93


Enforce confidentiality over the Internet
• Methods for protecting personal data

• Most Common/Successful? TLS.

• Comes with browser. Also used for protecting email. It just works, without you having to
configure anything. Protects contents of communication from passive eavesdroppers and
active MITM attacks.

• Tools that provide confidentiality also provide some privacy. You probably don't want
your landlord or coffee shop customers to learn things about you.

94
Encouraging HTTPS Adoption

2014: HTTPS used as a page rank indicator

Early 2018: Mozilla announces that new features will require HTTPS

Late 2018: New Chrome HTTPS indicators

(HTTPS)

(HTTP)

95
Chrome Page Loads over HTTPS

Google Transparency Report


96
STARTTLS as seen by Gmail
100%Yahoo and Hotmail
deploy STARTTLS

75%
Today, 92-93% of
messages are encrypted
50%

Gmail rolls out indicators


25%

0% Gmail Inbound Gmail Outbound


2013 2014 2016 2017 2018 2020
97
Protecting Metadata

TLS only protects content. What doesn’t TLS protect against?


We may want to protect metadata:
• Who is visiting what websites? Who is sending messages to whom?
• Gov’t might not like that you’re visiting Human Rights Watch website
• Gov’t might not be amused that you’re sending messages to Human
Rights Watch
• We may want to hide the existence of the message (maybe sending an
encrypted message at all is going to cause you problems)

98
Solution: anonymization

CS6290: Privacy Enhancing Technologies 99


What is Anonymity?

Anonymity (“without name’’) means that a person is not identifiable within a set of
subjects
Unlinkability of action and identity
- For example, sender and his email are no more related after adversary’s observations
than they were before
- Who talks to whom
Unobservability
- Adversary cannot tell whether someone is using a particular system and/or protocol

100
What is Anonymity?
To protect privacy:
- Avoid tracking by advertising companies
- Viewing sensitive content
- Information on medical conditions
- Advice on bankruptcy
Protection from prosecution
- Not every country guarantees free speech
To prevent chilling-effects
- It’s easier to voice unpopular or controversial opinions if you are anonymous

101
What is Anonymity?

Internet anonymity is hard…


• Right there in every packet is the source and destination IP address
• ISPs store communications records
- Law enforcement can subpoena these records
• Wireless traffic can be trivially intercepted
• Tier 1 ASs and IXPs are compromised — NSA, GCHQ, “Five Eyes”

102
Anonymity

Difficult if not impossible to achieve on your own


You generally need help.

State of the art technique: Ask someone else to send it for you

103
Naive approach .... VPNs

104
Naive approach .... VPNs

105
• Tor is a successful
privacy enhancing
technology that works
at the transport layer
• Millions of active users.
• Normally, a TCP
connection reveals your
IP address
• Tor allows TCP
connections without
revealing your IP

106
Tor (“The Onion Router”)
Tor operates by tunneling traffic through multiple “onion routers” using
public key cryptography

107
Who Knows What?
Entry node: knows Alice is using Tor, and identity of middle node,
but not destination
Exit node: knows some Tor user is connecting to destination, but
not which user
Destination: knows a Tor user is connecting to it via the exit node

Tor does not provide encryption between exit and destination


(use HTTPS!)

108
Does Tor Provide Anonymity?
• Tor provides for anonymity in TCP connections over the Internet, both unlinkably (long-
term) and linkably (short-term).

• What does this mean?


• There’s no long-term identifier for a Tor user
• If a web server gets a connection from Tor today, and another one tomorrow, it won’t
be able to tell whether those are from the same person
• But two connections in quick succession from the same Tor node are more likely to in
fact be from the same person

109
Tor Challenges
Attack: government can Attack: adversary operates
Performance: message coerce server operates in all of the mixes
bounces around a lot (can one country • Defense: have lots of mix servers
• Defense: use mix servers in (Tor has ~7,000 onion routers
be slow) today). Use diverse set.
different legal jurisdictions

A side channel attack –


Attack: adversary observes exploits timing information
when Alice sends and • Defenses: pad messages,
when Bob receives, links introduce significant delays
the two together • Tor does the former, but notes
that it’s not enough for defense

110
Guard Relays
How do you protect against an adversary creating a large number of onion routers
and performing timing observation at entrance and exits?

Limit the servers used for initial connection to a subset of trusted nodes:
- Have long and consistent uptimes…
- Have high bandwidth…
- Are manually vetted by the Tor community

Tor client selects 3 guard relays and uses them for 3 months

111
Exit Nodes
Relays must self-elect to be exit nodes. Why?

- Legal problems

- If someone does something malicious or illegal using Tor and the police trace the
traffic, the trace leads to the exit node

- By self-electing to become an exit node, the relay operator is aware of the


potential legal risks

112
Tor Hidden Services
As described, Tor protects the identity of the client, but not the server

What if we want to run an anonymous service?


- a website, where nobody knows the IP address?

Tor supports Hidden Services…


- Allows you to run a server without disclosing the IP or DNS name

Many hidden services


- Duck Duck Go, Tor Chat, Wikileaks

113
114
115
116
117
118
119
120
Silk Road Marketplace

121
Who uses anonymity systems?
“If you’re not doing anything wrong, you shouldn’t have anything to
hide.”

- Implies that anonymous communication is for criminals

The truth: who uses Tor?

- Journalists, Law Enforcement, Human Rights Activists, Business


Executives, Intelligence/Military, Normal People

122
Internet Censorship
Government censors
Block websites containing “offensive” content
Commonly employ blacklist approach

Observed techniques
IP blocking, DNS blackholes, forged RST packets

Popular countermeasures
Mostly proxy based — Tor, Freenet, Ultrasurf, …
Problem: Cat-and-mouse game

123
Internet Censorship

124
Tor Bridges
Anyone can look up the IP addresses of Tor relays
- Public information in the consensus file

Many countries block traffic to these IPs


- Essentially a denial-of-service against Tor

Solution: Tor Bridges


- Tor proxies that are not publicly known

125
Tor Bridges

126
Obfuscating Tor Traffic
Bridges alone may be insufficient to get around all types of censorship
- DPI can be used to locate and drop Tor frames

Countries would proactively detect and block bridges


- Tor client is public, simply download it
- cat-and-mouse game

Tor adopts a pluggable transport design


- Tor traffic is forwarded to an obfuscation program
- Obfuscator transforms the Tor traffic to look like some other protocol
- BitTorrent, Skype, HTTP, streaming audio, etc.
127
Decoy Routing (Telex)

128
Decoy Routing (Telex)

129
Decoy Routing (Telex)

130
Decoy Routing (Telex)

131
Decoy Routing (Telex)

132
Related References
• New York Times, “The Privacy Project”,
https://www.nytimes.com/interactive/2019/opinion/internet-privacy-project.html
• Vincent Toubiana, et al., “Adnostic: Privacy Preserving Targeted Advertising”, in NDSS’10
• Saikat Guha, Bin Cheng, and Paul Francis, “Privad: Practical Privacy in Online Advertising”, in
NDSS’11
• Michael Backes et al., “ObliviAd: Provably Secure and Practical Online Behavioral Advertising”, in
IEEE S&P’ 12
• Alexey Reznichenko et al., “Private-by-Design Advertising Meets the Real World”, in CCS’14
• Mathias Lecuyer et al., “Sunlight: Fine-grained Targeting Detection at Scale with Statistical
Confidence”, in CCS’15
• Athanasios Andreou et al., “Investigating Ad Transparency Mechanisms in Social Media: A Case
Study of Facebook’s Explanations”, in NDSS’18
• Giridhari Venkatadri et al., “Treads: Transparency-Enhancing Ads”, in HotNets’19

CS6290: Privacy Enhancing Technologies 133

You might also like