Professional Documents
Culture Documents
Privacy
Privacy
USC CSci430
2
Privacy
• The state or condition of being free from
observation.
3
Privacy
• The right of people to choose freely under
what circumstances and to what extent they
will reveal themselves, their attitude, and their
behavior to others.
4
Privacy is not black and white
• Lots of grey areas and points for discussion
• What seems private to you may not seem
private to me
• Three examples to start us off:
– HTTP Cookies
– Google Street View
– Facebook
5
HTTP cookies: What are they?
• Cookies = small text file
• Received from a server, stored on your
machine
– Usually web
• Purpose: HTTP is stateless, so cookies
maintain state for the HTTP protocol
– Eg keeping the contents of your “shopping cart”
while you browse a site
6
HTTP cookies: 3rd party cookies
• You visited your favorite site
unicornsareawesome.com
• unicornsareawesome.com pulls ads from
lameads.com
• You get a cookie from lameads.com, even though
you never visited lameads.com
• lameads.com can track your browsing habits
every time you visit any page with ads from
lameads.com… those might be a lot of pages
7
HTTP cookies: Grey Area?
• 3rd party cookies allow ad servers to
personalize your ads = more useful to you.
Good!
• But
– You choose to go to unicornsareawesome.com =
ok with unicornsareawesome.com knowing about
how you use their site
– Nowhere did you choose to let lameads.com
monitor your browsing habits
8
Short Discussion:
• Collusion: tool to track these 3rd party cookies
• TED talk on “Tracking the Trackers”
– http://www.ted.com/talks/gary_kovacs_tracking_t
he_trackers.html
9
Google Street View: What is it?
• Google cars drive around and take
360° panoramic pictures.
• Images are stitched together and
can be browsed through on the
Internet
10
Google Street View: Me
11
Google Street View: Lots to See
12
Google Street View: Grey Area
• Expectation of privacy?
– I’m in public, I can expect people will see me
• Expectations?
– Picture linked to location
– Searchable
– Widely available
– Available for a long time to come
13
Facebook: What is it?
• Social networking site
– Connect with friends
– Share pictures, interests (“likes”)
14
Facebook: Grey Area
• Who uses Facebook data and how is data
used?
– 4.7 million liked a page about health conditions or
treatments. Insurance agents?
– 4.8 million shared information about dates of
vacations. Burglars?
– 2.6 million discussed recreational use of alcohol.
Employers?
15
Facebook: More Grey
• Security issues with Facebook
• Confusion over privacy settings
• Sudden changes in default privacy settings
• Facebook tracks browsing habits, even if a
user isn’t logged in (third-party cookies)
19
for
20
Today’s Agenda
• Privacy and Privacy & Security
• How do we “safely” share private data?
• Privacy and Inferred Information
• Privacy and Social Networks
• How do we design a system with privacy in
mind?
21
• Privacy and Privacy & Security
• How do we “safely” share private data?
• Privacy and Inferred Information
• Privacy and Social Networks
• How do we design a system with privacy in
mind?
22
Examples private information
• Tons of information can be gained from Internet use:
– Behavior
• Eg. Person X reads reddit.com at work.
– Preferences
• Eg. Person Y likes high heel shoes and uses Apple products.
– Associations
• Eg. Person X and Person Y are friends.
– PPI (private, personal/protected information)
• credit card #s, SSN, nick names, addresses
– PII (personally identifying information)
• Eg. Your age + your address = I know who you are, even if I’m not given your
name.
23
How do we achieve privacy?
• policy + security mechanisms
• + law + ethics + trust
• Anonymity & Anonymization mechanisms
– Make each user indistinguishable from the next
– Remove PPI & PII
– Aggregate information
24
Who wants private info?
• Governments – surveillance
• Businesses – targeted advertising, following
trends
• Attackers – monetize information or cause
havoc
• Researchers – medical, behavioral, social,
computer
25
Who has private info?
• You and me
– End-users
– Customers
– Patients
• Businesses
– Protect mergers, product plans, investigations
• Government & law enforcement
– National security
– Criminal investigations
26
Privacy and Security
• Security enables privacy
– Data is only as safe as the system its on
27
• Privacy and Privacy & Security
• How do we “safely” share private data?
• Privacy and Inferred Information
• Privacy and Social Networks
• How do we design a system with privacy in
mind?
28
Why do we want to share?
• Share existing data sets:
– Research
– Companies
• Buy data from each other
• Check out each other’s assets before merges/buyouts
• Start a new dataset:
– Mutually beneficial relationships
• Share data with me and you can use this service
29
Sharing everything?
• Easy, but what are the ramifications?
• Legal/policy may limit what can be
shared/collected
– IRBs: Institutional Review Board
– HITECH & HIPAA: Health Insurance Portability and
Accountability Act
• Future use and protection of data?
30
Mechanisms for limited sharing
• Remove really sensitive stuff (sanitization)
– PPI & PII (private, personal & private identifying)
– Without a crystal ball, this is hard
• Anonymization
– Replace information to limit ability to tie entities
to meaningful identities
• Aggregation
– Remove PII by only collecting/releasing statistics
31
Anonymization Example
• Network trace:
PAYLOAD
32
Anonymization Example
• Network trace:
PAYLOAD
33
Anonymization Example
• Network trace:
PAYLOAD
34
Anonymization Example
• Network trace:
PAYLOAD
35
Anonymization Example
• Network trace:
PAYLOAD
Removing IPs severely limits what you can do with the data.
Replace with something identifying, but not the same data.
IP1 = A
IP2 = B
Etc.
36
Aggregation Example
• “Fewer U.S. Households Have Debt, But
Those Who Do Have More, Census Bureau
Reports”
37
Methods can be bad or good
• Just because someone uses aggregation or
anonymization, doesn’t mean the data is safe
• Example:
– Release aggregate stats of people’s favorite color?
38
• Privacy and Privacy & Security
• How do we “safely” share private data?
• Privacy and Inferred Information
• Privacy and Social Networks
• How do we design a system with privacy in
mind?
39
What is Inferred?
• Take 2 sources of information, correlate data
• X + Y = ….
• Example: Google Street View + what my car
looks like + where I live = you know where I
was back in November
40
Another example
• Paula Broadwell who had an affair with CIA
director David Petraeus, similarly took
extensive precautions to hide her identity. She
never logged in to her anonymous e-mail
service from her home network. Instead, she
used hotel and other public networks when
she e-mailed him. The FBI correlated hotel
registration data from several different hotels
-- and hers was the common name.
41
Another example: Netflix & IMDB
• Netflix prize: released an anonymized dataset
• Correlated with IMDB: undid anonymization
(University of Texas)
42
• Privacy and Privacy & Security
• How do we “safely” share private data?
• Privacy and Inferred Information
• Privacy and Social Networks
• How do we design a system with privacy in
mind?
43
What is social networking data?
• Associations
• Not what you say, but who you talk to
44
Why is social data interesting?
• From a privacy point of view:
– Guilt by association
– Eg. Government very interested
• Phone records (US)
• Facebook activity (Iran)
45
Computer Communication
• Computer communication = social network
• What sites/servers you visit/use = information on
your relationship with those sites/servers
You Unicornsareawesome.com
48
Don’t we bounce around already?
51
Don’t we bounce around already?
• Yes, we route packets through a series of
routers
• BUT this doesn’t protect the privacy of who’s
talking to who…
• Why? ENCRYPTED
52
Yes, we bounce… but:
• Everyone along the way can see src & dst
• Routes are easy to figure out
ENCRYPTED
53
Onion routing saves us
• Each router only knows about the last/next
hop
• Routes are hard to figure out
– Change frequently
– Chosen by the source
54
The Onion part of Onion Routing
• Layers of encryption
PAYLOAD
You
Unicornsareawesome.com
56
Onion Routing Example: Tor
Get a list of Tor Routers from the publically known Tor directory
57
Onion Routing Example: Tor
Tor Routers
You
Unicornsareawesome.com
58
Onion Routing Example: Tor
You 1st
2nd
3rd
You 1st
2nd
3rd
You 1st
2nd
3rd
Unicornsareawesome.com
61
Onion Routing Example: Tor
2nd
3rd
Unicornsareawesome.com
62
Onion Routing Example: Tor
You 1st
2nd
3rd
Unicornsareawesome.com
63
Onion Routing Example: Tor
You 1st
3rd
Unicornsareawesome.com
64
Onion Routing Example: Tor
You 1st
2nd
3rd
Unicornsareawesome.com
65
Onion Routing Example: Tor
You 1st
2nd
Unicornsareawesome.com
66
Onion Routing Example: Tor
You 1st
3rd
You
You
You
What would happen here if You were the only one using Tor?
72
Side note: Tor is an overlay
76
Designing privacy preserving systems
• Aim for the minimum amount of information
needed to achieve goals
• Think through how info can be gained and
inferred
– Inferred is often a gotcha! x + y = something private,
but x and y by themselves don’t seem all that special
• Think through how information can be gained
– On the wire? Stored in logs? At a router? At an ISP?
77
Privacy and Stored Information
• Data is only as safe as the system
• How long is the data stored affects privacy
• Longer term = bigger privacy risk (in general)
– Longer time frame, more data to correlate & infer
– Longer opportunity for data theft
– Increased chances of mistakes, lapsed security etc.
78