You are on page 1of 18

Privacy Enhancing

Technologies:
Overview
DT 306 Privacy in the Digital Age
T1 2023-24
T K Srikanth
Fair Information Principles (OECD Guidelines)
OECD Guidelines on the Protection of Privacy 2008, revised 2013

● Collection Limitation
● Data Quality
● Purpose Specification
● Use Limitation
● Security Safeguards
● Openness
● Individual Participation
● Accountability

2
Privacy by Design - Foundational Principles
1. Proactive not reactive; preventive not remedial
2. Privacy as the default
3. Privacy embedded into design
4. Full functionality – positive-sum, not zero-sum
5. End-to-end security – full lifecycle protection
6. Visibility and transparency – keep it open
7. Respect for user privacy – keep it user-centric

Privacy by Design - The 7 Foundational Principles, Implementation and Mapping of Fair


Information Practices, Ann Cavoukian, 2009
GDPR: Key principles
● Lawfulness, fairness and ● Storage limitation: identifiable data
transparency in processing the kept only as long as needed for the
data of individuals intended processing
● Purpose limitation: data processed ● Integrity and confidentiality
in a manner consistent with the (security): appropriate technical
purpose for which it is collected mechanisms and organizational
● Data minimisation: adequate, policies in place to ensure security of
relevant and minimal for the data
intended purpose ● Accountability: identified officer
● Accuracy: ensure that personal who is accountable for compliance
data is correct and up-to-date ● Ensure “privacy by design”

4
GDPR: Rights of individuals
● The right to be informed: about ● The right to data portability:
collection and use obtain and reuse personal data
● The right of access: for example,
across service providers
to check for accuracy
● The right to rectification: when ● The right to object: for use in
inaccurate or incomplete certain uses (such as direct
● The right to erasure: or the “right marketing)
to be forgotten” ● Rights in relation to automated
● The right to restrict processing:
decision making and profiling:
suppress or restrict use of data
request human intervention in
certain situations

5
Security and Privacy
● Information Security
○ Securing data at rest and data in transit
○ Securing computers, servers, networks …
○ Securing application components
○ Access control mechanisms
○ Authentication, Authorization, Auditing (AAA)
● Privacy requirements: Techniques to enable privacy
○ Anonymity
○ Unlinkability
○ Unobservability
○ Data Minimization
○ Consent-based
Privacy, Personalization, Data Utility
Would like to maximize “utility” of data while ensuring privacy.

Trade off between Personalization and Privacy:

Contexts such as location services, web search, recommendation systems,


and a number of applications that leverage personal, location, and
behavioural data
Privacy Enhancing Techniques (PETs)
Can broadly be classified based on where they are applied:
- Client side
- Server side
- Communication or channel side
Some techniques may be applicable in more than one of these
Many techniques based on cryptography and/or controlled data sharing
Nesrine Kaaniche, Maryline Laurent, Sana Belguith, Privacy enhancing
technologies for solving the privacy-personalization paradox: Taxonomy and
survey, Journal of Network and Computer Applications, Volume 171, 2020
Kaaniche et al 2020
User-side PETs
● Controlling access: passwords, biometrics, multi-factor
○ Standards such as OAUTH
○ Privacy preserving certification:
■ Access Credentials
■ Passwordless logins - WebAuthn, FIDO. Passkeys stored on devices and tied
to user account
● Managing consent: generate, modify, revoke
● Browser settings and extensions
○ Cookie management, Ad Blockers, Do Not Track, Block 3rd party cookies
○ Protection from Fingerprinting
● Encrypting communication Given a message m, find D, E, s.t.
○ HTTPS
D(E(m)) = m
○ End-to-end encryption
○ PGP and application-level encryption Transmit/store E(m)
User-side PETs
● Data Obfuscation and Perturbation
○ Modifying data at collection time to minimize identification
○ Obfuscate data so that actual data is not easily available to the server
■ Server and client side uses
■ E.g. client generates dummy queries with “incorrect” attributes, so server does not
know exactly which attribute instance is correct. Useful in Location based services
○ Differential Privacy: Control how much information is revealed - “privacy budget”
■ Google RAPPOR
■ Apple DP
■ Microsoft Telemetry
● Cookie-less tracking
○ Federated Learning of Cohorts (FLOC) (also involves techniques on server side)
Privacy Preserving Computations: Secure Multiparty
Communication
● Enable distributed computing tasks among participating entities in a
secure manner.
● Enable parties to compute a joint task without revealing each other’s data
● Privacy-preserving computations such as private queries, private set
intersections, privacy preserving data mining..
● High computation and communication overheads
Server side PETs
Database and API access:

● Basic security techniques:


○ encryption of sensitive data
○ Access control: role-based, attribute-based
● Privacy-aware access control
○ Authorization based on privacy requirements: minimization, retention period etc
○ Consent-based access - take application state/context into account
Data Anonymization
● Modify database to minimize possibility of identification of individuals
in the dataset, while still preserving statistical control on the dataset
● Typically involves:
○ Data suppression - rows or columns
○ Generalization: replace data with a “range”
○ Perturbation: add noise and/or modify data while preserving statistical measures
● Common techniques:
○ k-anonymity
○ l-diversity
○ t-closeness
● All can be shown to be vulnerable to attacks if additional information
available
● Differential Privacy increasingly being applied where practical
Obfuscation: Private Information Retrieval
Hide the intended meaning of a query or transmission

Query does not reveal what is the attribute of interest

● Searchable encryption schemes: not revealing keywords, query etc in


plaintext

Homomorphic Encryption

● Perform computations directly on encrypted data to produce results that


mirror computation on plaintext
● Run an encrypted query without revealing details to server
● Perform queries without client getting information about contents of db
Communication PETs
Encryption of messages: TLS, SSH, HTTPS all part of internet standards

IPSec: secure tunnels between machines, typically used within organizations

Protecting location and browsing information:

● Virtual Private Networks (VPN): create secure, encrypted connections over


public and private networks.
● The Onion Router (TOR): anonymize traffic - make it impossible to identify
communication channel between client and server

Proxies - Trusted 3rd patries


Blockchain
Can be leveraged for privacy in a system of mutually mistrusting entities

Ensures authenticity and validity of data, controlled access: immutable and


valid

Audit capability - transaction history verifiable

Transactional privacy not available, since data being added is available across
the network. Hence, often, encrypted data is stored

Potential system for managing Consent artefacts


Privacy in the times of ML
Privacy of Training data:
● Should not be able to infer if a given subject was part of training data
Privacy of query:
● enable prediction while maintaining privacy of query input
Training models on distributed datasets without access to entire data.
Federated Learning:
● Define a model
● Train models locally on local dataset
● Exchange weights/meta data, and recompute new weights
● Share new weights and continue training
E.g. keyboard prediction, health data across hospitals

You might also like