You are on page 1of 16

RAPPOR: Randomized Aggregatable Privacy-Preserving

Ordinal Response
P R I VA C Y P R E S E R V I N G M O D E L ( A N O N Y M O U S D ATA C O L L E C T I O N )

Z O H A I B A K R A M ( FA 2 1 - R I S - 0 1 7 )
A H M A D F I A Z [ FA 2 1 - R I S - 0 0 1 ]
Agenda
 Privacy Preservation
 Problem statement
 Proposed solution.
 Crowdsourcing.
 RAPPOR Algorithm.
 RAPPOR in Chrome
 RAPPOR Modifications.
 Results: RAPPOR implementation in Chrome.
Privacy Preservation
 In our data, we may have two types of information: Sensitive and
non-sensitive. If we share our data to third party without any
privacy, it may be dangerous.
 Before releasing data, we have to take some precautions in term of
privacy. So hiding sensitive information in our data is privacy
preserving.
Problem Statement:
 Statistical data is very important for a company. Because this data
helps it to grow and compete in the market. But it was not suitable
for company to collect data in true value. Because this process
compromises user’s privacy and invites adversaries to attack.
Proposed Solution
 To address above problem, the RAPPOR was proposed to provide
solid privacy to users. RAPPOR was specifically introduced to collect
anonymous statistical data. This technique was introduced by
Google and currently Chrome is using it for privacy preserving.
Crowdsourcing
 Crowdsourcing data to make better, more informed decisions is
becoming increasingly commonplace. For any such crowdsourcing,
privacy-preservation mechanisms should be applied to reduce and
control the privacy risks introduced by the data collection process,
and balance that risk against the beneficial utility of the collected
data.
 RAPPOR is based on the concepts of randomized response.
Introduction
 RAPPOR stands for Privacy-Preserving Aggregable Randomized Response.
 It is a new technology for collecting information from end-user, client-side
software in a way that uses randomized response techniques to provide
strong privacy protection.
 RAPPOR is meant to collect statistics on client-side numbers and phrases,
such as their classes, frequencies, histograms, and other set statistics, over a
large number of clients.
 It provides a solid secrecy guarantee for the reporting client for every
specific value received, which strongly restricts private information released,
as measured by a differential privacy bound, and which holds even for a
single client that reports on the same value repeatedly.
Algorithm
Cont’
• The client value of the string “The number 68" is hashed onto the
Bloom filter B using h (here 4) hash functions. For this string, a
Permanent randomized response B’ is produces and memorized by
the client, and this B’ is used (and reused in the future) to generate
Instantaneous randomized responses S (the bottom row), which are
sent to the collecting service.
• The reported bit array sent to the server is shown at the bottom of
the figure.
Cont’
RAPPOR in Chrome
 understanding population statistics is a key part of an elective, reliable
operation of online services by Cloud service and software platform
operators.
 The collection of up-to-date crowdsourced statistics raises a dilemma for
service operators.
 Reporting on Chrome Homepages The Chrome Web browser has
implemented and deployed RAPPOR to collect data about Chrome clients.
 URLs
 Top level domains
 Hosts
RAPPOR Modifications
• One-time RAPPOR: One time collection, enforced by the client,
does not require longitudinal privacy protection. The Instantaneous
randomized response step can be skipped in this case and a direct
randomization on the true client's value is sufficient to provide
strong privacy protection.
RAPPOR Modifications
• Basic RAPPOR: If the set of strings is small and well-defined, such
that each string can be deterministically mapped to a single bit,
there is no need for using a Bloom filter with multiple hash
functions. In gender case, the effective number of hash functions, h,
would be 1.
Result
• Here RAPPOR privacy parameters are q = 0.75 and p = 0.5,
corresponding to ε= ln(3). True sample distribution is shown in
black; light green shows the estimated distribution based on the
decoded RAPPOR reports. We do not assume a priori knowledge of
the Normal distribution in learning. If such prior information were
available, we could significantly improve upon learning the shape of
the distribution via smoothing.
Thanks

Any Question?

You might also like