You are on page 1of 9

IEEE INTERNET OF THINGS JOURNAL, VOL. 10, NO.

8, 15 APRIL 2023 6733

A Numerical Splitting and Adaptive Privacy


Budget-Allocation-Based LDP Mechanism for
Privacy Preservation in Blockchain-Powered IoT
Kai Zhang , Jiao Tian, Hongwang Xiao, Ying Zhao, Wenyu Zhao, and Jinjun Chen , Fellow, IEEE

Abstract—Blockchain has gradually attracted widespread interactive system, many researchers [3]–[8] have explored
attention from the research community of the IoT, due to its employing a decentralized blockchain network to adapt IoT
decentralization, consistency, and other attributes. It builds a scenarios.
secure and robust system by generating a backup locally for each
participant node to collectively maintain the network. However, In the blockchain, each node will generate a backup
this feature brings some privacy concerns since all nodes can locally for the whole chain data to maintain and synchronize
access the chain data, users’ sensitive information under risk the network [9]. However, such a deep supervision mech-
of leakage. The local differential privacy (LDP) mechanism can anism brings increasing privacy concerns. Since all nodes
be a promising way to address this issue as it implements data have access to the chain data, the sensitive information will
perturbation before uploading to the chain. While traditional
LDP mechanisms cannot fit well with the blockchain since the face the threat of leakage. To address the privacy issue,
requirements of a fixed input range, large data volume, and researchers are devoted to applying privacy-preserving algo-
using the same privacy budget, which are practically difficult in a rithms on the blockchain, such as secure multiparty computa-
decentralized environment. To overcome these problems, we pro- tion [10], zero-knowledge proof [11], homomorphic computa-
pose a novel LDP mechanism to split input numerical data and tion [12], [13], and so on. These research focus on encryption
implement perturbation by digital bits, which does not require a
fixed input range and large data volume. In addition, we use an approaches to work on the ciphertext, which provides higher
iteration approach to adaptively allocate the privacy budget for security in theory while costing much computing resources. It
different perturbation procedures that minimize the total devia- will be practically unacceptable for IoT devices in terms of
tion of perturbed data and increase the data utility. We employ their constraints in storage and computing capability. In addi-
mean estimation as the statistical utility metric under the same tion, encryption methods will lose the statistical utility when
and randomized privacy budgets to evaluate the performance
of our novel LDP mechanism. The experiment results indicate other users tend to extract valuable information from database
that the proposed LDP mechanism performs better in different querying.
scenarios, and our adaptive privacy budget allocation model can Since there is no centralized server in blockchain, the
significantly reduce the deviation of the perturbation function to local differential privacy (LDP) mechanismcan be a promis-
provide high data utility while maintaining privacy. ing approach to address privacy issues. As it can provide both
Index Terms—Adaptive privacy budget allocation, blockchain, privacy guarantee and data utility while does not increase the
local differential privacy (LDP), mean estimation, numerical computational complexity. However, traditional LDP mecha-
splitting. nisms [14]–[17] do not fit well with the blockchain, in terms
of the mean estimation of numerical data. The reasons are as
I. I NTRODUCTION follows.
N RECENT years, IoT applications and relevant 1) The input range is limited in advance. In the Laplace
I research [1], [2] are growing continuously with impressive
speed. When people use IoT devices, many individual data
mechanism and Duchi’s solution, all input data need to
be mapped into a range of [−A, A] in preprocessing.
will be collected by equipment automatically. For data sharing However, under the decentralization scenario, it is prac-
or additional services (predicting, statistics, etc.), some people tically difficult to ask every participant to follow the
would allow the IoT system to upload their data into cloud same input rules.
servers, which leads to much data interaction with third-party 2) The mean estimation result does not perform well in the
applications. It poses strict requirements on the performance case of a small amount of data. It requires a large amount
and stability of the central server. To build a stable and robust of data to balance the positive and negative noise. When
input data volume is small, the perturbation degree is
Manuscript received 10 December 2021; accepted 7 January 2022. Date hard to control.
of publication 25 January 2022; date of current version 7 April 2023. 3) The employed privacy budget needs to be the same for
(Corresponding author: Kai Zhang.)
The authors are with the Department of Computing Technologies, all users. The privacy budget corresponds to the protec-
Swinburne University of Technology, Melbourne, VIC 3122, Australia tion strength of LDP mechanisms. As a decentralized
(e-mail: kevin.zhang0522@gmail.com; 102346450@student.swin.edu.au; system, it is challenging to request all participants to
hxiao@swin.edu.au; yingzhao@swin.edu.au; 102506526@
student.swin.edu.au; jinjun.chen@gmail.com). choose the same protection strength, since privacy is a
Digital Object Identifier 10.1109/JIOT.2022.3145845 very subjective factor.
2327-4662 
c 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.
6734 IEEE INTERNET OF THINGS JOURNAL, VOL. 10, NO. 8, 15 APRIL 2023

To address these issues, we propose a novel LDP mech- one data collector. Hence, users have to follow the same
anism to adapt the nature of blockchain in IoT, the usage data uploading rules, including the fixed input data range and
scenario focuses on the privacy preservation on numerical data employ the same privacy budget. Otherwise, the traditional
in the statistical mean estimation query. In general, for any LDP mechanisms cannot conduct preprocessing (the Laplace
input numerical data, this method splits it into several digits mechanism needs to work out global sensitive f , and Duchi’s
by bit, then translates each digit into binary value for perturba- solution needs to implement discretization) and the data utility
tion. After perturbation, perturbed digits will be aggregated by will also lose greatly. While in the blockchain, each node can
inverse function. In addition, our method employs an iteration be regarded as a data collector. Nodes are equal to each other,
approach to adaptively allocate privacy budgets for different which means they own the rights to make the data upload reg-
perturbation procedures to minimize the total deviation of per- ulation. It is the same to the users, they would tend to set the
turbation functions. According to the experiment result, the privacy protection strength (privacy budget ε) by themselves.
proposed mechanism can provide both the strong privacy guar- Based on the above discussion, we propose a novel LDP
antee and high utility on mean estimation with no requirements mechanism for the blockchain in IoT scenarios to overcome
of the fixed input range, large data volume, and same privacy these problems, and it will be expounded in the following
budget. The main contributions of this article are summarized sections.
as follows.
1) We propose a novel LDP mechanism to fit blockchain
well, with no requirement of the limited input range and III. P RELIMINARIES
data volume. The LDP mechanism emerged with no requirement of the
2) We work out the problem to implement the LDP mecha- trusted third-party collector. The users conduct the data pertur-
nism in blockchain with participants employing different bation before uploading to the server, and since the raw data
privacy budgets. are hidden locally, privacy would be protected. The definition
3) We provide an iteration approach to adaptively allocate of the LDP model as follows: given mechanism M and its
privacy budgets for different perturbation procedures. domain Dom(M) and range Ran(M), for the mechanism M, if
Paper Organization: Section II presents related work any input of sample record t and t (t, t ∈ Dom(M)), and their
and problem analysis. Section III introduces preliminaries. output t∗ (t∗ ⊆ Ran(M)) satisfy the following formula (1), then
Section IV expounds on our algorithm and approaches. The mechanism M satisfies ε-local differential privacy:
experimental results and evaluation metrics are demonstrated
in Section V. Section VI conducts the conclusions and dis-      
P M(t) = t∗ ≤ eε × P M t = t∗ . (1)
cusses future directions.
From the definition, we can learn that the key point of the
II. R ELATED W ORK AND P ROBLEM A NALYSIS LDP model is to control the output of mechanism M. For any
two inputs t and t from domain (M), the output of mecha-
In the past few years, many researchers have contributed
nism M will have a similar result. According to (1) above,
to privacy preservation in the blockchain. Wu et al. [18]
when the privacy budget ε is close to 0, the probability of
presented blockchain-based solutions to address the privacy
P(M(t) = t∗ ) is equal to P(M(t ) = t∗ ), it indicates that the
issues in 5G-enabled drone communications. They also [19]
algorithm is highly protective for the input data. The smaller
conducted deep exploration on the privacy preservation of
the privacy budget it is, the stronger capability of data privacy
the blockchain and edge computing for industry 4.0. For
protection will be, and the worse data utility we obtain.
the studies of DP and LDP mechanisms in the blockchain:
Hassan et al. [20] discussed the privacy-preserving solutions of
blockchain-based IoT systems. Mohanta et al. [21] employed
IV. O UR P ROPOSED A PPROACH
blockchain as a secure database to address the privacy issues
in IoT scenarios. Zhao et al. [22] used the blockchain to A. LDP-Based Data Interaction Framework in Blockchain
trace the update operation in a federated learning model to The overall framework of the proposed LDP model and
avoid malicious attacks. Gai et al. [23] integrated IoT with data stream is demonstrated in Fig. 1. Specifically, the IoT
edge computing and blockchain, and proposed a framework devices collect users’ data and implement perturbation locally,
to establish a privacy-preserving mechanism on industrial then upload perturbed data into the intermediate client mod-
IoT scenarios. Zhao et al. [24] proposed a blockchain-based ule. The intermediate client module is directly connected to
approach to save and track the cost of the DP model. the smart contract through API functions and forwards the
In general, many studies regarded the blockchain system perturbed data from users to the blockchain network. All data
as a secure data collector while less attention toward on the uploading processes are compulsory to pass the consensus.
privacy issues for itself. According to the previous discus- Whenever they failed in network malfunction, the intermediate
sion, the LDP mechanism can be a feasible solution to adapt client module would resend the request to fix the error. The
the blockchain system in IoT Scenarios. But the traditional third-party data analysts can send the query to the smart con-
LDP mechanisms are designed for the centralized server envi- tract to obtain the mean estimation result, and the process of
ronment, it is the essential reason why they cannot adapt querying will be presented in the part of the smart contract in
blockchain well. In the centralized scenario, there has only detail.

Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: NUMERICAL SPLITTING AND ADAPTIVE PRIVACY BUDGET-ALLOCATION-BASED LDP MECHANISM 6735

Fig. 1. Framework of data perturbation and interaction in the IoT scenario.

TABLE I
Algorithm (a) Split Numerical Data into Digits by Bit. E NCODING E XAMPLE OF N UMERICAL N UMBER Nr = 256.48
(b) Transfer Digits into Four-bit Binary Mode
Input: Users’ sensitive numeric data Nr ,
Dp denotes the accuracy of the number of decimal places.
Output: Four-bit binary mode array of integer part int_ANr
Four-bit binary mode array of decimal part decimal_ANr
Function transfer(num)
Bm [0 . . . 9] ← ["0000", "0001", "0010", "0011", "0100", "0101",
"0110", "0111", "1000", "1001"]
for j< 10 do
if num = j then method to realize the transformation between binary and dec-
return Bm [j]
end for imal mode, once the input digits match the index of array Bm ,
End the function will return the corresponding binary string value.
Di denotes the decimal value of each digit split by Nr , and Bi
Set array int_ANr , decimal_ANr to be empty
int_N r ← int(Nr ) denotes the corresponding 4-bit binary value. When the input
decimal_N r ← Nr − int_N r data have a decimal part, the integer part and the decimal part
int_LNr ←length(int_N r ) need to be processed separately. Specifically, the decimal part
decimal_LNr ←Dp
#########For Integer part########## needs to be converted into an integer before processing.
for i < int_LNr do To explain the encoding procedure more clearly, we propose
Di ← int_N r %10 an example here (as shown in Table I): for input numerical
int_N r ← int_N r //10
Bi ←transfer(Di ) number 256.48, we set Dp = 2, then it is as follows.
int_ANr [i]←Bi The following perturbation approach is based on the general
end for random response (GRR) algorithm, which requires input data
#########For Decimal part#########
for j < decimal_LNr do to be a binary value before applying the algorithm. That is the
decimal_N r ← decimal_N r ∗10 reason why we transfer each digit into binary mode.
Di ← int(decimal_N r )
Bi ←transfer(Di )
2) Data Perturbation: The perturbation function will apply
decimal_ANr [i]←Bi to each 4-bit binary value. To be specific, the perturbation
end for function will return the original value with probability p and
Return: int_ANr , decimal_ANr
return the opposite value with probability 1 − p. By this way,
the perturbed 4-bit binary value will be obtained. Before per-
turbation, we need to set the key parameter of total privacy
B. Novel LDP Algorithm budget ε. Here, we use α to denote the length of the valid
In general, the total procedures of the proposed algorithm input number (2), and allocate privacy budget averagely for
can be divided into three steps: 1) encoding; 2) perturbation; each digit
and 3) aggregation. Each step will be demonstrated as follows. α = int_LNr + decimal_LNr . (2)
1) Encoding: For any numerical input data, the encoding
function will split it into several digits by bit. Then, translate Then, the privacy budget of perturbation for each digit
each digit into binary mode from decimal mode. equals (ε/α) (i.e., sequential composition feature of differ-
As the pseudocode demonstrated above, during the encod- ential privacy model proposed by McSherry [25]). Since the
ing procedure, the model inputs user’s sensitive numerical perturbation function applies GRR on a 4-bit binary value,
data Nr , and parameter Dp represents the correctness number according to the definition of differential privacy
to the decimal places. This procedure could obtain the 4-bit ε
e 4α
binary array of both integer part int_ANr and decimal part p= ε . (3)
decimal_ANr . The function of transfer uses the enumeration e 4α + 1

Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.
6736 IEEE INTERNET OF THINGS JOURNAL, VOL. 10, NO. 8, 15 APRIL 2023

TABLE II
P ROBABILITY OF P ERTURBATION F UNCTION
Algorithm (a) Rectify the Perturbed Decimal Array RP .
(b) Aggregate All Perturbed Digits
Input: Perturbed array of integer part int_RP
Perturbed array of decimal part decimal_RP
Output: Final output result Sp
Giving Dn , int_LNr , decimal_LNr
Set Sp , int_Sp , decimal_Sp to be empty
#########For Integer part#########
Algorithm (a) Implement GRR Algorithm on Each Four-bit T←1
Binary Digit by Bit. (b) Transfer Perturbed Four-bit Binary for i < int_LNr do
Digit into Decimal Value n←Dn [i]
C←2n + (15 − 2n)p − 15
Input: Four-bit binary mode array of integer part int_ANr int_Sp ←int_Sp +(C + int_RP [i])∗T
Four-bit binary mode array of decimal part decimal_ANr T←T∗10
Output: Perturbed array of integer part int_RP end for
Perturbed array of decimal part decimal_RP #########For Decimal part#########
Function perturb(bin) T←1
Set Ret[0 . . . 3] to be empty for i < decimal_LNr do
for i< 4 do n←Dn [i]
r←rand.uniform(0, 1) C←2n + (15 − 2n)p − 15
if r > p do T←T∗0.1
if bin[i]= 0 then  
decimal_Sp ← decimal_Sp + C + decimalRP [i] ∗ T
Ret[i]= 1 Sp ← int_Sp + decimal_Sp
else end for
Ret[i]= 0 Return: Sp
end for
return Ret
End
Function transfer(num) transferred into decimal mode, where α denotes the length
Bm [0...15]←["0000", "0001", "0010", "0011", "0100", "0101", "0110", "0111
", "1000", "1001", "1010", "1011", "1100", "1101", "1110", "1111"]
of the valid input number and p represents the perturbation
for j< 15 do probability.
if num = Bm [j] then 3) Aggregation: According to the output result of the sec-
return j
end for
ond step, we have obtained the decimal perturbed value for
End each digit. But the mathematical expectation of perturbation
Giving ε, int_LNr , decimal_LNr function is biased for input digits. To rectify the output, we
Set p, int_Rp , decimal_Rp to be empty
α ←int_LNr + decimal_LNr
adopt coefficient C to adjust the output result, according to (4)
ε
p← e 4α
ε C = 2n + (15 − 2n)p − 15. (5)
e 4α +1
#########For Integer part######### After adding the coefficient C, the output result will be
for i < int_LNr do
Tmpi ←perturb(int_ANr [i]) unbiased. Then, the last step is to aggregate all perturbed
Ri ←transfer(Tmpi ) digits.
int_Rp [i]←Ri Where array Dn denotes the raw input digits for each bit,
end for
#########For Decimal part######### and the final output result is Sp . Then, the whole process of
for i < decimal_LNr do this novel LDP mechanism is completed. Since the proposed
Tmpi ←perturb(decimal_ANr [i]) LDP mechanism can directly apply to numerical data with no
Ri ←transfer(Tmpi )
decimal_Rp [i]←Ri other requirements, it can adapt to the decentralized system
end for well. Furthermore, because the perturbation function is based
Return: int_RP , decimal_RP on the 4-bit binary value, the upper bound and lower bound of
the output result is in the range of [−C, 15+C] for each digit.
Hence, the bounded output will have better utility performance
As Table II shows, there are five scenarios of perturbation under the situation of employing different privacy budgets.
result: 1) binary value without any change; 2) binary value
with the change of 1 bit; 3) 2 bits; 4) 3 bits; and 5) 4 bits. Then, C. Iteration Approach to Minimize Deviation
the next step is to transfer perturbed 4-bit binary value into
According to the proposed LDP mechanism, we can obtain
decimal value. The detailed enumeration result of input digits
great data utility in the statistical mean estimation query. But
(from 0 to 9) is demonstrated in the Appendix. According to
in the previous procedures, we allocate a privacy budget for
the enumeration methodology, we can obtain the general math-
each digit averagely, which will come up with a bigger devi-
ematical expectation formula of the output of the perturbation
ation in high-order bits. To consider more on data utility, we
algorithm is
employ the iteration approach to adaptively allocate the pri-
ResultP(n) = (−15 + 2n)p + 15 − n. (4) vacy budget for each digit to minimize the total deviation of
our perturbation function. The definition of deviation is as
As the pseudocode demonstrated above, during the pertur-
follows:
bation procedure, the input data is a 4-bit binary array of both  
integer part int_ANr and decimal part decimal_ANr . The per- deviation = abs perturbeddata − originaldata (6)
turbed output results are int_RP and decimal_RP correspond to abs(deviationlast − deviationcurrent )
the integer part and decimal part, respectively, which has been decrementrate = . (7)
deviationlast
Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: NUMERICAL SPLITTING AND ADAPTIVE PRIVACY BUDGET-ALLOCATION-BASED LDP MECHANISM 6737

TABLE III
Algorithm (a) Iterate the Allocation of Privacy Budget 4-B IT B INARY VALUE OF t, t , AND t∗
Input: Initial Eparray
Output: Final Allocation Privacy Budget Array Eparray
Giving Input Number N
deviationoriginal = Perturb(N, Eparray )
Set Eparray to be empty
Epoch←500
LR←0.2
LN ←length(N)
deviationmin ←deviationoriginal
for i < Epoch do part. Then, transfer decimal parts into integers according to the
Eptmp ←Eparray precision setting. Finally, store both integer and decimal parts
for j < LN do 
Eptmp j ←Eptmp j − LR
into a two-dimension array with the same first-index number.
for k < LN do When the blockchain receives a query of mean estimation,
Eptmp [k]←Eptmp [k] + LR the smart contract will add up all stored values for both integer
deviationtmp ←Perturb(N, Eptmp )
if deviationtmp < deviationmin do
and decimal parts, respectively. Then, return these results to
deviationmin ←deviationtmp the external interface function with a parameter of precision
Eparray ←Eptmp rate in the decimal part. The external interface function will
end for
  work out [according to (8)] the final result and response to
Eptmp j ←Eptmp j + LR
end for queries
end for
Return: Eparray Rfinal = Suminteger + Sumdecimal ∗ Rateprecision . (8)

E. Differential Privacy of Proposed Algorithm


Intuitively speaking, higher order bits should allocate more According to the abovementioned procedures, we can obtain
privacy budget than lower ones, since higher bits will generate the perturbed digits which satisfy the definition of the LDP
much more deviation than lower ones. Based on this, we use mechanism. We provide the proof as follows.
the iteration approach to allocate the privacy budget more rea- Theorem 1: The proposed algorithm satisfies the definition
sonably. Before starting the iteration approach, we allocate the of ε− LDP.
privacy budget averagely for each digit as initial status. Then, Proof: According to (1), for any input digits t and t , they
we employ the learning rate (LR) as a parameter to adjust have the probability to output the same result of t∗ . As demon-
the allocation of privacy budget, which will transfer privacy strated in Table III, each input digit will be transferred into a
budget value from a lower bit to a higher one. The LR value 4-bit binary value, then it is as follows.
is set empirically, we have conducted more than 1000 times To obtain the output of t∗ , input digits t and t need to per-
of experiments, and we found the most suitable value should turb each binary bit into the same value as t∗ . The binary
be set as 0.2. Each iteration will work out a deviation result, bit only has two values (0 or 1), we use Nt and Nt to
we employ the minimal one to be the start value of the next denote the number of binary bits, which originally equal to t∗ ,
batch. After many times of iterations, the deviation curve will respectively. Then, 4 − Nt and 4 − Nt represent the numbers
tend to be smooth and the decrement rate of deviation will be unequal to t∗
very small. Then, we regard the iteration process as tending            
to converge. The pseudocode of the iteration procedure is as P(t | t∗ ) P n1  n∗1 ∗ P n2  n∗2 ∗ P n3  n∗3 ∗ P n4  n∗4
   =    ∗         
follows. P t  t∗ P n1  n1 ∗ P n2  n∗2 ∗ P n3  n∗3 ∗ P n4  n∗4
i=4   ∗ 
In this way, users can work out the best privacy budget P ni  ni
allocation for each digit to provide better data utility than the = i=1 
i=4    ∗ 
average allocation method. To evaluate the performance of the i=1 P ni ni
proposed iteration approach to deviation error reduction, we PNt ∗ (1 − P)4−Nt
=  
set a benchmark of applying 1-bit perturbation on the lowest PNt ∗ (1 − P)4−Nt
integer digit. Obviously, the benchmark can get the lowest |N t −Nt |
= P|Nt −Nt | ∗ (1 − P)

deviation in theory, since it only perturbs 1 bit. But the final
output of the iteration approach is very close to it, that is how ≤ eε
the proposed approach can affect deviation reduction.
where P presents the probability of return same values in per-
turbation function (P ∈ [0, 1]). Since the minimal result for eε
D. Fusion With Smart Contract  
is equal to 1, then P|Nt −Nt | ∗ (1 − P)|Nt −Nt | should be always
The smart contract takes a significant role in the blockchain smaller than or equal to eε .
network, it can store the uploaded information by predeployed
functions. Since it is programmed by Solidity Language, the
V. E XPERIMENT AND E VALUATIONS
smart contract usually does not support float data type well.
Then, a preprocessing step is necessary for noninteger input A. Experimental Setup and Evaluation Metrics
data. Specifically, the preprocessing step will split numerical Data Sets: There are two data sets used in this article:
data into two parts: 1) the integer part and 2) the decimal 1) heart rate (HR) [26] and 2) insurance cost (IC) [27]. The

Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.
6738 IEEE INTERNET OF THINGS JOURNAL, VOL. 10, NO. 8, 15 APRIL 2023

HR data set contains HR time-series data that come from a


person’s daily record. It has 42 963 samples in six-days record-
ing. The IC data set includes insurance details of different
persons, we focus on the charges of the individual medical
cost. It contains 1338 samples covering different groups of
people.
Comparative Approaches: For comparison purposes, we
have conducted the experiments on some current exist-
Fig. 2. Utility loss for LDP mechanisms in different privacy budgets.
ing approaches: the Laplace mechanism [14], Duchi’s solu-
tion [15], [16], and Harmony-mean [17]. Harmony-mean is
an improvement based on Duchi’s solution and has less time
complexity and computational overhead in high-dimensional
data sets. While in low-dimensional data sets, there is little
difference between them.
Evaluation Metrics: To estimate the performance of the
proposed LDP mechanism, we use utility loss (9) as the evalu-
ation metric. It indicates the deviation degree of the perturbed
result. The smaller the utility loss, the better performance of
the LDP mechanisms. In the experiment, we consider about
three scenarios. Fig. 3. Utility loss for LDP mechanisms in different data volume.
1) The utility loss of employing different privacy budgets
among different algorithms.
2) The utility loss of employing the same privacy bud-
get among different algorithms with the increasing data
volume.
3) The utility loss of employing randomized privacy bud-
gets among different algorithms with the increasing data
volume.
For each round, we repeat the experiment 1000 times and
query the mean estimation result for different algorithms Fig. 4. Utility loss for LDP mechanisms by employing randomized privacy
budget.
|Perturbed_Datamean − Raw_Datamean |
Utilityloss = × 100%.
Raw_Datamean
(9) of situations and always has less utility loss compared to other
methods.
Blockchain Network: We use Fisco-Bcos [28] as the under- Besides, we also consider the scenarios under different data
lying network, which is a completely opensource blockchain volumes. According to Fig. 3, traditional LDP mechanisms
platform independently developed by the Financial Blockchain require a large quantity of data to balance the positive and
Shenzhen Consortium Group. negative noise, utility loss will decrease by increasing data
Experimental Environment: The experiments are imple- volume. While our proposed novel mechanism performs well
mented using the following environmental settings: Hardware: as always, with no requirement of large data volume.
CPU: Intel I9-10900, and Memory: 32 GB. Software: The second round of the experiments are comparisons of
Python 3.6, Numpy 1.15, Permissioned blockchain: utility loss for different mechanisms employing randomized
Fisco-Bcos, and Virtual machine for smart contract: EVM privacy budgets. In this round of experiments, the perturbation
functions employ a randomized privacy budget value at each
B. Experimental Result and Analysis time, in the range of 0–10. It is aiming to simulate the decen-
The first round of experiments is the comparisons tralized environment in the blockchain that different users will
of our novel LDP mechanism, the Laplace mechanism, tend to choose their privacy budget value.
Harmony-mean, and Duchi’s solution on statistical mean According to Fig. 4, employing the randomized privacy
estimation. This round of experiments is suitable for both budget will generate uncontrollable noise in traditional LDP
one-dimension and high-dimension numeric data. For the easy mechanisms, and lead to the great loss of data utility. However,
purpose, we only implement the comparison of one-dimension since the output of our novel mechanism is fixed in the
numeric data. range of [−C, 15 + C], the total noise is limited within an
We employ utility loss as the main evaluation metric accord- acceptable range.
ing to (9). We conduct experiments on both HR and IC data The third round of experiments is the iteration process of
sets to compare the performance of all LDP mechanisms numerical numbers. In this round of experiments, we conduct
by employing privacy budgets from 0 to 1. According to the iteration approach to adaptively allocate the privacy bud-
Fig. 2, the performance of current LDP mechanisms gradu- get of numerical number 256 as an example to minimize the
ally becomes better with the increment of the privacy budget. deviation of the perturbation function. The iteration algorithm
But our proposed novel mechanism performs well in a variety has demonstrated in Section V-C.
Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: NUMERICAL SPLITTING AND ADAPTIVE PRIVACY BUDGET-ALLOCATION-BASED LDP MECHANISM 6739

TABLE IV
P RIVACY B UDGET VALUES D URING I TERATION However, the privacy concern continues rising since all
participant nodes can access the chain data, users’ sen-
sitive information would be vulnerable in the blockchain
network.
In order to address the privacy issues, this article proposes
a novel LDP mechanism to protect sensitive data with no
requirement of the fixed input range, large data volume, and
same privacy budget. In addition, we use the iteration approach
to adaptively allocate privacy budget for different procedures,
minimize the total deviation of the perturbation function. The
experiments have compared our proposed mechanism with tra-
ditional ones in different cases and demonstrated details of the
iteration approach.
In conclusion, our novel mechanism performs better
(i.e., with higher utility) under different circumstances. In
future work, we will consider improving our proposed
mechanism to adapt the exponential mechanism, then
it can deal with categorical data in acceptable time
complexity.

Fig. 5. Iteration process of number 256. A PPENDIX

First, we employ a LR equal to 0.2 and a privacy budget


for each binary bit equals 1, as an example here. Table IV
demonstrates the iteration process of allocating privacy bud-
gets in different digits. We transfer the privacy budget from
lower order bits to higher ones and calculate the deviation of
each batch. If the deviation is smaller than the last time, we
employ it as the start value of the next iteration procedure. We
use the decrement rate (7) as a stop condition for the iteration
procedure. When the decrement rate is less than 0.1%, it is
regarded as converged, since the curve is tending to be stable.
After more than 40 times of iteration, the final curve of devi-
ation tends to converge, which is shown in Fig. 5. To evaluate
the performance of the iteration algorithm, we set a 1-bit per-
turbation of the lowest integer bit as a benchmark of minimum
deviation, which will apply on digit 6. We also consider the
scenario of employing different privacy budgets for the binary
bit, as Fig. 5 shows above, the increasing employment of the
privacy budget will bring less deviation of the final iteration
process.
According to Fig. 5 the iteration approach can significantly
decrease the deviation of perturbation function by adaptively
allocating the privacy budget. Even the iteration function can-
not obtain a smaller deviation than the benchmark, but very
close (when ε > 1.75 for each binary bit). Hence, our algo-
rithm can provide both privacy guarantee and less deviation
by iteration approach.

VI. C ONCLUSION AND F UTURE W ORK


Blockchain plays a significant role in IoT scenarios
to build security and robust system as its decentral-
ized, co-maintained, and consistent characteristics. Meeting
with the fragmentation and lightweight structure of IoT
devices, the blockchain can build a stable system at a low
cost, which is difficult to a traditional centralized system.

Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.
6740 IEEE INTERNET OF THINGS JOURNAL, VOL. 10, NO. 8, 15 APRIL 2023

[2] Y. Wu, Z. Wang, Y. Ma, and V. C. M. Leung, “Deep reinforcement


learning for blockchain in industrial IoT: A survey,” Comput. Netw.,
vol. 191, May 2021, Art. no. 108004.
[3] C. Liu, Y. Xiao, V. Javangula, Q. Hu, S. Wang, and X. Cheng,
“NormaChain: A blockchain-based normalized autonomous transaction
settlement system for IoT-based E-commerce,” IEEE Internet Things J.,
vol. 6, no. 3, pp. 4680–4693, Jun. 2019.
[4] O. Novo, “Scalable access management in IoT using blockchain:
A performance evaluation,” IEEE Internet Things J., vol. 6, no. 3,
pp. 4694–4701, Jun. 2019.
[5] F. Chen et al., “Blockchain-based optical network slice rental approach
for IoT,” in Proc. IEEE Comput. Commun. IoT Appl., 2020, pp. 11–14.
[6] Y. He, Y. Wang, C. Qiu, Q. Lin, J. Li, and Z. Ming, “Blockchain-based
edge computing resource allocation in IoT: A deep reinforcement learn-
ing approach,” IEEE Internet Things J., vol. 8, no. 4, pp. 2226–2237,
Feb. 2021.
[7] A. Asheralieva and D. Niyato, “Reputation-based coalition formation
for secure self-organized and scalable sharding in IoT blockchains
with mobile-edge computing,” IEEE Internet Things J., vol. 7, no. 12,
pp. 11830–11850, Dec. 2020.
[8] C. K. Pyoung and S. J. Baek, “Blockchain of finite-lifetime blocks with
applications to edge-based IoT,” IEEE Internet Things J., vol. 7, no. 3,
pp. 2102–2116, Mar. 2020.
[9] Y. Xiao, N. Zhang, W. Lou, and Y. T. Hou, “A survey of distributed
consensus protocols for blockchain networks,” IEEE Commun. Surveys
Tuts., vol. 22, no. 2, pp. 1432–1465, 2nd Quart., 2020.
[10] Z. Guan, X. Zhou, P. Liu, L. Wu, and W. Yang, “A blockchain based
dual side privacy preserving multi party computation scheme for edge
enabled smart grid,” IEEE Internet Things J., early access, Feb. 22, 2021,
doi: 10.1109/JIOT.2021.3061107.
[11] D. Gabay, K. Akkaya, and M. Cebe, “Privacy-preserving authentication
scheme for connected electric vehicles using blockchain and zero knowl-
edge proofs,” IEEE Trans. Veh. Technol., vol. 69, no. 6, pp. 5760–5772,
Jun. 2020.
[12] W. Liang, D. Zhang, X. Lei, M. Tang, K. C. Li, and A. Zomaya, “Circuit
copyright blockchain: Blockchain-based homomorphic encryption for IP
circuit protection,” IEEE Trans. Emerg. Topics Comput., vol. 9, no. 3,
pp. 1410–1420, Jul.–Sep. 2021.
[13] P.-C. Chen, T.-H. Kuo, and J.-L. Wu, “A study of the applicability of
ideal lattice-based fully homomorphic encryption scheme to ethereum
blockchain,” IEEE Syst. J., vol. 15, no. 2, pp. 1528–1539, Jun. 2021.
[14] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to
sensitivity in private data analysis,” Theory of Cryptography (Lecture
Notes in Computer Science (Including Subseries Lecture Notes in
Artificial Intelligence and Lecture Notes in Bioinformatics 3876)).
Heidelberg, Germany: Springer, 2006, pp. 265–284.
[15] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Privacy aware
learning,” J. ACM, vol. 61, no. 6, p. 38, 2014.
[16] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and
statistical minimax rates,” in Proc. Annu. IEEE Symp. Found. Comput.
Sci., 2013, pp. 429–438.
[17] T. T. Nguyên, X. Xiao, Y. Yang, S. C. Hui, H. Shin, and J. Shin,
“Collecting and analyzing data from smart device users with local
differential privacy,” 2016, arXiv:1606.05053.
[18] Y. Wu, H.-N. Dai, H. Wang, and K.-K. R. Choo, “Blockchain-based pri-
vacy preservation for 5G-enabled drone communications,” IEEE Netw.,
vol. 35, no. 1, pp. 50–56, Jan./Feb. 2021.
[19] Y. Wu, H. N. Dai, and H. Wang, “Convergence of blockchain and edge
computing for secure and scalable IIoT critical infrastructures in industry
4.0,” IEEE Internet Things J., vol. 8, no. 4, pp. 2300–2317, Feb. 2021.
[20] M. U. Hassan, M. H. Rehmani, and J. Chen, “Privacy preservation in
blockchain based IoT systems: Integration issues, prospects, challenges,
and future research directions,” Futur. Gener. Comput. Syst., vol. 97,
pp. 512–529, Aug. 2019.
[21] B. K. Mohanta, D. Jena, S. Ramasubbareddy, M. Daneshmand, and
A. H. Gandomi, “Addressing security and privacy issues of IoT
using blockchain technology,” IEEE Internet Things J., vol. 8, no. 2,
pp. 881–888, Jan. 2021.
[22] Y. Zhao et al., “Privacy-preserving blockchain-based federated learning
for IoT devices,” IEEE Internet Things J., vol. 8, no. 3, pp. 1817–1829,
Feb. 2021.
[23] K. Gai, Y. Wu, L. Zhu, and Z. Zhang, “Differential privacy-based
R EFERENCES blockchain for Industrial Internet-of-Things,” IEEE Internet Things J.,
vol. 16, no. 6, pp. 4156–4165, Jun. 2020.
[1] Y. Wu, “Cloud-edge orchestration for the Internet of Things: [24] Y. Zhao et al., “A blockchain-based approach for saving and track-
Architecture and AI-powered data processing,” IEEE Internet Things ing differential-privacy cost,” IEEE Internet Things J., vol. 8, no. 11,
J., vol. 8, no. 16, pp. 12792–12805, Aug. 2021. pp. 8865–8882, Jun. 2021.

Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: NUMERICAL SPLITTING AND ADAPTIVE PRIVACY BUDGET-ALLOCATION-BASED LDP MECHANISM 6741

[25] F. D. McSherry, “Privacy integrated queries: An extensible platform for Ying Zhao is currently pursuing the Ph.D. degree
privacy-preserving data analysis,” in Proc. ACM SIGMOD Int. Conf. with the Department of Computing Technologies,
Manag. Data, 2009, pp. 89–97. Swinburne University of Technology, Melbourne,
[26] “Heart Rate Analysis.” 2017. [Online]. Available: https://github.com/ VIC, Australia.
JenniferLing/heart_rate_analysis (accessed Aug. 31, 2021). Her current research focuses on data privacy.
[27] “Medical Cost Personal Datasets.” 2017. [Online]. Available: https://
www.kaggle.com/mirichoi0218/insurance (accessed Aug. 31, 2021).
[28] “FISCO BCOS. The Building Block of Open Consortium Chain.”
[Online]. Available: https://www.fisco-bcos.org/ (accessed Aug. 31,
2021).

Kai Zhang is currently pursuing the Ph.D. degree


with the Department of Computing Technologies,
Swinburne University of Technology, Melbourne,
VIC, Australia.
He is currently focused on privacy preserve
technology, machine learning, and blockchain. Wenyu Zhao is currently pursuing the Ph.D. degree
with the Department of Computing Technologies,
Swinburne University of Technology, Melbourne,
VIC, Australia.
Her research interests include information
retrieval, natural language processing, and
personalized search.
Jiao Tian is currently pursuing the Ph.D. degree
with the Department of Computing Technologies,
Swinburne University of Technology, Melbourne,
VIC, Australia.
Her current research interests include federated
learning and deep learning

Hongwang Xiao is currently pursuing the Jinjun Chen (Fellow, IEEE) received the
Ph.D. degree with the Swinburne University of Ph.D. degree in information technology from
Technology, Melbourne, VIC, Australia. Swinburne University of Technology, Melbourne,
His research interests include computer vision, VIC, Australia.
capsule networks, and related topics. He is a Professor with the Swinburne University
of Technology. His research has been published sig-
nificantly in various venues. His research is in data
security and privacy, cloud computing, and scalable
data processing.

Authorized licensed use limited to: George Mason University. Downloaded on February 02,2024 at 15:44:10 UTC from IEEE Xplore. Restrictions apply.

You might also like