You are on page 1of 9

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
1

P2DCA: A Privacy-Preserving based Data


Collection and Analysis Framework for IoMT
Applications
Muhammad Usman, Member, IEEE, Mian Ahmad Jan,∗ , Member, IEEE, Xiangjian He,∗ , Senior Member, IEEE
and Jinjun Chen,∗ , Senior Member, IEEE

Abstract—The concept of Internet of Multimedia Things (BS) to perform computationally complex tasks and offload
(IoMT) is becoming popular nowadays and can be used in various processed data to a cloud server. The BS is connected to
smart city applications, e.g., traffic management, healthcare, and the cloud server through a high-speed Internet connection.
surveillance. In the IoMT, devices, e.g., Multimedia Sensor Nodes
(MSNs) are capable of generating both multimedia and non- However, it is possible that the BS is unable to upload
multimedia data. The generated data are forwarded to a cloud processed multimedia data to the cloud server due to technical
server via a Base Station (BS). However, it is possible that the problems in the underlying telecommunication network. In this
Internet connection between the BS and cloud server may be particular situation, mobile sinks can be utilized to collect
temporarily down. The limited computational resources restrict data from nominated MSNs, known as Cluster Heads (CHs),
the MSNs from holding the captured data for a longer time.
In this situation, mobile sinks can be utilized to collect data and forward to the cloud server. However, the involvement
from MSNs and upload to the cloud server. However, this data of mobile sinks for data collection in the IoMT may put the
collection may create privacy issues, e.g., revealing identities privacy of original data sources, e.g., MSNs, at risk.
and location information of MSNs. Therefore, there is a need In the recent years, general data protection regulation law
to preserve the privacy of MSNs during mobile data collection. and the California consumer privacy act have addressed the
In this paper, we propose an efficient Privacy-Preserving based
Data Collection and Analysis (P2DCA) framework for IoMT privacy issues for the export of personal data [1], [2]. The
applications. The proposed framework partitions an underlying term privacy is a subjective concept and can have differ-
wireless multimedia sensor network into multiple clusters. Each ent illustrations. In this paper, we consider the scenario of
cluster is represented by a Cluster Head (CH). The CHs are IoMT where the end-devices are continuously capturing and
responsible to protect the privacy of member MSNs through uploading the multimedia data. These devices can be a part of
data and location coordinates aggregation. Later, aggregated
multimedia data are analyzed on the cloud server using a counter- different sensitive applications, e.g., surveillance, healthcare,
propagation artificial neural network to extract meaningful and transportation management, and need to be protected from
information through segmentation. Experimental results show various security and privacy threats. Preserving the privacy
that the proposed framework outperforms the existing privacy- of these devices is highly important and if it gets compro-
preserving schemes, and can be used to collect multimedia data mised, an intruder can use and manipulate these devices for
in various IoMT applications.
malicious purposes. The IoMT applications, e.g., surveillance,
Index Terms—IoMT, MSNs, privacy, aggregation, clusters, healthcare, and transportation management, generate sensitive
counter-propagation artificial neural network.
multimedia data that need to be uploaded on urgent bases for
quick actions. To avoid the end-to-end delay, mobile sinks can
I. I NTRODUCTION be utilized to upload the collected data to cloud servers. In the

R ECENT developments in the electronic industry have


enabled sensing devices to capture high-resolution multi-
media data, and transformed the concept of Internet of Things
mobile data collection scenario, the credibility of mobile sinks
becomes a challenging task if they are anonymous. Even if the
mobile sinks are registered devices, still there are chances that
(IoT) into Internet of Multimedia Things (IoMT). The most the privacy of IoMT devices may be compromised through
common example of these devices is Multimedia Sensor Nodes malicious applications running on mobile sinks. Identities
(MSNs). The MSNs form a network known as a Wireless (IDs) and location information of member MSNs can easily
Multimedia Sensor Network (WMSN). In the WMSN, cap- be determined by analyzing the shared data. If the MSNs
tured multimedia data are forwarded to a nearby Base Station are compromised, an attacker can not only manipulate the
forwarded data to generate misleading results, but also can
A * indicates the corresponding author.
Muhammad Usman and Jinjun Chen are with the Department of Computer gain access to and control the underlying network. Therefore,
Science and Software Engineering, Swinburne University of Technology, Aus- protecting the privacy of MSNs becomes an important and
tralia. (E-mail: muhammad.usmanskk@gmail.com, jinjun.chen@gmail.com) pivotal security concern in the IoMT applications.
Xiangjian He is with the Global Big Data Technologies Center (GBDTC),
School of Electrical and Data Engineering, University of Technology Sydney, The concept of differential privacy was introduced when
Australia. (E-mail: xiangjian.he@uts.edu.au) the data and their sources were available to a global audience
Mian Ahmad Jan is with the Department of Computer Science, Abdul Wali through cloud computing platforms. However, in recent years,
Khan University Mardan, Pakistan (E-mail: mianjan@awkum.edu.pk)
This paper research is partially supported by Australian Research Council the computing resources have been distributed on a local
projects of DP170100136 and LP140100816. level through edge, fog, and cluster computing to address

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
2

issues like latency and end-to-end delay. Furthermore, due mobile sinks with the BS.
to technological growth in the electronic industry, end-user •A lightweight aggregation technique is applied by CHs
devices are now powerful enough and can participate in to the received multimedia data and location information.
different computing tasks, e.g., mobile computing in IoT and The aggregation helps in protecting the sensitive informa-
IoMT. This participation puts the privacy of end-devices at tion of member MSNs, such as angular position, location
risk, and as a result, a new term called Local Differential information, and IDs. If the sensitive information leaks,
Privacy (LDP) has been introduced. The purpose of LDP an adversary can easily identify the exact location of a
is to protect the privacy of data and their sources from data source by penetrating the network and cause serious
an end-user’s perspective. In recent years, many privacy- malfunctioning activities.
preserving techniques based on popular machine learning • A CP-ANN is used at the cloud server to process the
algorithms, e.g., k-nearest neighbor, support vector machine, aggregated data, and segment the foreground and back-
and artificial neural network, were proposed to protect location ground regions for extracting meaningful information,
information of participating devices [3]. In these techniques, e.g., tracking the moving objects and identifying the
naive approaches are used to hide sensitive information of malicious activities. The aggregated data are compressed
participating devices and release non-sensitive data. However, on mobile sinks using a video coding technique before
the act of protecting the sensitive information can still infer forwarding to the cloud server. Experimental results show
the hidden information. An adversary can easily determine that our proposed framework is still able to extract the
a temporal correlation between actions and situations and required information from compressed data.
make the aforementioned privacy-preserving approaches fail. The rest of this paper is structured as follows. Section
Furthermore, applying complex machine learning algorithms II provides the literature review on efforts made in recent
on resource-constrained devices is not an optimal solution. years. The proposed framework is explained in Section III.
This problem can easily be solved by using simple privacy- Experimental setup and simulation results are discussed in
preserving techniques on participating devices during data Section IV. Finally, the paper is concluded in Section V.
collection, and utilizing complex machine learning algorithms
on cloud platforms for further analysis.
II. L ITERATURE R EVIEW
In this paper, we propose a Privacy-Preserving based Data
Collection and Analysis (P2DCA) framework for IoMT ap- We distribute this section into two subsections. Both sub-
plications. The proposed framework offers LDP services to sections discuss various privacy-preserving techniques. In the
protect the privacy of end-devices in a way that if malicious first subsection, we provide an overview of privacy-preserving
users manage to hack the transmitted data, they should not be techniques based on random methodologies. In the second
able to locate the actual sources of data. This framework can subsection, the focus is on privacy-preserving techniques based
be useful for various multimedia data generating applications on machine learning algorithms.
of IoMT, e.g., surveillance, healthcare, and transportation
management. The proposed framework operates in two phases. A. Privacy-Preserving Using Random Methodologies
In the first phase, a WMSN is partitioned into multiple clusters
and CHs are selected to collect data from member MSNs. A survey on security and privacy-related issues in vehic-
Secondly, the mobile sinks are registered with the BS before ular ad-hoc networks was presented in [4]. In this survey,
collecting data from CHs. Data and location information are various methods addressing security and privacy challenges
aggregated at CHs to preserve the privacy of member MSNs. in vehicular devices are reviewed and explained to detect and
This scenario can be considered as a special case of local revoke malicious nodes. A survey on existing authentication
differential privacy. In the second phase, the mobile sinks and privacy-preserving schemes to secure 4G and 5G cellular
forward the collected data to the cloud server. Once received, networks was presented in [5]. In this survey, various schemes
a Counter-Propagation Artificial Neural Network (CP-ANN) are discussed and analyzed from the perspective of four
is applied to segment foreground and background regions. To different types of attacks, i.e., privacy, integrity, availability,
the best of our knowledge, there is no existing framework for and authentication. Surveys presented in [4], [5] highlight
the IoMT paradigm to preserve the privacy of member MSNs various security and privacy challenges for mobile devices that
using data and location aggregation and analyze the aggregated can be used to collect data from different devices. However,
data on the cloud servers, using a machine learning technique. a discussion from IoT and IoMT perspectives is missing in
Major contributions of the proposed framework are as follows. these surveys.
A fusion of IoT, big data, and cloud storage was presented
• A lightweight handshaking mechanism is applied to parti- to preserve the privacy of sensory data collected from e-health
tion the WMSN into multiple clusters. These clusters are systems in [6], [7]. In this work, IoT-related group keys are
represented by CHs. In each round of simulation, new used to authenticate medical nodes and encrypt messages in a
CHs are selected, based on their current energy levels. batch processing style to minimize the computational time. An
The CHs are responsible for various activities, such as advanced framework for opportunistic routing in delay-tolerant
nodes’ authentication, data collection and aggregation, networks was proposed in [8]. In this framework, the main
and sharing the aggregated data with the mobile sinks. focus is to protect the confidentiality of nodes and perform
A similar handshaking mechanism is used to register the anonymous authentication using a pairwise communication. A

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
3

privacy-preserving sensory data collection scheme was pro- A. Location Privacy Protection
posed in [9]. In this scheme, the location privacy is preserved In the P2DCA framework, the WMSN is partitioned into
through a tree-based diversionary routing and a proxy re- multiple clusters. Each cluster consists of at least l member
encryption technique is used for preserve the privacy at the MSNs, out of which only one is selected as a CH. The selected
network edge. Approaches presented in [6]–[9] are based on CHs collect data from member MSNs and forward to the BS.
cryptography-based solutions and may not be feasible for real- The BS is responsible to perform computationally complex
time IoMT applications. tasks and forward the processed data to the cloud server. In
the P2DCA framework, we use mobile sinks to collect data
B. Privacy-Preserving Using Machine Learning Methodolo- from CHs and forward to the cloud server. Notations used in
gies this phase are illustrated in Table I. The operation performed
In recent years, many machine learning techniques are in this phase are summarized in the following subsections.
used in the networking domain to offer various services,
such as providing security and privacy [10]–[12]. Distributed Notation Description
machine learning algorithms, based on an alternating direc- x, y coordinates of a node
tion method of multipliers, were proposed to preserve the e Energy of an MSN
privacy of a network in [13]. In these algorithms, dual and E Average energy threshold
primal variable perturbations are used to provide dynamic d Shortest distance
differential privacy. A privacy-preserving machine learning µ Session key
based collaborative intrusion detection system for vehicular R Request for registration
ad-hoc networks was proposed in [14]. In this system, an γ1 , γ2 Mapping functions
alternating direction method of multipliers and a dual-variable ϑ Challenge
perturbation are used to train a classifier to detect intruders and ∆ Time-stamp
provide dynamic differential privacy, respectively. A privacy- δ Authentication value
preserving technique based on sparse representation for cloud- Hr , Hs Histograms
enabled mobile applications was proposed in [15]. In this S Similarity value
technique, the privacy of data contributors and application θ, Θ Information loss
users is protected in the presence of an untrusted cloud server.
Schemes presented in [13]–[15] are able to stand against TABLE I: Notations of Location Privacy Protection
various types of privacy attacks, however, these schemes are
not feasible for real-time IoMT applications. 1) Information Sharing: In each round of simulation, the
MSNs forward their information, e.g., ID (i.e., i where i ∈
III. P RIVACY-P RESERVING - BASED DATA C OLLECTION {1, 2, · · · , I}), location coordinates (i.e., (xi , yi )), and current
AND A NALYSIS energy level (i.e., ei ) to the BS. Upon reception, the BS
extracts all embedded information, stores it in its database, and
In this section, we explain our proposed P2DCA framework
computes the average energy threshold (i.e., E). The average
for IoMT applications. A block diagram of our proposed
energy threshold is computed using the following equation.
framework is shown in Fig. 1. The proposed framework not
only protects the location privacy of member MSNs using I
data and location information aggregation but at the same
X ei
E= . (1)
time, is capable to analyze multimedia data using the CP- i=1
I
ANN to extract meaningful information at the cloud server.
After computing the value of E, it is compared with the
The aggregation is required to ensure that no one can locate
energies of all MSNs. The MSNs having energy levels equal
the data sources in the WMSN. The P2DCA framework is
to or greater than the average energy threshold are eligible to
divided into two phases, i.e., location privacy protection and
be selected as CHs. There may be a situation in a specific
data analysis. In the following subsections, these phases are
simulation round where multiple MSNs may have the same
explained in detail.
energy levels. In this situation, the selection is done based
on very minor differences in the shared energy levels and
can go up to four decimal places. It is also possible that
some eligible MSNs may have the same energy levels. In this
specific scenario, those MSNs are given priority that were not
BS selected as CHs in the past four rounds of simulation.
2) Cluster Formation: After finalizing the CHs, the BS
MSN
Wireless
broadcasts a message in the WMSN field. This message
Multimedia Sensor CH
contains the IDs (i.e., j, where j ∈ {1, 2, · · · , J} and J ⊂ I)
Cloud Multimedia Mobile Sinks Network
Servers and location coordinates (i.e., (xj , yj )) of selected CHs. The
BS also shares the IDs and location coordinates of the MSNs
Fig. 1: Proposed P2DCA Framework with the selected CHs. Upon reception, following operations
are performed by CHs, 1) retrieving and storing IDs and

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
4

location coordinates of MSNs, 2) sending an acknowledgment δm = γ2 (m ⊕ γ1 (c ⊕ µm )), (4b)


message back to the BS to confirm the successful arrival
ϑ = AES{∆m , δm }, (4c)
of the information, and 3) advertising themselves to their
neighboring MSNs by sharing their current energy levels. It is where γ2 is a hashing function used to perform a map to
possible that an MSN may receive invitations from multiple point hashing operation with a value within the range [0 1],
CHs. In this situation, the MSN associates itself with the CH ∆m is a time-stamp, δm is an authentication value, and b and
having the shortest distance and maximum energy level, after c are random integers, where {b, c} ∈ Z. The time-stamp
verifying its ID from the list forwarded by the BS. The shortest represents the total amount of time that a mobile sink can
distance (i.e., d) measurement is estimated using the following spend in the underlying WMSN and the authentication value
equation. is used to create a response.
p Upon receiving the ϑ, the mobile sink decrypts and retrieves
d= (x − xi )2 − (y − yi )2 . (2) the embedded values to create a response (i.e., Rm ) using the
following equation and forwards it to the BS.
It is possible that there are multiple CHs fulfilling the
shortest distance criteria. In this situation, the join-request Rm = AES{∆m , (c || γ1 (b ⊕ µm ))}. (5)
is sent to all the shortlisted CHs. Upon approval, the MSN
associates itself with the one who replies first. This joining Upon receiving the Rm , the BS verifies the identity by
process is based on a mutual authentication scheme which is retrieving the forwarded time-stamps and embedded random
explained in our previous works published in [16]–[18]. integers. After verification, the following operations are per-
3) Mobile Sink Registration: In the cluster-based com- formed by the BS, 1) sharing of time-stamps with all CHs and
munication, the CHs are responsible to collect data from 2) forwarding the information of selected CHs to the registered
member MSNs and forward to the BS for further processing. mobile sinks.
In the P2DCA framework, we are assuming that the Internet 4) Coordinates Aggregation: In real-world scenarios, the
connection between the BS and cloud server is temporarily MSNs within a cluster usually focus on the same scene but at
down, and the BS is unable to upload the processed data to different angles. As a result, the same information is captured
the cloud server. In this situation, the mobile sinks are used to and transmitted that consumes the computational resources
collect data from CHs and upload to the cloud server. However, and network bandwidth. Applying aggregation on captured
it is important to register the mobile sinks with the BS. data may produce different results as compared to the original
As shown in Fig. 2, the registration with the BS is a four- data, however, it cannot be considered as a significant loss.
step process. In the first step, a mobile sink (i.e., m where The aggregated data make it harder to estimate the exact
m ∈ {1, 2, · · · , M }) generates a session key (i.e., µm ) and a angle at which the data are captured. As a result, the actual
random integer (i.e., a where a ∈ Z), and use them to create source of data cannot be located and the privacy of individual
an encrypted registration request (i.e., Rm ) as shown in the sources is preserved. Furthermore, the data aggregation helps
following equation. in efficiently utilizing the available storage space at CHs and
the bandwidth during data transmission.
Rm = AES{m, (γ1 (a ⊕ µm ))}. (3) In the P2DCA framework, the CHs apply similarity-based
data aggregation to minimize data redundancy and preserve
Here, the messages are encrypted using AES-128 and γ1 is the privacy of member MSNs. A video is a combination
a mapping function used to perform one-way secure hashing of multiple frames. The videos are usually transmitted as a
with a value within the range [0 1]. group of frames. Before transmission, complex video coding
algorithms are applied to reduce the size of a video by
discarding the redundant information in video frames. In
Registration Request the P2DCA framework, videos captured by CHs are used
as standard data. On the other hand, videos from member
Encrypted Challenge MSNs within the same cluster are considered as redundant
Encrypted Response data and are compared with the standard data as shown in
Fig. 3. To find the similarity, the comparison is performed on
Cluster Heads Database a frame-by-frame basis. Each video frame is partitioned into
Base Station Mobile Sink multiple blocks of equal size, i.e., 64 × 64. The blocks in a
video frame from the redundant data are compared against the
blocks of a corresponding video frame from the standard data.
Fig. 2: Mobile Sink Registration
This comparison is performed using a histogram normalization
technique. If the normalized histograms of blocks from the
Upon receiving the Rm , the BS decrypts and retrieves
redundant and standard video frames are represented by Hr
the embedded values, and generates an encrypted challenge
and Hs , respectively, then the similarity value (i.e., S) can be
(i.e., ϑ) using the following equation and forwards it to the
computed using the following equation.
requesting mobile sink.
 
Hs
∆m = γ2 (m ⊕ γ1 (b ⊕ a)), (4a) S = Hs × log2 . (6)
Hr

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
5

In traditional privacy-preserving frameworks based on local


differential privacy, some noise is usually added to data to
protect the privacy of data sources. However, the addition of
Redundant Video Frames
noise may increase the size of data, especially in the case
of multimedia data. Therefore, it cannot be considered an
ideal solution to preserve the privacy in scenarios containing
multimedia data, low-energy nodes, and limited bandwidth. In
the P2DCA framework, we apply aggregation to original data
Standard Video Frames
to preserve the privacy of member MSNs in the underlying
WMSN. This aggregation can be considered as noise, as it
modifies the original data. Due to the aggregation of data and
location coordinates, the compromised mobile sinks cannot
determine the actual data and location coordinates of member
Fig. 3: Similarity-based Data Aggregation
MSNs.
The computed similarity value is compared against a pre-
defined threshold S, where S ∈ (Smin , Smax ). If S < S, then B. Data Analysis
the block is considered dissimilar, otherwise, it is considered After receiving the multimedia data from CHs, the mobile
similar. If at least 25% blocks are found dissimilar, then it is sinks upload them to a cloud server. However, prior to the
assumed that the video frame from redundant data contains upload, the multimedia data need to be encoded to meet the
important information and needs to be transmitted. available bandwidth requirement between the mobile sinks
The location coordinates are also aggregated to preserve the and cloud server. For encoding, we apply scalable video
privacy of member MSNs. Such an aggregation of data and encoding to generate video bitstreams with variable bit-rates.
location coordinates help in preserving the privacy of member The aggregated videos encoded in the scalable style make it a
MSNs, based on the k-anonymity rule. Due to this aggregation, challenging task for the data analysis applications on the cloud
the destination, i.e., a mobile sink in this case, assumes that the servers to perform the segmentation and detection of moving
data are coming from a single entity rather than k nodes. The objects. To detect moving objects with a maximum accuracy,
CHs aggregate the location coordinates by computing a mean we use a CP-ANN on the cloud server. The CP-ANN is a
value of location coordinates of member MSNs. Later, the three-layer architecture where the first layer is an input layer,
aggregated multimedia data along with the aggregated location the second layer is a Kohonen layer, and the third layer is a
coordinates are shared with the mobile sinks. Grossberg layer. These layers are distributed into two modules,
Although the aggregation of data and location coordinate i.e., Background Generation Module (BGM) for training, and
help in preserving the privacy of member MSNs, an informa- Object Extraction Module (OEM) for extracting objects. In the
tion loss may also occur. To ensure a minimum information CP-ANN, neurons from one layer are connected to the neurons
loss during data aggregation, we assume that all member of the next layer in a sequence. Due to an unsupervised
MSNs in a cluster are located within close proximity. This learning style, each neuron in the Kohonen layer characterizes
assumption can be verified for each member MSN, using the the input patterns based on a winner-takes-all rule. A winning
following equation. neuron is selected based on the longest distance between the
L
input and Kohonen layers. For each category, the Grossberg
X layer produces an output. Based on this working style, the
θ= (xl − x̂l )2 + (yl − yˆl )2 , (7a)
BGM can predict properties of the incoming video bitstreams
l=1
by exploiting various properties of pixels in each video frame.
L  PL 2 PL 2
X g=1 xg

g=1 yg Notations used in this phase are illustrated in Table II. In the
θ= xl − + yl − following subsections, we explain the modules of the CP-ANN
L L
l=1 in detail.
L PL L PL (7b)
X ( g=1 xg )2 X ( g=1 yg )2 1) Background Generation Module: A video frame is repre-
= xl 2 − + yl 2 − , sented by one luminance (i.e., Y ) and two chroma (i.e., Cb and
L L
l=1 l=1
Cr ) components. If a pixel from a video frame is represented
F
X by p(x, y) and the luminance and chroma components are the
Θ= θf , (7c) input patterns of the input layer, then the distance (i.e., Ψ) of
f =1 a neuron (i.e., n) between the input and Kohonen layers can
be computed using the following equation.
where L represents the total number of MSNs in a cluster
and L ⊂ I, (xl , yl ) are coordinates of an MSN, (x̂l , yˆl ) is the p
(pc − v)2
mean value of location coordinates of an entire cluster, θ is Ψn = pP 1 ≤ n ≤ N, (8)
the information loss per cluster, and Θ is the total information (pc )2
loss of an entire network. where pc represents p(x, y)’s component value in Y Cb Cr
In Eq. 7a-7c, small values of θ and Θ mean a low color space and v represents the current weight of the cor-
information loss, when the aggregation is applied to data. responding neuron in the Kohonen layer.

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
6

Notation Description
p Pixel (
1, if Ψmin < T
Y Luminance b(x, y) = PN
. (12)
−( n=1 pc −v)
Cb , C r Chroma components w.e T2 , otherwise
Ψ Distance between neurons
v Current weight of neurons
Unlike Eq. 10, empirical tolerance in Eq. 12 is based on the
T Empirical tolerance
minimum value of Ψ. A small value of T in Eq. 10 generates
h Learning rate
new neurons in the Kohonen layer. On the other hand in Eq.
w Weight between neurons
12, it generates a gradual Gaussian curve in the Grossberg
b Background pixel
layer. To avoid misjudgment of background pixels, we set a
B Block of pixels
low value (e.g., T = 0.12) in the P2DCA framework.
During data analysis, the video frame is partitioned into
TABLE II: Notations of Data Analysis multiple square-shaped blocks of equal size, i.e., 64 × 64.
It is uncommon that there will always be abnormal activi-
ties happening in real-time scenarios. Furthermore, a video
A winner neuron is selected based on a maximum distance sequence consists of multiple frames and it is unlikely that
described by the following equation. there will always be very rapid motions in the consecutive
video frames. Sizes like 2 × 2, 4 × 4, 8 × 8, 16 × 16, and
Ψmax = max Ψn . (9) 32 × 32 can be adopted for an in-depth analysis when there
n=1,2,··· ,N is a very rapid motion in the captured videos or when some
To declare a neuron as the winner neuron, an empirical intermediate video frames are lost during transmission [19]. As
tolerance (i.e., T ) is used for comparison as shown in the far as the shape of the block is concerned, the square shape
following equation. is the optimum one as compared to other complex shapes like
hexagonal shape. Furthermore, it covers more area and does
( not require jumping from one pixel location to another and
winner, if Ψmax < T changing of axis information. In each block, the CP-ANN
n= . (10) architecture is applied to each pixel to determine how many
loser, otherwise
pixels belong to the background region. This entire procedure
Once the winner neuron is selected, weights around its can be represented by the following equation.
position in the Kohonen layer are adjusted using the following ( P
equation. 0, if b(x, y) > 256
Bq = , (13)
1, otherwise
v̂ = v + h(pc − v), (11) where B represents a block of pixels and q represents the
block number, where q ∈ {1, 2, · · · , Q}. Here, a 0 indicates
where v̂ represents the newly adjusted weight and h is the
that most of the pixels in a block belong to the background
learning rate set to 0.01 in the P2DCA framework.
region and B is classified as the background region block. On
The neurons between the Kohonen and Grossberg layers are
the other hand, a 1 means that the block belongs to a moving
connected on a one-on-one basis, therefore, weight (i.e., w)
object.
between these two layers is set to 1 to maintain the character-
istics of each neuron. This entire unsupervised strategy helps
to determine characteristics of each video bitstream, where the IV. E XPERIMENTAL S ETUP AND S IMULATION R ESULTS
characteristics are distributed into the neurons of the Kohonen In this section, we compare the performance of the P2DCA
layer. The distribution in this module helps in identifying framework with the existing privacy-preserving schemes, i.e.,
moving objects and background regions in the incoming video SLICER with Transfer on Meet Up (SLICER-TMU), Minimal
bitstreams. Cost Transfer (SLICER-MCT), and Simple Exchanging (SE)
2) Object Extraction Module: After determining the re- [20], [21]. For a performance comparison, we consider six
lationship between neurons in the input layer and Kohonen different metrics, i.e., computational overhead, communication
layers and deciding the winner neuron in the Kohonen layer, overhead, data freshness, packet delivery ratio, reconstruction
next step is to compute the output for the Grossberg layer. If ratio, and segmentation accuracy.
the winner neuron exists in the Kohonen layer, then the output During simulation, we build a WMSN of 500 MSNs out
of the Grossberg layer is set to 1. If no winner neuron exists, of which only 5% are selected as CHs. This network is setup
then the similarity between a pixel and a neuron is determined in Matlab 2017a. Transmission range of each MSN is set to
through a Gaussian function. If the similarity exists, then there 100m. Mobile sinks are continuously moving with a constant
is a high probability that a pixel belongs to the background speed based on a random way-point mobility model [22].
region in a video frame, and is treated as a background In the P2DCA framework, we introduce an extra task for
pixel (i.e., b(x, y)). This Gaussian function based similarity CHs, i.e., data and location coordinates aggregation. Unlike lo-
estimation is used to estimate the output of the Grossberg cation coordinates aggregation, data aggregation is a computa-
layer and is computed using the following equation. tionally intense task and requires availability of computational

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
7

resources. The computational overhead metric represents the


total amount of time required to aggregate data at CHs along-
side performing other tasks. As shown in Fig. 4, the P2DCA 500

Communication Overhead (MB)


framework shows a better performance as compared to the 400
SLICER-TMU and SLICER-MCT schemes. In the SE scheme,
300
no aggregation is applied to multimedia data, therefore, it
shows less computational overhead. At the start of simulations, 200
SE
the existing schemes and P2DCA framework show similar SLICER-MCT
100
results. However, the computational overhead remains higher SLICER-TMU
P2DCA
in the SLICER-TMU and SLICER-MCT schemes with an 0
increase in data transmitting MSNs. We use a lightweight 0 50 100 150 200 250
Total Number of Nodes
mutual handshaking mechanism to verify the identities of
newly joining MSNs. To further reduce the computational load
Fig. 5: Communication Overhead
on CHs, registration of mobile sinks is performed at the BS.
Once registered, the CHs receive information about registered
mobile sinks from the BS. These features are missing in the of 150 minutes. The P2DCA framework achieves a higher
targeted schemes. percentage of data freshness and sustains itself against the
replay attacks. Unlike the P2DCA framework, the SLICER-
TMU and SLICER-MCT schemes experience a significant
decrease in the data freshness after 45 minute of the network
3500
deployment. The SE scheme, on the other hand, experiences
Computational Overhead (ms)

3000
a significant drop of 7.5% after the initial 20 minutes of the
2500
network deployment.
2000

1500

1000 SE
SLICER-MCT
500 SLICER-TMU 100
SE
P2DCA
0 98 SLICER-MCT
SLICER-TMU
0 20 40 60 80 100 120 140 160 180 200
Data Freshness (%)

P2DCA
Size of Multimedia Data (MB) 96

94
Fig. 4: Computational Overhead
92

90
A comparison based on the communication overhead is pro-
88
vided in Fig. 5. This comparison represents the total amount of
0 30 60 90 120
data transmitted from CHs to the cloud server via mobile sinks. Total Time (Minutes)
As shown in this figure, the communication overhead of the
P2DCA framework is lower than other schemes in the presence Fig. 6: Data Freshness
of an increasing number of data transmitting MSNs. Unlike
the targeted schemes, the P2DCA framework uses efficient In Fig. 7, the packet delivery ratio of the P2DCA framework
data and location aggregation techniques which reduce the is compared against the existing schemes for a varying number
amount of data to be transmitted. Multimedia data coming of malicious nodes. The average packet delivery ratio for the
from the same geographical region mostly contain repetitive P2DCA framework is 95.56%, the SLICER-TMU scheme is
information. Furthermore, the mobile sinks and MSNs operate 92.23%, the SLICER-MCT scheme is 85.31%, and the SE
in open environments with limited available bandwidth. Pro- scheme is 74.1%. Unlike the P2DCA framework, the existing
cessing and transmission of redundant multimedia data require schemes experience a significant drop in the packet delivery
extra computational resources and bandwidth, respectively, and ratio as the number of malicious nodes increases. In general,
are not feasible for resource-constrained devices. this metric is associated with the verification process at CHs.
The data freshness is evaluated in Fig. 6. The data freshness During the verification process, the malicious nodes that are
metric indicates the resilience of a scheme against various unnoticed, acquire time division multiple access slots from
attacks, especially the replay attack. In this figure, the mali- CHs and broadcast their malicious/fabricated data packets to
cious nodes are the ones that somehow manage to infiltrate the the BS. As a result, the legitimate nodes are unable to transmit
proposed framework by registering with the BS. We evaluated their data and ultimately start dropping data packets due to
the P2DCA framework for data freshness against the existing buffer overloading.
schemes in the presence of 10 malicious nodes. The average A comparison based on the reconstruction ratio is shown in
data freshness of the P2DCA framework is 97.09%, the Fig. 8. The reconstruction ratio represents the percentage of
SLICER-TMU scheme is 96.3%, the SLICER-MCT scheme multimedia data successfully reconstructed by the cloud server.
is 95.5%, and the SE scheme is 92.01%, over a period As shown in this figure, the P2DCA framework shows a better

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
8

formance analysis, we use the following metrics, i.e., True-


Positive (TP), False-Positive (FP), False-Negative (FN), and
1 F-measure. These are standard metrics used to measure the
accuracy of binary segmentation. To compute the average
Packet Delivery Ratio (%)

0.8
performance, we use the following mathematical expressions.
0.6
#TP
0.4 Recall (Re) = ,
SE
#TP + #FN
0.2 SLICER-MCT #TP
SLICER-TMU Precision (Pr) = , (14)
0
P2DCA #TP + #FP
0 10 20 30 40 50 60 70 80 90 100 2 × Pr × Re
Total Number of Malicious Nodes F-measure = ,
Pr + Re
Fig. 7: Packet Delivery Ratio where, #TP, #FN and #FP represent the total number of TP,
FN, and FP, respectively.
In a quantitative evaluation, based on Re, Pr, and F-
performance in the presence of a packet-drop ratio of 5%. measure metrics, a high score means a better performance.
Unlike the targeted schemes, the P2DCA framework transmits An overall quantitative comparison is summarized in Table
a small amount of data. In the presence of packet-drop, missing III. As shown in the table, better results in terms of average
data packets are recovered either by retransmission or applying recall and average precision are obtained for the multimedia
an error concealment technique [23]–[25]. The recovery of data processed through the P2DCA framework as compared to
data packets may introduce excessive delays and increase the multimedia data processed through the targeted schemes.
network load due to retransmission of data packets. As a In the SHVC-based encoding, compression of videos means
result, the cloud server needs to wait until all the data packets loss of information and introduction of visual artifacts in
are successfully arrived prior to starting the segmentation videos. As the data analysis phase of the P2DCA framework
process. In real-time IoMT applications, such as surveillance, is running at the cloud server, it is important to know that the
healthcare, and transportation management, quick actions need background and foreground segmentation are performed on
to be taken. Due to heavy multimedia data transmission and compressed aggregated videos. Despite this fact, the overall
processing delay, the targeted schemes may not be feasible for accuracy of the P2DCA framework on video sequences with
real-time IoMT applications. moving objects is still better than the existing schemes as
shown in the last column representing the average F-measure.
Due to the absence of aggregation in the SE scheme, extensive
compression is applied to reduce the size of multimedia
100 data before transmission. High compression and packet-drop
together make it difficult for the cloud server to extract
Reconstruction Ratio (%)

80
meaningful information from the received multimedia data.
60 As a result, this scheme lags behind the P2DCA framework.
40
SE Method Average Re Average Pr Average F-Measure
SLICER-MCT
20 SLICER-TMU 0.7666 0.7643 0.7627
SLICER-TMU
P2DCA SLICER-MCT 0.7873 0.7973 0.7751
0
0 50 100 150 200 250
SE 0.7951 0.8451 0.806
Total Number of Malicious Nodes
P2DCA 0.8151 0.8784 0.8339
Fig. 8: Reconstruction Ratio
TABLE III: Quantitative Comparison
In real-world scenarios, multimedia data are always com-
pressed before transmission. In our experiments, videos are
encoded using the Scalable High-efficiency Video Coding V. C ONCLUSION
(SHVC) standard. Here, we assume that MSNs are fixed nodes. In this paper, we have proposed a privacy-preserving-based
Therefore, the selected test video sequences contain moving data collection and analysis framework, called P2DCA for
objects with static backgrounds. Once the multimedia data IoMT applications. This framework is designed to collect mul-
are successfully arrived at the cloud server, the segmentation timedia data using mobile sinks from WMSNs. In the P2DCA
task is performed to extract meaningful information, e.g., framework, the underlying WMSN is partitioned into multiple
tracking moving objects. We compare the performance of the small clusters. Before sharing data with mobile sinks, both
P2DCA framework against the targeted schemes in terms of data and location coordinates are aggregated on CHs to hide
segmentation accuracy. The segmentation accuracy represents data sources (i.e., member MSNs) from mobile sinks. After
the percentage of successful segmentation and reconstruction receiving the data from mobile sinks, the cloud server performs
of meaningful information from reconstructed data. For per- an analysis using a CP-ANN to segment received videos into

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSAC.2019.2904349, IEEE Journal
on Selected Areas in Communications
9

foreground and background regions to track moving objects. [19] H. Zhang, A. Kankanhalli, and S. W. Smoliar, “Automatic partitioning of
Simulation results have shown that the P2DCA framework full-motion video,” Multimedia systems, vol. 1, no. 1, pp. 10–28, 1993.
[20] F. Qiu, F. Wu, and G. Chen, “Privacy and quality preserving multimedia
performs better as compared to the existing schemes in terms data aggregation for participatory sensing systems,” IEEE Transactions
of preserving the privacy of member MSNs and segmenting on Mobile Computing, vol. 14, no. 6, pp. 1287–1300, 2015.
the received data. Regarding future work, we are planning to [21] D. Christin, J. Guillemet, A. Reinhardt, M. Hollick, and S. S. Kanhere,
“Privacy-preserving collaborative path hiding for participatory sensing
use the simulation results produced in this paper as a base applications,” in Mobile Adhoc and Sensor Systems (MASS), 2011 IEEE
to perform further enhancements in the P2DCA framework to 8th International Conference on. IEEE, 2011, pp. 341–350.
protect the privacy of visual contents in captured videos. [22] C. Bettstetter, G. Resta, and P. Santi, “The node distribution of the
random waypoint mobility model for wireless ad hoc networks,” IEEE
Transactions on mobile computing, vol. 2, no. 3, pp. 257–269, 2003.
R EFERENCES [23] M. Usman, X. He, M. Xu, and K. M. Lam, “Survey of error concealment
techniques: Research directions and open issues,” in Picture Coding
[1] G. D. P. Regulation, “Regulation (eu) 2016/679 of the european parlia- Symposium (PCS), 2015. IEEE, 2015, pp. 233–238.
ment and of the council of 27 april 2016 on the protection of natural [24] M. Usman, X. He, K.-M. Lam, M. Xu, S. M. M. Bokhari, and J. Chen,
persons with regard to the processing of personal data and on the free “Frame interpolation for cloud-based mobile video streaming.” IEEE
movement of such data, and repealing directive 95/46,” Official Journal Trans. Multimedia, vol. 18, no. 5, pp. 831–839, 2016.
of the European Union (OJ), vol. 59, no. 1-88, p. 294, 2016. [25] M. Usman, X. He, K. K. Lam, M. Xu, J. Chen, S. M. M. Bokhari,
[2] C. Legislature, “California consumer privacy act,” 2018. [Online]. and M. A. Jan, “Error concealment for cloud-based and scalable video
Available: https://www.caprivacy.org/ coding of hd videos,” IEEE Transactions on Cloud Computing, 2017.
[3] K. Ota, M. S. Dao, V. Mezaris, and F. G. De Natale, “Deep learning
for mobile multimedia: A survey,” ACM Transactions on Multimedia
Computing, Communications, and Applications (TOMM), vol. 13, no. 3s, Muhammad Usman received a Ph.D. in Computer
p. 34, 2017. Systems from the School of Electrical and Data
[4] F. Qu, Z. Wu, F.-Y. Wang, and W. Cho, “A security and privacy review Engineering, University of Technology Sydney, Aus-
of vanets,” IEEE Transactions on Intelligent Transportation Systems, tralia, in 2017. Currently, he is working as a Postdoc
vol. 16, no. 6, pp. 2985–2996, 2015. Research Assistant in the Department of Computer
[5] M. A. Ferrag, L. Maglaras, A. Argyriou, D. Kosmanos, and H. Jan- Science and Software Engineering, Swinburne Uni-
icke, “Security for 4g and 5g cellular networks: A survey of existing versity of Technology, Australia. He has published
authentication and privacy-preserving schemes,” Journal of Network and his research work in many well-known international
Computer Applications, 2017. journals and conferences. His research interests in-
[6] E. Luo, M. Z. A. Bhuiyan, G. Wang, M. A. Rahman, J. Wu, and clude multimedia processing and communication,
M. Atiquzzaman, “Privacyprotector: Privacy-protected patient data col- security and privacy in communication networks,
lection in iot-based healthcare systems,” IEEE Communications Maga- Internet of Things, Internet of Multimedia Things, and error concealment
zine, vol. 56, no. 2, pp. 163–168, 2018. techniques.
[7] H. Tao, M. Z. A. Bhuiyan, A. N. Abdalla, M. M. Hassan, J. M. Zain,
and T. Hayajneh, “Secured data collection with hardware-based ciphers Mian Ahmad Jan is an Assistant Professor in the
for iot-based healthcare,” IEEE Internet of Things Journal, 2018. Department of Computer Science, Abdul Wali Khan
[8] J. Long, A. Liu, M. Dong, and Z. Li, “An energy-efficient and sink- University Mardan, Pakistan. He holds a Ph.D. in
location privacy enhanced scheme for wsns through ring based routing,” Computer Systems from the University of Technol-
Journal of parallel and Distributed computing, vol. 81, pp. 47–65, 2015. ogy Sydney, Australia. He has published his research
[9] J. Long, M. Dong, K. Ota, and A. Liu, “Achieving source location pri- work in many well-known international journals and
vacy and network lifetime maximization through tree-based diversionary conferences. His research interests include cluster-
routing in wireless sensor networks,” IEEE Access, vol. 2, pp. 633–651, based hierarchical routing protocols, congestion de-
2014. tection, mitigation, and intrusion and malicious at-
[10] S. Homayoun, A. Dehghantanha, M. Ahmadzadeh, S. Hashemi, tack detection in wireless sensor networks, Internet
R. Khayami, K.-K. R. Choo, and D. E. Newton, “Drthis: Deep ran- and web of things.
somware threat hunting and intelligence system at the fog layer,” Future
Generation Computer Systems, vol. 90, pp. 94–104, 2019. Xiangjian He received a Ph.D. in Computing Sci-
[11] S. A. Miraftabzadeh, P. Rad, K.-K. R. Choo, and M. Jamshidi, “A ences from the University of Technology, Sydney,
privacy-aware architecture at the edge for autonomous real-time identity Australia, in 1999. He is currently a professor and
reidentification in crowds,” IEEE Internet of Things Journal, vol. 5, the director of Computer Vision and Pattern Recog-
no. 4, pp. 2936–2946, 2018. nition Laboratory at the University of Technology,
[12] H. HaddadPajouh, A. Dehghantanha, R. Khayami, and K.-K. R. Choo, Sydney. He has received many competitive national
“A deep recurrent neural network based approach for internet of things and regional grants. He has published his research
malware threat hunting,” Future Generation Computer Systems, vol. 85, work in many well-known international journals and
pp. 88–96, 2018. conferences. His research interests include image
[13] T. Zhang and Q. Zhu, “Dynamic differential privacy for admm-based processing, computer vision, and pattern recognition.
distributed classification learning,” IEEE Transactions on Information
Forensics and Security, vol. 12, no. 1, pp. 172–187, 2017.
[14] ——, “Distributed privacy-preserving collaborative intrusion detection
systems for vanets,” IEEE Transactions on Signal and Information Jinjun Chen is a professor from Faculty of Sci-
Processing over Networks, 2018. ence, Engineering and IT, Swinburne University of
[15] Y. Shen, C. Luo, D. Yin, H. Wen, R. Daniela, and W. Hu, “Privacy- Technology, Australia. He holds a PhD in Computer
preserving sparse representation classification in cloud-enabled mobile Science and Software Engineering (2007) from the
applications,” Computer Networks, vol. 133, pp. 59–72, 2018. Swinburne University of Technology, Melbourne,
[16] M. Usman, M. A. Jan, X. He, and P. Nanda, “Data sharing in secure Australia. His research interests include cloud com-
multimedia wireless sensor networks,” in Trustcom/BigDataSE/I SPA, puting, big data, and data intensive systems. He
2016 IEEE. IEEE, 2016, pp. 590–597. has published more than 130 papers in high quality
[17] M. Usman, N. Yang, M. A. Jan, X. He, M. Xu, and K.-M. Lam, “A journals and conferences.
joint framework for qos and qoe for video transmission over wireless
multimedia sensor networks,” IEEE Transactions on Mobile Computing,
vol. 17, no. 4, pp. 746–759, 2018.
[18] M. Usman, M. A. Jan, X. He, and J. Chen, “A mobile multimedia data
collection scheme for secured wireless multimedia sensor networks,”
IEEE Transactions on Network Science and Engineering, 2018.

0733-8716 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like