You are on page 1of 14

SPECIAL SECTION ON ADVANCES ON HIGH PERFORMANCE

WIRELESS NETWORKS FOR AUTOMATION AND IIOT

Received 31 October 2022, accepted 14 November 2022, date of publication 28 November 2022, date of current version 18 August 2023.
Digital Object Identifier 10.1109/ACCESS.2022.3224922

Innovative Variational AutoEncoder for an


End-to-End Communication System
MOHAMAD A. ALAWAD 1 , (Student Member, IEEE),
MUTASEM Q. HAMDAN 2 , (Student Member, IEEE),
AND KHAIRI A. HAMDI 1 , (Senior Member, IEEE)
1 Department of Electrical and Electronic Engineering, The University of Manchester, M13 9PL Manchester, U.K.
2 5GIC & 6GIC, Institute for Communication Systems (ICS), University of Surrey, GU2 7XH Guildford, U.K.
Corresponding author: Mohamad A. Alawad (mohamad.alawad@manchester.ac.uk)

ABSTRACT Powered by deep learning (DL), autoencoders (AE) end-to-end (E2E) communication systems
have been developed to merge all physical layer blocks in traditional communication systems and have
achieved great success. In this paper, a new probabilistic model, based on the variational autoencoders (VAE),
is proposed for short-packet wireless communication systems. Using this new approach, the information
messages are represented by the so-called packet hot vectors (PHV), which are inferred by the VAE latent
random variables (LRVs). Then only LRVs’ parameters can be transmitted through the physical wireless
channel. This results in a significant improvement in spectral efficiency when compared with the pure AE
approach, where longer hot vectors are to be transmitted. Specific VAE models have been developed for
both binary (BPSK) as well as Quadrature phase shift keying (QPSK) systems. Simulation and numerical
results are given to demonstrate the performance of the proposed method in different real scenarios, including
Rayleigh and Rician fading channels with Shadowing and Doppler effects. Our simulation and numerical
results show that the new proposed VAE with a DL classifier can provide an improved symbol error rate
(SER) performance than both the baseline AE and the classical Hamming code with hard decision decoding.
Furthermore, as far as the spectral efficiency of the proposed method is concerned, we show that using two
channels in the proposed VAE performance exceeds the 7 channels’ baseline AE.

INDEX TERMS Variational autoencoder, machine learning, auto-encoder, Hamming code, latent random
variable, probabilistic models, wireless communications, binary phase shift keying, quadrature phase shift
keying.

I. INTRODUCTION intelligence (AI)-based approach can significantly improve


Wireless networks and other related services are becoming the design and management of communication components.
more intelligent with innovative advances and unprecedented AI, represented by machine learning (ML) and deep learn-
levels of computing capability. The advent of numerous ing (DL), has attracted tremendous attention as it has suc-
unprecedented services, such as factories, self-driving cars, cessfully transformed the manner in which humans work
smart cities, factories, and telemedicine and remote diag- and communicate. This has been addressed in [1] and [2].
nostics, presents a challenge to classical communication in Some of these techniques have been applied in the com-
terms of latency, flexibility, reliability, energy efficiency, and munication literature, have triggered extensive research, and
connection density. All of these technologies require new have greatly impacted the solutions to some communication
architectures, approaches, and algorithms in almost all lay- problems. Various emerging trends for the DL method are
ers of the communications systems. An advanced artificial also considered based on information theory, probability,
statistics, and solid mathematical modelling. The primary
The associate editor coordinating the review of this manuscript and function of a communication system is to transmit a message,
approving it for publication was Dave Cavalcanti . such as a bit stream, from the source to the destination over

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
86834 VOLUME 11, 2023
M. A. Alawad et al.: Innovative VAE for an E2E Communication System

a channel through the accurate use of a transmitter and (SVD) precoding-based MIMO system [15], which view the
receiver. In order to achieve this optimally, the transmitter and channels as a group of independent sub-channels.
receiver are segmented into strings of multiple independent Moreover, Recent research has examined how to learn
blocks, each of which is responsible for a particular mini-task. an E2E communication system without prior knowledge of
Many approaches have been demonstrated in various applica- channel models. A reinforcement learning (RL) approach
tions such as modulation recognition [3], signal detection [4], based on reinforcement learning was developed [22] to opti-
channel coding [5], channel decoding [6], [7], [8], [9], and mize the transmitter DNN without regard to the channel
channel estimation and detection [10], [11], [12], [13], [14], transfer function or channel state information (CSI). The
and replacement of the total communication system with a stochastic perturbation approach was used in [23] to design
novel architecture based on an auto-encoder (AE). In [3] a model-free E2E communication framework. In [24], a con-
and [15] the authors show a significant gain by introducing ditional generative adversarial network (GAN) approach has
an AE as a communication system, in which the modulation been developed for building E2E communications, where the
and coding are jointly designed as one end-to-end (E2E) channel effects are modelled by a conditional GAN.
DL model. The work in [3] showed how the use of block In contrast to other ML techniques that do not require
structures typically enables individual optimization, analysis, communication resources, federated learning (FL) utilizes
and control of each block, without the need for any domain- communication between the central server and distributed
specific information. The E2E AE can achieve a performance local clients in order to train and optimize the model.
similar to the conventional method in additive white Gaussian ML-based FL allows training models to be distributed
noise (AWGN) channels. However, the block-based approach between multiple clients, each with a certain amount of train-
is sub-optimal in certain cases [3]. Considering the DL-based ing data and coordinated through a central server. Therefore,
communications system design, the optimization of E2E as the computation can be offloaded from the central server
one black box block is proposed in [3] and [16]. All previous to the client. In brief, in FL, the local clients communicate
work has shown that the idea of E2E learning in commu- with the central server only using model parameters learned
nication systems has received widespread attention in the locally rather than raw data, preserving both privacy and
wireless communications community [17], [18]. In our paper, communication overhead [25], [26], [27].
we use generative models known as variational autoencoders A part of the artificial intelligence field is ML, which
(VAEs) [20], [46], as they have been extensively used for includes algorithms for classification, clustering, and dimen-
unsupervised and semi-supervised DL. Moreover, since most sionality reduction (DR). Over the last decade, various clas-
of the current mobile systems generate unlabeled or semi- sification algorithms have been developed, including Deep
labelled data, the VAE is well suited to learning in wireless Convolutional Neural Networks (DCNNs) [28], and Vari-
environments. ational AutoEncoders (VAEs) [46]. The VAE inherits the
traditional AE architecture, meaning it is composed of two
A. RELATED WORKS neural networks (NNs), an encoder and a decoder, respec-
As DL advances, the research paradigm can shift away tively. The encoder decreases the dimensionality of the inputs
from designing schemes using mathematical models to into a latent space. On the other hand, the decoder can
autonomously constructing E2E DL schemes based on obser- reconstruct the inputs from the latent space through learning.
vations of large quantities of data. For example, when DL is Thus, VAEs can be used for classification [29], [30], [31]
employed for image classification, feature detectors that are and production [32], [33], [34]. Moreover, VAE can learn
far more accurate than conventional detectors can be derived a data generation distribution that can take random samples
from a large set of image inputs using DNN structures. There- from the latent space. It then generates unique images with
fore, in the age of DL, it starts with preparing, selecting, and features similar to those on which the network was trained
pre-processing data to be used in the DNN structure. Then, after decoding the random samples using the decoder net-
determine the appropriate structure for the DNN. Lastly, work. Using the Bayes rule [35], the VAE can learn the joint
interpreting the output of the DNN becomes increasingly probability of input data and labels simultaneously. Bayesian
important than developing analytic schemes from mathemat- inference is a method of statistical inference that provides
ical systems that typically contain assumptions necessary to a powerful framework for reasoning and prediction under
enable analysis. uncertainty. However, the limitation of computing the pos-
Recently, DL has been applied to many areas of wireless terior with only a few parametric distributions, makes wider
communications research. Besides improving conventional applications of Bayesian inference difficult [36]. Recently,
communication modules, DL-based E2E communication sys- to approximate the posterior by representing the variational
tems have recently been developed, in which DNNs represent distribution with a set of particles and update them through
both the transmitter and receiver. A framework with block a deterministic optimization process, particle-based varia-
structures under the AWGN channels was proposed in [3] tional inference (ParVI) methods have been proposed [37],
and performs similarly to traditional approaches. There is also [38], [39]. Although the ParVI method can achieve com-
an E2E framework in the OFDM (overlay frequency division putational efficiency and asymptotic accuracy, it restricts
multiplexing) system [21] and singular value decomposition the fixed number of particles and lacks the ability to draw

VOLUME 11, 2023 86835


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

new samples beyond the initial set of particles [37]. Gen- B. MAIN CONTRIBUTIONS
erally, variational inference and Markov chain Monte Carlo In this paper, a new approach has been proposed and inves-
(MCMC) methods have been used to give tractable approx- tigated with the help of a variational autoencoder (VAE) as
imate inference, but these approaches bring their own set a probabilistic model to reconstruct the transmitted symbol
of challenges when the space’s dimensionality is particu- without sending the data bits out of the transmitter. Our main
larly high. Bayesian neural networks (BNNs) are a recent contributions are summarized as follows:
example of interest. These apply Bayesian inference to deep • We propose an E2E communication system that rep-
neural network training to provide a principled mechanism resents the symbol as PHV and operates over BPSK
to analyze model uncertainty. Developing efficient computer modulation in AWGN channels, where modulation and
strategies to estimate this intractable posterior with excep- demodulation are performed by a deep neural network
tionally high dimensionality, on the other hand, remains (DNN) based on a VAE architecture.
challenging. • We extend our experiment to investigate the QPSK mod-
On the basis of the above and the development of DL, ulation, Rayleigh, Rician fading channels, shadowing,
semantic communication is again being considered a and Doppler effect for a limited range of doppler fre-
key technology and has received great attention. As the quency shifts and phase offsets.
5G system has approached the Shannon limit, semantic com- • While the baseline AE uses 4 and 7 channels in [3] to
munication aims to retain the successful transmission of achieve their results. In our work, we efficiently use
semantic information by the source rather than the accu- two channels only to achieve better performance than
rate reception of each bit or single symbol regardless of its AE baseline.
meaning. Semantic communication is at the second level of • Our work considers a VAE with two LRVs, and a simple
communication-based on Shannon and Weaver [40], aim- classifier can reconstruct the transmitted message by
ing to accurately convey the semantic information of the sending only the LRVs’ parameters and the message
transmission symbols, rather than accurately recovering the error rate (MER). The result shows that the performance
transmitted information. of our proposed system is better than that of the existing
Recently, several semantic communication concepts have classical scheme.
been developed based on NNs to replace conventional com-
munication blocks. In [41], the conditional generative adver- C. PAPER STRUCTURE AND NOTATIONS
sarial net (GAN) was designed to represent channel effects, The rest of this paper is organized as follows. Section II
while in [42], a complete point-to-point communication sys- describes the system model, starting from the anatomy of the
tem in the physical layer was developed using NNs. The VAE and then formulating the wireless system model and
authors of [43], show that the network can learn a projection VAE model. Section III outlines the experiment setup, the
function from feature space to a semantic embedding space in classifier training algorithm, and the VAE training algorithm.
zero-shot learning (ZSL) models. The work in [44] developed Section IV evaluates the performance of the proposed VAE
a DL-based semantic communication system (DeepSC) for and compares it with several benchmarks. Finally, Section V
text transmission, with the aim of maximizing the capacity draws conclusions.
of the system and minimizing semantic errors, as it would Furthermore, a list of important symbols used throughout
recover the meaning of sentences rather than the bit or symbol this paper are summarized in Table 1.
error. Moreover, the authors in [45] proposed a semantic
communication approach based on AE for the wireless relay II. SYSTEM MODEL
channel (AESC) to extract and compress semantic informa- In this paper, the wireless communication system model has
tion and reconstruct its semantic features. However, there a simple setup to allow the reader to follow the proposed
are some key differences between semantic communication idea. Our goal is to design a probabilistic model that can
systems and conventional communication that can be defined reconstruct the transmitted information without sending the
as follows [44]: exact bits or the deterministic transformed bits of the exact
• The design and optimization of the information trans- symbol (e.g, channel coding using Hamming codes), but by
mission module in conventional systems are contained transmitting the statistical parameters of the LRVs through
in the transceiver, unlike the semantic system, where the the physical layer rather than sending the data bits of the
whole information processing block is jointly designed original symbol out of the transmitter.
from the source information to sink.
• Recovering the exact data is the focus of conven- A. VARIATIONAL AUTOENCODER (VAE)
tional communication systems; however, semantic com- A brief description of the basic VAE, on which this work
munication systems are intended for transmission builds are required to clearly grasp what follows. The VAE
decisions. is a popular generative model, allowing us to solve prob-
• Conventional communication systems compress data in lems in the framework of probabilistic graphical models with
the entropy domain, while semantic communication sys- latent variables [46], [47]. VAEs can be considered as two
tems process data in the semantic domain. independently parameterized models: the recognition model,

86836 VOLUME 11, 2023


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

TABLE 1. List of symbols.

known as the encoder, and the generative model or decoder.


The encoder delivers an approximation to its posterior over
latent random variables to the decoder, which is required to
update its parameters inside the iteration of expectation maxi-
mization learning. Conversely, the decoder is a scaffolding of
sorts for the encoder to learn meaningful representations of
the data besides class labels. In other words, the VAE helps
the encoder infer the distribution of original data rather
than the original data itself. By employing a properly
designed object function, the distribution of original data can
be encoded into certain low-dimensional distributions. Simi-
larly, the decoder training allows the decoder to transform the FIGURE 1. VAE stochastic mapping.
distributions into the approximate original data distribution to
obtain a new sample that represents the reconstruction of the
original ones. parameters θ which can be written as:
Moreover, as probabilistic models, VAEs also contain data
x ∼ pθ (x). (1)
and unknowns. Therefore we need to assume some level of
uncertainty around this aspect of the model. This uncertainty To find the value for the parameter θ, we used the learning1
can be specified in terms of a conditional probability distri- process, which is the most commonly used search process.
bution, where the model can contain both discrete and con- Since the probability distribution function is given by the
tinuous variable values. In addition, between these variables, model pθ (x) and approximates the true distribution of the
this probabilistic model is able to specify all correlations and data, denoted by p∗ (x), therefore, for any observed (x):
higher-order dependencies in the form of a joint probability
pθ (x) ≈ p∗ (x). (2)
distribution.
As shown in Fig.1, VAEs can learn the stochastic mappings Often, in the case of classification or regression problems,
between the observed x-space that has distribution qD (x) and we are interested in a learning conditional model such as
the latent z-space. The generative model learns the joint dis- pθ (y/x) that approximates the underlying conditional dis-
tribution pθ (x,z), which is factorized as pθ (x,z)=pθ (z)pθ (x|z) tribution p∗ (y/x), where the distribution of the value over
with a prior distribution over latent space pθ (z) and a stochas- the variable y is conditioned on the value of the observed
tic decoder pθ (x|z). The inference model or the stochastic variable x. In this case, x is the input of the model. As in
encoder qφ (x|z) approximate the true but intractable posterior the previous paragraph, the model pθ (y/x) is chosen and
pθ (x|z) of the generative model [47]. optimized to be close to the unknown underlying distribution
Specifically, we use the vector (x) to represent the set for any x and y:
of all observed variables that we want to model its joint pθ (y/x) ≈ p∗ (y/x). (3)
distribution. We assume the observed variable (x) from an
1 learning: In terms of ML, the concept of learning can be formulated as
unknown underlying process is a random sample that has
an unknown probability distribution p∗ (x). To approximate Tom Michell defines it, as a ‘‘problem of searching through a predefined
space of potential hypotheses for the hypothesis that best fits the training
this underlying process, we used a chosen model pθ (x), with examples.’’ [48].

VOLUME 11, 2023 86837


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

One of the most common examples of conditional mod- where No = σ 2 is the noise power variance that contaminates
elling is image classification, where (x) is an image, the transmitted signal power as shown in Fig. 3.
and (y) is the image’s class that we want to predict.
We can extend the models discussed above into directed
models with latent variables, where the latent variables can
be defined as variables that are part of the model but are not
part of the data-set, and which, therefore, we do not observe.
Normally, we use z to denote the latent variables. In the case
of unconditional modelling of the observed variable x, we can
represent the directed graphical model by a joint distribution
pθ (x, z) over the observed variable x and the latent variables z.
The marginal distribution over the observed variables pθ (x)
can be written as: FIGURE 3. Simple wireless system with AWGN channel.
Z
pθ (x) = pθ (x, z)dz. (4) By definition, the signal-to-noise ratio (SNR) is:

The model pθ (x, z) can be conditioned in some context, 0 = Sr /No , (7)


such as pθ (x, z | y) and for this, we use the term ‘‘deep latent
where Sr is the power of the desired signal received, and No is
variable model’’ (DLVM), which is when the distributions are
the AWGN power. for t bits per symbol in Eb /No , (6) and (7)
parameterized by NNs. The advantage of the DLVM is that
can be written as:
when each factor in the directed model, whether its prior or
Sr
conditional distribution, is relatively simple, the marginal dis- γ = . (8)
tribution p∗ (x) can be very complex. This expression makes No × t
the DLVM attractive for approximating complicated under- where hi ∈ C1×1 sampled for Rayleigh distribution, Rician
lying distributions. One of the most common and simplest distribution, and long normal Shadowing for Rayleigh, Rician
DLVM is known as factorization, which can be defined as and Shadowing models, respectively [49]. While in the
follows: Doppler model, we use the theoretical flat Doppler spec-
pθ (x, z) = pθ (z)pθ (x|z). (5) trum S(f ), where S(f ) = 2f1d , and phase shift φd [50].

Fig. 2 shows a simple schematic of computational flow in C. THE VAE AS A WIRELESS SYSTEM MODEL
the VAE with the evidence lower bound (ELBO) which is the The proposed VAE model design learns the noise, multi-
optimization objective of the VAE. path, line of sight, and non-line of sight effects features using
a directed probabilistic graph model (DPGM) as in Fig. 5,
where z represents the LRVs that are used to infer the signal
features from the packet hot vector (PHV). Using this method,
the relation between the transmitted signal and the received
signal patterns can be presented using inferred LRVs.
Inspired by the semantic level communication and VAE,
our work considers the use of variational inference for gen-
FIGURE 2. Simple schematic of computational flow in a VAE.
erative modelling; however, we reinterpret the variational
inference from a new perspective. We use generative mod-
More details about the VAE can be found in [19], [46], elling, which refers to the process of valid samples from p(x).
and [47]. Fig. 5, shows our generative model. In this work, the samples
of x are generated from a latent variable z, and θ represents
B. WIRELESS SYSTEM MODELS the associated parameters, while the solid lines denote the
We build up an end-to-end communication system that con- generative model pθ (z) pθ (x | z). For example, to gener-
sists of a transmitter sending the desired signal to the receiver, ate valid samples of x, we first sample z, then use z and
as shown in Fig. 3. We assume that the wireless channels have θ to generate x. The dashed lines represent the inference
AWGN. The equation below formulates the received signal procedure with a variational approximation of the intractable
vector sr : posterior pθ (z | x). Moreover, we apply DL that is proposed
sr = Hsd + no , (6) by a stochastic optimization-based technique to approximate
the inference p(z | x) with appropriate prior on p(z) using
where H = diag (h) is the channel coefficient vector, h = an encoder network qφ (z | x). After that comes the decoder
[h1 , . . . hi , . . . , hN ], hi ∈ C1×1 , sd is the desired transmitted network pθ (x | z) to compute the reconstruction x̂ of the
signal vector for propagated sd from the transmitter to the message x, where this will be learned during the training
receiver, and no is the AWGN noise vector, n0 ∼ CN (0, σ ), phase. Given a neural network model with sufficient learning

86838 VOLUME 11, 2023


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

FIGURE 4. End-to-end wireless communication architecture consists of the VAE and classifier DNNs.

capability and good prior distribution p(z), this high-capacity


model will approximate the posterior by qφ (z | x) ≈
pθ (z | x). Since this model is structured as an encoder-decoder,
the technique is known as autoencoding variational Bayes
(AVB), where the expected marginal likelihood pθ (x) of the
datapoint x ∈ X , under an encoding function, qφ (.), can be
computed as in [51]:
Ep(x) log pθ (x) = Ep(x) DKL qφ (z|x) ||pθ (z|x) FIGURE 5. DPGM for used VAE.


+ Lθ,φ (x) , (9)


The first term in (9) is the Kullback-Leibler (KL) divergence
between qφ (z|x) and pθ (z|x).
The second term in (9) is called the evidence lower
bound (ELBO):
pθ (x, z)
 
Lθ,φ (x) = Ep(x) Eqφ (z|x) log , (10)
qφ (z|x)
and
qφ (z|x)
 
qφ (z|x) ||pθ (z|x) = Eqφ (z|x) log , (11)

DKL FIGURE 6. Reparameterization trick used for training VAE.
pθ (z|x)
We have to maximize the Lθ,φ (x) by minimizing the
DKL qφ (z|x) ||pθ (z|x) in order to maximize the penalized Next, we describe the architecture of the VAE in the pro-
likelihood of the reconstruction of x from z using: posed E2E wireless communication system shown in Fig. 4
Lθ,φ (x) = Ep(x) log pθ (x) and compare this transformation with a simple wireless sys-
tem as shown in Fig. 3.
− Ep(x) DKL qφ (z|x) ||pθ (z|x) .

(12)
Moreover, since backpropagation through a random opera- 1) VAE INPUT
tion is not possible in the training stage, we use the reparam- The hot vector in [3] can be replaced with a new concept
eterization trick to move the random sampling operation to an known as the PHV in the same way that [19] used to
auxiliary variable ε that is shifted by the mean µi and scaled represent the constellation of a symbol. However, in this
by the standard deviation σi , respectively, representing the work, we present the symbol as a packet of ones and zeroes
distribution 8 that the network is trying to learn, as in Fig. 6. where the inputs s0 and s1 to the transmitter are encoded as
This allows backpropagation through the deterministic nodes a one-PHV 1s ∈ RM . The sent binary phase shift keying
f , z, 8. The idea here is that sampling from N (µi , σi2 ) is the (BPSK) message s0 has been presented by a packet of B bits.
same as sampling from (µi + ε.σi ), where ε ∼ N (0, 1). This packet consists of K sub-packets, where each

VOLUME 11, 2023 86839


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

sub-packet ki , i ∈ {1, . . . , K } contains b bits. For example, 2 × by + 1 is the final length of bits code that the
this means that the total length of our PHV is 1 × bK . Let modulator receives. After signal demodulation, the BCD
the space of possible messages be M = 2bK and bK be the will use the binary decoded bits to convert it back as a
necessary number of bits to represent each message m. Then decimal integer and fraction parts before combining both
transmit input message st ∈ {1, . . . , m, . . . , M }, where M is using a fixed point radix to retrieve the decimal value.
the space size of the possible messages as in Fig. 7. This proposed method eliminates any digitization error
for the y values when by length satisfies the required
2) VAE ENCODER significant figures for precision sf .
Each PHV x fed into the input layer will be transformed by • BPSK modulation and demodulation components: The
f : R1×bK → R1×c , where c is the dimension of the last layer BPSK is used to modulate the output of the DCB using
in the encoder. Looking at Fig. 4, the encoder layers include a standard modulation, whereas the demodulated output
two-dimensional convolution (2DConv) layers, each of which is used to feed the BCD input.
is configured with several filters (each filter has a size of h̄ • AWGN noise channel: the physical AWGN noise is
height and $ width). The features output by each layer are ∼ CN (0, ξ ), where ξ is the fixed standard deviation
mapped to a number of filters ν1 and ν2 , respectively. The value that contaminates the amplitude and phase of the
filter shifts by ς strides at each convolution step, while the received signal.
padding size ℘ can be calculated using: size (2DConv) = The purpose of this is to represent the posterior of every
k−h̄$ +2℘ parameter in all weight tensors from each layer of deep
ς + 1, to keep the output size equal to the input.
A rectified linear unit (ReLU) layer is used after each 2DConv networks. The number of channels in the physical wireless
to eliminate any negative output value. A final fully connected component medium has the dimension of R1×c , and c is the
(FC) layer was added to the encoder with the dimension of number of channels that the proposed communication system
1×2c. The output of the FC layer is divided into two sets µz = uses to send one message out of the 2bK messages. The
[µ1 , . . . , µc ] and σ z = [σ1+c , . . . , σ2c ], which represents the E2E rate of this communication can be measured by rE2E =
bK
latent variables’ distributions parameters (the expectation and c [bits/channel use]. However, over the physical wireless
the variance, respectively). components medium, the rate of the physical transmission
The transformation can be formulated using the DNN is rPH = (2by + 1) [bits/channel use]. This leads to the
hyperparameters θT : compression rate (CR) formula:

yn = f (xn , θT ), (13) (2by + 1)c


CR = . (14)
where xn ∈ X , xn is the input data point and yn is the output 2bK
of the FC layer which has decimal format. After this, the The channel noise is an AWGN due to the assumption
FC decimal output values use the physical decimal to binary that the main source of the noise is on the receiver side [3].
converter (DCB) component to start sending the LRVs’ dis- The channel uses a fixed variance ξ 2 = (Eb /No )−1 and is
tribution parameters over the physical layer. characterized as a distribution N (0, ξ 2 I ), where (Eb /No ) is
the energy per bit Eb to the ratio of the spectral density of
noise power No that contaminates the desired signal at the
receiver after converting the values from binary to a decimal
using the BCD.

4) VAE DECODER
FIGURE 7. Packet representation for VAE input. The BCD output of the physical medium represents the
LRVs’ contaminated expectation, and variance decimal
parameters vector values as a function of Eb /No are
3) PHYSICAL MEDIUM µz (Eb /No ) = [b
b µ1 (Eb /No ), . . . , b
µc (Eb /No )] and bσ z (Eb /
In this paper, our unique approach is to explain the practical No ) = [bσ1+c (Eb /No ), . . . , b
σ2c (Eb /No )], respectively. In this
aspect of implementing an E2E system that includes the real- paper, we proposed to use the sampling layer inside the
ization of the physical wireless transmission and the receiving receiver to realize a practical architecture of the E2E wireless
components, such as the digitization of µ and σ values for system. The dimensions of the sampling input layers are equal
each LRV, the modulator, demodulator, and AWGN channel: to those of the last encoder output layer 1 × 2c. However, for
• Decimal-coded binary (DCB) and binary-coded decimal the following layers, the reparameterization trick is necessary
(BCD) converters: In the DCB component, the received to allow the VAE to perform the backpropagation at the
decimal integer part will be represented by by number of training phase and to sample the b z as shown in Fig. 6, and
bits and the same for the fractional part of the decimal has been formulated using  ∈∼ [N1 (0, 1), . . . , Nc (0, 1)] as:
value. In addition, an extra bit for the sign has been
added as the most significant bit (MSB), which means z(Eb /No ) = µz (Eb /No ) + σ z (Eb /No )
b . (15)

86840 VOLUME 11, 2023


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

At high Eb /No values, bz = z as a result of eliminating any It is important to mention that the SER is the analogy
contamination of the z values, due to the AWGN channel of the proposed MER measurement in classical wireless
effect, is mathematically proved by: communication. More discussion regarding this point
can be found in Section IV. However, the most important
lim z(Eb /No ) = z,
b (16) of the three methods is the MER, because it measures the
Eb /No →∞
final ratio of the correctly received messages out of the
which is the input of the decoder that is transformed back total transmitted ones, which is the ultimate goal of
to f −1 : R1×c → RbK to reconstruct the input symbol s the proposed system.
as b
s. The transformation can be formulated using the DNN
hyperparameters θR : III. EXPERIMENT SETUP, E2E WIRELESS SYSTEM
x=f
b −1
(τ (f (x, θT )), θR ). (17) TRAINING AND SIMULATION
A. EXPERIMENT SETUP
The DNN consists of one input layer, three transposed The main parameters for the VAE, classifier and physical
2DConv layers, and a ReLU layer is used to eliminate the wireless component layers are summarized in Table. 2
negative values at each output. Lastly, a 2DConv is used to
reconstruct the transmitted image. TABLE 2. Parameters used for simulations.

D. SIMPLE DNN IMAGE CLASSIFIER


To classify the final reconstructed symbol b x → sd ∈
{1, . . . , m, . . . , M }, a simple DNN classifier has been used.
Fig. 4, shows the architect of the classifier block using con-
volution, batch normalization, ReLU and max-pooling layers
to extract the feature of b x. The classifier output layer learns
the final message sˆd from the output size of the previous fully
connected and softmax layers with output size M possible
messages.

E. PROPOSED NUMERICAL PERFORMANCE


MEASUREMENT METHODS FOR THE NEW E2E WIRELESS
SYSTEM
To measure the performance of the proposed E2E VAE wire-
less system, we suggested the following methods:
• BERE2E definition: This is the ratio of bits error of the
B. CLASSIFIER TRAINING
transmitted PHV from transmitter to receiver
A classifier stochastic gradient descent with momentum
N P
bk
P (SGDM) training type is used to train PHVs under AWGN
(|xi − b
xi |)
n=1 i=1 contamination with a value of Eb /No = 0 dB to produce
BERE2E = , (18) the final retrieved sent message. The SGDM algorithm can
NbK
oscillate along the path of the steepest descent towards the
where N is the number of transmitted PHVs (Symbols).
optimum. Adding the momentum term with the contribution
xi ∈ {1, . . . , bK } bits that produced by convert-
xi and b
factor ϒ to the parameter update is one way to reduce this
ing x and b
x from decimal to binary respectively.
oscillation as in (15). Algorithm 1 describes the classifier
• BERPH definition: This is the ratio of bits error of the
training process [52].
transmitted LRVs values between the DCB and BCD
components. θcl+1 = θcl − ηc ∇L(θcl ) + ϒ(θcl − θcl−1 ), (21)
N (2bP
P y +1) where θcl is the vector of weight and bias parameters for the
(|DCB(yi ) − DCB(BCD(wi ))|) DNN classifier in iteration l, ηc is the learning rate, and L(θcl )
n=1 i=1
BERPH = . is the loss function, while ∇L(θcl ) is the gradient of the loss
N (2by + 1)
function used to train the entire training set.
(19)

• MER definition: This is the ratio of the wrongly classi- C. THE VAE TRAINING
fied messages at the receiver to the transmitted ones. VAE training aims to reconstruct the sent PHV from a mean-
ingful continuous space produced by the LRVs z ranges using
N
P
(sd −b
sd ) the ELBO as in:
n=1
MER = . (20) min ELBO = E[L(θV ) + β × KL(θV )], (22)
N θV

VOLUME 11, 2023 86841


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

Algorithm 1 Classifier Training values [53]. The VAE Training was done at a fixed value
1: Initialization: {Epo,Itr, x, No , ηc , ϒ, θ}, where of Eb /No = 7 dB with a learning rate 0:001 and batch
Epo: number of epochs training. size=64. More details about the training set will be illustrated
Itr: number of iterations per epoch. in section IV.
X: training packet ho vectors. Algorithm 2 describes the proposed VAE training process.
sd :the desired output message.
No : noise sample ∼ N (0, ξ 2 )
θc : DNN preceptons weights and biases matrix. Algorithm 2 VAE E2E Wireless System Training
2: for each Epo do 1: Initialization: {Epo,Itr, x, No , ηV , θV }, where
3: for each Itr do Epo: number of epochs training.
4: The input layer passes the PHV values to the Itr: number of iterations per epoch.
2DConv layer. x: training input vector. In addition the desired output.
5: The 2DConv layer produces the first features map. No : noise sample ∼ N (0, ξ 2 )
6: The output of the 2DConv layer passes the batch θV : DNN weights and biases matrix for θT and θR .
normalization to speed up the training and reduce n: number of re-sampled PHV at the receiver,
the sensitivity of network initialization. Then the 2: for each Epo do
output passes ReLU layer to remove any negative 3: for each Itr do
values. 4: use x for input layer to produce y = f (x, θT )
7: To reduce the spatial size of the feature map and 5: use y for physical wireless layer to produce w =
redundant spatial information, the ReLU output τ (y)
uses the max-pooling layer to down-sample the 6: use w for sampling layer input to produce the LRVs
input. z values using (15).
8: Repeat steps 5 to 7, to fine-tune the detection of 7: use the sampling layer output to reconstruct the
the important features in the message. (The gradient PHV:
threshold = + ∞) x = f −1 (w, θR )
b
9: Apply SGDM algorithm to optimize θc as in (21) 8: Apply (25) to find the ELBO:
using initial parameters: ηc (learning rate), ϒ (the 9: Apply Adam optimization algorithm to optimize θV
momentum contribution factor) to get the gradient using initial parameters: ηV (learning rate), λ1 &λ2
gItr : (the exponential decay rate for the 1st and 2nd
gItr ← x, θc )
− ∇θc L(x,b moment estimates respectively), εV (a small con-
10: use gItr to update θc according to [52]. stant value for numerical stability) to get the gra-
11: end for dient gItr :
12: end for gItr ←− ∇θV L(x,b x, θV )
13: Output: Return the up-to-date θc and save the DNN 10: use gItr to update θVItr according to [53].
‘‘Classifier-PHV’’ model. 11: end for
12: end for
13: Output: Return the up-to-date θV and save the DNN
where ‘‘VAE-Wireless’’ model.
k
X 1
min L(θV ) = x − x)2 .
(b (23)
θV 2 D. E2E WIRELESS SYSTEM SIMULATION REALIZATION
1
k Once both the 0 Classifier0 and 0 VAE − Wireless0 models have
−1
(1 + log(σz ) − µ2z − eσ z ). (24)
X
max KL(θV ) = been trained, the two models cascade as in Fig. 4, and then the
θV 2
1 real data transmission starts. In this experiment, 106 PHVs
have been sent from the transmitter through the VAE-encoder,
However, unlike the existing references, the contamination
physical wireless component layer, VAE-encoder and finally
of the LRVs’ inferred parameters occurs at the transmitted
pass the classifier to each under observation Eb /No . The pro-
binary (not decimal values) level bits while it is propagated
posed system has a novel method to re-sample the retrieved
through the wireless channel to imitate the practical aspects
message for N times using parallel computing techniques
of the experiment. In addition, the sampling layer has been
and hardware such as graphics processing units (GPUs), then
moved to the receiver side to produce the contaminated LRV’s
finding the mode (the data value with the highest count) of
z values from the received contaminated LRVs’ inferred
the N re-sampled messages (b snd )N
n=1 :
parameters. The used optimization algorithm for DL net-  
works wights is adaptive moment estimation (Adam) with bsd = mode (b n=1 .
snd )N (25)
an added momentum term. It keeps an element-wise moving
average of both the parameter gradients and their squared Algorithm 3 describes the realization process.

86842 VOLUME 11, 2023


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

Algorithm 3 VAE E2E Wireless System Realization


1: Initialization: {x, Exp, Eb /No , N , 0 Classifier0 model,
0 VAE − Wireless0 model}, where

Exp: number of transmitted messages


x: Test PHV for each Exp messages.
Eb /No : the range of power contamination at the physical
wireless layer
n: number of re-sampled PHV at the receiver,
2: for each Eb /No do
3: for each Exp do
4: use x for input layer to produce y = f (x, θT ), where
θT ∈ ‘‘VAE-Wireless’’
5: use y for physical wireless layer to produce w =
τ (y)
6: for each n do
7: use w for sampling layer input to produce the
LRVs z values using (15). FIGURE 8. SER performance with fixed learning rate versus Eb/No for
different batch sizes.
8: use the sampling layer output to reconstruct the
PHV:
x = f −1 (w, θR ) where θR ∈ ‘‘VAE-Wireless’’
b
9: Use b x as input for the ‘‘Classifier-PHV’’ model
10: provide the final class of received message b x ⇒
sd ∈ {1, . . . , m, . . . , M }
b
11: end for
snd )N

12: find the bsd = mode (b n=1 for the N re-samples
message.
13: compareb sd to sd
14: end for
15: Calculate the MER at specific Eb /No
16: end for
17: Output: Return MER for all Eb /No .

IV. NUMERICAL RESULTS


In this section, a series of experiments will be implemented to
evaluate the performance of the new approach proposed under FIGURE 9. SER performance with fixed batch size versus Eb/No for
different learning rate based.
various scenarios and compared with several benchmarks.
In particular, we consider QPSK modulation in AWGN and
BPSK modulation with the effects of AWGN, fading, shad-
owing and Doppler on the model. We compare our results we fix the batch size to 64 and increase the learning rate
with the commonly used QPSK and BPSK expert modulation from 0.0001 to 0.01. The lowest SER can be obtained
schemes which have long been used [49]. with a learning rate = 0.001, and the learning rate value
We start this section with the training process by using we used in our training procedure was 0.001. The results
BPSK modulation in AWGN with the parameter settings rec- using ηY = 0.0001 show deterioration in SER as the search
ommended by Adam [53]. To begin with, we fix the learning for the optimal solution required more iterations than the
rate to 0.001 and increase the batch size from 32 to 128. used one (in this work the iterations: 300 iteration/epoch ×
From the simulation results shown in Fig. 8, we can see that 50 epoch=1500 iterations).
all the curves have a similar trend, but the curve for batch On the other hand, choosing ηY = 0.01 produces results
size = 64 is smoother and more stable than the other curves. between the different choices due to utilising the iterations
This is due to the effect of underfitting and overfitting the data but with less resolution in loss function [55]. Similarly, with
while calculating the loss function at the training stage [54]. a fixed learning rate and different batch sizes, we observe
As a result, we choose batch size = 64 in our training a similar trend in SER. As the Eb /No increase, the SER
process. constantly decreases.
Next, it is important to choose appropriate learning Having established the feasible learning parameters,
parameters. The parameters are adjusted by observing the we simulate the performance of the proposed algorithm as
SER values as shown in Fig. 8, and Fig. 9. In this case, follows:

VOLUME 11, 2023 86843


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

A. BPSK CHANNEL
The numerically computed SER values versus Eb /No ∈
[0,9] dB with BPSK modulation in AWGN are depicted
in Fig. 10. The proposed VAE with two LRVs is capable
of reconstructing the transmitted message by only sending
the LRVs’ parameters (µz , σ z ), and the MER (in our work,
MER = SER) decreases when the Eb /No increases as the
green curve shows. As the AE and VAE state-of-the-art
articles assume that the encoder output has decimal output
values only, we used Hamming code to add protection and
correction to the binary transmitted values of the encoder
output, after converting it to binary by adding two bits for
each transmitted bit over the physical layer. However, when
comparing the numerical performance of the VAE SER with
the theoretical Hamming (3,1) decoded by the hard-decision
method, our proposed VAE outperforms Hamming (3,1),
as shown in Fig.10. Furthermore, even if Hamming (3,1) FIGURE 11. SER performance of the BPSK VAE vs AE baseline schemes.

encoded by the soft decision method performs better until


Eb /No = 2 dB, the VAE will outperform this scheme at
Eb /No > 2 dB. From this result, we observe that the VAE B. QPSK CHANNEL
at low Eb /No cannot outperform the optimal soft decision Fig.12 shows a similar comparison, but for a higher-order
scheme as it does not learn the distribution for LVRs prop- modulation scheme, specifically, quadrature phase shift key-
erly, which is one of our research findings. Moreover, the ing (QPSK) under AWGN channel to the classical AE [3]
proposed VAE outperforms the hard-decision-decoded Golay and the proposed VAE. This result shows that the proposed
scheme with a semi-constant gap (parallel) with an average VAE with different modulation (BPSK and QPSK) achieve
of 0.5 dB. better performance than the classical AE. Notice that even the
QPSK VAE still perform better than AE (7,4) at low Eb /NO
as in [3].

FIGURE 10. SER performance of the BPSK VAE vs baseline schemes.

FIGURE 12. SER performance BPSK and QPSK of the VAE scheme under
Moreover, comparing the performance of the VAE SER AWGN vs AE baseline schemes.

to the classical AE [3], the dashed curves show that at the


same number of channels used to transmit the encoder outputs
(AE(1,4) in brown), the proposed VAE outperforms the AE C. RAYLEIGH FADING CHANNEL
scheme, as shown in Fig. 11. However, as the number of The numerically computed SER values versus Eb /No ∈
channels of the AE increased, the performance gap decreases [0,20] dB with Rayleigh are depicted in Fig. 13. The proposed
as the blue dashed curve AE(7,4) in comparison to the amber VAE with two LRVs is capable of reconstructing the transmit-
curve (VAE with 2 LRVs), which means that the VAE use ted message by only sending the LRVs’ parameters (µz , σ z )
fewer channels than the classical AE to achieve the same SER and the SER decreases as the Eb /No increases, as in Fig. 13.
numerical performance. As with BPSK VAE SER performance, when comparing

86844 VOLUME 11, 2023


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

Our proposed VAE with Rician performs better when the


value of k increases until it gets close to the performance of
the AWGN.

E. SHADOWING EFFECT
Fig. 15 shows the proposed BPSK VAE SER performance
compared to VAE with shadowing behaviour regarding
the σ of lognormal fading for a different number of
Eb /No . In this figure, it is possible to observe that increas-
ing Eb /No in presence of the shadowing effect decrease
the SER.

FIGURE 13. SER performance of the BPSK VAE scheme under AWGN vs
Rayleigh channel.

the VAE SER numerical performance with the theoretical


Rayleigh [49], our proposed VAE with Rayleigh outperforms
the theoretical one.

D. RICIAN FADING CHANNEL


The numerically computed SER values versus Eb /No ∈
[0,16] dB with Rician are shown in Fig. 14. The proposed
VAE with two LRVs is capable of reconstructing the transmit-
ted message by only sending the LRVs’ parameters (µz , σ z ), FIGURE 15. SER performance of the BPSK VAE with Shadowing σ =4.5 dB.
and the MER (in the BPSK case the MER = SER) decreases
as the Eb /No increases, as shown in Fig. 14. The numerical
performance of VAE SER with the different Rician factors, F. DOPPLER EFFECT
that measure the relative strength of the line of sight (LoS)
Fig.16 presents the proposed VAE with a variation of the
component and measure the severity of fading, with K = 2
Doppler shift value under the non-stationary case. From
being the most severe case of fading (very close to Rayleigh
the simulated results, we can notice that the SER increases as
fading), while K = 14 represents almost no fading, and at
the Doppler shift increase if we assume that both transmitter
K = 7 lay in-between.

FIGURE 14. SER performance of the BPSK VAE scheme under AWGN vs FIGURE 16. SER performance of the Doppler effect of the BPSK VAE with
Rician channel. different phase and frequency offsets.

VOLUME 11, 2023 86845


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

and receiver are moving along the same axis with different REFERENCES
phase offsets 5◦ and 45◦ , which demonstrates increasing [1] M. E. Morocho-Cayamcela, H. Lee, and W. Lim, ‘‘Machine learning for
mobility causing SER increment in compared to the station- 5G/B5G mobile and wireless communications: Potential, limitations, and
future directions,’’ IEEE Access, vol. 7, pp. 137184–137206, 2019.
ary scenario. [2] M. E. M. Cayamcela and W. Lim, ‘‘Artificial intelligence in 5G technology:
A survey,’’ in Proc. Int. Conf. Inf. Commun. Technol. Converg. (ICTC),
Oct. 2018, pp. 860–865.
G. FURTHER RESULTS DISCUSSION [3] T. O’Shea and J. Hoydis, ‘‘An introduction to deep learning for the physical
We extend our experiment to investigate the shadowing, layer,’’ IEEE Trans. Cogn. Commun. Netw., vol. 3, no. 4, pp. 563–575,
Dec. 2017.
Rayleigh and Rician fading channels in addition to the [4] X. Jin and H.-N. Kim, ‘‘Deep learning detection in MIMO decode-forward
AWGN. Moreover, the QPSK modulation under AWGN has relay channels,’’ IEEE Access, vol. 7, pp. 99481–99495, 2019.
been used to investigate the possibility of applying higher [5] E. Bourtsoulatze, D. B. Kurka, and D. Gündüz, ‘‘Deep joint source-channel
coding for wireless image transmission,’’ IEEE Trans. Cogn. Commun.
modulation schemes, which provides promising insight. Netw., vol. 5, no. 3, pp. 567–579, Sep. 2019.
However, due to the work limitation in focusing on the proof [6] E. Nachmani, Y. Be’ery, and D. Burshtein, ‘‘Learning to decode linear
of the proposed concept where short packets can be transmit- codes using deep learning,’’ in Proc. 54th Annu. Allerton Conf. Commun.,
Control, Comput. (Allerton), Sep. 2016, pp. 341–346.
ted through a wireless E2E VAE-based system. Further work
[7] E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Burshtein, and
can be conducted to find the SER performance for 64PSK and Y. Be’ery, ‘‘Deep learning methods for improved decoding of linear
128PSK. Furthermore, the Doppler effect has been added to codes,’’ IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, pp. 119–131,
Feb. 2018.
the experiment to show that the proposed design has potential
[8] T. Gruber, S. Cammerer, J. Hoydis, and S. T. Brink, ‘‘On deep learning-
for non-stationary scenarios and it requires to study of more based channel decoding,’’ in Proc. 51st Annu. Conf. Inf. Sci. Syst. (CISS),
varying channel parameters in order to overcome the limita- Mar. 2017, pp. 1–6.
tion of this experiment from the perspective of the transmitter- [9] S. Cammerer, T. Gruber, J. Hoydis, and S. T. Brink, ‘‘Scaling deep learning-
based decoding of polar codes via partitioning,’’ in Proc. IEEE Global
receiver mobility. Commun. Conf. (GLOBECOM), Dec. 2017, pp. 1–6.
[10] M. A. Alawad and K. A. Hamdi, ‘‘A deep learning-based detector for
IM-MIMO-OFDM,’’ in Proc. IEEE 94th Veh. Technol. Conf. (VTC-Fall),
V. CONCLUSION Sep. 2021, pp. 1–5.
This paper introduced a novel approach using the VAE as a [11] N. Samuel, T. Diskin, and A. Wiesel, ‘‘Deep MIMO detection,’’ in
Proc. IEEE 18th Int. Workshop Signal Process. Adv. Wireless Commun.
probabilistic model to reconstruct the transmitted symbol by (SPAWC), Jul. 2017, pp. 1–5.
transmitting the statistical parameters of the LRVs through [12] N. Farsad and A. Goldsmith, ‘‘Detection algorithms for communication
the physical layer instead of sending the data bits of the systems using deep learning,’’ 2017, arXiv:1705.08044.
[13] D. Neumann, T. Wiese, and W. Utschick, ‘‘Learning the MMSE channel
original symbol out of the transmitter. We show significantly estimator,’’ IEEE Trans. Signal Process., vol. 66, no. 11, pp. 2905–2917,
improved PHV or SER performance compared to the baseline Jun. 2018.
Hamming code with hard decision decoding, and classical [14] H. Ye, G. Y. Li, and B.-H. Juang, ‘‘Power of deep learning for channel esti-
AE E2E, where increasing the Eb /No improves the SER of mation and signal detection in OFDM systems,’’ IEEE Wireless Commun.
Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018.
the proposed system in comparison to the baseline schemes. [15] T. J. O’Shea, T. Erpek, and T. C. Clancy, ‘‘Deep learning based MIMO
In addition, the proposed VAE shows a promising channel communications,’’ 2017, arXiv:1707.07980.
utilizing efficiency in comparison to the classical AE, where [16] T. J. O’Shea, K. Karra, and T. C. Clancy, ‘‘Learning to communicate:
Channel auto-encoders, domain specific regularizers, and attention,’’ in
the results show that the VAE with two channels, (BPSK Proc. IEEE Int. Symp. Signal Process. Inf. Technol. (ISSPIT), Dec. 2016,
and QPSK) under AWGN, outperforms the classical AE of pp. 223–228.
4 and 7 channels schemes. Moreover, the performance of [17] Z. Qin, H. Ye, G. Y. Li, and B.-H. F. Juang, ‘‘Deep learning in physical layer
communications,’’ IEEE Wireless Commun., vol. 26, no. 2, pp. 93–99,
the proposed approach in the presence of fading (Rayleigh, Apr. 2019.
Rician and shadowing) is promising too as the results show [18] M. A. Alawad, M. Q. Hamdan, and K. A. Hamdi, ‘‘End-to-end deep
the performance improvement towards the BPSK VAE SER. learning IRS-assisted communications systems,’’ in Proc. IEEE 94th Veh.
Technol. Conf. (VTC-Fall), Sep. 2021, pp. 1–6.
Furthermore, other cases such as the Doppler effect has [19] M. Q. Hamdan and K. A. Hamdi, ‘‘Variational auto-encoders application in
been simulated and discussed, showing that the proposed wireless vehicle-to-everything communications,’’ in Proc. IEEE 91st Veh.
model can be generalized to the case in which the LVRs’ Technol. Conf. (VTC-Spring), May 2020, pp. 1–6.
[20] D. J. Rezende, S. Mohamed, and D. Wierstra, ‘‘Stochastic backpropagation
parameters are transmitted, rather than the original bits. and approximate inference in deep generative models,’’ in Proc. Int. Conf.
Our findings illustrate the importance of using the VAE Mach. Learn., Jun. 2014, pp. 1278–1286.
approach and may inspire other researchers to use a similar [21] A. Felix, S. Cammerer, S. Dörner, J. Hoydis, and S. Ten Brink, ‘‘OFDM-
autoencoder for end-to-end learning of communications systems,’’ in
approach for future communication systems. Nevertheless, Proc. IEEE 19th Int. Workshop Signal Process. Adv. Wireless Commun.
while we are concentrating on the proof of the proposed (SPAWC), Jun. 2018, pp. 1–5.
concept, there are some limitations to our work, further work [22] F. A. Aoudia and J. Hoydis, ‘‘Model-free training of end-to-end com-
can be conducted to find the SER performance for 64PSK munication systems,’’ IEEE J. Sel. Areas Commun., vol. 37, no. 11,
pp. 2503–2516, Nov. 2019.
and 128PSK. In addition, this paper probes the applications [23] V. Raj and S. Kalyani, ‘‘Backpropagating through the air: Deep learning
of the proposed design for a non-stationary case. Further at physical layer without channel models,’’ IEEE Commun. Lett., vol. 22,
investigation is required for both high mobility and higher no. 11, pp. 2278–2281, Nov. 2018.
[24] H. Ye, G. Y. Li, B.-H.-F. Juang, and K. Sivanesan, ‘‘Channel agnostic end-
modulation schemes to find how such limitations can be to-end learning based communication systems with conditional GAN,’’ in
overcome. Proc. IEEE Globecom Workshops (GC Wkshps), Dec. 2018, pp. 1–5.

86846 VOLUME 11, 2023


M. A. Alawad et al.: Innovative VAE for an E2E Communication System

[25] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Y. Arcas, [51] V. Raj and S. Kalyani, ‘‘Design of communication systems using deep
‘‘Communication-efficient learning of deep networks from decentralized learning: A variational inference perspective,’’ IEEE Trans. Cogn. Com-
data,’’ in Proc. Artif. Intell. Statist., Apr. 2017, pp. 1273–1282. mun. Netw., vol. 6, no. 4, pp. 1320–1334, Dec. 2020.
[26] P. Kairouz et al., ‘‘Advances and open problems in federated learning,’’ [52] K. P. Murphy, Machine Learning: A Probabilistic Perspective. Cambridge,
Found. Trends Mach. Learn., vol. 14, nos. 1–2, pp. 1–210, 2021. MA, USA: MIT Press, 2012.
[27] P. Bellavista, L. Foschini, and A. Mora, ‘‘Decentralised learning in fed- [53] D. P. Kingma and J. Ba, ‘‘Adam: A method for stochastic optimization,’’
erated deployment environments: A system-level survey,’’ ACM Comput. 2014, arXiv:1412.6980.
Surv., vol. 54, no. 1, pp. 1–38, Apr. 2021. [54] H. Zhang, L. Zhang, and Y. Jiang, ‘‘Overfitting and underfitting analysis
[28] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification for deep learning based end-to-end communication systems,’’ in Proc. 11th
with deep convolutional neural networks,’’ Commun. ACM, vol. 60, no. 6, Int. Conf. Wireless Commun. Signal Process. (WCSP), Oct. 2019, pp. 1–6.
pp. 84–90, May 2017. [55] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley,
[29] A. M. Abdelhameed and M. Bayoumi, ‘‘Semi-supervised EEG signals ‘‘On the effectiveness of least squares generative adversarial networks,’’
classification system for epileptic seizure detection,’’ IEEE Signal Process. IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 12, pp. 2947–2960,
Lett., vol. 26, no. 12, pp. 1922–1926, Dec. 2019. Dec. 2019, doi: 10.1109/TPAMI.2018.2872043.
[30] X. Chen, Y. Sun, M. Zhang, and D. Peng, ‘‘Evolving deep convolutional
variational autoencoders for image classification,’’ IEEE Trans. Evol.
Comput., vol. 25, no. 5, pp. 815–829, Oct. 2021.
[31] J. Klys, J. Snell, and R. Zemel, ‘‘Learning latent subspaces in variational MOHAMAD A. ALAWAD (Student Member,
autoencoders,’’ in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 31. IEEE) received the B.Sc. degree in electrical
Red Hook, NY, USA: Curran Associates, 2018, pp. 1–15. engineering from Qassim University, Qassim,
[32] P. Cristovao, H. Nakada, Y. Tanimura, and H. Asoh, ‘‘Generating in- Saudi Arabia, in 2010, and the M.Sc. degree in
between images through learned latent space representation using varia- electrical engineering from the Rochester Insti-
tional autoencoders,’’ IEEE Access, vol. 8, pp. 149456–149467, 2020. tute of Technology, Rochester, NY, USA, in 2016.
[33] J. Bao, D. Chen, F. Wen, H. Li, and G. Hua, ‘‘CVAE-GAN: Fine-grained He is currently pursuing the Ph.D. degree with the
image generation through asymmetric training,’’ in Proc. IEEE Int. Conf. Department of Electrical and Electronic Engineer-
Comput. Vis. (ICCV), Oct. 2017, pp. 2764–2773. ing, The University of Manchester, Manchester,
[34] A. Hawkins-Hooker, F. Depardieu, S. Baur, G. Couairon, A. Chen, and U.K. His research interests include 5G wireless
D. Bikard, ‘‘Generating functional protein variants with variational autoen- communication and machine learning for 5G and beyond 5G network
coders,’’ PLOS Comput. Biol., vol. 17, no. 2, Feb. 2021, Art. no. e1008736. technology.
[35] J. Jagannath, N. Polosky, A. Jagannath, F. Restuccia, and T. Melo-
dia, ‘‘Machine learning for wireless communications in the Internet of
Things: A comprehensive survey,’’ Ad Hoc Netw., vol. 93, Oct. 2019,
Art. no. 101913.
MUTASEM Q. HAMDAN (Student Member,
[36] A. A. Pourzanjani, R. M. Jiang, B. Mitchell, P. J. Atzberger, and
L. R. Petzold, ‘‘Bayesian inference over the stiefel manifold via the Givens IEEE) received the B.Eng. degree from The Uni-
representation,’’ Bayesian Anal., vol. 16, no. 2, pp. 639–666, Jun. 2021. versity of Jordan, in 2004, and the M.Sc. degree in
[37] Q. Liu and D. Wang, ‘‘Stein variational gradient descent: A general purpose broadband communications from Lancaster Uni-
Bayesian inference algorithm,’’ in Proc. Adv. Neural Inf. Process. Syst., versity, in 2011. He is currently pursuing the Ph.D.
vol. 29, 2016, pp. 1–9. degree in electrical and electronic engineering
[38] Q. Liu, ‘‘Stein variational gradient descent as gradient flow,’’ in Proc. Adv. with The University of Manchester. He was an
Neural Inf. Process. Syst., vol. 30, 2017, pp. 1–9. Electrical Engineer. He worked internationally at
[39] C. Liu, J. Zhuo, P. Cheng, R. Zhang, and J. Zhu, ‘‘Understanding and accel- many telecom infrastructure implementations for
erating particle-based variational inference,’’ in Proc. Int. Conf. Mach. 2G/3G Motorola, NEC, and Ericsson RAN tech-
Learn., May 2019, pp. 4082–4092. nologies, from 2004 to 2010. He joined as a Solutions Architect with the
[40] C. E. Shannon, The Mathematical Theory of Communication, W. Weaver, Handsfree Group for telematics and fleet management company at Manch-
Ed. Champaign, IL, USA: Univ. of Illinois Press, 1949. ester, where he worked on different telecom technologies and deployed
[41] H. Ye, L. Liang, G. Y. Li, and B.-H. F. Juang, ‘‘Deep learning-based end-to- 4G, 4G+, Huawei and Ericsson RAN technologies, 3/4G Cameras. At the
end wireless communication systems with conditional GANs as unknown end of 2021, he started working as a Research Fellow in 5G and beyond
channels,’’ IEEE Trans. Wireless Commun., vol. 19, no. 5, pp. 3133–3143, 5G network technology and research time synchronization reliability and
May 2020. resilience at the University of Surrey, in addition to machine learning for
[42] S. Dörner, S. Cammerer, J. Hoydis, and S. T. Brink, ‘‘Deep learning based open RAN intelligent controller (RIC).
communication over the air,’’ IEEE J. Sel. Topics Signal Process., vol. 12,
no. 1, pp. 132–143, Feb. 2018.
[43] E. Kodirov, T. Xiang, and S. Gong, ‘‘Semantic autoencoder for zero-shot
learning,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Jul. 2017, pp. 3174–3183. KHAIRI A. HAMDI (Senior Member, IEEE)
[44] H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, ‘‘Deep learning enabled received the B.Sc. degree in electrical engineering
semantic communication systems,’’ IEEE Trans. Signal Process., vol. 69, from Alfateh University, Tripoli, Libya, in 1981,
pp. 2663–2675, 2021. the M.Sc. degree (Hons.) from the Technical Uni-
[45] X. Luo, Z. Chen, B. Xia, and J. Wang, ‘‘Autoencoder-based semantic versity of Budapest, Budapest, Hungary, in 1988,
communication systems with relay channels,’’ 2021, arXiv:2111.10083. and the Ph.D. degree in telecommunication engi-
[46] D. P. Kingma and M. Welling, ‘‘Auto-encoding variational Bayes,’’ 2013, neering from the Hungarian Academy of Sciences,
arXiv:1312.6114. Budapest, in 1993. Previously, he held research
[47] D. P. Kingma and M. Welling, ‘‘An introduction to variational autoen- and academic posts with the Department of Com-
coders,’’ Found. Trends Mach. Learn., vol. 12, no. 4, pp. 307–392, 2019. puter Science, The University of Manchester, and
[48] T. M. Mitchell, Machine Learning, vol. 1, no. 9. New York, NY, USA: the Department of Electronic Systems Engineering, University of Essex.
McGraw-Hill, 1997. He was a BT Research Fellow, in Summer 2002, and was a Visiting Assistant
[49] J. Oetting, ‘‘A comparison of modulation techniques for digital radio,’’ Professor with Stanford University, during the academic year 2007–2008.
IEEE Trans. Commun., vol. COM-27, no. 12, pp. 1752–1762, Dec. 1979. He is currently a Senior Lecturer with The University of Manchester. His
[50] C. Neipp, A. Hernández, J. J. Rodes, A. Márquez, T. Beléndez, and current research interests include modeling and performance analysis of
A. Beléndez, ‘‘An analysis of the classical Doppler effect,’’ Eur. J. Phys., wireless communication systems, and networks.
vol. 24, no. 5, p. 497, 2003.

VOLUME 11, 2023 86847

You might also like