Professional Documents
Culture Documents
Abstract—In this paper, we investigate the problem of robust designs (e.g., different RISs, propagation environments, users
Reconfigurable Intelligent Surface (RIS) phase-shifts configura- distribution, etc.). Moreover, these approaches produce a high
tion over heterogeneous communication environments. The prob- inference accuracy within a fixed environment, from which
lem is formulated as a distributed learning problem over different
arXiv:2306.01306v1 [cs.LG] 2 Jun 2023
environments in a Federated Learning (FL) setting. Equivalently, the training and testing data are drawn. They subsequently
this corresponds to a game played between multiple RISs, as fail to have a good Out-of-Distribution (OoD) generalization
learning agents, in heterogeneous environments. Using Invariant in unseen environments.
Risk Minimization (IRM) and its FL equivalent, dubbed FL Although FL provides a learning framework where multiple
Games, we solve the RIS configuration problem by learning agents train a collaborative model while preserving privacy,
invariant causal representations across multiple environments
and then predicting the phases. The solution corresponds to its state-of-the-art approach, such as Federated Averaging
playing according to Best Response Dynamics (BRD) which yields (F EDAVG) [8], is known to perform poorly when the local
the Nash Equilibrium of the FL game. The representation learner data is non-identical across participating agents. This is due
and the phase predictor are modeled by two neural networks, to the fact that F EDAVG (and its variants) solve the distributed
and their performance is validated via simulations against other learning problem via Empirical Risk Minimization (ERM),
benchmarks from the literature. Our results show that causality-
based learning yields a predictor that is 15% more accurate in that minimizes the empirical distribution of the local losses
unseen Out-of-Distribution (OoD) environments. assuming that the data is identically distributed. To mitigate
Index Terms—Reconfigurable Intelligent Surface (RIS), Fed- this issue, the authors in [9] leveraged Distributionally Robust
erated Learning, Causal Inference, Invariant Learning. Optimization (DRO) [10] and proposed a federated DRO for
RIS configurations. Therein, the distributed learning problem
I. I NTRODUCTION is cast as a minimax problem, where the model’s parameters
The advent of Reconfigurable Intelligent Surfaces (RISs) are updated to minimize the maximal weighted combination
will substantially boost the performance of wireless commu- of local losses. This ensures a good performance for the
nication systems. These surfaces are manufactured by layering aggregated model over the worst-case combination of local
stacks of sheets made out of engineered materials, called distributions.
meta-materials, built on a planar structure. The reflection On the other hand, [11] used Invariant Risk Minimization
coefficients of the meta-material elements, called meta-atoms, (IRM) [12] to formulate the problem of learning optimal RIS
vary depending on their physical states. Thus, the direction of phase-shifts. The aim of IRM is to capture causal dependencies
incident electromagnetic waves on RISs can be manipulated in the data, that are invariant across different environments. In
with the aid of simple integrated circuit controllers that modify [11], the authors empirically showed that using relative dis-
meta-atoms’ states. In this view, the RIS technology provides tances and angles between the RIS and the transmitter/receiver
a partial control over the wireless propagation environment as causal representations for the CSI, improves the robustness
rendering improved spectral efficiency with a minimal power of the RIS phase predictor. However, these representations
footprint [1]. RIS is considered a fundamental enabler to were not learned by the configuration predictor, but were
achieve the 6G vision of smart radio environments [2]. predefined and fixed. Also, this solution assumes that multiple
One of the major challenges in the RIS technology is the environments are known to the predictor beforehand, which is
accurate tuning of RIS phases. To this extent, a vast majority an unfeasible assumption.
of the existing literature on RIS-assisted communication relies The main contribution of this paper is a novel distributed
on the use of Channel State Information (CSI) to train Machine IRM-based solution to the RIS configuration problem. We cast
Learning (ML) models that predict the optimal RIS configura- the problem of RIS phase control as a federated learning prob-
tion [3]–[5]. These methods seek either a centralized-controller lem with multiple RISs controllers defined over heterogeneous
driven approach, or a distributed multi-agent optimization environments, using a game-theoretic reformulation of IRM,
technique, such as Federated Learning (FL). Other works such referred to as Federated Learning Games (FL G AMES) [13],
as [6], [7] exploit the users’ locations to employ a location- [14]. The solution of this problem is proven to be the Nash
based passive RIS beamforming. Either way, their main focus Equilibrium of a strategic game where each RIS updates its
is to draw on the statistical correlations of the observed configuration predictor by minimizing its local loss function.
data, while overlooking the impacts of heterogeneous system This game is indexed by a representation learner that is
Representation
Learner
Predictor by denoting φr and ϑr as the azimuth and elevation angles-
RIS 1 RIS 2 of-departure (AoD) from the RIS respectively, the channel is
h modeled as
h √
g = αr aN (φr , ϑr ) , (1)
Transmitter
g g Transmitter
For our experiments, we consider three different environ- Broadcast ϕ to all agents r ∈ R
ments R. In all three environments, the Tx is located at the for each agent r ∈ R parallel do
coordinate (0, 35, 3) and the RIS comprising N = 10 × 10 Sample mini-batch Br of size m from Dr
reflective elements is at (10, 20, 1), with the coordinates given Update predictor:
in meters within a Cartesian system. The Rx is located in wr ← wr − ηw ∇wr ℓr (fwav ◦ fϕ ; Br )
an annular region around the RIS with inner and outer radii Communicate wr to the server
Rmin = 1 m and Rmax = 5 m. The Rician factor κ is set to 5, end for
Average Predictor: wav ← r∈R P Dr Dn wk
P
N dx dy
and the pathloss coefficients are calculated by αt = 4πd 2 and n∈R
Nd d x y
t
Broadcast {wr }r∈R to all agents r ∈ R
αr = 4πd 2 . To simplify the exhaustive search for optimal
r round ← round + 1
phases, we assume C contains two configurations classes,
T T end while
namely θ1 = [0, 0, . . . , 0] and θ2 = [0, π, 0, π, . . . , 0, π] .
In environment 1, the RIS elements are distanced by dx =
Environment 1 Environment 2
dy = 0.5λ. Therein, the Rx is uniformly placed around the RIS 2 2
Rmax-Rmin 1 Rmax-Rmin 1
with a tendency to be deployed closer to the RIS following 2p 2p
the distributions illustrated under Environment 1 in Fig. 2. In pr pq pr pq
environment 2, the RIS is characterized by dx = dy = 0.25λ. 0 0 0
0 2p
Rmin Rmax 0 2p Rmin Rmax 0
In contrast to environment 1, here, the Rx is higher likely r q r q
to be placed far from the RIS (see the distributions under Environment 3
2
Environment 2 in Fig. 2). The RIS in environment 3 has dx = Rmax-Rmin 1
p
dy = 0.4λ. Therein, the Rx’s distance to the RIS is uniform pr pq
but the angle distribution is concentrated in one direction as
0 0
illustrated in Fig. 2 under Environment 3. Note that the data Rmin r Rmax 0
q
2p
from environments 1 and 2 are used to compose the training
data while environment 3 is used only for testing. Fig. 2: Distributions of the receiver’s position (distance r and
In this work, we leverage V-FL G AMES that exhibits a angle θ from the RIS) in different environments.
superior performance than F-FL G AMES (where fϕ = I). On Representation Learner Predictor
the other hand, the authors in [11] showed by empirical 128
64 64
simulations, due to the fact that RIS channels have strong LoS
h 32 32
components, that one can select fixed causal representations 2
based on the channels in (1) and (5). The causal representa-
RIS
tions in this case are the AoA and AoD at the RIS, (φt , ϑt ) configuration
and (φr , ϑr ), and the relative distances RIS-Tx dt and RIS-Rx softmax
Test Accuracy
1500 83.3%
70 70 79.1%
80.4% 66.7%
60 68.7%
60 FedAvg - Same environment 1000
F-FLGames - Same environment
50 V-FLGames - Same environment
50 FedAvg - Different environment 500
0 250 500 F-FLGames - Different environment
V-FLGames - Different environment
40 0
0 500 1000 1500 2000 2500 3000 4 Agents 8 Agents
Communication Rounds Agents per environment
(a) Test accuracy convergence for all algorithms compared (e) Communication rounds needed for convergence with
in the same or in a different environment. multiple agents per environment. The number on top of
the bar is the corresponding test accuracy.
85
0.55
80
80
Test Accuracy
70
65 0.40
70
60 0.35
Best V-FLGames
55 F-FL Games 65 F-FL Games F-FLGames Random
0.30
V-FL Games V-FL Games FedAvg
50 FedAvg FedAvg
0.25
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 3 4 5 7 10 12 15 100 250 500 750 1000 1250
Weight E Local Steps Samples per Environment
(b) Impact of the weight αe on the (c) Impact of number of local itera- (d) Impact of the number of samples per envi-
testing accuracy. tions on the testing accuracy. ronment on the achievable spectral efficiency.
Fig. 4: Performance comparison of different algorithms in OoD testing datasets.
is zero and the normalized variance is one. We also record implying they do not overfit to the statistical correlations in
the parameters (φt , ϑt , φr , ϑr , dt , dr ) to train F-FL G AMES, the channels. It is also interesting to observe the convergence
which are scaled using a minmax scaler that normalizes rate of the different methods. We notice that V-FL G AMES
the data to the interval [−1, 1] by dividing by the absolute converges faster than F-FL G AMES, requiring around 500
maximum. Unless stated otherwise, we use 1000 samples communication epochs in both cases.
collected from environment 3 for testing. The design of the Fig. 4(b) demonstrates the impact of mixing data from
neural networks of the extractor and the predictor of V-FL different environments on model training. Here, we keep the
G AMES is based on the multi-layer perceptrons architecture, total number samples to be D1 + D2 = 3000 and vary the
and is shown in Fig. 3. Note that F EDAVG and F-FL G AMES amount of data from the two environments, in which, αe = D D2
1
only use the predictor part. The considered loss function is represents the fraction of samples of environment 1 compared
the cross-entropy. The mini-batch size used for training the to environment 2. We first notice that all algorithms reach
predictor is m = 32, and the learning rates are fixed at their peak performance with balanced datasets, i.e. αe = 0.5.
ηϕ = 5 × 10−4 and ηw = 2 × 10−3 . In the figures, lines When the datatsets are biased towards one of the environments,
correspond to the simulation results that are averaged over five F EDAVG loses 15% of its performance. On the other hand,
runs while the their respective standard deviations are shown F-FL G AMES and V-FL G AMES maintain a steady accuracy
as shaded areas. with a slight degradation of about 3–4%.
Inspired by F EDAVG’s flexibility in allowing more local
B. Discussion computations at each agent before sharing their models, we
We first plot the evolution of the testing accuracy of all modify our proposed algorithm to study the effect of the
algorithms in Fig. 4(a). Within the same environment (En- number of local iterations on the model accuracy. Insofar, each
vironments 1 and 2), F EDAVG and V-FL G AMES perform agent performs a few SGD updates locally prior to model
similarly with an accuracy of 90%, while F-FL G AMES gives sharing. Fig. 4(c) shows the impact of the number of local
an accuracy of 87%. When testing in a different environment steps on the testing accuracy. It can be noted that the accuracy
(Environment 3), F EDAVG’s accuracy drops to 68%, high- drops when each RIS performs more local computations
lighting its lack of robustness. In this case, F-FL G AMES and when using FL G AMES due to the fact that the models are
V-FL G AMES yield slightly lower accuracies of 85% and 80%, overfitting to the local datasets. However, the performance of
F-FL G AMES is more consistent with the local steps, losing fashion, and the results are compared with the environment-
only 2% of accuracy with seven local iterations, while V-FL unaware F EDAVG and an IRM-based predictor. The numerical
G AMES loses around 8% of accuracy with 15 local updates. results show that a phase predictor trained with the geometric
The reason for this behavior is that by letting each agent properties of the environments demonstrated a better perfor-
run over more samples from its local dataset, the testing mance than a representation learner followed by a predictor.
accuracy at equilibrium decreases, since the played strategies Moreover, the extractor-predictor network exhibits faster train-
do not account for the opponents’ actions. On the other hand, ing convergence when using more RISs. The extensions for
the accuracy of F EDAVG slightly increases with more local multiple users and multiple antennas at Tx and Rx are left for
iterations, but still performs poorly. future works.
The effect of the dataset size Dr per agent on the achievable
R EFERENCES
spectral efficiency is illustrated in Fig. 4(d). Note that all
agents use an equal amount of samples, i.e., αe = 0.5 is held. [1] C. Huang, A. Zappone, G. C. Alexandropoulos, M. Debbah, and
C. Yuen, “Reconfigurable intelligent surfaces for energy efficiency in
For the comparison, we additionally present the optimal rate wireless communication,” IEEE transactions on wireless communica-
given by the best configurations (indicated by Best) and the tions, vol. 18, no. 8, pp. 4157–4170, 2019.
rates given by random phase decision making (indicated by [2] M. D. Renzo, M. Debbah, D.-T. Phan-Huy, A. Zappone, M.-S. Alouini,
C. Yuen, V. Sciancalepore, G. C. Alexandropoulos, J. Hoydis et al.,
Random). All algorithms reach their best performance with “Smart radio environments empowered by reconfigurable ai meta-
the highest number of samples Dr = 1250 with about 21%, surfaces: An idea whose time has come,” EURASIP Journal on Wireless
14% and 32% losses compared to Best rates in V-FL G AMES, Communications and Networking, vol. 2019, no. 1, pp. 1–20, 2019.
[3] Ö. Özdoğan and E. Björnson, “Deep learning-based phase reconfigura-
F-FL G AMES, and F EDAVG, respectively. The advantage of tion for intelligent reflecting surfaces,” in 2020 54th Asilomar Confer-
learning invariant causal representations with minimal amount ence on Signals, Systems, and Computers. IEEE, 2020, pp. 707–711.
of data is highlighted when Dr ≤ 750. F EDAVG looses its [4] K. Stylianopoulos, N. Shlezinger, P. Del Hougne, and G. C. Alexan-
dropoulos, “Deep-learning-assisted configuration of reconfigurable in-
performance rapidly. Even with 100 samples per environment, telligent surfaces in dynamic rich-scattering environments,” in ICASSP
FL G AMES algorithms lose only 6% of their performance, 2022-2022 IEEE International Conference on Acoustics, Speech and
while F EDAVG incurs more than 15% of its accuracy. F E - Signal Processing (ICASSP). IEEE, 2022, pp. 8822–8826.
[5] G. C. Alexandropoulos, K. Stylianopoulos, C. Huang, C. Yuen, M. Ben-
DAVG requires around 750 samples per agent to reach its nis, and M. Debbah, “Pervasive machine learning for smart radio en-
best performance, that is more than 10% less than that given vironments enabled by reconfigurable intelligent surfaces,” Proceedings
by V-FL G AMES, underscoring its weakness in OoD settings. of the IEEE, vol. 110, no. 9, pp. 1494–1525, 2022.
[6] J. Park, S. Samarakoon, H. Shiri, M. K. Abdel-Aziz, T. Nishio,
Additionally, with 1250 samples per environment, the error A. Elgabli, and M. Bennis, “Extreme ultra-reliable and low-latency
variance of V-FL G AMES and F-FL G AMES is 73% and 98% communication,” Nature Electronics, vol. 5, no. 3, pp. 133–141, 2022.
less than that of F EDAVG. [7] G. C. Alexandropoulos, S. Samarakoon, M. Bennis, and M. Debbah,
“Phase configuration learning in wireless networks with multiple recon-
Finally, we vary the number of agents per environment figurable intelligent surfaces,” in 2020 IEEE Globecom Workshops (GC
as shown in Fig. 4(e). For this experiment, 1500 samples Wkshps. IEEE, 2020, pp. 1–6.
[8] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,
from each environment are shared among all agents, so more “Communication-efficient learning of deep networks from decentralized
agents having less data are involved. The achieved testing data,” in Artificial intelligence and statistics. PMLR, 2017, pp. 1273–
accuracies of the FL G AMES algorithms are still superior than 1282.
[9] C. B. Issaid, S. Samarakoon, M. Bennis, and H. V. Poor, “Federated
the F EDAVG benchmark. Surprisingly, doubling the number distributionally robust optimization for phase configuration of riss,” in
of collaborating RISs from 8 to 16 induces an 82% increase 2021 IEEE Global Communications Conference (GLOBECOM). IEEE,
in the number of training epochs for convergence in F-FL 2021, pp. 1–6.
[10] Y. Deng, M. M. Kamani, and M. Mahdavi, “Distributionally robust fed-
G AMES. The same does not hold for V-FL G AMES that suffers erated averaging,” Advances in neural information processing systems,
from an 8% increase, while losing 3-4% in accuracy compared vol. 33, pp. 15 111–15 122, 2020.
to F-FL G AMES. This implies that, with more agents owning [11] S. Samarakoon, J. Park, and M. Bennis, “Robust reconfigurable intel-
ligent surfaces via invariant risk and causal representations,” in 2021
fewer data, the training of a causal extractor and a predictor IEEE 22nd International Workshop on Signal Processing Advances in
converges faster than training of only a predictor. Wireless Communications (SPAWC), 2021, pp. 301–305.
[12] M. Arjovsky, L. Bottou, I. Gulrajani, and D. Lopez-Paz, “Invariant risk
minimization,” arXiv preprint arXiv:1907.02893, 2019.
V. C ONCLUSIONS [13] K. Ahuja, K. Shanmugam, K. Varshney, and A. Dhurandhar, “Invariant
risk minimization games,” in International Conference on Machine
This paper proposes a distributed phase configuration con- Learning. PMLR, 2020, pp. 145–155.
trol for RIS-assisted communication systems. The rate max- [14] S. Gupta, K. Ahuja, M. Havaei, N. Chatterjee, and Y. Bengio, “Fl games:
A federated learning framework for distribution shifts,” arXiv preprint
imization problem is formulated using federated IRM as arXiv:2205.11101, 2022.
opposed to a heterogeneity-unaware ERM approach. Our novel [15] E. Björnson and L. Sanguinetti, “Rayleigh fading modeling and channel
robust RIS phase-shifts controller leverages the underlying hardening for reconfigurable intelligent surfaces,” IEEE Wireless Com-
munications Letters, vol. 10, no. 4, pp. 830–834, 2020.
causal representations of the data that are invariant over dif-
ferent environments. A neural network based feature extractor
first uncovers the causal structure of the CSI data, then feeds it
to another neural network based configuration predictor. Both
neural networks are trained in a distributed supervised learning