Professional Documents
Culture Documents
Noise-Resilient Federated Learning Suppressing Noisy Labels in The Local Datasets of Participants
Noise-Resilient Federated Learning Suppressing Noisy Labels in The Local Datasets of Participants
NRL initially sets λ1 = λ2 = 1, which makes it similar to Algorithm 1: Noise-Resilient Federated Learning.
cross-entropy loss. During the training, λ1 and λ2 decreases Input: Set of participant P = {p1 , p2 , · · · , pN } with local
asynchronously to reduce the impact of the noisy labels. datasets with or without noisy labels.
4.) Local training and global aggregation: Server S designs 1 for each pi ∈ {p1 , p2 , · · · , pN } do
and randomly initialize a model M , upload M on partici- 2 Estimate noise ratio βi ;
pants of G1 , participants train the M using local dataset by 3 Obtain k clusters {G1 , G2 , · · · , Gk } using βi , 1 ≤ i ≤ N ;
minimizing NRL loss, and finally, the participants transfer 4 while not converge do
5 for each group Gl ∈ {G1 , G2 , · · · , Gk } do
WPM of trained M to S for aggregation using FL. NRL loss 6 if Gl == G1 then
adjust λ1 and λ2 to reduce impact of noisy labels. Let pq 7 G = G1
denotes a participant in G1 , where the target is to estimate 8 else
optimal WPM (wq ) by minimizing NRL loss (LqN RL ), q ∈ N1 . 9 G = G1 + · · · + Gl ;
During aggregation,
PN1 the server estimate aggregated loss as: 10 for each participant pq in G do
q
L1 (w1 ) = L
q←1 N RL , where w1 is aggregated WPM of 11 Obtain model M from server;
G1 . w1 and gradient ∇L1 are estimated as: 12 Train M on using NRL, given in Eq. 1;
N1 N1
13 Send WPM of M to the server;
1 X 1 X
Server aggregated WPM from all participant in G;
w1 = wq and ∇L1 = ∇LqN RL . (2) 14
N1 N 1
q←1 q←1 return Trained models on all the participants, {p1 , · · · , pN }.
The trained model M on the server achieves higher per-
formance and provides well-refined WPM for sub-sequent
groups. Next, the server includes the participants of G2 , train and S3 , respectively. We set a total of 10 participants, where
M using participants of G1 and G2 using the mechanism 5 participants have noise-free labels and 5 have equal and
mentioned above, andPNobtain WPM (w2 ) of the trained model randomly assigned noisy labels in their datasets.
1 1 +N2
as: w2 = N1 +N 2 q←1 wq . Similarly, we iterate for k 90 S1
groups to obtain the final WPM (wk ) of Gk as: 80
S2
S3
Accuracy
Accuracy
N1 +···+N N 70
1 X k 1 X 60
wk = wq = wq . (3) S1
N1 + · · · + Nk q←1
N q←1 50 S2 40
S3
wk is the well-refined WPM for N participants and achieves 20 40 60 80 100 0.1 0.2 0.3 0.4 0.5 0.6
Communication rounds Noise ratio
adequate performance in less convergence time. Fig. 1 il- (a) Communication rounds. (b) Noise ratio.
lustrates different steps of proposed noise-resilient approach, Fig. 2: Illustration of accuracy on S1 , S2 , and S3 .
which are summarised in Algorithm 1.
Fig. 2(a) illustrates a rapid increment in the accuracy of
Step 1 Step 2 Step 3 Next
S3 (proposed) up to 13 rounds and a marginal increment
round afterwards. The performance improvement is negligible after
Participants Estimating Group Training on Training on
in FL noise ratio formation G1 G1 · · · Gk 60 rounds, which indicates S3 converged after 60 rounds and
w and M w and M achieves an accuracy of 91.43%. We set the noise ratio to
p1 β1
1 2 1 2
0.1 during this experiment. Similarly, S1 and S2 converged
Server after 80 and 70 rounds and achieved an accuracy of 86.52%
p2
β2 G1 and 89.22%, respectively. Fig. 2(b) depicts the reduction in
.. .. ..
. . . G1 ··· G1 , · · · , Gk accuracy due to the increase in noise ratio. S3 and S1 achieves
βN Gk Group Groups highest and lowest performance, respectively. It is because
pN the S1 did not incorporate any mechanism for handling noisy
Noise Robust Loss labels, whereas S3 (proposed) estimated noise ratio and used
Fig. 1: An overview of noise-resilient approach.
1 and
2 denote NRL to mitigate the impact of noisy labels.
transmission of WPM from server and participant, respectively. R EFERENCES
III. P RELIMINARY RESULTS [1] B. Luo, X. Li, S. Wang, J. Huang, and L. Tassiulas, “Cost-effective
This section illustrates the preliminary results to verify federated learning design,” in Proc. IEEE INFOCOM, 2021, pp. 1–10.
[2] T. Tuor, S. Wang, B. J. Ko, C. Liu, and K. K. Leung, “Overcoming noisy
the effectiveness of the proposed noise-resilient approach. and irrelevant data in federated learning,” in Proc. IEEE ICPR, 2021, pp.
We presented the technique of handling noisy labels in [4]; 5020–5027.
however, its scope is limited to centralized training. We also [3] P. Chen, B. B. Liao, G. Chen, and S. Zhang, “Understanding and utilizing
deep neural networks trained with noisy labels,” in Proc. ICML, 2019,
collected the locomotion mode recognition dataset in [4], pp. 1062–1070.
which is utilized in this preliminary evaluation. In addition, [4] R. Mishra, A. Gupta, and H. P. Gupta, “Locomotion mode recog-
we utilized the deep learning-based conventional model in [4]. nition using sensory data with noisy labels: A deep learning ap-
proach,” IEEE Transactions on Mobile Computing, pp. 1–1, 2021, doi:
Further, we considered three schemes, i.e., baseline FedAvg, 10.1109/TMC.2021.3135878.
ICRP [2], and proposed noise-resilient, denoted as S1 , S2 ,
Authorized licensed use limited to: Indian Institute Of Technology (Banaras Hindu University) Varanasi. Downloaded on November 29,2022 at 06:59:00 UTC from IEEE Xplore. Restrictions apply.