Professional Documents
Culture Documents
ABSTRACT 1 INTRODUCTION
The efficient and effective handling of few-shot learning tasks on There is a rapid growth in mobile device usage over the last decade.
mobile devices is challenging due to the small training set issue Moreover, there is a need to independent build effective personal-
and the physical limitations in power and computational resources ized predictive models on these mobile devices for different user
on these devices. We propose a framework that combines feder- needs. In other words, predictive models are different on different
ated learning and meta-learning to handle independent few-shot devices. The main challenge to build these predictive models is
learning tasks on multiple devices. In particular, we utilize the Pro- the limited amount of data for each object class (e.g., five to ten
totypical Networks to perform meta-learning on all devices to learn images available for each class) available for the predictive model
multiple independent few-shot learning models and to aggregate on a device. This is the so-called few-shot learning problem [19].
the device models using federated learning which can be reused Federated Learning (FL) [13, 16] is a recent technique which can
by the devices subsequently. We perform extensive experiments to solve the few-shots learning issue [19] by allowing the edge devices
(1) compare three different federated learning approaches, namely to collaboratively train and share knowledge to improve the pre-
Federated Averaging (FedAvg), Federated Proximal (FedProx), and diction accuracy at each device. In particular, distributed devices
Federated Personalization (FedPer) on the proposed framework, can effectively train their models and aggregate them to form an
and (2) investigate the effect of data heterogeneity issue on multi- effective global model shared by the devices.
ple devices on their few-shot learning performance. Our empirical The key difference between our few-shot learning scenario and
results show that our proposed framework is feasible and is able scenarios on existing federated few-shots learning problems is that
to improve the devices’ individual prediction performance and sig- each device has its own distinct prediction task different from the
nificant performance improvement using the aggregated model others. Our proposed solution combines federated learning (using
using any of the federated learning approaches when the few-shot aggregated models trained for the few-shots learning tasks) with
learning tasks are from the same source and data heterogeneity meta-learning [6] to fine tune the predictive models (at the devices)
continues to be a challenging issue to overcome. so that they work well when the number of data samples for each
class is limited on unrelated few-shot learning tasks at the mobile
CCS CONCEPTS devices. Fig. 1 shows an example of our problem scenario and
a high-level sketch of the proposed solution utilizing federated
• Computing methodologies → Neural networks; Coopera-
learning for knowledge sharing (models) among multiple devices to
tion and coordination.
perform few-shot learning at these devices driven by meta-learning
to fine-tuning the individual models.
KEYWORDS To enhance the training process for few-shot learning on the
Federated Learning, Meta-Learning, Few-Shot Learning devices, a meta-learning technique is applied on each device on
a collection of few-shot learning tasks so that global and local
ACM Reference Format: predictive models can be efficiently learned for previously unseen
Kousalya Soumya Lahari Voleti and Shen-Shyang Ho. 2023. Personalized few-shot learning tasks. Many meta-learning methods have been
Learning with Limited Data on Edge Devices using Federated Learning and presented for the few-shot learning problem such as Task Agnostic
Meta-Learning. In The Eighth ACM/IEEE Symposium on Edge Computing Meta-Learning [9] and meta-learning over a pre-trained model [3].
(SEC ’23), December 6–9, 2023, Wilmington, DE, USA. ACM, New York, NY, We investigate a federated meta-learning framework for few-shot
USA, 5 pages. https://doi.org/10.1145/3583740.3626811 learning using the Prototypical Networks [18] as the base classifier
to perform meta-learning on all devices in a centralized data dis-
tributed architecture. In particular, the different few-shot predictive
Permission to make digital or hard copies of all or part of this work for personal or models can be executed on the devices and the server. We perform
classroom use is granted without fee provided that copies are not made or distributed extensive experiments on three real-world datasets, namely CIFAR-
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than the 100 [10], Fashion-MNIST [20] and Omniglot [11], to (1) compare
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or three different federated learning approaches, namely Federated
republish, to post on servers or to redistribute to lists, requires prior specific permission Averaging (FedAvg) [14], Federated Proximal (FedProx) [12], and
and/or a fee. Request permissions from permissions@acm.org.
SEC ’23, December 6–9, 2023, Wilmington, DE, USA Federated Personalization (FedPer) [2] on our proposed framework
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. and (2) explore the effect of data heterogeneity (using different
ACM ISBN 979-8-4007-0123-8/23/12. . . $15.00
https://doi.org/10.1145/3583740.3626811
SEC ’23, December 6–9, 2023, Wilmington, DE, USA Kousalya Soumya Lahari Voleti and Shen-Shyang Ho
datasets on different edge devices) on the few-shot learning perfor- Learning) [4] called Sharp-MAML [1] utilizing a sharpness aware
mance on the different edge devices. minimization algorithm to overcome the generalization issues re-
The main observations and conclusions from our empirical re- lated to the nonconvex nature of the MAML task.
sults are as follows: One recent research of interest is whether one can utilize meta-
1. For few-shot classification tasks with reasonable difficulty (> 50% learning techniques to support few-shot learning tasks [19]. Unlike
accuracy initially), the proposed approach is able to improve the de- supervised learning of training on one set and testing on another
vices’ individual prediction performance and improve significantly set, meta-learning technique uses three different sets, namely: base
on the global model (on the server) using any of the federated learn- set for prior training, support set for fine-tuning and a query set
ing approaches when the few-shot learning tasks are from the same which one performs prediction on. The support set and query set
dataset. mostly have classes with fewer samples that are unseen in the base
2. Using the FedAvg performance as a baseline, we observe that set. Every support set and query set is specified as 𝑛-way: 𝑘-shot:
FedPer and FedProx performed well on our proposed framework. 𝑞-query tasks where n-way denotes number of classes in the sets,
3. Data heterogeneity problem affects the prediction performance 𝑘-shot is number of images per class in support set and 𝑞-query
of our current implementation of the framework no matter which is the number of images per class in the query set. The two main
federated learning approach we used. steps of an inductive meta-learning method for few-shot learning
are as follows:
1. Creating a set of support and query sets for training. To
2 BACKGROUND learn to handle an unseen few-shot learning task, a training set
2.1 Meta-Learning for Few Shot Learning is needed to build prior knowledge of the model. The training set
Meta-learning [6] is learning from one model with a large amount is randomly sampled to create multiple support and query sets of
of data, and fine-tuning it to another model for a new similar task. pre-defined 𝑛-way 𝑘-shot 𝑞-query configuration for 𝑛-way 𝑘-shot
Recently proposed meta-learning approaches include the synthetic few-shot learning training purposes. By replicating the process
information bottleneck method [7] which is based on an empirical of predicting with support and query sets during training, when
Bayes formulation with a transductive approach to obtain the vari- the model needs to train on an unseen support and query set, the
ational posterior and a variant of MAML (Model-Agnostic Meta training will be more effective.
Personalized Learning with Limited Data on Edge Devices using Federated Learning and Meta-Learning SEC ’23, December 6–9, 2023, Wilmington, DE, USA
independent devices and exploring the federated meta-few-shot information processing systems 27 (2014).
learning with decentralized architectures. [16] Dinh C Nguyen, Ming Ding, Pubudu N Pathirana, Aruna Seneviratne, Jun Li, and
H Vincent Poor. 2021. Federated learning for internet of things: A comprehensive
survey. IEEE Communications Surveys & Tutorials 23, 3 (2021), 1622–1658.
[17] Debaditya Shome and Tejaswini Kar. 2021. FedAffect: Few-shot federated learning
REFERENCES for facial expression recognition. In Proceedings of the IEEE/CVF International
[1] Momin Abbas, Quan Xiao, Lisha Chen, Pin-Yu Chen, and Tianyi Chen. 2022. Conference on Computer Vision. 4168–4175.
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning. In International [18] Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for
Conference on Machine Learning. PMLR, 10–32. few-shot learning. Advances in neural information processing systems 30 (2017).
[2] Manoj Ghuhan Arivazhagan, Vinay Aggarwal, Aaditya Kumar Singh, and Sunav [19] Yaqing Wang, Quanming Yao, James T Kwok, and Lionel M Ni. 2020. Generalizing
Choudhary. 2019. Federated learning with personalization layers. arXiv preprint from a few examples: A survey on few-shot learning. ACM computing surveys
arXiv:1912.00818 (2019). (csur) 53, 3 (2020), 1–34.
[3] Yinbo Chen, Zhuang Liu, Huijuan Xu, Trevor Darrell, and Xiaolong Wang. 2021. [20] Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-MNIST:
Meta-baseline: Exploring simple meta-learning for few-shot learning. In Proceed- a Novel Image Dataset for Benchmarking Machine Learning Algorithms.
ings of the IEEE/CVF International Conference on Computer Vision. 9062–9071. arXiv:cs.LG/1708.07747 [cs.LG]
[4] Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-
learning for fast adaptation of deep networks. In International conference on
machine learning. PMLR, 1126–1135.
[5] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual
learning for image recognition. In Proceedings of the IEEE conference on computer
vision and pattern recognition. 770–778.
[6] Timothy Hospedales, Antreas Antoniou, Paul Micaelli, and Amos Storkey. 2021.
Meta-learning in neural networks: A survey. IEEE transactions on pattern analysis
and machine intelligence 44, 9 (2021), 5149–5169.
[7] Shell Xu Hu, Pablo Garcia Moreno, Yang Xiao, Xi Shen, Guillaume Obozinski,
Neil D. Lawrence, and Andreas C. Damianou. 2020. Empirical Bayes Transduc-
tive Meta-Learning with Synthetic Gradients. In 8th International Conference
on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020.
OpenReview.net. https://openreview.net/forum?id=Hkg-xgrYvH
[8] Wenke Huang, Mang Ye, Bo Du, and Xiang Gao. 2022. Few-shot model agnostic
federated learning. In Proceedings of the 30th ACM International Conference on
Multimedia. 7309–7316.
[9] Muhammad Abdullah Jamal and Guo-Jun Qi. 2019. Task agnostic meta-learning
for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition. 11719–11727.
[10] Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images.
(2009).
[11] Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. 2015. Human-
level concept learning through probabilistic program induction. Science 350, 6266
(2015), 1332–1338.
[12] Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar,
and Virginia Smith. 2020. Federated optimization in heterogeneous networks.
Proceedings of Machine Learning and Systems 2 (2020), 429–450.
[13] Wei Yang Bryan Lim, Nguyen Cong Luong, Dinh Thai Hoang, Yutao Jiao, Ying-
Chang Liang, Qiang Yang, Dusit Niyato, and Chunyan Miao. 2020. Federated
learning in mobile edge networks: A comprehensive survey. IEEE Communications
Surveys & Tutorials 22, 3 (2020), 2031–2063.
[14] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and
Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep net-
works from decentralized data. In Artificial intelligence and statistics. PMLR,
1273–1282.
[15] Deanna Needell, Rachel Ward, and Nati Srebro. 2014. Stochastic gradient descent,
weighted sampling, and the randomized Kaczmarz algorithm. Advances in neural