You are on page 1of 6

2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications

(GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData)

Mobile Device Training Strategies in Federated


Learning: An Evolutionary Game Approach
Yuze Zou:˚ , Shaohan Feng: , Dusit Niyato: , Yutao Jiao: , Shimin Gong; , and Wenqing Cheng˚
: School of Computer Science and Engineering, Nanyang Technological University, Singapore
˚ School of Electronic Information and Communications, Huazhong Univ. of Science & Technology, China
; School of Intelligent Systems Engineering, Sun Yat-sen University, China

Abstract—With the tremendous success of ma- and accomplishes the learning tasks in a distributed
chine learning and increasingly powerful mobile manner [3], [4]. In the federated learning system, a
devices, federated learning has gained growing centralized server working as a coordinating cloud is
attention from both academia and industry. It responsible for training machine learning models for
capitalizes on vast number of distributed data to
the model owners by using the federation of the mobile
support machine learning based applications while
maintaining data privacy. In this paper, we con- devices, e.g., averaging the weights trained by the
sider a federated learning system, in which the federation. However, with the springing up machine
mobile devices allocate their data and computation learning based applications, there are a number of
resources among the machine learning applications, emerging federated learning based applications, each
i.e., model owners. Specifically, we formulate an of which is run by an individual model owner. The
evolutionary game mode for the mobile devices applications and model owners need the data and
with bounded rationality to adapt their training training services from the federation of the mobile
strategies aiming to maximize the device’s indi- devices. In contrast, asking the mobile devices to work
vidual utility. The uniqueness and stability of the
as sacrificial volunteers is not a viable and sustainable
equilibrium of the game are analysed theoretically.
Besides, Extensive experiments are conducted to option. Therefore, mobile devices with limited re-
determine the functions fitted for the accuracy and sources have to choose machine learning applications,
energy consumption metrics. i.e., the model owners, to contribute their training
Index Terms—Federated Learning, Evolutionary services. This selection is done based on the utility
Game, Equilibrium which consists of incentives from the model owners
and the costs of executing different training services.
I. Introduction Moreover, the bounded rationality generally exists in
Recently, due to the great success in machine learn- the federated learning environment. For example, the
ing, extensive applications based on machine learning accuracy of one model is determined by all mobile
techniques, e.g., face recognition, image classification, devices participating in the training for that model
and natural language processing, have been develope- while the training strategy of one mobile device is
d [1]. However, the conventional centralized machine generally not known to others. Motivated by these
learning techniques are facing thorny issues concern- observations, we propose an evolutionary game model
ing data privacy, e.g., data abuse and leakage, which to investigate the training strategies of the mobile
hinder the machine learning based applications from devices. Specifically, the contributions of this paper
touching the massive data generated by the mobile mainly lie in two folds:
devices. To address the issues, federated learning has
been introduced as one of the most promising solu- ‚ We proposed two metrics, i.e., the accuracy and
tions. Lately, Google have built a scalable production energy consumption of federated learning per-
system for federated learning in the domain of mobile formance, which correspond to the benefit and
devices, based on TensorFlow. [2] Briefly, federated cost, respectively. Extensive experiments are con-
learning is the collaborative machine learning that ducted to determine the function approximations
fully utilizes powerful mobile devices, takes advan- fitted for the accuracy and energy consumption
tage of the intelligence at the end of the network, metrics. Accordingly, the power law function and

978-1-7281-2980-8/19/$31.00 ©2019 IEEE 874


DOI 10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00157
ś Ř 'RZQORDG PRGHO 0RELOH'HYLFH
0RGHO
ř 7UDLQ PRGHO ‫ ڮ‬2ZQHUV
Ś 8SORDG ZHLJKWV
0RGHO 8WLOLW\
ś $YHUDJH ZHLJKWV 5HFHLYHU 0RQLWRU
7UDLQLQJ
Ř Ś 8QLW
Ś
Ś Ř Ř
:HLJKWV 'HFLVLRQ
‫ڮ‬ 0RELOH 6HQGHU 0DNHU
ř ř ‫ڮ‬ ř 'HYLFHV 'DWDVHW

(a) Federated learning workflow (b) System Model (c) Training service process
Fig. 1. Federated learning workflow, system model and training service process

linear model are obtained respectively. networks, e.g., convolutional neural network (CNN),
‚ An evolutionary game model is formulated to recurrent neural network (RNN), and long short-term
investigate the mobile devices’ training strategies memory (LSTM), or the same type of models with
under the scenario of multiple model owners. The different parameter settings, e.g., the number of layers
uniqueness and stability of the evolutionary equi- and neurons. Likewise, the mobile devices also have
librium are analysed, and the stability is further different types, such as smart phones, tablets, and
validated numerically. wearable devices.
The remainder of the paper is organized as fol- The training service process of a mobile device is
lows: Section II gives a brief introduction of federated shown in Fig. 1(c). The mobile device feeds the model
learning and describes the system model. Evolutionary received from the model owner to the training unit and
game is applied to formulate the problem in the Sec- the utility monitor records the utility of the training
tion III, where the uniqueness and Lyapunov stability strategy. After that, the decision maker determines the
are proved. Numerical performance and conclusion are training strategy for the next period according to the
given in Section IV and Section V, respectively. utility. In the sequel, certain amount of the dataset
is selected for the training service. After training, the
II. System Description updated weights are sent back to the model owner.
A. Preliminary of Federated Learning Let Mid denote the dataset size of mobile de-
The federated learning system works in four phases vice i and xi,k denote the probability that mobile
as shown in Fig. 1(a), which are: 1) The central device i provide training service to model owner k.
coordinator, i.e., model owner distributes its model Let xi fi rxi,1 , xi,2 , . . . , xi,K s denote the training s-
to the mobile devices. 2) The mobile devices train trategy of mobile device i and X fi rx1J , x2J , . . . , xN J sJ

the machine learning model received from the model denote the training profile of the federation of mobile
owner using its local dataset. 3) After training, mobile devices. Then, the total amount ř of datad offered to
devices upload the trained weights back to the model model owner k is given by N i“1 xi,k Mi . Let fk p¨q
owner. 4) Then, the model owner averages all the denote the update function, which describes training
weights uploaded by the mobile devices. This process benefit of model owner k against training dataset
repeats until the machine learning model achieves the size, e.g., accuracy of the model. The update that
desired performance. model ř owner k receives from mobile devices is given
by fk p N i“1 xi,k Mi q. In the sequel, we assume that
d

B. System Model the payoff that model owner k offers to the mobile
We consider the federated learning system that devicesřis proportional to the update it receives, i.e.,
consists of K model owners served by N mobile γi,k fk p N
i“1 xi,k Mi q, where γi,k is the payoff coefficient
d

devices as depicted in Fig. 1(b). The index sets of of model owner k. Furthermore, mobile devices will
model owners and mobile devices are denoted by K fi receive the payoff from model owners proportionally
t1, 2, . . . , K u and N fi t1, 2, . . . , N u, respectively. according to their training contributions to the model
Each model owner has a particular machine learning owners. For example, the contribution of mobile de-
x Md
model to train. For example, the model owners may vice i made to model owner k is given by řN i,kx iM d .
i“1 i,k i
have different types of models, such as different neural Accordingly, the payoff that mobile device i can re-

875
ceive from model owner k is given by With the above definitions, we define the evolution-
ary game G as follows:
ÿ
N
xi,k Mid
pi,k “ γi,k fk p xi,k Mid q řN ` αk , G “ pN , A, U q . (2)
i“1 xi,k Mi
d
i“1
B. Dynamics of Training Strategy Adaptation
where αk ě 0 is a bias term of model owner k, which
offers an additional incentive for mobile devices to In the context of the evolutionary game, each mo-
train for it. bile device will dynamically adapt its strategy accord-
On the other hand, model training incurs costs, ing to its utility. This is referred to as the evolution
e.g., energy consumption, due to the usage of the of the game where the strategy adaptation of mobile
computing resources. Let ci,k denote the cost per unit devices will affect the data assignment for model
data of mobile device i when training for the model owners, and therefore, the training strategy will evolve
owner k. Then, the cost incurred by mobile device i over time.
for training owner k’s model is given by In this case, the data assignment xi,k is function of
time t which can be denoted as xi,k ptq. The strategy
ci,k “ ηi,k xi,k Mid , adaptation process and the corresponding training
where ηi,k denotes the cost coefficient of mobile de- strategy evolution can be modelled and analysed by
vice i’s training process for model owner k. The replicator dynamics [5], which is a set of ordinary
coefficient may differ due to different complexities of differential equations defined as follows:
the training models. In summary, we define the utility x9 i,k ptq “ xi,k ptqpui,k ptq ´ ūi ptqq, (3)
that mobile device i can gain from the training for
for all xi,k P r0, 1s, with the initial data assignment
model owner k as follows:
strategy xi,k p0q “ x0i,k , where ūi is the average utility
ui,k “ pi,k ´ ci,k (1) of the trainingřstrategy for mobile device i, which is
given by ūi “ kPK ui,k xi,k .
Thus, we can define ui fi rui,1 , ui,2 , . . . , ui,K sJ as The replicator dynamics governs the rate of strategy
the utility vector of mobile device i. adaptation of the training strategy. As the game is
III. Evolutionary Game Formulation repeated, each mobile device observes its own utility
In this section, we give the evolutionary game and compares it with the average utility. Then, in the
formulation for training strategies of the mobile de- next period, the mobile device forms another strategy
vices in the federated learning system. The replicator if its utility is less than the average. By replicator
dynamics is used to model the strategy adaptation dynamics, the probability of a mobile device training
process. [5] for model owner k will increase if the corresponding
utility is higher than the average (i.e., ui,k ą ūi ).
A. Game Formulation
C. Uniqueness and Stability Analysis of the Evolution-
The federated learning problem can be formulated ary Equilibrium
as an evolutionary game as follows:
Definition 1. Evolutionary Equilibrium: The evolu-
Set of players pN q: The set of mobile devices N de-
tionary equilibrium is the solution of the game defined
notes the set of players of the game, i.e., the federation
in (3).
of mobile devices in the system.
Set of actions pAq: The mobile devices, i.e., the Firstly, the uniqueness of the evolutionary equilib-
players aim to obtain the training strategies that rium is guaranteed by the following theorem.
maximize their individual utilities. According to the
Theorem 1. The evolutionary game defined in (3) is
system model, mobile device i’s training strategy is
uniquely solvable and hence admits a unique evolution-
given by xi . Accordingly, we define Ś the set of action ary equilibrium.
for each mobile device as A fi kPK r0, 1s.
Utility function: The utility of mobile device i Proof. The following proposition 1 guarantees that
training for model owner k evaluates its performance x9 i,k ptq satisfies Lipschitz condition with respect to all
of choosing training strategy xi,k , which is elaborated xj,l , pj, lq P N ˆ K. Hence, the evolutionary game de-
in (1). Let U fi ru1 , u2 , . . . , uN s denote the utility fined in (3) is uniquely solvable according to Cauchy-
function matrix for the federation of mobile devices. Lipschitz theorem [6].

876
Proposition 1. Let gi,k ptq fi xi,k ptq pui,k ptq ´ ūi ptqq, TABLE I
then the first derivatives of gi,k ptq over xj,l ptq, for all Parameters Settings
pj, lq P N ˆ K are bounded. Value Value
N 3 K 3
Proof. Please see Appendix A. Md r100, 50000s η cf. Table II
Secondly, according to Lyapunov’s second method f p¨q » cf. (6) fi αk » [1
fi 1 2]
for stability [7], we justify the stability of the evolu- 0.2 0.5 0.3 1
X0 – 0.3 0.5 0.2 fl γi,k – 0.8 fl ˆ r8,32,64s
tionary equilibrium to the evolutionary game defined 0.2 0.5 0.3 0.6
100

in (3) as presented in Theorem 2.


Theorem 2. The evolutionary game defined in (3) 

admits a stable evolutionary equilibrium. 








(QHUJ\&RQVXPSWLRQ -


Proof. From the Lyapunov’s second method for sta-  




$FFXUDF\ 
 

bility, we design a Lyapunov function as follows: 






  

ř ř  
     
V pxi,k ptqq“t kPK xi,k ptqu ,
2
iPI (4) 



where V pxi,k ptqq satisfies 


  

          
$ 'DWDVHW6L]H 'DWDHW6L]H 


&“ 0, if xi,k “ 0, @i P N , k P K, (a) Accuracy (b) Energy Consumption
V pxi,k ptqq

(5)
%ą 0, otherwise. Fig. 2. Accuracy and Energy Consumption Fitting for Different
Learning Models
By using Lyapunov function defined in (4), the evolu-
tionary equilibrium to the evolutionary game defined
in (3) can be proven to be stable. Please refer to training dataset size (m). The update function may
Appendix B for details. vary from different models. Generally, however, the
IV. Performance Evaluation update functions are nondecreasing, which means that
more training data contributes to a more accurate
A. Parameters Setting and Measurement model or at least the model with same accuracy. In
We evaluate three neural networks with different order to obtain a precise expression of f p¨q, we train
number of hidden layer’s units (H). Basically, the the model with varied training dataset size from 100
more hidden units a model has, the higher accuracy to 50000 and then test its accuracy on the test dataset.
it can yield at the cost of computational complexity, The results and fitted update functions are given in
which leads to higher energy consumption for mobile Fig. 2. We use power law function to fit the accuracy
devices. Models with H P t0, 64, 512u are evaluat- curve, which is given by
ed.1 The parameters settings are listed in Table I.
According to Table I, model owner 3 provides the f pmq “ c ´ am´b (6)
highest γ and α compared with the other two model where a, b, c ą 0. c captures the upper bound of the
owners, which represents the highest incentive for accuracy, i.e., lim f pmq “ c. Similar results also
mobile devices. Apart from that, all model owners mÑ8
appeared in [9]. The fitted parameters for the accuracy
prefer mobile device 1 to the other two devices in
models are listed in Table II. It is worth noting that
terms of γ. The reason may be that mobile device 1
the fitness of these models are qualified with coefficient
has the highest reputation according to its training
of determination higher than 0.99, i.e., R2 ą 0.99.
history for the model owners.
2) Energy Consumption Measurement: To measure
1) Update Function Fitting: To obtain the update
the energy consumption of mobile devices, we adopt
function f p¨q of machine learning model(s), we con-
Raspberry Pi 3 Model B [10], to mimic the mobile
duct a set of machine learning experiments. In par-
device, which is a portable single-board computer that
ticular, we use the multi-class classification dataset,
has comparable hardware with mobile phones. The
MNIST [8]. The update function is a function that
energy consumption of the three models above are
elaborates the model performance, e.g., accuracy, over
depicted in Fig. 2(b) along with the linear regression
1
To hold the expression consistency, we use H “ 0 here to models. The energy consumption coefficients η are
denote NN with no hidden layer. listed in Table II, which are in mJ per 100 samples

877
TABLE II with the setting that γ1,k ą γ2,k ą γ3,k , which means
Fitted Model Coefficients for Accuracy and Energy that the model owners prefer mobile device 1.
Consumption against Training Set Size
2) Data Assignment Variations: The dataset size
Accuracy Energy variations of model owners against time are shown in
H a b c R2 η R2 Fig. 4(b). Similarly, the dataset sizes converge to the
0 20.74 0.8090 0.9228 0.9979 1.2303 0.9991
64 10.18 0.7017 0.9608 0.9902 2.9587 0.9987 equilibrium after fluctuations. Besides, D1 ą D3 ą D2
512 2.26 0.4686 0.9875 0.9909 21.4793 1.0000 at the stable state. This shows that although model 3
has the highest accuracy, but is not the most energy
efficient one. In contrast, model 1 achieves the highest


 
(YROXWLRQDU\
HTXLOLEULXP energy efficiency, and thus mobile devices tend to

 offer their training service for model owner 1. On



 the other hand, no model owner obtain all data from

(YROXWLRQDU\

mobile devices. The reason is that the accuracies of
 HTXLOLEULXP 




the models tend to saturate when dataset size is
 

        

 



larger than 10000 according to Fig. 2(a) while energy
consumptions still increase linearly with the dataset
(a) (b) size according to Fig. 2(b). As a result, mobile devices
Fig. 3. Direction Field of the replicator dynamics showing the prefer to train models dispersively.
stability of the evolutionary equlibrium 3) Training Strategy Variations: Figures 4(c)-(e)
depict the variations of training strategies of mobile
training, i.e., 10´5 J / sample training. Similarly, the device 1, 2,and 3 against time, respectively. Although
fitness of the models are also qualified with R2 ą 0.99. these three devices are initialized with the same
amount of data, they converge to different training
B. Demonstration of the Equilibrium Stability strategies due to the different preferences of the model
The stability of the evolutionary equilibrium can owners with respect to these mobile devices.
be verified and presented by using a direction field 4) Learning Rate Variations: Figure 4(f) shows
of the replicator dynamics. We use the evolutionary the learning rate of the evolutionary dynamics a-
process training strategies of mobile device 1 and 2 gainst time. We define learning rate (lr ) as the L1
9
for model owner 1 as well as training strategies of ř ř of the dynamic replicators X, i.e., lr fi
distance
mobile device 1 for all model owners as examples. The iPN kPK |x
9 i,k ptq|. As shown in Fig. 4(f), the learning
direction fields of px1,1 , x2,1 q and px1,1 , x1,2 , x1,3 q are rate decreases dramatically with time, which indicates
shown in Fig. 3(a) and (b), respectively. As shown in that the evolutionary converges quickly. In particular,
Fig. 3, any unstable strategy will follow the arrow to lr approaches to 0 when t ě 30.
reach the evolutionary equilibrium, which is marked
V. Conclusion
by the black circle. This demonstrates the stability
of the evolutionary equilibrium and is consistent with In this paper, we have investigated mobile devices’
Theorem 2. training strategy in federated learning system. Evo-
lutionary game theory has been adopted to model
C. Evolutionary Game Dynamics dynamic strategies of the mobile devices with bounded
Next, we investigate the dynamics of mobile devices’ rationality. Two metrics, i.e., accuracy and energy
utilities and training strategies along with the learning consumption of federated learning performance, which
rate of evolutionary game. The dataset size of mobile define the benefit and cost, respectively, have been
devices are the same, where each of them has 20000 measured and modeled for three different learning
training data items. The dynamics are shown in Fig. 4. models. Based on the measurements, we have con-
1) Utility Variations: The utility variations of mo- ducted the numerical evaluations and the results have
bile devices against time are shown in Fig. 4(a). The validated the stability of the evolutionary equilibrium.
utilities fluctuate first over time and then converge These results can be applied for the scenario consists
to the evolutionary equilibrium. At the evolutionary of multiple model owners that served by a federation
equilibrium, the utilities of three mobile devices show of mobile devices, which is the trend of machine
an order of ū1 ą ū2 ą ū3 . These results are consistent learning era.

878
     


     


    

'DWD$VVLJQPHQW

/HDUQLQJUDWH


3DUWLWLRQ

3DUWLWLRQ

3DUWLWLRQ
0RGHO2ZQHU
8WLOLW\

  0RGHO2ZQHU    


0RELOH'HYLFH
0RGHO2ZQHU
0RELOH'HYLFH 
    
0RELOH'HYLFH

    


     


                                                            
7LPHW 7LPHW 7LPHW 7LPHW 7LPHW 7LPHW

(a) Utility (b) Data Assignment (c) Mobile Device 1 (d) Mobile Device 2 (e) Mobile Device 3 (f) Learning Rate
Fig. 4. Evolutionary Game Dynamics: Utility and Training Strategy Variations of Mobile Devices, Learning Rate

VI. Acknowledgment ∇t pyptqq ď 0 for all values of yptq ‰ 0. The ∇t pyptqq


This work was supported in part by the “Nation- is given as follows:
« ff
al Key R&D Program of China”(2018YFB1004504), ÿ ÿ ÿ ÿ
A*STAR-NTU-SUTD Joint Research Grant Call on ∇t pyptqq “ 2 xi,k ptq x9 i,k ptq
iPN kPK
Artificial Intelligence for the Future of Manufactur- ÿ ÿiPN kPK
ing RGANS1906, WASP/NTU M4082187 (4080), Sin- “ 2N xi,k ptq pui,k ptq ´ ūi ptqq
gapore MOE Tier 1 under Grant 2017-T1-002-007 iPN kPK
˜ ¸
RG122/17, MOE Tier 2 under Grant MOE2014-T2- ÿ ÿ ÿ
2-015 ARC4/15, Singapore NRF2015-NRF-ISF001- “ 2N xi,k ptqui,k ptq ´ ūi
iPN kPK iPN
2277, and Singapore EMA Energy Resilience under
Grant NRF2017EWT-EP003-041. “ 0.
This states that the evolutionary game defined in
Appendix A (3) follows Lyapunov stability, i.e., the equilibrium is
Proof of Proposition 1 stable.
To prove Proposition 1, we give the derivative of
References
gi,k over xj,l as follows:
ˆ ˙ [1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,”
dgi,k dxi,k dui,k dūi nature, vol. 521, no. 7553, pp. 436–444, 2015.
“ pui,k ´ ūi q ` xi,k ´ , [2] K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. In-
dxj,l dxj,l dxj,l dxj,l
german, V. Ivanov, C. Kiddon, J. Konecny, S. Mazzocchi,
dui,k and H. B. McMahan, “Towards federated learning at scale:
where is given by
dxj,l System design,” arXiv preprint arXiv:1902.01046, 2019.
„ [3] H. B. McMahan, E. Moore, D. Ramage, and B. A. y Arcas,
xi,k Mid
“ γk f 1 phpxk qqMjd ` “Federated learning of deep networks using model averag-
dui,k
dxj,l
hpxk q ing,” CoRR, vol. abs/1602.05629, 2016.
˜ ¸ff [4] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B.
dxi,k Mjd xi,k Mid Mjd McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth,
fk phpxk qq ´
dxj,l hpxk q h2 pxk q “Practical secure aggregation for federated learning on
user-held data,” arXiv preprint arXiv:1611.04482, 2016.
dxi,k [5] J. Hofbauer and K. Sigmund, “Evolutionary game dy-
, ´ ηi,k Mid namics,” Bulletin of the American Mathematical Society,
dxj,l
vol. 40, no. 4, pp. 479–519, 2003.
ř
where hpxk q fi iPN xi,k Mid . Besides, t is omitted [6] E. of Mathematics, “Cauchy-lipschitz theorem,”
https://www.encyclopediaofmath.org/index.php/
here for notationalˇ ˇ
convenient. In the sequel, for all Cauchy-Lipschitz theorem.
ˇ du ˇ
pi, k q P N ˆ K, ˇ dxi,k
j,l
ˇ is bounded for all pj, lq P N ˆ K [7] S. Sastry, “Lyapunov stability theory,” Nonlinear Systems,
1 pp. 182–234, 1999.
due
ˇ ˇ the boundedness of f p¨q and f p¨q. Similarly,
to ˇ ˇ [8] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner,
ˇ dūi ˇ ˇ dg ˇ
ˇ dxj,l ˇ is also bounded. These facts admit that ˇ dxi,k j,l
ˇ “Gradient-based learning applied to document recogni-
is bounded, which completes the proof. tion,” Proc. of the IEEE, vol. 86, no. 11, pp. 2278–2324,
1998.
[9] B. Gu, F. Hu, and H. Liu, “Modelling classification per-
Appendix B formance for large data sets,” in International Conference
Proof of Theorem 2 on Web-Age Information Management, 2001, pp. 317–328.
[10] R. Foundation, “Raspberry Pi 3 model B,” https://www.
To prove that the function defined in (4) meet raspberrypi.org/products/raspberry-pi-3-model-b/.
the Lyapunov conditions, we need to verify that

879

You might also like