Modeling Multileader-Follower Noncooperative Stackelberg Games

Cybernetics and Systems
An International Journal
ISSN: 0196-9722 (Print) 1087-6553 (Online) Journal homepage: http://www.tandfonline.com/loi/ucbs20
Modeling Multileader–Follower Noncooperative

Stackelberg Games
Cesar U. Solis, Julio B. Clempner & Alexander S. Poznyak
To cite this article: Cesar U. Solis, Julio B. Clempner & Alexander S. Poznyak (2016) Modeling
Multileader–Follower Noncooperative Stackelberg Games, Cybernetics and Systems, 47:8,
650-673, DOI: 10.1080/01969722.2016.1232121
To link to this article: http://dx.doi.org/10.1080/01969722.2016.1232121
Published online: 26 Oct 2016.
Submit your article to this journal
Article views: 43
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=ucbs20
Download by: [The UC San Diego Library] Date: 16 January 2017, At: 03:12
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL
2016, VOL. 47, NO. 8, 650–673
http://dx.doi.org/10.1080/01969722.2016.1232121
Modeling Multileader–Follower Noncooperative

Stackelberg Games
Cesar U. Solisa, Julio B. Clempnerb, and Alexander S. Poznyaka
a
Department of Control Automatics, Center for Research and Advanced Studies, Mexico City,
Mexico; bCentro de Investigaciones Economicas, Administrativas y Sociales, Instituto Politecnico
Nacional, Mexico City, Mexico
ABSTRACT KEYWORDS
This paper presents a Stackelberg–Nash game for modeling Extraproximal method;
multiple leaders and followers. The model involves two Nash Markov chains; multiple
games restricted by a Stackelberg game. We propose a leader–follower; Nash;
Stackelberg games
computational approach to find the equilibrium point based
on the extraproximal method for ergodic controlled finite
Markov chains. The extraproximal method consists of a two-
step iterated procedure: the first step is a prediction and the
second is a basic adjustment of the previous step. We formulate
the game as coupled nonlinear programming problems using
the Lagrange principle. The Tikhonov’s regularization method is
used to guarantee the convergence to a unique equilibrium
point. Validity of the method is demonstrated applying this
framework to model an oligopoly competition.
Introduction
Brief Review
The standard equilibrium concept employed most commonly in multi-players
game theory is that of the Nash equilibrium assuming that players always
make a best-reply to what other players are doing (Clempner and Poznyak
2011; Nash 1951). It describes a mathematical model in which all players
simultaneously compete against each other in a noncooperative game. A
strategic game in which there exist a leader player that moves first and then
the follower players move sequentially must be modeled as a Stackelberg
game (Van Stackelberg 1952, 2011). A classic Stackelberg game involves a
single leader and a single follower. However, games with a small number of
competing players are more common in the practice.
We consider a multi-leader-follower game that has several dominant
leaders and many dominated followers (Van Stackelberg 1952, 2011). Specifi-
cally, we model a multileader–follower Stackelberg game where both leaders
CONTACT Julio B. Clempner julio@clempner.name Centro de Investigaciones Economicas, Administrativas y
Sociales, Instituto Politecnico Nacional, Lauro Aguirre 120, col. Agricultura, Del. Miguel Hidalgo, Mexico City 11360,
Mexico.
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/ucbs.
© 2016 Taylor & Francis Group, LLC
CYBERNETICS AND SYSTEMS: AN INTERNATIONAL JOURNAL 651
and followers are in a noncooperative game, respectively. Thus, for instance,

in an oligopoly, a company must consider the effects of its actions on the rest
of the companies: the actions of a major (leader) firm cause reactions in the
other firms in the industry, however minor (follower) firms operate without
practically affecting the environment of the other firms in the industry, i.e.,
if a leader company tries to underprice the others, then the other leaders
and followers will respond by also lowering prices.
We use the joint format proposed by Tanaka and Yokoyama (1991) to
implement the Nash equilibrium for leaders and followers. This approach
excludes the compactness condition of the strategy set for each player and
allows the characterization of equilibrium point in each of the n-players game.
The Ekeland’s theorem plays a fundamental role in the nonconvex minimiza-
tion problem (Ekeland 1974, 1979).
In general, such type of games holds many equilibria. The uniqueness of an
equilibrium points is guaranteed by fulfilling of the so-called “strict diagonal
condition” which practically never takes place Rosen (1965). For solving the
existence and characterization of the Nash equilibrium we present an original
formulation in terms of coupled nonlinear programming problems implement-
ing the Lagrange principle and employed the Tikhonov’s regularization
method. Regularization refers to a process of introducing additional infor-
mation in order to solve an ill-posed problem. Specifically, Tikhonov regulari-
zation is a tradeoff between fitting the data and reducing a norm of the solution
ensuring the convergence of the objective functions to a local optimal policy.
Related Work
Sherali, Soyster, and Murphy (1983) develop a one leader and N followers
model where the followers operate under the Cournot assumption of zero
conjectural variation and the leader, called a Stackelberg firm, specifically
takes into account the reaction of the Cournot firms to its output proving
the existence and uniqueness of a Stackelberg–Nash–Cournot equilibrium.
Sherali (1984) presents a multiple leader Stackelberg model and analysis that
demonstrates how the leader-firms can utilize the true reaction curve of the
follower-firms providing sufficient conditions for some useful convexity
and differentiability properties of this function. Nishizaki and Sakawa
(2005) presented a method for obtaining Stackelberg solutions to two-level
integer problems with the upper- and lower-bound constraints through gen-
etic algorithms employing a zero-one bit string as an individual which is
penalized if the conditions are not satisfied in the artificial genetic systems.
Leyffer and Munson (2010) examine for multileader–follower games a variety
of nonlinear optimization and nonlinear complementarity formulations of
equilibrium problems with equilibrium constraints distinguishing two broad
cases: problems where the leaders can cost-differentiate and problems with
652 C. U. SOLIS ET AL.
price-consistent followers. Lu et al. (2007) extends branch and bound

algorithm to deal with a linear bilevel Stackelberg single leader with
multifollower decision problem by means of a linear referential-uncooperative
bilevel multifollower decision model. Lung and Dumitrescu (2008) present a
multiplayer game and propose a domination concept to detect the Nash
equilibria using an evolutionary multi-objective algorithm. Zhang, Lu, and
Gao (2008) proposes a set of models to describe such fuzzy multi-objective,
multifollower (cooperative) bilevel programming problems with a single
leader where the functions involve fuzzy coefficients and multiple objectives
at both levels. De Miguel and Xu (2009) study a noncooperative oligopoly
consisting of M leaders and N followers showing the existence of a unique
stochastic multiple-leader Stackelberg–Nash–Cournot equilibrium, and
proposing a computational approach to find the equilibrium and analyzing
its rate of convergence. Koh (2012) introduces an evolutionary algorithm
for the solution of a class of hierarchical leader-follower games known as
Equilibrium Problems with Equilibrium Constraints where the leader’s
payoffs are constrained not only by their competitor’s actions but also by
the behavior of the followers at the lower level which manifests in the form
of an equilibrium constraint. Sinha et al. (2014) considers a special case of
a multiperiod multileader–follower Stackelberg competition model with
nonlinear cost and demand functions and discrete production variables using
a computationally intensive nested evolutionary strategy to find an optimal
solution for the model. Trejo, Clempner, and Poznyak (2015) present an
approach for representing a leader and N followers based on the extraproxi-
mal method.
Main Results
In this paper we present a particular Stackelberg–Nash game approach for
modeling multiple leaders and followers. We propose a computational
approach to find the equilibrium point based on the extraproximal method
for a class of ergodic controlled finite Markov chains games. In this game,
each of the leaders proceeds as in a Nash game with respect to the other lea-
ders (selecting their strategies simultaneously and non-cooperatively among
themselves). As well, the followers proceed as in a Nash game with respect
to the other followers. However, leaders and followers are together in a Stack-
elberg game: the model involves two Nash games restricted by a Stackelberg
game. The extraproximal method consists of a two-step iterated procedure
where the first step is a prediction that calculates the preliminary position
approximation to the equilibrium point and the second is a basic adjustment
of the previous prediction. Each equation of the extraproximal method is an
optimization problem itself. We formulate the game as coupled nonlinear
programming problems using the Lagrange principle and the Tikhonov’s
regularization method is used to guarantee the convergence of the method to

a unique equilibrium point. Finally, validity of the proposed method is
demonstrated experimentally applying this framework to model an oligopoly
competition in the car industry.
Organization of the Paper

The rest of this paper is organized as follows. The next section presents
the mathematical background for understanding the rest of the
paper. From then on, we suggest a mathematical description of the
Stackelberg–Nash solution for M leaders and N followers. Thereafter,
the Lagrange principle and the Tikhonov’s regularization are presented.
Subsequently, we describe the extraproximal Markov game model and
some implementation details. Then, we present a numerical example on
the multileader–follower Stackelberg optimization problem to model and
analyze competition for the car industry. Lastly, the conclusions and future
work are summarized.
Controllable Markov Games

A Controllable Markov chain is a 4-tuple MC ¼ {S, A, ϒ, Π} where S is a finite
set of states, S ⊂ N; A is the set of actions, which is a metric space (Clempner
and Poznyak 2014; Poznyak, Najim, and Gomez-Ramirez 2000). For each s ∈
S, A(s) ⊂ A is the non-empty set of admissible actions at state s ∈ S. Without
loss of generality we may take A ¼ ∪ s∈SA(s); ϒ ¼ {(s, a)|s ∈ S, a ∈ A(s)} is the
set of admissible state-action pairs, which is a measurable subset of S × A;
Π ¼ [π(i, j|k)] is a stationary transition controlled matrix, where
�
pði;jjkÞ � P sðn þ 1Þ ¼ sðjÞ jsðnÞ ¼ sðiÞ ; aðnÞ ¼ aðkÞ
representing the probability associated with the transition from state s(i) to
state s(j) under an action a(k) ∈ A(s(i)) (k ¼ 1,…, M) at time n ¼ 0, 1,….
The dynamic of the game for Markov chains is described as follows. The
game consists of i ¼ 1; M þ N players and begins at the initial state sι(0)
which (as well as the states further realized by the process) is assumed to
be completely measurable. Each player ι is allowed to randomize, with distri-
� �
i i i i
bution dðkjiÞ ðnÞ, over the pure action choices aðkÞ 2 A sðiÞ , i ¼ 1; Ni and
k ¼ 1; Mi . The leader corresponds to m ¼ 1; M and followers to l ¼ 1; N.
m
At each fixed strategy of the leaders dðkjiÞ ðnÞ the followers make the strategy
selection trying to realize a Nash equilibrium. From now on we will consider
i i
only stationary strategies dðkjiÞ ðnÞ ¼ dðkjiÞ . In the ergodic case when all Markov
i
chains are ergodic for any stationary strategy dðkjiÞ the distributions
�
Pi si ðn þ 1Þ ¼ sðji Þ exponentially quickly converge to their limits Pι(s ¼ s(i))
satisfying
!
Ni
X Mi
X
� �
Pi sðji Þ ¼ piðii ;ji jki Þ dðk
i
i jii Þ
Pi sðii Þ : ð1Þ
ii ¼1 ki ¼1
The cost function of each player, depending on the states and actions of all
participants, is given by the values Wðii 1 ;k1 ;...;iMþN ;kMþN Þ , so that the ‘average cost
function’ Jι for each player ι in the stationary regime can be expressed as
� X X Y
MþN
Ji c1 ; ::; cMþN :¼ �� Wðii 1 ;k1 ;::;iMþN ;kMþN Þ ciðii ;ki Þ ; ð2Þ
i1 ;k1 iMþN ;kMþN i¼1
h i
where ci :¼ ciðii ;ki Þ is a matrix with elements
ii ¼1;Ni ;ki ¼1;Mi
� �
ciðii ;ki Þ i
¼ dðk i jii Þ
P i ðiÞ
s ¼ sðii Þ ; ð3Þ
satisfying
8 P i
ðiÞ
>
<c : cðii ;ki Þ ¼ 1; ciðii ;ki Þ � 0
ðiÞ i ;k
cðiÞ 2 Cadm ¼ P ii i P i ; ð4Þ
: cðji ;ki Þ ¼
> pðii ;ji jki Þ ciðii ;ki Þ
ki ii ;ki
where
X X Y
MþN
Wðii 1 ;k1 ;...;iMþN ;kMþN Þ ¼ :: i
Jði1 ;j1 ;k1 ;...;iMþN ;jMþN ;kMþN Þ ðnÞ piðii ji jki Þ :
j1 jMþN i¼1
Notice that by (3) it follows that:
i
� X ciðii ;ki Þ
P sðii Þ ¼ ciðii ;ki Þ ; i
dðk i jii Þ
¼P i : ð5Þ
ki
cðii ;ki Þ
ki
P
In the ergodic case ki ciðii ;ki Þ > 0 for all i ¼ 1; M þ N. The individual aim
of each participant is Ji ðci Þ ! min . We have a conflict situation which can
ðiÞ
cðiÞ 2Cadm
be resolved by the Stackelberg–Nash equilibrium concept discussed below in
details.
The Stackelberg–Nash Solution

For solving the Nash condition, for both leaders and followers, we
guarantee that if each leader has chosen a strategy and no leader can
benefit by changing strategies then the other leaders keep theirs

unchanged. Then, they maximize the followers utility subject to this
constraint satisfying the Nash condition as well.
Let us introduce the variables
ðmÞ �
um :¼ col cðmÞ ; U m :¼ Cadm m ¼ 1; M ;
ðlÞ � ð6Þ
vl :¼ col cðlÞ ; V l :¼ Cadm l ¼ 1; N ;
and let us consider a game with M leaders and N followers. �

The leaders strategies are denoted by um 2 U m m ¼ 1; M , where U is a
convex an compact set. Denote by u ¼ (u1,…, uM)T ∈ U the joint strategy of
the leaders and um^ is a strategy of the rest of the leaders adjoint to um, namely,
�T M
um^ :¼ u1 ; . . . ; um 1 ; umþ1 ; . . . ; uM 2 U m^ :¼ � U m;
q¼1;q6¼m
�
such that u ¼ ðum ; um^ Þ m ¼ 1; M .
Leaders try to reach the one of Nash equilibria, that is, to find a joint
strategy u� ¼ ðu1� ; . . . ; uM� Þ 2 U satisfying for any admissible um ∈ Um and
any m ¼ 1; M
��
P
M
m m
^
� m m^
�
^ðuÞÞ :¼
F ðu; u min
m m
wm u ; u wm u ; u ; ð7Þ
m¼1 u 2U
^ T
^ðuÞ ¼ ðu^1T ; . . . ; uMT
where u Þ 2U^ � RMðM 1Þ (Tanaka and Yokoyama 1991).
�
Here wl um ; um^ is the cost function of the leader l which plays the strategy
um ∈ Um and the rest of the leaders the strategy um^ 2 U m^ .
If we consider the utopia point
�
�m :¼ argmin wl um ; um^ ;
u ð8Þ
um 2U m
then, we can rewrite Eq. (7) as follows:
X
M � � ��
^ðuÞÞ :¼
Fðu; u �m ; um^
wm u wm um ; um^ : ð9Þ
m¼1
� �
The functions wm um ; um^ m ¼ 1; M are assumed to be convex in all
their arguments.
Remark 1: The function F ðu; u^ðuÞÞ satisfies the Nash property
� �
wm u�m ; um^ wm um ; um^ � 0
ð10Þ
for any um 2 U m and all m ¼ 1; M
Definition 1: A strategy u� 2 Uadm is said to be a Nash equilibrium of the

leaders if
u� 2 Arg min fF ðu; u

^ðuÞÞg: ð11Þ
u2Uadm
^ðuÞÞ is strictly convex then we have that

If F ðu; u
u� ¼ arg min fF ðu; u

^ðuÞÞg:
u2Uadm
Lemma 1: The Nash equilibrium u� 2 Uadm for the leaders can be

expressed in the joint format (Tanaka and Yokoyama 1991) as follows:
u� ¼ arg min Fðu; u

^ðuÞÞ;
u ^
^2U
X
M � � ��
^ðuÞÞ : ¼
Fðu; u �m ; um^
wm u wm um ; um^ ;
m¼1 ð12Þ
m
^
� m m ^
�
�m ; u
wm u : ¼ min
m m
/ l z ; u ;
z 2U
for any um 2 U m and all m ¼ 1; M:
Proof. Summing (12) implies (10). And inverse, taking u ^l ¼ ul for all l ≠ m
l
^ , we obtain (10).
in (12), which is valid for any admissible u
Remark 2: Notice that the condition F ðu; u ^ðuÞÞ � 0 (12) is equivalent to
^ðuÞÞ � 0
max F ðu; u
^
^ 2U
u
for any fixed u^2U ^ and any u ∈ U.

Similarly, in this process the followers try to �reach one of the Nash
equilibria for the fixed strategies vl 2 V l l ¼ 1; N , where V is a convex a
compact set. Denote by v ¼ (v1,…, vN)T ∈ V the joint strategy of the followers
^
and vl is a strategy of the rest of the followers adjoint to vl, namely,
^ �T ^ N
vl :¼ v1 ; . . . ; vl 1 ; vlþ1 ; . . . ; vN 2 V l :¼ � Vl;
l¼1;l6¼m
^ �
such that v ¼ ðvl ; vl Þ l ¼ 1; N .
Followers try to reach the one of Nash equilibria, that is, to find a joint
strategy v� ¼ ðv1� ; . . . ; vN� Þ 2 V satisfying for any admissible vl ∈ Vl and
any l ¼ 1; N
N ��
X � ^��
^l
Gðv; ^vðvÞÞ :¼ min /l vl ; vl /l v ; vl
; ð13Þ
vl 2V l
l¼1
^ ^
ðvÞ ¼ �ðv1T ; . . . ; vNT ÞT 2 V
where ^v� ^ � RNðN 1Þ (Tanaka and Yokoyama 1991).
^
Here /l vl ; vl is the cost function of the follower l which plays the strategy
^ ^
ul ∈ Ul and the rest of the followers the strategy vl 2 V l .
We also consider the utopia point as
� ^�
�vl :¼ argmin /l vl ; vl ; ð14Þ
vl 2V l
then, we can rewrite Eq. (13) as follows:
N h �
X � � ^�i
^
Gðv; ^vðvÞÞ :¼ /l �vl ; vl /l vl ; vl : ð15Þ
l¼1
� � �
^
The functions /l vl ; vl l ¼ 1; N are assumed to be convex in all their
arguments.
Remark 3: The function Gðu; u ^ðuÞÞ satisfies the Nash property
� ^� � ^�
/l �vl ; vl /l vl ; vl � 0;
ð16Þ
for any vl 2 V l and all l ¼ 1; N:
Definition 2: A strategy v� 2 Vadm is said to be a Nash equilibrium of the

followers if:
v� 2 Arg min fGðv; ^vðvÞÞg: ð17Þ

v2Vadm
If V ðv; ^vðvÞÞ is strictly convex then we have that
v� ¼ arg min fGðv; ^vðvÞÞg:

v2Vadm
Remark 4: Lemma 1 is similar for the followers.

Definition 3: A game with M leaders and N followers said to be a non-
cooperative Stackelberg–Nash game if
X
M � � ��
^ðuÞjvÞ ¼
F ðu; u �m ; um^ jv
wm u wm um ; um^ jv ;
m¼1
where
^ðuÞjvÞ � 0 for any um 2 U m and all m ¼ 1; M;

F ðu; u
and
^ðuÞÞ � 0
Gðu; u
N h �
X � � �i
l ^l l ^l
Gðv; ^vðvÞÞ :¼ /l �v ; v /l v ; v for any vl 2 V l and all l ¼ 1; N;
l¼1
^l
where v is a strategy of the rest of the followers adjoint to vl, namely,
^ � ^ N
vl :¼ v1 ; . . . ; vl 1 ; vlþ1 ; . . . ; vN 2 V l :¼ � Vq
q¼1;q6¼l
and
�
� ^l
l l
�v :¼ argmin /l v ; v ju :
vl 2V l
m m ^
�
Definition
� 4: Let� w m u �; u jv be the cost functions of the leaders �
^
m ¼ 1; M and /l vl ; vl ju the cost functions of the followers l ¼ 1; N
where u ∈ U and v ∈ V. A strategy u� 2 U of the leaders together with the col-
lection v� 2 V of the followers is said to be a Stackelberg–Nash equilibrium if
ðu� ; v� Þ 2 Arg min ^ðuÞjvÞjFðu; u
max fFðu; u ^ðuÞjvÞ � 0; Gðv; ^vðvÞjuÞ � 0g:
u2U;^ ^ v2V;^v2V
u2U ^
ð18Þ
Lagrange Principle and Tikhonov’s Regularization

Tikhonov Regularization
Considering that the loss functions admit to be non-strictly convex, an equi-
librium point in the leaders and the followers game may not be unique. To
provide the uniqueness of an equilibrium let us associate problem (18) with
the so-called regularized problem, i.e.,
�
u�d ;v�d 2 Arg min max fFd ðu;^ uðuÞjvÞjFd ðu;^
uðuÞjvÞ � 0;Gd ðv;^vðvÞjuÞ � 0g
u2U;^ ^ v2V;^v2V
u2U ^
XM � � �� d �� 2 � �2 �
Fd ðu;^
uðuÞjvÞ :¼ wm u �m ;um^ jv wm um ;um^ jv þ �u� þ �u ^ðuÞ�
m¼1
2
XN h � � � ^�i d �� 2 �
l ^l � 2 �
Gd ðv;^vðvÞÞ :¼ /l �v ;v l l
/l v ;v þ v þ ^vðvÞ� :
�
l¼1
2
ð19Þ
Now, the function Fd ðu; u^ðuÞjvÞ and joint function Gd ðv; ^vðvÞÞ are strictly
convex if δ > 0. It is evident that, for δ ¼ 0, the problem (19) converts to
problem (18).
Lagrange Principle Implementation

The nonlinear programming problem (19) may be resolved by the Lagrange
principle implementation. To do this, consider the Lagrange function
^; t; v; ^v; k; h; lÞ ¼ t
Ld ðu; u ^ðuÞjvÞ
lðFd ðu; u ^ðuÞjvÞ
tÞ þ hðFd ðu; u tÞ
d 2 �
þ kGd ðv; ^vðvÞjuÞ k þ h2 :
2
It is easy to see that, if μ ¼ 1, the Lagrange function has no minimum at t ∈ R
^; t; v; ^v; k; h; 1Þ), we have
(exclude this case). Taking μ ¼ 1 (excludes t in Ld ðu; u
^; t; v; ^v; k; h; lÞ
Ld ðu; u
d 2 �
^jvÞ þ kGd ðv; ^vjuÞ
¼ ð1 þ hÞFd ðu; u k þ h2 :¼ Ld ðu; u
^; v; ^v; k; hÞ:
2
ð20Þ
In view of the strict convexity of (20) for δ > 0, there exists a k�d � 0 such
that the following saddle-point inequalities hold:
Ld ðu�d ; u ^�d ; v�d ; ^v�d ; k�d ; h�d Þ � Ld ðu; u
^�d ; v; ^v; k; hÞ � Ld ðu�d ; u ^; v�d ; ^v�d ; k�d ; h�d Þ: ð21Þ
^�d ; v�d ; ^v�d ; k�d ; h�d Þ can be interpreted as the δ approximation of
The vector ðu�d ; u
the solution of problem (20). Thus, we can rewrite (19) using (20) as follows:
^�d ; v�d ; ^v�d ; k�d ; h�d Þ 2 arg min
ðu�d ; u max ^; v; ^v; k; hÞg:
fLd ðu; u ð22Þ
u2U;^ ^ v2V;^v2V;k�0;h�0
u2U
The Extraproximal Method

The Proximal Format
In the proximal format [see,
� Antipin (2005)] the relation (22) can be � expressed as
1 � � �
k�d ¼ arg max �k k� �2 þ cLd u� ; u � � �
vd ; k; h�d
d d ^d ; vd ; ^
k�0 2
� �
� 1��
�
� �2 � � � � �
�
hd ¼ arg max h hd þ cLd ud ; u ^d ; vd ; ^vd ; kd ; h
h�0 2
� �
� 1�
�
�
� �2 � � � � �
�
ud ¼ arg min ^ ^
u ud þ cLd u; ud ; vd ; vd ; kd ; hd
u2U 2
� � ð23Þ
� 1�
�
�
� �2 � � � � �
�
^d ¼ arg min
u ^ u
u ^d þ cLd ud ; u^; vd ; ^vd ; kd ; hd
u ^ 2
^2U
� �
� 1��
�
� �2 � � � � �
�
vd ¼ arg max v vd þ cLd ud ; u ^d ; v; ^vd ; kd ; hd
v2V 2
� �
� 1��
�
� �2 � � � � �
�
^vd ¼ arg max ^v ^vd þ cLd ud ; u^d ; vd ; ^v; kd ; hd
^
^v2V 2
where the solutions u�d , xd� , v�d , w�d and k�d depend on the small parameters δ, γ > 0.
With the purpose to simplify the restrictions of the game let us define the
following extended variables:
0 1
� � v
u B ^v C
~x :¼ 2X~ :¼ U � U; ^ ~y :¼ B C ~ ^ þ
@ k A 2 Y :¼ V � V � R � R :
þ
^
u
h
The regularized function of Lagrange can be represented by

~d ð~x; ~yÞ :¼ Ld ðu; u
L ^; v; ^v; k; hÞ:
The equilibrium point that satisfies Eq. (23) can be expressed by
� �
1� �2
~xd ¼ arg min �~x ~xd � þ cL
� � ~d ð~x; ~yd Þ
�
~x2X~ 2
� �
1� �2
~yd ¼ arg max �~y ~yd � þ cL
� � ~d ð~xd ; ~yÞ :
�
~
~y2Y 2
Now, let us introduce the regularized function and the following vectors:
� � � �
~1
w ~ ~ ~v1 ~ �Y ~
w~¼ 2 X � Y; ~v ¼ 2X
~2
w ~v2
Hd ð~ ~d ð~
w; ~vÞ :¼ L w1 ; ~v2 Þ ~d ð~v1 ; w
L ~ 2Þ
~ 2 ¼ ~y, ~v1 ¼ ~v�1 ¼ ~xd� , and ~v2 ¼ ~v�2 ¼ ~yd� we have
~ 1 ¼ ~x, w
For w
~d ð~x; ~yd� Þ
w; ~vÞ :¼ L
Hd ð~ ~d ð~xd� ; ~yÞ:
L
In these variables, the relation (22) can be represented in a “short format”
as
� �
� 1��w
�
� �2 �
~v ¼ arg min ~ ~v w; ~v Þ :
þ cHd ð~ ð24Þ
~ Y
~ 2X�
w ~ 2
Finally, for the extraproximal method we have that
1. First step
� �
1�
�w
�2
^vn ¼ arg min ~ ~vn � þ cHd ð~ �
w; ~v Þ
~ Y
~ 2X�
w ~ 2
2. Second step
� �
1��w
�2
^vnþ1 ¼ arg min ~ ~vn � þ cHd ð~
w; ~v� Þ
w ~ Y
~ 2X� ~ 2
The Extraproximal Format

The general format iterative version (n ¼ 0, 1,…) of the extraproximal method
(Antipin 2005) with some fixed admissible initial values ðu0 2 U; u ^
^0 2 U;
^
v0 2 V; ^v 2 V; k0 � 0; and h0 � 0Þ is as follows.
1. The first half-step (prediction):
� �
1� �2
kn ¼ arg min �k kn � ^n ; vn ; ^vn ; k; hn Þ
cLd ðun ; u
k�0 2
� �
1� �2
hn ¼ arg min �h hn � ^n ; vn ; ^vn ; k�n ; hÞ
cLd ðun ; u
h�0 2
� �
1� �2
un ¼ arg min �u ^n ; vn ; ^vn ; k�n ; hn Þ
un � þ cLd ðu; u
u2U 2
� � ð25Þ
1� �2
^n ¼ arg min �u
u ^ u ^; vn ; ^vn ; k�n ; hn Þ
^n � þ cLd ðun ; u
u ^ 2
^2U
� �
1� �2
vn ¼ arg min �v vn � ^n ; v; ^vn ; k�n ; hn Þ
cLd ðun ; u
v2V 2
� �
1� �2
�
^vn ¼ arg min �^v ^vn � ^n ; vn ; ^v; kn ; hn Þ ;
cLd ðun ; u
^ 2
^v2V
or in the extended variables

� �
1��w
�2
�
^vn ¼ arg min ~ w; ~vn Þ :
~vn þ cWd ð~ ð26Þ
~ X
~ 2Y�
w ~ 2
2. The second (basic) half-step

� �
1� �2
knþ1 ¼ arg min �k kn � þ cLd ð� ^n ; �vn ; ^vn ; k; hn Þ
un ; u
k�0 2
� �
1� �2
�
hnþ1 ¼ arg min �h hn � ^n ; �vn ; ^vn ; kn ; hÞ
un ; u
cLd ð�
h�0 2
� �
1� �2
�
unþ1 ¼ arg min �u un � þ cLd ðun ; u^n ; �vn ; ^vn ; kn ; hn Þ
u2U 2
� � ð27Þ
1� �2
�
^nþ1 ¼ arg min �u
u ^ u �
^n þ cLd ð� un ; u^; �vn ; ^vn ; kn ; hn Þ
^ 2
^2U
u
� �
1� �2
�
vnþ1 ¼ arg min �v vn � cLd ð� ^n ; v; ^vn ; kn ; hn Þ
un ; u
v2V 2
� �
1� �2
�
^vnþ1 ¼ arg min �^v ^vn � cLd ð� ^n ; �vn ; ^v; kn ; hn Þ
un ; u
^ 2
^v2V
or in the “short format”

� �
1��w
�2
�
~vnþ1 ¼ arg min ~ w; ^vn Þ :
~vn þ cWd ð~ ð28Þ
~ X
~ 2Y�
w ~ 2
Convergence of the Parameters

Theorem 1: Within the class of numerical sequences
c0
cn ¼ ; c0 ; n0 ; c > 0;
ðn þ n0 Þc
d0
dn ¼ ; d0 ; d > 0;
ðn þ n0 Þd
the step size γn and the regularizing parameter δn satisfy the following
conditions:
0 < cn ! 0; 0 < dn ! 0 when n ! 1

X1
cn dn ¼ 1
n¼0
cn jdnþ1 dn j
! k which is small enough ! 0 when n ! 1
dn cn dn
for γ þ δ � 1, γ ≥ δ, γ < 1.
Proof. It follows from the estimates that
� �
1
cn dn ¼ O cþd ;
n
we have that
! � #! "�
1 1 1 d
1
jdnþ1 dn j ¼ O d ¼O 1þ 1
n ðn þ 1Þd ðn þ 1Þd n
"� � #! � �
1 1 d 1
¼O þoð1Þ ¼O d ;
ðn þ 1Þd n n þ1
and
� �
jdnþ1 dn j 1
¼O 1 c :
cn dn n
Theorem 2: Let u and u ^ two variables with nonnegative components of the

leaders. Then, within the class of numerical sequences we have that
c0
cn ¼ ; c0 ; n0 ; c > 0;
ðn þ n0 Þc
d0
dn ¼ ; d0 ; d > 0;
ðn þ n0 Þd
of the procedure given in Eq. (23), the rate of convergence for the leaders is
given by
� �
�� 1
kun u k þ ku ^n u ^ k ¼ O u�
n
that is equal to
@ ¼ minfc d; 1 c; dg: ð29Þ
Then, the maximal rate @� of convergence is attained for
c ¼ c� ¼ 2=3; d ¼ d� ¼ 1=3: ð30Þ
Proof. It follows that for ℵ0 characterizing the rate of convergence is given by
� �
� � 1
rn ¼ kun u ðdn Þk þ ku ^n u ^ ðdn Þk ¼ O u� ;
n0
we have ℵ0 ¼ min{γ − δ; 1 − γ; δ}. It follows from the linear dependence of the
regularized Lagrange function on δ that
k un u�� k þ ku
^n ^�� k ¼ rn þ Oðdn Þ
u
� � � � � �
1 1 1
¼ O @ þ O d ¼ O minf@ ;dg ;
n 0 n n 0
which implies (29). The maximal value ℵ of ℵ� is attained when

γ − δ ¼ 1 − γ ¼ δ, i.e., when (30) holds.
Remark 5: We have a similar rate of convergence for the followers given by
� �
�� 1
kvn v k þ k^vn ^v k ¼ O @ :
n
The Extraproximal Markov Game Model

We consider a four-player Stackelberg game with two leaders (M ¼ {1, 2}) and
two followers (N ¼ {3, 4}). Numbers are used to refer to players in a specific
expression. We also assume that the number of strategies and actions that this
description can take is finite and fixed. Here we will apply the iterative quad-
ratic method providing significantly more quick rate of convergence.
The Extraproximal Stackelberg Leader–Follower Nash Model

The procedure for solving the Stackelberg leader–follower Nash equilibrium
point for the extraproximal method consists of the following “iterative rules”
implementation. Let us introduce for simplicity the following notation:
ðlÞ ðlÞ
Wði3 ;k3 ;i4 ;k4 ;i1 ;k1 ;i2 ;k2 Þ ¼ W ðlÞ cðil ;kl Þ
ðlÞ
cðil ;kl Þ ¼ cðlÞ
ð^lÞ ^
� �
ðlÞ
cðil ;kl Þ ¼ c ðlÞ
ðlÞ
For n ¼ 0, 1, … let define the vectors

� � � ^
�
col cð1Þ ðnÞ m^ col cð1Þ ðnÞ
u¼ ; u ^n ¼ u ¼ ^
col cð2Þ ðnÞ col cð2Þ ðnÞ
0 � �1
! ð1Þ ð4Þ ð3Þ ð^2Þ
col arg min J c ð nÞ; c ð nÞ; z; c ð n Þ
col cð1Þ ðnÞ B ð1Þ
z2Cadm C
� m
u ¼ ð2Þ ¼B @
� �C
A
col cði2 ;k2 Þ ðnÞ ^
col arg min Jð2Þ cð4Þ ðnÞ; cð3Þ ðnÞ; cð1Þ ðnÞ; z
ð2Þ
z2Cadm
� � � ^
�
col cð3Þ ðnÞ ^l col cð3Þ ðnÞ
v¼ ; ^vn ¼ v ¼ ^
col cð4Þ ðnÞ col cð4Þ ðnÞ
0 � �1
ð3Þ ð^4Þ ð1Þ ð2Þ
� � col arg min J z; c ðnÞ; c ðnÞ; c ðnÞ
� col cð3Þ ðnÞ B ð3Þ
z2Cadm C
�v ¼l
¼@ B � �C
ð4Þ
col c ðnÞ ^ A
col arg min Jð4Þ cð3Þ ðnÞ; z; cð1Þ ðnÞ; cð2Þ ðnÞ
ð4Þ
z2Cadm
For example, we have:

1. First half-step� �
2. kn ¼ arg min
1
2 ðk kn Þ2 cLd ðun ; u
^n ; vn ; ^vn ; k; hn Þ
k�0
� �XXXX�
^
k�n ¼ kn þ c W ð3Þ cð3Þ cð4Þ ðnÞcð1Þ ðnÞcð2Þ ðnÞ
i3 ;k3 i4 ;k4 i1 ;k1 i2 ;k2
� XXXX
ð3Þ ð3Þ ð^4Þ ð1Þ ð2Þ
W c ðnÞc ðnÞc ðnÞc ðnÞ þ
i3 ;k3 i4 ;k4 i1 ;k1 i2 ;k2
h i
^ ^
W ð4Þ cð3Þ ðnÞcð4Þ cð1Þ ðnÞcð2Þ ðnÞ W ð4Þ cð3Þ ðnÞcð4Þ ðnÞcð1Þ ðnÞcð2Þ ðnÞ
� �2 � �!
d � cð3Þ
ðnÞ � � cð^3Þ ðnÞ �2 � 1
�
� � � �
þ � � þ� ^ � �
2 � cð4Þ ðnÞ � � cð4Þ ðnÞ � ð1 þ cdÞ þ
n �� 2 o
3. h�n ¼ arg min 12��h hn �� ^n ; vn ; ^vn ; k�n ; hÞ
cLd ðun ; u
h�0
where
� �XXXX
^
ð1 dcÞh ¼ hn þ c W ð1Þ cð3Þ ðnÞcð4Þ ðnÞcð1Þ cð2Þ ðnÞ
i3 ;k3 i4 ;k4 i1 ;k1 i2 ;k2
^
W ð1Þ cð3Þ ðnÞcð4Þ ðnÞcð1Þ ðnÞcð2Þ ðnÞ
XXXX ^
þ W ð2Þ cð3Þ ðnÞcð4Þ ðnÞcð1Þ ðnÞcð2Þ
i3 ;k3 i4 ;k4 i1 ;k1 i2 ;k2
^
W ð2Þ cð3Þ ðnÞcð4Þ ðnÞcð1Þ ðnÞcð2Þ ðnÞ
� �2 � �!
d � � � cð^1Þ ðnÞ �2 ��
� cð1Þ ðnÞ � � � 1
þ � ð2Þ � þ� ^ � �
2 � c ðnÞ � � ð 2Þ
c ðnÞ � þ ð1 þ dcÞ
�1 � �
4. un ¼ arg min 2 �u un k2 þ cLd ðu; u ^n ; vn ; ^vn ; kn ; h�n Þ
u2U
where for Quadratic Programming we obtain the following matrices:
n!
Jðl1;l2Þ ¼ Q1 Q2 þ Q3 Q4 and I ¼ IðN1 M1 þN2 M2 �N1 M1 þN2 M2 Þ
2 2 r!ðn rÞ!
� � !
�
1 þ 1 þ hn cd I �
ckn Jðl1;l2Þ
H¼ � �
ck�n Jðl1;l2Þ
T
1 þ 1 þ h�n cd I
0 � P P P ð2Þ ð3Þ ^
1
c 1 þ h�n W c ðnÞcð4Þ ðnÞcð1Þ ðnÞ !
B i3 ;k3 i4 ;k4 i1 ;k1 C c ð0Þ
ðnÞ
f ¼ B � P P P ð1Þ ð3Þ C
@ ^ A
c 1 þ h�n W c ðnÞcð4Þ ðnÞcð2Þ ðnÞ cð1Þ ðnÞ
i3 ;k3 i4 ;k4 i2 ;k2
such that
XX ^
Q1ði1 ;k1 ;i2 ;k2 Þ ¼ W ð3Þ cð3Þ cð4Þ ðnÞ
i3 ;k3 i4 ;k4
XX ^
ð3Þ
ðnÞcð4Þ ðnÞ
Q2ði1 ;k1 ;i2 ;k2 Þ ¼ W ð3Þc
i3 ;k3 i4 ;k4
XX ^
Q3ði1 ;k1 ;i2 ;k2 Þ ¼ W ð4Þ cð3Þ ðnÞcð4Þ ðnÞ
i3 ;k3 i4 ;k4
XX ^
Q4ði1 ;k1 ;i2 ;k2 Þ ¼ W ð4Þ cð3Þ ðnÞcð4Þ ðnÞ
i3 ;k3 i4 ;k4
We leave the rest of the development for the reader.
Application Example
In this problem we consider the market for the car industry. In our prob-
lem there are four noncooperative players: two leaders and two followers
that capture approximately 95% of the total market. Leaders operate at a

national level, rather than followers that are regional carriers, operating
in specific and determined major cities of the country. We will suppose
that collusion between firms to fix prices of the cars is illegal and
they reach industry agreements on pricing indirectly. Companies accepted
price leader in the oligopoly and established their prices considering
the industry’s price leaders. These companies benefit from significant
cost advantages that make it difficult for new firms to enter the industry
that result of: (1) large scale of production, (2) experience in operating
costs down, and (3) predatory practices such as obtaining lower
prices from suppliers and establishing exclusive agreements with the
government.
Cars are classified in four segments as follows: (1) Luxury—Higher-end
cars that are not classified as sports; (2) Large, greater speed, capacity and
occupant protection are safer designed in a more sportive and crossover
approach; (3) Midsize—cars are drawn with a sedan shape designed to seat
four to six passengers comfortably; and (4) Small—cars that refer to the hatch-
backs and shortest saloons marketed at low price.
Let consider a customer who wishes to purchase a new car (and has never
purchased one before). The customer will visit the local where all four carriers
have an agency. Because prices are fixed the carriers focus on strategies other
than price in order to win customers and increase profits. These strategies
may be realizing that each carrier charges different amount for their monthly
purchase plans and special customer service price. After joining together the
relevant information from all four, the customer is able to make a final pur-
chasing decision.
Let N1 ¼ N2 ¼ N3 ¼ N4 ¼ 4 and M1 ¼ M2 ¼ M3 ¼ M4 ¼ 2 be the number
of states and the number of the actions.
We have that the transition matrices are given by:
2 3
0:0004 0:0008 0:0001 0:9988
6 7
6 0:5798 0:0000 0:3940 0:0262 7
p1ði;j;1Þ ¼6
6 0:0106
7
4 0:2029 0:3649 0:4217 7
5
0:4447 0:0714 0:0364 0:4475
2 3
0:2299 0:3494 0:1783 0:2424
6 7
6 0:0215 0:0125 0:9528 0:0133 7
p1ði;j;2Þ ¼6
6 0:9753
7
4 0:0128 0:0000 0:0119 7
5
0:9910 0:0052 0:0038 0:0000
2 3
0:1675 0:0041 0:8284 0:0000
6 0:9424 0:0000 0:0272 0:0304 7
6 7
p2ði;j;1Þ ¼6 7
4 0:2782 0:0000 0:3091 0:4127 5
0:1897 0:6267 0:1836 0:0000
2 3
0:9844 0:0083 0:0024 0:0049
6 0:1537 0:2547 0:4966 0:0951 7
6 7
p2ði;j;2Þ ¼6 7
4 0:0086 0:0106 0:0000 0:9808 5
0:3499 0:0193 0:5721 0:0587
2 3
0:0080 0:0003 0:0006 0:9911
6 0:5830 0:0628 0:0418 0:3124 7
6 7
p3ði;j;1Þ ¼ 6 7
4 0:7968 0:0675 0:1355 0:0003 5
0:0196 0:0171 0:0027 0:9606
2 3
0:3481 0:0460 0:0650 0:5410
6 0:0104 0:0032 0:0684 0:9180 7
6 7
p3ði;j;2Þ ¼6 7
4 0:0097 0:0005 0:0004 0:9894 5
0:1310 0:5207 0:3221 0:0261
2 3
0:0120 0:0067 0:8893 0:0920
6 0:9638 0:0148 0:0120 0:0094 7
6 7
p4ði;j;1Þ ¼6 7
4 0:1635 0:2017 0:2316 0:4031 5
0:0711 0:0000 0:0219 0:9070
2 3
0:6844 0:0136 0:3017 0:0003
6 0:8966 0:0081 0:0000 0:0954 7
6 7
p4ði;j;2Þ ¼6 7
4 0:4769 0:2865 0:0000 0:2365 5
0:3542 0:0761 0:0000 0:5697
and the utility matrices are as follows:

2 3 2 3
4 30 24 6 38 32 38 18
6 2 26 0 12 7 6 12 28 22 30 7
Jð1i;j;1Þ ¼6
4 0
7 Jð1i;j;2Þ ¼6
4 38
7
32 22 24 5 32 22 2 5
28 30 26 14 30 32 26 2
2 3 2 3
15 51 9 24 8 12 16 6
6 3 30 18 21 7 68 20 16 12 7
Jð2i;j;1Þ ¼6
4 3
7 Jð2i;j;2Þ ¼6
48
7
48 51 12 5 12 16 20 5
6 45 27 3 4 10 18 0
2 3 2 3
12 17 13 11 10 18 8 9
6 1 17 9 7 7 6 4 5 1 3 7
Jð3i;j;1Þ ¼6
4 0
7 Jð3i;j;2Þ ¼6
4 4 6 7 10 5
7
11 12 4 5
5 15 11 1 5 7 8 10
2 3 2 3
5 8 7 8 9 13 7 9
6 1 4 3 57 6 5 0 7 47
Jð4i;j;1Þ ¼6
4 4
7 Jð4i;j;2Þ ¼6
4 11 2 19
7
6 17 1 5 65
12 0 7 3 3 10 14 5
Figures 1 and 2 show the convergence of the parameters λ and θ, respect-

ively. As well, Figures 3–6 show the convergence of c variables.
Finally, applying Eq. (5), we have that the optimal strategies for each player
are as follows:
2 3 2 3
0:3958 0:6042 0:9568 0:0432
6 0:2709 0:7291 7 6 0:0653 0:9347 7
d1� ¼6
4 0:6644
7 d2� ¼6
4 0:9763
7
0:3356 5 0:0237 5
0:5158 0:4842 0:9485 0:0515
2 3 2 3
0:4095 0:5905 0:5286 0:4714
6 0:5909 0:4091 7 6 0:5203 0:4797 7
d3� ¼6
4 0:8564
7 d4� ¼6
4 0:4740
7
0:1436 5 0:5260 5
0:3905 0:6095 0:4639 0:5361
Figure 1. Convergence of the parameter k.

Figure 2. Convergence of the parameter h.
and the corresponding final utilities are given by:
2 3 2 3
191:1676 26:1762
6 61:1780 7 6 36:4509 7
J1� ¼6
4 96:5544 5
7 J2� ¼6 7
4 360:2074 5 ð30Þ
171:6424 261:3396
Figure 3. Convergence of the strategies of the Leader 1.

Figure 4. Convergence of the strategies of the Leader 2.

2 3 2 3
25:2610 20:5306
6 4:4015 7 6 0:9443 7
J3� ¼6
4 4:7361 5
7 J4� ¼6 7
4 17:0718 5 ð31Þ
15:8519 7:8000
Explanation of the Strategic Behavior

For segment 1, the most profitable segment for the leaders (see (30)), the
leader 1 makes emphasis on offering monthly purchase plans (0.3958) and
Figure 5. Convergence of the strategies of the Follower 3.

Figure 6. Convergence of the strategies of the Follower 4.
customer service price (0.6042) for Luxury cars. However, the leader 2 focuses
on offering monthly purchase plan (0.9568) for the same segment, based on
the supposition that usually Luxury cars are purchased with credit. Looking
at the strategies of the leader 1 and the leader 2, the follower 3 and the fol-
lower 4 decided for a sense of balance to compete on customer service price
(0.5905) and (0.4714), and to offer monthly purchase plans with (0.4095) and
(0.5286), respectively.
For segment 2, corresponding to Large cars, the leader 1 chooses a mixed
strategy deciding on (0.7291) for customer service price and the rest (0.2709)
for monthly purchase plans. Instead, leader 2 decided to strongly compete on
customer service price (0.9347). Again, the follower 3 and the follower 4
decided for a sense of balance to compete on customer service price
(0.4091) and (0.4797), and to offer monthly purchase plans with (0.5909)
and (0.5203), respectively.
For Midsize cars (segment 3) leader 1 selects again a mixed strategy giving
importance to monthly purchase plans (0.6644) and customer service price
(0.3356). The leader 2 does the opposite making emphasis on monthly pur-
chase plans (0.9763) and customer service price (0.0237). However, follower
1 decided to strongly compete with the leader 1 in this segment giving impor-
tance to monthly purchase plans (0.8564). On the other hand, follower 2
decided to compete on monthly purchase plans (0.4740), the rest on customer
service price (0.5260).
For segment 4 (Small cars) leader 1 selects a balanced strategy while leader
2 decided to strongly compete on monthly purchase plans (0.9485), based on
the assumption that many people in this segment are not in a position to pay
cash for their cars. In addition, follower 1 and follower 2 select a balanced
strategy.
Conclusion and Future Work

We have presented a novel formulation of a multileader–follower Stackelberg
model for M leaders and N followers. The approach is based on the extraprox-
imal method. To implement the Nash equilibrium for leaders and followers
we implemented the joint format proposed by Tanaka and Yokoyama
(1991). We formulated the game as coupled nonlinear programming
problems using the Lagrange principle and the Tikhonov’s regularization
method. For solving our approach we use a quadratic programming method.
Finally, the formulation was applied to an oligopoly problem: we have solved
a concrete version of the multileader–follower Stackelberg model for the car
industry market including two leaders and two followers. A multi-objective
extension of the problem will be considered in future works. Moreover, an
approach that is computationally intensive considering several leaders and
followers would be of further interest.
References
Antipin, A. S. 2005. An extraproximal method for solving equilibrium programming problems
and games. Computational Mathematics and Mathematical Physics 45 (11):1893–1914.
Clempner, J. B., and A. S. Poznyak. 2011. Convergence method, properties and computational
complexity for lyapunov games. International Journal of Applied Mathematics and
Computer Science 21 (2):349–61. doi:10.2478/v10006-011-0026-x
Clempner, J. B., and A. S. Poznyak. 2014. Simple computing of the customer lifetime value:
A fixed local-optimal policy approach. Journal of Systems Science and Systems Engineering
23 (4):439–59. doi:10.1007/s11518-014-5260-y
De Miguel, V., and H. Xu. 2009. A stochastic multiple-leader Stackelberg model: Analysis,
computation, and application. Operations Research 57:1220–35. doi:10.1287/opre.1080.0686
Ekeland, I. 1974. On the variational principle. Journal of Mathematical Analysis and
Applications 47 (2):324–53. doi:10.1016/0022-247x(74)90025-0
Ekeland, I. 1979. Nonconvex minimization problems. Bulletin of the American Mathematical
Society 1 (3):443–75. doi:10.1090/s0273-0979-1979-14595-6
Koh, A. 2012. An evolutionary algorithm based on Nash dominance for equilibrium problems
with equilibrium constraints. Applied Soft Computing 12 (1):161–73. doi:10.1016/j.asoc.
2011.08.056
Leyffer, S., and T. Munson. 2010. Solving multi-leader-follower games. Optimization Methods
and Software 25 (4):601–23. doi:10.1080/10556780903448052
Lu, J., C. Shi, G. Zhang, and D. Ruan. 2007. An extended branch and bound algorithm
for bilevel multi-follower decision making in a referential-uncooperative situation.
International Journal of Information Technology and Decision-Making 6 (2):371–88.
doi:10.1142/s0219622007002459
Lung, R. I., and D. Dumitrescu. 2008. Computing Nash equilibria by means of evolutionary
computation. International Journal of Computers Communications and Control 3:364–68.
Nash, J. F. 1951. Non-cooperative games. Annals of Mathematics 54:286–95. doi:10.2307/

1969529
Nishizaki, I., and M. Sakawa. 2005. Computational methods through genetic algorithms for
obtaining stackelberg solutions to two-level integer programming problems. Cybernetics
and Systems 36 (6):565–79. doi:10.1080/01969720590961718
Poznyak, A. S., K. Najim, and E. Gomez-Ramirez. 2000. Self-learning control of finite Markov
chains. New York: Marcel Dekker.
Rosen, J. B. 1965. Existence and uniqueness of equilibrium points for concave n-persons
games. Econometrica 33:520–34. doi:10.2307/1911749
Sherali, H. D. 1984. A multiple leader Stackelberg model and analysis. Operations Research 32
(2):390–404. doi:10.1287/opre.32.2.390
Sherali, H. D., A. L. Soyster, and F. Murphy. 1983. Stackelberg–nash–cournot equilibria:
Characterizations and computations. Operations Research 31 (2):253–76. doi:10.1287/opre.
31.2.253
Sinha, A., P. Malo, A. Frantsev, and K. Deb. 2014. Finding optimal strategies in a multi-period
multi-leader-follower Stackelberg game using an evolutionary algorithm. Computers
Operation Research 41:374–85. doi:10.1016/j.cor.2013.07.010
Tanaka, K., and K. Yokoyama. 1991. On e-equilibrium point in a noncooperative n-person
game. Journal of Mathematical Analysis and Applications 160:413–23. doi:10.1016/0022-
247x(91)90314-p
Trejo, K. K., J. B. Clempner, and A. S. Poznyak. 2015. Computing the Stackelberg/Nash
equilibria using the extraproximal method: Convergence analysis and implementation
details for Markov chains games. International Journal of Applied Mathematics and
Computer Science 25 (2):337–51. doi:10.1515/amcs-2015-0026
Van Stackelberg, H. 1952. The theory of market economy. New York, NY: Oxford University
Press.
Van Stackelberg, H. 2011. Market structure and equilibrium. 1st ed. Translation into English.
Berlin-Heidelberg: Springer.
Zhang, G., J. Lu, and Y. Gao. 2008. Fuzzy bilevel programming: multi-objective and multi-
follower with shared variables. International Journal of Uncertainty Fuzziness and
Knowledge 16:105–33. doi:10.1142/s0218488508005510

Modeling Multileader-Follower Noncooperative Stackelberg Games

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Modeling Multileader-Follower Noncooperative Stackelberg Games

Uploaded by

Copyright:

Available Formats

Cybernetics and Systems

ISSN: 0196-9722 (Print) 1087-6553 (Online) Journal homepage: http://www.tandfonline.com/loi/ucbs20

Modeling Multileader–Follower Noncooperative

Cesar U. Solis, Julio B. Clempner & Alexander S. Poznyak

To link to this article: http://dx.doi.org/10.1080/01969722.2016.1232121

Published online: 26 Oct 2016.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Modeling Multileader–Follower Noncooperative

and followers are in a noncooperative game, respectively. Thus, for instance,

price-consistent followers. Lu et al. (2007) extends branch and bound

regularization method is used to guarantee the convergence of the method to

Organization of the Paper

Controllable Markov Games

Notice that by (3) it follows that:

The Stackelberg–Nash Solution

benefit by changing strategies then the other leaders keep theirs

and let us consider a game with M leaders and N followers. �

then, we can rewrite Eq. (7) as follows:

Definition 1: A strategy u� 2 Uadm is said to be a Nash equilibrium of the

u� 2 Arg min fF ðu; u

^ðuÞÞ is strictly convex then we have that

u� ¼ arg min fF ðu; u

Lemma 1: The Nash equilibrium u� 2 Uadm for the leaders can be

u� ¼ arg min Fðu; u

for any fixed u^2U ^ and any u ∈ U.

then, we can rewrite Eq. (13) as follows:

Definition 2: A strategy v� 2 Vadm is said to be a Nash equilibrium of the

v� 2 Arg min fGðv; ^vðvÞÞg: ð17Þ

If V ðv; ^vðvÞÞ is strictly convex then we have that

v� ¼ arg min fGðv; ^vðvÞÞg:

Remark 4: Lemma 1 is similar for the followers.

^ðuÞjvÞ � 0 for any um 2 U m and all m ¼ 1; M;

Lagrange Principle and Tikhonov’s Regularization

Lagrange Principle Implementation

The Extraproximal Method

The regularized function of Lagrange can be represented by

The Extraproximal Format

or in the extended variables

2. The second (basic) half-step

or in the “short format”

Convergence of the Parameters

0 < cn ! 0; 0 < dn ! 0 when n ! 1

Theorem 2: Let u and u ^ two variables with nonnegative components of the

which implies (29). The maximal value ℵ of ℵ� is attained when

The Extraproximal Markov Game Model

The Extraproximal Stackelberg Leader–Follower Nash Model

For n ¼ 0, 1, … let define the vectors

For example, we have:

We leave the rest of the development for the reader.

that capture approximately 95% of the total market. Leaders operate at a

and the utility matrices are as follows:

Figures 1 and 2 show the convergence of the parameters λ and θ, respect-

Figure 1. Convergence of the parameter k.

Figure 2. Convergence of the parameter h.

and the corresponding final utilities are given by:

Figure 3. Convergence of the strategies of the Leader 1.

Figure 4. Convergence of the strategies of the Leader 2.

Explanation of the Strategic Behavior

Figure 5. Convergence of the strategies of the Follower 3.

Figure 6. Convergence of the strategies of the Follower 4.