You are on page 1of 16

Defence Technology 18 (2022) 368e383

Contents lists available at ScienceDirect

Defence Technology
journal homepage: www.keaipublishing.com/en/journals/defence-technology

Air combat decision-making of multiple UCAVs based on constraint


strategy games
Shou-yi Li, Mou Chen*, Yu-hui Wang, Qing-xian Wu
College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 211100, China

a r t i c l e i n f o a b s t r a c t

Article history: Game theory can be applied to the air combat decision-making problem of multiple unmanned combat
Received 7 September 2020 air vehicles (UCAVs). However, it is difficult to have satisfactory decision-making results completely
Received in revised form relying on air combat situation information, because there is a lot of time-sensitive information in a
10 January 2021
complex air combat environment. In this paper, a constraint strategy game approach is developed to
Accepted 20 January 2021
generate intelligent decision-making for multiple UCAVs in complex air combat environment with air
Available online 5 February 2021
combat situation information and time-sensitive information. Initially, a constraint strategy game is
employed to model attack-defense decision-making problem in complex air combat environment. Then,
Keywords:
Game theory
an algorithm is proposed for solving the constraint strategy game based on linear programming and
Time-sensitive information linear inequality (CSG-LL). Finally, an example is given to illustrate the effectiveness of the proposed
Constraint strategy games approach.
Polytope strategy games © 2021 China Ordnance Society. Publishing services by Elsevier B.V. on behalf of KeAi Communications
Multiple UCAVs Co. Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/
Air combat decision-making licenses/by-nc-nd/4.0/).

1. Introduction usually adopts the following intelligent methods to obtain the de-
cision result. In Ref. [8], a decision method was proposed based on
The developing science technology and artificial intelligence the expert system, but this method relies on expert experience, and
make it possible to achieve autonomous air combat decision- the selected strategy may not be the best result. A framework for
making for unmanned combat air vehicle (UCAV) in military war- multiple UAVs task assignment based on hierarchical task assign-
fare [1,2]. Due to the limitations of detection capabilities and ment method was constructed in Ref. [9], and the original problem
weapon attacks, it is difficult for a single UCAV to complete a was broken down into three levels of sub-problems: target clus-
complex air combat mission. Therefore, multiple UCAVs can coop- tering, cluster allocation and target assignment, which can reduce
erate in combat, perceive the battlefield situation through the the computational complexity of the task assignment problem. An
shared information of the communication equipment, and realize evolutionary algorithm in machine learning was considered to train
coordinated task allocation, search, reconnaissance and attack, large-scale UCAVs in Ref. [10], as the training number increases, the
which can effectively improve the survivability of UCAVs and UCAVs show increasing intelligence. There is no doubt that these
overall combat effectiveness [3,4]. Therefore, it is particularly results have made a lot of valuable progress, but the impact of
important to realize the autonomous air combat decision-making enemy strategies on the optimal solution has not been considered.
during the cooperative operation of UCAVs. The development of Since the UCAVs air combat decision-making system is a system
this technology will open up a new situation in future wars [5e7]. that interacts with two or more participants, it is unreasonable to
In the existing literatures, the air combat decision-making of consider only the actions of one party in the air combat decision-
multiple UCAVs is usually considered as a unilateral optimization making of multiple UCAVs.
problem, which only considers the best strategies of one UCAV Game theory is one of the effective tools to analyze the
group, but does not predict and analyze the opponent’s strategies. It competitive relationship among multiple participants, and it has
played a significant role in economics, management sciences,
operational research, engineering, social sciences, biology, etc [11].
Game theory also has many applications in some special issues in
* Corresponding author. the field of air combat decision-making, such as the use of differ-
E-mail address: chenmou@nuaa.edu.cn (M. Chen).
ential games to study pursuit-evasion, missile guidance, etc [12],
Peer review under responsibility of China Ordnance Society

https://doi.org/10.1016/j.dt.2021.01.005
2214-9147/© 2021 China Ordnance Society. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co. Ltd. This is an open access article under the CC BY-NC-
ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

but the study of game theory in multiple UCAVs air combat discussed and some results on its solutions are given, which are
decision-making has just started. Game theory was first used to specified as CSG-LL algorithm to solve the constraint strategy game.
study the confrontation between two helicopters in Ref. [13]. Thereby, the air combat decision-making results are obtained. In
Although the game model was relatively simple, it inspired the Section 4, an example is given to illustrate the effectiveness of the
future research. An attrition-type discrete-time dynamic model proposed approach.
was formulated for two opposing forces in Ref. [14], in which a one- The notations in this paper are shown in Table 1.
step look ahead moving horizon optimization was considered.
However, this is only a static game model and does not always have
a pure strategy Nash equilibrium solution. A moving horizon so-
lution approach was proposed in Ref. [15] to reduce the computing 2. Problem formulation and preliminaries
difficulty of complex air combat systems. Many researchers also
applied other game methods to study decision-making problems, Assume that there are two confrontational UCAV groups in air
such as stochastic games [16], dynamic games [4], differential combat, the Red UCAV group R and the Blue UCAV group B , and
games [17], fuzzy games [18], etc, and achieved many valuable re- UCAV group R consists of m UCAVs, denoted as R 1 ; R 2 ; /; R m ,
sults. However, these methods rarely take into account the impact UCAV group B consists of n UCAVs, denoted as B 1 ;B 2 ;/;B n . The
of time-sensitive information on decision-making results in the air combat decision-making process of R and B includes modeling
complex air combat environment. the air combat situation assessment and solving the proposed
There is often a lot of time-sensitive information in a complex constraint strategy game, which is shown in Fig. 1 [23].
air combat environment [19], such as detecting that a certain target
is the control center of the enemy UCAVs, or some enemy UCAVs are
predicted to attack our ground base, or some of our UCAVs have 2.1. Review of air combat situation assessment
equipment failure, or certain targets entering dangerous airspace,
etc. The time-sensitive information can greatly affect air combat The air combat situation of R i ði ¼ 1; 2; /; mÞ and B j
decisions, so it has a greater priority than air combat situation in- ðj ¼ 1; 2; /; nÞ is given in Fig. 2 [24], where the symbols are defined
formation and need to be considered first [20,21]. Decisions for in Table 1.
time-sensitive information are given by the airborne expert system, For different UCAVs with different positions, speeds, weapons
which specifies the probability range for executing certain strate- and equipment, they have different decision results. Firstly, to
gies under specific time-sensitive information. The decision- describe these differences, the air combat situation assessment is
making rules of the airborne expert system are based on the conducted, which is the foundation of the following air combat
experience of military experts, which are quantitative and vague. decision-making.
Therefore, the decisions for time-sensitive information are man- Through the airborne sensors and ground station, the following
ifested in the estimation of the probability of executing specific parameters of each UCAV can be obtained: the position, the speed,
strategies. For example, if a target is the control center of the fleet, the maximum missile attack distance, the maximum radar detec-
the target should be attacked with a greater probability; if a certain tion distance and the number of missiles carried, as shown in
UCAV has equipment failure, the UCAV should perform the attack Table 1.
mission with a small probability, etc. Decisions for time-sensitive The situation assessment includes angle priority, velocity pri-
information form linear constraints of mixed strategies in the ority, distance priority, high priority and performance priority
game, and under these linear constraints, air combat decision- [24,25]. The priority computation equations for R to B are as
making are made based on the air combat situation information. follows:
This kind of game model with linear constraints is called the
constraint strategy game. r [24]. It depicts the effect of the position
(1) Angle priority Pija
Traditionally, the Nash equilibria of mixed strategy games can be
angle fij and the target entry angle rij on air combat situation
calculated by solving a pair of dual linear programming problems  
[22] due to that each mixed strategy of a player is the convex assessment. The smaller fij  is, the more favorable position
combination of pure strategies, and it can turn the infinite for attack is occupied by R i, and the smaller equation refer-
 
inequality constraints of Nash equilibrium into finite inequality ence goes here rij  is, the less likely R i will be attacked by B j.
constraints. We find that in constraint strategy games, the strategy The angle priority equation is given by
space of each player is a polytope, and the strategy of each player is
   .
the convex combination of the extreme points of its strategy set. r
Pija ¼ 1  fij  þ rij  2p (1)
Therefore, the constraint strategy game can be solved in a similar
way to the mixed strategy game.
ðP bj P ri ÞVir
The main contributions of this paper are given as follows: where fij ¼ arccos is the position angle of R i ,
jP bj P ri j,jVir j
ðP bj P ri ÞVjb
(i) Considering the time-sensitive information in the complex rij ¼ arccos is the target entry angle, in which P ri ¼
j jj j
ðP bj P ri Þ , Vjb
air combat environment, a constraint strategy game model is
ðP rix ; P riy ; P riz Þ, P bj ¼ ðP bjx ; P bjy ; P bjz Þ, Vir ¼ ðV r r r
ix ; V iy ; V iz Þ, Vjb ¼
presented.
b b b
(ii) The polytope strategy games are studied and some general ðV jx ; V jy ;V jz Þ, “’’ is the inner product of two vectors, “j,j’’ is the
results are given, which are specified as an algorithm (CSG-LL modulus of a vector. The definitions of the following symbols are
algorithm) to solve the constraint strategy game. Thereby, similar.
the air combat decision-making results of multiple UCAVs r [24]. It depicts the effect of the relative
(2) Velocity priority Pijv
are obtained.
velocity on air combat situation assessment. A larger
 r   b 
V =V  will make R i have more initiative, and then it will
The structure of this paper is as follows. In Section 2, an air i j
combat situation assessment approach and a constraint strategy be easier to shoot down the target. The velocity priority
game are introduced. In Section 3, the polytope strategy game is equation is written as
369
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

Table 1
Notation table.

R Red UCAV group B Blue UCAV group


Ri i-th UCAV ofR Bj j-th UCAV of B
P rix X coordinate of position of R i P riy Y coordinate of position of R i
P riz Z coordinate of position of R i V r
ix
X coordinate of speed of R i
V riy Y coordinate of speed of R i V r
iz
Z coordinate of speed of R i
P bjx X coordinate of position of B j b
P jy Y coordinate of position of B j

P bjz Z coordinate of position of B j V bjx X coordinate of speed of B j

V b Y coordinate of speed of B j V bjz Z coordinate of speed of B j


jy
M r
i
missile attack distance of R i M bj missile attack distance of B j
A r
i
radar detection distance of R i A bj radar detection distance of B j
L r
i
number of carried missiles of R i L ri number of carried missiles of B j
fij position angle of R i rij target entry angle of R i
Vir speed vector of R i Vjb speed vector of B j
hij height difference from R i to B j Dij distance between R i and B j
r
Pija angle priority of R i to B j r
Pijv velocity priority of R i to B j
r
Pijd distance priority of R i to B j r
Pijh high priority of R i to B j
r
Pijc performance priority of R i to B j Pijr situation priority of R i to B j
 inner product of two vectors j,j vector modulus or number of elements
G pure strategy game Sr pure strategy set of R
Sb pure strategy set of B ur utility function of R in G
ub utility function of B in G srm a pure strategy of R
sbn a pure strategy of B Pr attack probability matrix of R
Pb attack probability matrix of B Vr value matrix of R
Vb value matrix of B Pir joint attack probability to R i
Pjb joint attack probability to B j ðsr* ; sb* Þ a Nash equilibrium of G
Ge mixed strategy game Sr mixed strategy set of R
Sb mixed strategy set of B sr a mixed strategy of R
sb a mixed strategy of B Ur utility function of R in Ge
Ub utility function of B in Ge εr a perturbation vector of R
εb a perturbation vector of B ε a perturbation vector of G
Ge ðεÞ constraint strategy game Sr ðεr Þ the constraint strategy set of R
Sb ðεb Þ the constraint strategy set of B Uεr utility function of R in Ge ðεÞ
Uεb utility function of B in Ge ðεÞ ðsr* ; sb* Þ a Nash equilibrium of Ge ðεÞ
ℝd d-dimensional Euclidean space crl extreme points of a polytope
cbk extreme points of a polytope Gp polytope strategy game
Xr strategy set of R in Gp Xb strategy set of B in Gp
xr a strategy in X r xb a strategy in X b
Upr utility function of R in Gp Upb utility function of B in Gp
Up Up bUpr ¼  Upb v maximum value of Gp
v minimum value of Gp v the value of Gp
ðxr* ; xb* Þ a Nash equilibrium of Gp xl variables of a linear programming
w variables of a linear programming V optimal value of a linear programming
hrl variables of a linear inequalities hbk variables of a linear inequalities

r [25]. It depicts the effect of relative


(3) Distance priority Pijd
distance, the maximum missile attack distance and the
 r
8 V  maximum radar detection distance. R i can attack effectively
> 0:1  i  < 0:6 only if the relative distance between R i and B j is within its
>
>  b
>
> Vj 
>
> maximum missile attack distance and its maximum radar
>
>    r
>
> detection distance.
< Vir  V  r is discussed according to the
r
Pijv 
¼  b   0:5 
0:6   i   1:5 (2) The distance priority Pijd
>
> Vj  Vjb  maximum missile attack distance and the maximum radar
>
>
>
> detection distance of R i and B j .
>
>  r
>
> V 
>
: 1  i  > 1:5 a) If the performance of B j is better than that of R i , that is
 b
Vj  M ri < M b
j < A ri < A bj , we have

370
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

8
>
> Dij  M ri ; or M b
 Dij  A ri ; or A b
 Dij
>
> 0 j j
>
<  . 
b
r
Pijd ¼ 0:4 M j  Dij M bj  M ri r
M < Dij < M
i
b
j (3)
>
>  . 
>
>
>
: 0:2 A b
A bj  A ri A ri < Dij < A b
j  Dij j

  8
  < 0:1 hij <  5
where Dij ¼ P ri P bj  is the distance between R i and B j .
r 0:5 þ 0:1hij 5  hij  5
Pijh ¼ (5)
b) If the performance of R i is better than that of B j , that is :
1:0 hij > 5
b r b r
M j <M <A
i j <A i, we obtain

8
>
> 0 Dij  M bj ; or M r
 Dij  A bj ; or A r
 Dij
>
> i i
<  r . r 
r
Pijd ¼ 0:4 M i  Dij M i  M bj M b
j < Dij < M r
i (4)
>
>
>
>  . r 
: 0:2 A r  Dij A i  A bj A b
< Dij < A r
i j i:

where hij ¼ P riz  P bjz is the height difference from R i to B j .


r [25]. It depicts the effect of the height dif- r [26]. It depicts the effect of the
(5) Performance priority Pijc
(4) High priority Pijh
ference. When hij is larger, R i is at a higher position above maximum missile attack distance, the maximum radar
detection distance and the number of missiles carried. Larger
B j , which makes R i have a larger maneuvering space and a
missile attack distance, larger radar detection distance and
larger guidance space, so that R i is easier to shoot down B j .
more missiles carried make the UCAV have better attack
The high priority equation is given by

Fig. 1. Air combat decision-making process of R and B .

371
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

Fig. 2. Air combat situation of R and B .

performance. The performance priority equation is described


by B1 B2 / Bn
0 1
2 !vffiffiffiffiffiffiffiffi 3
r u
b
A r
M uL r B sn;11 sbn;12 / sbn;1n C
2 R1 B C
r
Pijc ¼ arctan40:5 i
þ b t bi 5:
i
(6) B b C
p Ajb
Mj Lj b
sn ¼ R 2 B sn;21 sbn;22 / sbn;2n C (9)
B C
« B C
B« « 1 « C
Taking into account the above priorities, the air combat situation B C
Rm @ b A
priority value Pijr of R i to B j is defined as follows: sn;m1 sbn;m2 / sbn;mn
 
Pijr ¼ Pijc
r
u1 Pija
r
þ u2 Pijv
r
þ u3 Pijd
r
þ u4 Pijh
r
(7) Pn
such that sbn;ij 2f0; 1g ði ¼ 1; 2; /; m; j ¼ 1; 2; /; nÞ, b
j¼1 sn;ij ¼ 1.

where u1 , u2 , u3 , u4 are the weighting coefficients, determined by Remark 1. In the above representations, srm and sbn are both m  n
experts. matrices for convenience. Similar representations are considered in
Similarly, the situation priority value Pjib of B j to R i can be easily the following context.

obtained. Utility functions ur and ub are functions of ðsrm ;sbn Þ, where ðsrm ; sbn Þ
2Sr  Sb is called a pure strategy combination. Since R and B are
completely competitive relationship in air combat, the two utility
2.2. The constraint strategy game of R and B functions satisfy: ur þ ub ¼ 0. Following, the utility functions are
constructed on the basis of the situation assessment and the defi-
In this part, we establish a constraint strategy game model for nition of the pure strategy set.
the air combat decision-making process of R and B . According to the air combat situation priority Eq. (7), we can
The air combat decision-making process of R and B is modeled
calculate the situation priority value Pijr and Pijb ði ¼ 1; 2; /; m; j ¼
as a matrix game G ¼ CSr ; Sb ; ur ; ub D [27,28], where Sr and Sb are the
1; 2; /; n). Pijr 2½0; 1 can be regarded as the attack ability of R i to
pure strategy sets of R and B , respectively. ur and ub are the utility
  B j , and Pijb 2½0; 1 can be seen as the attack ability of B j to R i . Thus,
functions of R and B , respectively. For srm 2Sr ðm ¼ 1; 2; /; Sr Þ, it
represents a combination of m UCAVs from R against n UCAVs from the attack probability matrices P r ¼ ðPijr Þmn and P b ¼ ðPijb Þmn of
P
B , which can be described by a 0  1 matrix [27]: both parties can be calculated. Denote Vir ¼ nj¼1 Pijr ði ¼ 1; 2;/;mÞ,
which is regarded as the value of R i , and the value matrix V r of the
B1 B2 / Bn P
r Þ. Similarly, denote V b b m P b
0 1 R is written as: V r ¼ ðV1r ; V2r ;/; Vm j i¼1 ji
sr srm;12 / srm;1n
R1 B m;11 C ðj ¼ 1; 2;/;nÞ, and the value matrix V b can be written as: V b ¼ ðV1b ;
B r C
Bs r
srm;2n C
srm ¼ R 2 B m;21 sm;22 / C (8) V2b ; /; Vnb Þ.
B C
« B« « 1 « C Corresponding to each pure strategy combination ðsrm ; sbn Þ, a real
B C
Rm @ r r
A value ur ðsrm ; sbn Þ is given to describe the utility of R . Construct the
sm;m1 sm;m2 / srm;mn
utility function of R as follows [27]:
 
where Sr  represents the number of strategies in the set Sr , srm;ij 2   X X  
n   m
f0; 1g ði ¼ 1; 2; /; m; j ¼ 1; 2; /; nÞ, srm;ij ¼ 1 means that R i attacks ur srm ; sbn ¼ Pjb srm ,Vjb  Pir sbn ,Vir ; srm 2Sr ; sbn 2Sb
j¼1 i¼1
B j with all its missiles, srm;ij ¼ 0 means that R i does not attack B j ,
P (10)
and satisfies m r
i¼1 sm;ij ¼ 1, which means each UCAV of R attacks
 
  Qm
one UCAV of B . Similarly, for sbn 2Sb ðn ¼ 1; 2; /; Sb Þ, it can be
r
where Pjb ðsrm Þ ¼ 1  i¼1 ð1  Pijr Þsm;ij ðj ¼ 1; 2/; nÞ is the joint attack
presented as [27]: probability of R to Bj when given srm 2Sr ,
372
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

Qn sbn;ij Then, on the basis of linear constraints, the numerical computing


Pir ðsbn Þ ¼ 1  j¼1 ð1  Pijb Þ ði ¼ 1; 2/; mÞ is the joint attack prob-
system gives precise decision based on situation information. The
ability of B to R i when given sbn 2Sb . Since ur þ ub ¼ 0, we have matrix game of R and B with linear constraints forms the
    constraint strategy game, then based on situation information, the
ub srm ; sbn ¼ ur srm ; sbn ; srm 2Sr ; sbn 2Sb (11) Nash equilibrium strategy is obtained by solving the constraint
strategy game. The roles of time-sensitive information and air
Nash equilibrium is a widely recognized solution concept in combat situation information in air combat decision-making are
game theory. A pure strategy combination ðsr* ; sb* Þ is the Nash shown in Fig. 3.
equilibrium of G ¼ CSr ; Sb ; ur ; ub D, if it satisfies the following in- In addition, as long as they are linear constraints, the final
equalities [29]: strategy set constitutes a polytope, which can be solved by our
method. In this paper, we might as well consider a simple scenario:
   
a certain UCAV is the control center of its fleet and need to be shot
ur srm ; sb*  ur sr* ; sb* ; csrm 2Sr (12)
down with lager probabilities. Therefore, the linear constraints can
be added to the execution probabilities of certain strategies.
   
ub sr* ; sbn  ub sr* ; sb* ; csbn 2Sb (13) A perturbation vector of G ¼ CSr ; Sb ; ur ; ub D is applied to charac-
terize these constraints [29].
Since the pure strategy game G ¼ CSr ; Sb ; ur ; ub D does not always Definition 1. A perturbation vector of G ¼ CSr ; Sb ; ur ; ub D is defined
have a Nash equilibrium solution, the mixed extension of G (the as ε ¼ ðεr ;εb Þ, where εr ¼ ðεr1 ; εr2 ; /; εrSr Þ is a perturbation vector of
j j
mixed strategy game) is given as Ge ¼ CSr ;Sb ;U r ;U b D, which always   PjSr j r
has a Nash equilibrium [29], where R , satisfying εm  0 ðm ¼ 1; 2; /; Sr Þ,
r
m¼1 εm  1;
P Sr
 j j r r
S ¼ fs : S /½0; 1 m¼1 s ðsm Þ ¼ 1g is the mixed strategy set of
r r r εb ¼ ðεb1 ; εb2 ; /; εb b Þ is a perturbation vector of B , satisfying εbn  0
jS j
  P Sb
R , which represents all probability distributions over Sr , Sb ¼  b j j
 ðn ¼ 1; 2; /; S Þ, n¼1 εbn  1.
PjSb j
fsb : Sb /½0; 1 n¼1 sb ðsbn Þ ¼ 1g is the mixed strategy set of B ,
Following, the constraint strategy game of R and B is given.
which represents all probability distributions over Sb. Then, the
Definition 2. Letting ε ¼ ðεr ; εb Þ be a perturbation vector of G ¼ C
utility function of R can be given by
Sr ;Sb ;ur ;ub D, the constraint strategy game of R and B is defined as
 
  jSr j X
jSb j       Ge ðεÞ ¼ CSr ðεr Þ; Sb εb ; Uεr ; Uεb D, where
X  
U r
s ;s
r b
¼ sr srm sb sbn ,ur srm ; sbn ; sr ; sb 2Sr
m¼1 n¼1
  
C Sr ðεr Þ ¼ fsr 2Sr sr ðsrm Þ  εrm ; m ¼ 1; 2; /; Sr g is the
 Sb constraint strategy set of R, and
  
  
(14) Sb ðεb Þ ¼ fsb 2Sb sb ðsbn Þ  εbn ; n ¼ 1; 2; /; Sb g is the
constraint strategy set of B .
where U r ðsr ; sb Þ is the expected utility under the mixed strategies
C Uεr : SðεÞ/ℝ is the utility function of R , which is a restriction
combination ðsr ; sb Þ, and U b ðsr ; sb Þ ¼ U r ðsr ; sb Þ is the utility
of U r on Sr ðεr Þ  Sb ðεb Þ, satisfying
function of B .
As we discussed in Section 1, there is a lot of time-sensitive
information in air combat, and the time-sensitive information is
more important than the air combat situation information, so they
       
need to be prioritized in air combat decision-making. Firstly, the
Uεr sr ; sb ¼ U r sr ; sb ; sr ; sb 2Sr ðεr Þ  Sb εb (15)
expert system gives rough decision based on time-sensitive infor-
mation. In the expert system, the inference engine matches the
time-sensitive information with the rules in the knowledge base. where U r is defined in equation (14).
Under such case, the execution probabilities of some pure strategies Uεb : SðεÞ/ℝ is the utility function of B , which is a restriction of
are given, which are the linear constraints of the mixed strategies. Ub on SðεÞ, satisfying

Fig. 3. Roles of time-sensitive information and air combat situation information.

373
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

        3.1. Polytope strategy games


Uεb sr ; sb ¼ U b sr ; sb ; sr ; sb 2Sr ðεr Þ  Sb εb (16)
In Section 2.3, we have stated that Sr ðεr Þ and Sb ðεb Þ are poly-
The definition of the Nash equilibrium solution of Ge ðεÞ is given
topes in ℝjS j and ℝjS j , respectively. In order to solve Ge ðεÞ, we will
r b

below, which is similar to that of G.


discuss a more general game: the polytope strategy game Gp , where
Definition 3. A strategy combination ðsr* ; sb* Þ is the Nash equi- the strategy sets of R and B are both polytopes. In fact, the pol-
 
librium of Ge ðεÞ ¼ CSr ðεr Þ; Sb εb ; Uεr ; Uεb D, if it satisfies ytope strategy game is a generalized matrix game, since the mixed
strategy set is a special polytope (simplex). The polytope strategy
    game of R and B is a quadruplet Gp ¼ CX r ; X b ; Upr ; Upb D, where X r is
U r sr ; sb*  U r sr* ; sb* ; csr 2Sr ðεr Þ (17) the strategy set of R , which is a polytope in Euclidean space, X b is
the strategy set of B , which is a polytope in Euclidean space, Upr is
     
U b sr* ; sb  U b sr* ; sb* ; csb 2Sb εb (18) the utility function of R , which is a continuous bilinear function on
X r  X b , and Upb ¼ Upr is the utility function of B [29]. The exis-
tence of Nash equilibrium of Gp is guaranteed by the Theorem 5.32
in Ref. [29].
Define the maximum and minimum values of Gp as follows:
2.3. Geometric structures of constraint strategy set
 
We know that a polytope in ℝdis a bounded set which is the vbmaxxr 2X r minxb 2X b Up xr ; xb (21)
intersection of a finite number of half-spaces; A simplex in ℝd is a
special polytope whose extreme points are unit vectors e1 ;e2 ;/;ed .  
It is obvious that Sr and Sb are both simplexes. In the following we vbminxb 2X b maxxr 2X r Up xr ; xb (22)
will show that Sr ðεr Þ and Sb ðεb Þ are the polytopes in ℝjS j and ℝjS j ,
r b

respectively. where Up bUpr ¼  Upb . The rationality of the definition can be


Given a perturbation vector of R , εr ¼ ðεr1 ;εr2 ;/;εrSr Þ, notice that guaranteed by the compactness of strategy sets and the continuity
j j
 r of utility function, since a continuous function on any compact set
 
s ðsm Þ  εm ðm ¼ 1; 2; /; S Þ is a half-space in ℝ , denoted as
r r r j Sr j
has maximum and minimum values.
jSr j
I r ðεrm Þ, thereby Sr ðεr Þ ¼ ½∩m¼1 I r ðεrm Þ∩Sr . So, we prove that Sr ðεr Þ is a Then we give the definition of the value of the polytope strategy
game.
polytope in ℝjS j . In the same way, it can be proved that Sb ðεb Þ is a
r

Definition 4. Gp has a value if v ¼ v, and the quantity vbv ¼ v is


polytope in ℝjS j .
b

r b then called the value of Gp .


Obviously, the extreme points of S and S ðεr Þ
are both ðεb Þ
finite. Assume that Sr ðεr Þ has Nr extreme points, denoted as cr1 ; cr2 ; The maxmin value and minmax value have the following
/; crNr and Sb ðεb Þ has Nb extreme points, denoted as cb1 ; cb2 ; /; cbNb . properties.
Then, we have Lemma 1. Supposing that Gp ¼ CX r ; X b ; Upr ; Upb D is a polytope strat-
(  ) egy game of R and B , we have
 X
Nr X
Nr
Sr ðεr Þ ¼ sr sr ¼ hrl crl ; 0  hrl  1; hrl ¼ 1 (19)    

l¼1 l¼1 maxxr 2X r minxb 2X b Up xr ; xb  minxb 2X b maxxr 2X r Up xr ; xb
(  ) (23)
   X
Nb X
Nb
S b εb ¼ sb sb ¼ hbk cbk ; 0  hbk  1; hbk ¼ 1 (20) Proof. See Appendix B.

k¼1 k¼1
Lemma 2. Gp has a value if and only if Gp has at least one Nash
In other words, Sr ðεr Þ and Sb ðεb Þ are the convex hulls of their equilibrium. What is more, we have
extreme points, respectively. A method is given to calculate the
extreme points of Sr ðεr Þ and Sb ðεb Þ in Appendix A. Algorithms for  
calculating the extreme points of an arbitrary polytope can be Up xr* ; xb* ¼ v (24)
found in Ref. [30].
So far, we have conducted the air combat situation assessment where ðxr* ; xb* Þ is a Nash equilibrium of Gp , v is the value of Gp .
and established the constraint strategy game model for the air Proof. See Appendix C.
combat decision-making problem with constraints from human
Theorem 1. If Gp ¼ CX r ; X b ; Upr ; Upb D is a polytope strategy game, then
experience, following we will give an algorithm to calculate the
Gp has a value. Namely, the following equation holds:
Nash equilibrium solution of the proposed model, thereby the air
combat decision-making results of R and B are given.
   
maxxr 2X r minxb 2X b Up xr ; xb ¼ minxb 2X b maxxr 2X r Up xr ; xb
3. Main result (25)
Proof. As we stated above, every polytope strategy game of R
In this section, the polytope strategy game is studied and some
and B has a Nash equilibrium. By using Lemma 2, a polytope
general results are given on the basis of [29], and these results are
strategy game has a value if and only if it has at least one Nash
specified as an algorithm to solve the constraint strategy game
equilibrium. Thus, we proved that every polytope strategy game
Ge ðεÞ based on the algorithm for solving the mixed strategy game in
has a value.
Ref. [22].
374
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

The above theorem guarantees the existence of the value of Gp ,  


and the value of Gp can be calculated by the following theorem. Up xr ; xb  V; cxb 2X b (34)
Theorem 2. The value of Gp ¼ CX r ; X b ; Upr ; Upb D is the optimal objec- Then, we have
tive function value of the following linear program in the variables x1 ;
 
x2 ; /; xNr ; w:
minxb 2X b Up xr ; xb  V (35)

Compute : Vb max w (26) Thereby, there has


   
X
Nr   v ¼ maxxr 2X r minxb 2X b Up xr ; xb  minxb 2X b Up xr ; xb  V
Subject to: xl Up crl ; cbk  w; k ¼ 1; 2; /; Nb (27)
(36)
l¼1
Hence V  v.
X
Nr A Nash equilibrium strategy of Gp can be obtain by the following
xl ¼ 1 (28) theorem.
l¼1
Theorem 3. A strategy combination ðxr* ; xb* Þ is the Nash equilib-
xl  0; l ¼ 1; 2; /; Nr : (29) rium of Gp ¼ CX r ; X b ; Upr ; Upb D, if the following inequality is held:

   
where cr1 ; cr2 ; /; crNr are all the extreme points of X r and cb1 ; cb2 ; /; cbNb Up crl ; xb*  v  Up xr* ; cbk ; l ¼ 1; 2; /; Nr ; k ¼ 1; 2; /; Nb ;
are the extreme points of X b .
Proof. Denote the value of Gp as v. We will show that V ¼ v by
(37)
showing that V  v and V  v.
where v is the value of Gp .
r
Proof. Assume that v is the value of Gp , then v satisfies (37). Since
Step 1 V  v . If v is the value of Gp , then R has a strategy ~
x , such
X r and X b are both polytopes, for each xr 2X r , xr can be written as a
that
convex combination of all the extreme points of X r . Similarly, for
   
each xb 2X b , xb can be written as a convex combination of all the
xr ; xb ¼ maxxr 2X r minxb 2X b Up xr ; xb ¼ v
minxb 2X b Up ~
extreme points of X b :
(30)
X
Nr X
Nb

r
xr ¼ hrl crl ; xb ¼ hbk cbk (38)
thereby ~x is a strategy that guarantees an utility of at least v, l¼1 k¼1
for every strategy of B :
  where hrl and hbk are convex combination coefficients. Combining
Up ~xr ; xb  v; cxb 2X b (31) the bilinearity of Up , we have
!
  X
Nr X
Nr   X
Nr
in particular, for each extreme point cbk ðk ¼ 1; 2; /; Nb Þ, we Up xr ; xb* ¼ Up hrl crl ; xb* ¼ Up crl ; xb* hrl  v hrl
have l¼1 l¼1 l¼1

  ¼ v; cxr 2X r
Up ~xr ; cbk  v; k ¼ 1; 2; /; Nb : (32) (39)

!
Since X r is a polytope, ~ xr can be written as a convex combi- X
Nb X
Nb   X
Nb
PNr ~r r r v¼v hbk  Up xr* ; cbk hbk ¼ Up xr* ; hbk cbk
nation of cr1 ; cr2 ; /; crNr , we obtain ~xr ¼ x c , where ~xl
l¼1 l l k¼1 k¼1 k¼1
ðl ¼ 1; 2; /; Nr Þ are convex combination coefficients. Combined  
with the bilinearity of Up , Eq. (32) can be rewritten as ¼ Up xr* ; xb ; cxb 2X b (40)

X
Nr  
~xr U Then, we get
l p crl ; cbk  v; k ¼ 1; 2; /; Nb (33)
l¼1    
Up xr ; xb*  v  Up xr* ; xb cxr 2X r ; cxb 2X b (41)
r r r
Note that ð~x1 ; ~x2 ; /; ~xNr ; vÞ
is a vector that satisfies constraints
Specially, we have
(27)e(29), and V is the largest real value w for all vectors that
satisfies constraints (27)e(29). Thus, we have V  v .    
Step 2 V  v. Since V is the objective function value of the linear Up xr* ; xb*  v  Up xr* ; xb* (42)
program, there exists a vector x ¼ ðx1 ; x2 ; /; xNr Þ such that  
ðx; VÞ satisfies constraints (27)e(29), in other words, thereby, v ¼ Up xr* ; xb* , so ðxr* ; xb* Þ is the Nash equilibrium of Gp .
PNr  
x U crl ; cbk  V ðk ¼ 1; 2; /; Nb Þ. Denoting xr the
l¼1 l p Remark 2. Theorem 3 simplifies the definition of the Nash equi-
convex combination of cr1 ; cr2 ; /; crNr with respect to x, i.e. librium of Gp . To judge whether a strategy combination ðxr* ; xb* Þ is a
P r   Nash equilibrium of Gp , we only need to check whether it satisfies
xr ¼ N x cr , we have Up xr ; cbk  V ðk ¼ 1; 2; /; Nb Þ. The
l¼1 l l the finite number of inequalities in (37). The theorem makes it
bilinearity of Up implies unnecessary to check infinite inequalities in the definition of Nash
375
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

equilibrium, which makes it possible to solve or verify the Nash extreme points of Sr ðεr Þ, cb1 ; cb2 ; /; cbNb are the extreme points of
equilibrium of Gp .
Sb ðεb Þ. Then Ge ðεÞ has a value V :
We know that there must be a Nash equilibrium solution in a Step 2. Compute linear inequalities (45) and (46), where Up ¼
polytope strategy game, but the uniqueness of the Nash equilib- Uε . If hr* ¼ ðhr*
1 ; h2 ; /; hNr Þ and h
r* r* b* ¼ ðhb* ; hb* ; /; hb* Þ are their
1 2 Nb
rium of a polytope strategy game cannot be guaranteed. In fact, any P P
solutions respectively, then sr* b N r
h r* cr and sb* b Nb hb* cb
l¼1 l l k¼1 k k
solutions of (45) and (46) constitute a Nash equilibrium of Gp .
Fortunately, the Nash equilibria of Gp has the following good form a Nash equilibrium ðsr* ; sb* Þ of Ge ðεÞ.
properties, which are the generalization of the properties of matrix
games [31].
Theorem 4.
 
For any two Nash equilibria ðxr* ; xb* Þ and ðxr ; xb Þ of Remark 3. In our algorithm, we implicitly assume that both
Gp , we have parties have established the same utility function. But as we have
seen, the utility function involves many parameters, and it is a very
    
 limited assumption to establish the same utility functions for both
C the indifference of the value of Gp : Up xr* ; xb* ¼ Up xr ; xb ; sides without any communication.
C the ordered interchangeability of Nash equilibria of Gp : In fact, as long as the utility function established by oneself is
 
ðxr* ; xb Þ and ðxr ; xb* Þ are also the Nash equilibria of Gp . accurate, its decision results obtained by the proposed algorithm
will be effective, no matter whether the other party establishes the
Proof. See Appendix D.
same utility function or not. The following proof is given. Suppose
Because Gp has the above properties, even if R and B choose
that Uε and Uε0 are the utility functions of R established by R and
strategies in different Nash equilibria, they will also constitute a
Nash equilibrium of Gp , and different Nash equilibria have the same B respectively, ðsr* ; sb* Þ is the Nash equilibrium calculated by Uε
0 0
utility value for R and B . and ðsr* ; sb* Þ is the Nash equilibrium calculated by Uε0 . We assume
Through Theorem 3, a strategy combination ðxr* ; xb* Þ is a Nash that Uε is the real utility function. According to the definition of
equilibrium of Gp if it satisfies Nash equilibrium, for all sb 2Sb ðεb Þ, we have Uε ðsr* ; sb Þ  Uε ðsr* ;
0
  sb* Þ, thereby, Uε ðsr* ; sb* Þ  Uε ðsr* ; sb* Þ. In other word, even though
r*
Up x ; cbk v k ¼ 1; 2; /; Nb (43) R and B have established different utility functions and both
adopt the Nash equilibrium strategy calculated by themselves, as
  long as the utility function of R is accurate, R can ensure that the
Up crl ; xb*  v l ¼ 1; 2; /; Nr (44) utility obtained is not lower than Uε ðsr* ; sb* Þ. The above conclusion
is guaranteed by Uεr þ Uεb ¼ 0, for general sum games, utility cannot
PNr r* r b* P b b* b
denote xr* ¼ h c , x ¼ Nk¼1
l¼1 l l
hk ck , where hr*
l
and hb*
k
are be guaranteed without establishing the same utility function.
convex combination coefficients and can be calculated from the
Remark 4. We implicitly assume that both parties have estab-
following inequalities in the variables hr1 ; hr2 ; /; hrNr and hb1 ; hb2 ; /; lished the same utility, so the utility function is common knowl-
hbNb , respectively, edge, in other words, the constraint strategy game we proposed is a
complete information game. When one party does not know what
8 N
> X r   utility function the other party has established, in other word, the
>
> h r r b
;  v; k ¼ 1; 2; /; Nb ;
>
> l U p c l ck utility function is not common knowledge, then the game is an
>
>
>
< l¼1 incomplete information game.
XNr (45)
>
> hrl ¼ 1; Remark 5. As we have seen, our decisions are made under the
>
>
>
> current air combat situation. However, air combat is a dynamic
>
> l¼1
: process. If the air combat situation changes, the decisions must be
hrl  0; l ¼ 1; 2; /; Nr : adjusted accordingly. Therefore, in the execution of the attack de-
8 cision process, a module is introduced to detect whether the air
>
> XNb   combat situation has changed. If the air combat situation changes,
>
> h b
Up crl ; cbk  v; l ¼ 1; 2; /; Nr ;
>
> k the algorithm will run again based on the new air combat situation,
>
> k¼1
>
< and new assignments will be made; if the air combat situation does
X Nb (46) not change, the attack will be executed according to the decision
>
> hbk ¼ 1;
>
> result until the end of the air combat (win or lose). The air combat
>
>
>
>
k¼1 decision-making of multiple UCAVs based on CSG-LL algorithm is
>
:
hbk  0; k ¼ 1; 2; /; Nb : shown in Fig. 4.
In summary, we have studied the polytope strategy games (a
generalized matrix game) and applied the research results to the
constraint strategy game (matrix game with linear constraints). An
3.2. An algorithm for solving constraint strategy games algorithm for solving the constraint strategy game has been given,
at the same time, intelligent decisions for multiple UCAVs in air
Since Ge ðεÞ is a special polytope strategy game, the above combat with air combat situation information and time-sensitive
method of solving the game Gp is specialized as the CSG-LL algo- information are obtained.
 
rithm for solving Ge ðεÞ ¼ CSr ðεr Þ; Sb εb ; Uεr ; Uεb D.
The detailed CSG-LL algorithm is described as follows. 4. Numerical example

Step 1. Compute the optimal objective function value V of linear In this section, an example is given to verify the effectiveness of
programming (26)e(29), where Up ¼ Uε , cr1 ; cr2 ; /; crNr are the the CSG-LL algorithm.
376
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

Fig. 4. Air combat decision-making of multiple UCAVs based on CSG-LL algorithm.

377
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

Table 2
Parameters of R . 8
>
>
Symbol Description R1 R2 Unit >
>
>
>
P rix X coordinate of position 12,835 14,506 m >
>
>
>
>
> c1 þ c2 þ / þ c16 ¼ 1;
r r r
P riy Y coordinate of position 17421 23,495 m >
>
>
P riz Z coordinate of position 5504 6488 m >
> cr1  0;
>
>
V rix X coordinate of speed 42 25 km∙h-1 >
>
>
>
V riy Y coordinate of speed 21 77 km∙h-1 >
> cr2  0;
>
>
V riz Z coordinate of speed 64 61 km∙h-1 >
>
M ri Missile attack distance 93 101 km
>
>
>
cr3  0;
>
>
A ri Radar detection distance 163 173 km >
> cr4  0;
>
>
L r Number of missiles carried 6 4 km >
>
i >
> cr5  0;
>
>
>
>
>
> cr6  0;
>
>
>
>
Suppose that UCAV group R has 2 UCAVs: R 1 and R 2 , and >
>
>
> cr7  0;
UCAV group B has 4 UCAVs: B 1 , B 2 , B 3 and B 4 . Based on the <
related literatures and our previous work accumulation, their Sr ðεr Þ ¼ cr8  0;
>
>
parameter values are given in Table 2 and Table 3, respectively. >
>
>
> cr9  0;
In Eq. (7), take u1 ¼ 0:4, u2 ¼ 0:3, u3 ¼ 0:2, u4 ¼ 0:1 according >
>
>
>
to expert experience, the attack probability matrices of R and B >
>
>
cr10  0;
>
>
are obtained: >
> cr11  0:5;
>
>
>
>
>
> cr12  0;
>
>
>
>
>
> cr13  0;
 >
>
0:2492 0:1960 0:3250 0:3659 >
>
Pr ¼ ; >
>
>
> cr14  0;
 0:2047 0:1320 0:4261 0:3881 >
>
>
0:1516 0:2276 0:0765 0:0971 >
> cr15  0;
Pb ¼ : >
>
0:2077 0:2919 0:0684 0:0752 >
>
>
> cr16  0:
>
>
>
>
Based on the attack probability matrix P r and P b , the value >
>
>
>
matrices of R and B are calculated as: :
8
>
>
>
>
>
>
>
>
>
>
V r ¼ ð 1:1361 1:1509 Þ; >
> cb1 þ cb2 þ / þ cb16 ¼ 1;
>
>
V b ¼ ð 0:3592 0:5195 0:1448 0:1723 Þ >
>
>
>
>
>
>
cb1  0:5;
As explained in Eq. (8), Sr consists of 16 strategies: sr1 ; sr2 ; /; sr16 , >
>
>
>
Sb consists of 16 strategies: sb1 ;sb2 ;/;sb16 , and their detailed forms are >
>
>
cb2  0;
>
>
>
>
given in Appendix E. We use sr2 and sb6 as examples to explain the >
> cb3  0;
>
>
attack methods they represent: sr2 means that R 1 attacks B 1 and >
>
>
> cb4  0;
R 2 attacks B 2 with all their missiles; sb6 means that B 1 attacks R 1 , >
>
>
>
>
> cb5  0;
B 2 attacks R 2 , B 3 attacks R 1 , and B 4 attacks R 2 with all their >
>
>
>
missiles. >
>
>
> cb6  0;
The intelligence reconnaissance unit of R detects that B 3 is the >
>
>
>
command center of B . According to the decision rules of the >
> cb7  0;
>
airborne expert system, the strategy sr11 should be executed with a   > <
Sb εb ¼ cb8  0;
probability greater than 0.5, where sr11 represents the joint attack of >
>
>
>
R 1 and R 2 on B 3 . Similarly, the intelligence reconnaissance unit >
> cb9  0;
>
>
of B detects that R 1 is the command center of R , and executes the >
>
>
>
strategy sb1 with a probability greater than or equal to 0.5. The above >
> cb10  0;
>
>
>
>
constraints can be described by a perturbation vector ε ¼ ðεr ; εb Þ, >
> cb11  0;
>
>
where >
>
>
> cb12  0;
>
>
>
>
>
>
>
>
> cb13  0;
>
>
>
>
εr ¼ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ; >
>
>
cb14  0;
>
>
>
>
>
>
>
cb15  0;
>
>
>
>
>
> cb16  0:
>
>
>
>
εb ¼ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ: >
>
>
:
Then we obtain constraint strategy sets Sr ðεr Þ and Sb ðεb Þ, which
are all polytopes in ℝ16 . By the method introduced in Appendix A, the extreme points of
378
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

Table 3
Parameters of B .

Symbol Description B1 B1 B1 B1 Unit

P bjx X coordinate of position 24,537 12,253 36,432 18,752 m

P bjy Y coordinate of position 11412 13,273 26,732 27,891 m

P bjz Z coordinate of position 5449 7236 5367 6723 m

V bjx X coordinate of speed 25 91 53 33 km∙h-1

V bjy Y coordinate of speed 63 27 33 35 km∙h-1

V bjz Z coordinate of speed 51 12 47 41 km∙h-1

M bj Missile attack distance 89 120 56 78 km

A bj Radar detection distance 153 186 103 127 km

L bj Number of missiles carried 2 2 4 2 km

Table 4
Extreme points of Sr ðεr Þ.

Symbol Description Value

cr1 Extreme Point 1 of Sr ðεr Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ


cr2 Extreme Point 2 of Sr ðεr Þ ð0; 0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ
cr3 Extreme Point 3 of Sr ðεr Þ ð0; 0; 0:5; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ
cr4 Extreme Point 4 of Sr ðεr Þ ð0; 0; 0; 0:5; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ
cr5 Extreme Point 5 of Sr ðεr Þ ð0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ
cr6 Extreme Point 6 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ
cr7 Extreme Point 7 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ
cr8 Extreme Point 8 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0:5; 0; 0; 0; 0; 0Þ
cr9 Extreme Point 9 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0:5; 0; 0; 0; 0; 0Þ
cr10 Extreme Point 10 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0:5; 0; 0; 0; 0; 0Þ
cr11 Extreme Point 11 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 1; 0; 0; 0; 0; 0Þ
cr12 Extreme Point 12 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0:5; 0; 0; 0; 0Þ
cr13 Extreme Point 13 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0:5; 0; 0; 0Þ
cr14 Extreme Point 14 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0:5; 0; 0Þ
cr15 Extreme Point 15 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0:5; 0Þ
cr16 Extreme Point 16 of Sr ðεr Þ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0:5Þ

Sr ðεr Þ and Sb ðεb Þ are calculated which are presented in Table 4 and The value of the game can be calculated as 0:4726 by
Table 5, respectively. The utility Uεr ðcrl ; cbk Þ of R under different (47)e(50), and the Nash equilibrium strategy ðsr* ; sb* Þ is obtained
according to (51) and (52), where sr* ¼ ð0; 0; 0; 0; 0:5;
extreme point combinations can be calculated by Equation (15).
0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ, sb* ¼ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5;
Since Uεb ¼  Uεr , the utility Uεb ðcrl ; cbk Þ of B can be easily obtained.
0; 0; 0Þ, which gives the probabilities of R and B choosing their
Uεr ðcrl ; cbk Þ and Uεb ðcrl ; cbk Þ ðl ¼ 1; 2; /; Nr ; k ¼ 1; 2; /; Nb Þ are pre- respective pure strategies. According to the pure strategies given in
sented in Fig. 5. Appendix E, the Nash equilibrium strategy can be written as the

Table 5
Extreme points of Sb ðεb Þ.

Symbol Description Value

cb1 Extreme Point 1 of Sb ðεb Þ ð1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ


cb2 Extreme Point 2 of Sb ðεb Þ ð0:5; 0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ
cb3 Extreme Point 3 of Sb ðεb Þ ð0:5; 0; 0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ
cb4 Extreme Point 4 of Sb ðεb Þ ð0:5; 0; 0; 0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ
cb5 Extreme Point 5 of Sb ðεb Þ ð0:5; 0; 0; 0; 0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ
cb6 Extreme Point 6 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ
cb7 Extreme Point 7 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ
cb8 Extreme Point 8 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0; 0; 0; 0Þ
cb9 Extreme Point 9 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0; 0; 0Þ
cb10 Extreme Point 10 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0; 0Þ
cb11 Extreme Point 11 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0; 0Þ
cb12 Extreme Point 12 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0; 0Þ
cb13 Extreme Point 13 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0; 0Þ
cb14 Extreme Point 14 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0; 0Þ
cb15 Extreme Point 15 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5; 0Þ
cb16 Extreme Point 16 of Sb ðεb Þ ð0:5; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0:5Þ.

379
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

Fig. 5. The attack methods of R and B according to the Nash equilibrium strategy.

attack probabilities of each UCAV to different targets. The attack constraints εb at strategy sb1, which illustrates the effectiveness of
methods of R and B according to the Nash equilibrium strategy is the CSG-LL algorithm.
shown in Fig. 6.
In addition, when we do not consider the constraints, by the
method in Refs. [22,29], the game has a pure strategy Nash equi-
5. Conclusions
librium ðsr5 ; sb13 Þ with value  0:5186, where
sr5 ¼ ð0; 0; 0; 0; 1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0Þ and In view of the fact that there is time-sensitive information in air
sb13 ¼ ð0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 1; 0; 0; 0Þ in mixed strategy form. combat and need to be prioritized in complex air combat envi-
The comparisons of the solutions of both parties obtained by the ronment, a constraint strategy game model has been proposed,
CSG-LL algorithm and traditional algorithm are shown in Fig. 7. then the CSG-LL algorithm has been given to solve constraint
Obviously, R ’s Nash equilibrium strategy obtained by the CSG-LL strategy game. Numerical example has shown that the Nash equi-
algorithm fully meets the constraints εr , while the one obtained librium obtained by CSG-LL algorithm can meet the requirements
by the traditional algorithm cannot satisfy the constraints εr at of the given constraints. In addition, in complex decision-making
strategy sr11, similarly, B ’s the Nash equilibrium strategy obtained problems, the strategy sets of decision makers are often
according to the CSG-LL algorithm just satisfies of constraints εb , constraint, the proposed constraint strategy game with its solution
but the one obtained by the traditional algorithm cannot satisfy the algorithm, CSG-LL algorithm, will provide a powerful tool for such
problems.

Fig. 6. The attack methods of R and B according to the Nash equilibrium strategy.

380
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

8
>
> c1 þ c2 þ / þ cjSr j ¼ 1;
>
>
>
>
>
> c1  εr1 ;
>
<
c2  εr2 ; (A1)
>
>
>
> «
>
>
>
>
>
: c r  r
jS j εjSr j ;

jSr j
 r  in ℝ .
which is a polytope
Since every S  linearly independent hyperplanes in ℝjS j has an
r

unique intersection point.


The extreme points of Sr ðεr Þ are the intersection points of hy-
 
perplane. ðH0 Þc1 þ c2 þ / þ cjSr j ¼ 1 with any Sr   1 hyper-
 r
planes of the following S  hyperplanes

ðH1 Þ c1 ¼ εr1 ;
ðH2 Þ c2 ¼ εr2 ;
«
 
HjSr j cjSr j ¼ εrjSr j :
 
So Sr ðεr Þ has Sr  extreme points. Let crl be the intersection point of
 r
S  hyperplanes H0 ;H1 ;/;Hl1 , Hlþ1 ;/;H r . We give an example to
jS j
calculate cr1 , which is the unique solution of the following equations
in the variables c1 ; c2 ; /; cjSr j .
8
>
> c1 þ c2 þ / þ cjSr j ¼ 1;
>
>
>
>
< c2 ¼ εr2 ;
(A2)
>
> «
>
>
Fig. 7. Comparison of the CSG-LL algorithm and traditional algorithm. >
> cjSr j ¼ εrjSr j ;
:

so cr1 ¼ ð1  εr2  εr3  /  εrSr ; εr2 ; εr3 ; /; εrSr Þ, and other extreme
j j j j
points can be calculated similarly.

Declaration of competing interest


Appendix B. Proof of Lemma 1
The authors declare that they have no known competing
r b
financial interests or personal relationships that could have x 2X r , b
According to Ref. [32], for each b x 2X b , we have
appeared to influence the work reported in this paper.    r b
r
minxb 2X b Up b x ;b
x ; xb  Up b x ;
 r b  r b
b b b
Up x ; x  maxxr 2X r Up x ; x :
Acknowledgements
Thus, we obtain
   b
This work was supported by Major Projects for Science and r
Technology Innovation 2030 (Grant No. 2018AA0100800), Equip- minxb 2X b Up bx ; xb  maxxr 2X r Up xr ; b
x :
ment Pre-research Foundation of Laboratory (Grant No.
r
61425040104) and in part by Jiangsu Province “333” project under Because b
x is an arbitrary element in X r , we obtain
Grant BRA2019051.    b
maxxr 2X r minxb 2X b Up xr ; xb  maxxr 2X r Up xr ; b
x :

b
Appendix A. Calculate the extreme points of constraint Since b
x is an arbitrary element in X b , we have
strategy set    
maxxr 2X r minxb 2X b Up xr ; xb  minxb 2X b maxxr 2X r Up xr ; xb :
Considering the constraint strategy game of R and B : Ge ðεÞ ¼
CSr ðεr Þ; Sb ðεb Þ; U r ; U b D, where ε ¼ ðεr ; εb Þ, εr ¼ ðεr1 ; εr2 ; /; εrSr Þ and
j j
εb ¼ ðεb1 ; εb2 ; /; εb b Þ. We will take Sr ðεr Þ as an example to illustrate
jS j Appendix C. Proof of Lemma 2
how to calculate all the extreme points of a constraint set. The
strategy set Sr ðεr Þ can be described as ð*Þ On the basis of [32], suppose that ðxr* ; xb* Þ is a Nash
381
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

equilibrium of Gp , we have Appendix D. Proof of Theorem 4


       
maxxr 2X r Up xr ; xb*  Up xr* ; xb*  minxb 2X b Up xr* ; xb ; For any two Nash equilibria ðxr* ; xb* Þ and ðxr ; xb Þ of Gp , we have
     
so one has Up xr ; xb*  Up xr* ; xb*  Up xr* ; xb ; cxr 2X r ; cxb 2X b ;

    (D1)
minxb 2X b maxxr 2X r Up xr ; xb  Up xr* ; xb*
   
      
 maxxr 2X r minxb 2X b Up xr ; xb : Up xr ; xb  Up xr ; xb  Up xr ; xb ; cxr 2X r ; cxb 2X b :
(D2)
On the other hand, by Lemma 1, we have
 
Take xr as xr , xb as xb in Eq. (D1), and take xr as xr* , xb as xb* in Eq.
   
(D2), we obtain
maxxr 2X r minxb 2X b Up xr ; xb  minxb 2X b maxxr 2X r Up xr ; xb ;
      

Up xr ; xb*  Up xr* ; xb*  Up xr* ; xb
thereby,  
      
Up xr* ; xb  Up xr ; xb  Up xr ; xb*
   
maxxr 2X r minxb 2X b Up xr ; xb ¼ minxb 2X b maxxr 2X r Up xr ; xb
thereby,
 
¼ Up xr* ; xb* :       
   

Up xr ; xb*  Up xr* ; xb*  Up xr* ; xb  Up xr ; xb
  
This completes the proof of sufficiency and equation (24).
 Up xr ; xb*
ð0Þ Suppose that Gp has a value v, i.e.
    so we have
maxxr 2X r minxb 2X b Up xr ; xb ¼ minxb 2X b maxxr 2X r Up xr ; xb
      
   

¼ v: Up xr ; xb* ¼ Up xr* ; xb* ¼ Up xr* ; xb ¼ Up xr ; xb ;
(C1) (D3)
    
It is easy to know that there is xr* 2X r and xb* 2X b , satisfying 
which proved that Up xr* ; xb* ¼ Up xr ; xb .
    From the definition of Nash equilibrium and equation (D3), we
minxb 2X b Up xr* ; xb ¼ maxxr 2X r minxb 2X b Up xr ; xb ; (C2) have
      
    Up xr ; xb*  Up xr* ; xb* ¼ Up xr ; xb* ; cxr 2X r (D4)
maxxr 2X r Up xr ; xb* ¼ minxb 2X b maxxr 2X r Up xr ; xb : (C3)
 
   
  

Then by equations (C1), (C2) and (C3), we have Up xr ; xb  Up xr ; xb ¼ Up xr* ; xb ; cxr 2X r (D5)
     
v ¼ maxxr 2X r Up xr ; xb* ¼ minxb 2X b Up xr* ; xb  Up xr* ; xb*  
    
Up xr* ; xb ¼ Up xr* ; xb*  Up xr* ; xb ; cxb 2X b (D6)
 
 maxxr 2X r Up xr ; xb*
       
   
¼ minxb 2X b Up xr* ; xb ¼ v; Up xr ; xb* ¼ Up xr ; xb  Up xr ; xb ; cxb 2X b (D7)

from equations (D4) and (D7), there is


thereby
       
      Up xr ; xb*  Up xr ; xb*  Up xr ; xb ;
maxxr 2X r Up xr ; xb* ¼ Up xr* ; xb* ¼ v ¼ minxb 2X b Up xr* ; xb ;
cxr 2X r ; cxb 2X b

so we obtain from equations (D5) and (D6), we obtain


     
     
r b* Up xr ; xb  Up xr* ; xb  Up xr* ; xb ;
Up x ; x  Up xr* ; xb* ¼ v
  cxr 2X r ; cxb 2X b
r* b r r b b
 Up x ;x ; cx 2X ; cx 2X :
 
 
The above two inequalities prove that xr ; xb* and ðxr* ; xb Þ are
This completes the proof of necessity and equation (24). the Nash equilibria of Gp .

382
S.-y. Li, M. Chen, Y.-h. Wang et al. Defence Technology 18 (2022) 368e383

Appendix E. Strategies of R and B [4] Duan H, Li P, Yu Y. A predator-prey particle swarm optimization approach to
multiple UCAV air combat modeled by dynamic game theory. IEEE/CAA J
Automat Sinica 2015;2(1):11e8.
[5] Yang Z, Zhou D, Piao H, Zhang K, Kong W, Pan Q. Evasive maneuver strategy
  for UCAV in beyond-visual-range air combat based on hierarchical multi-
1 0 0 0 r 1 0 0 0 objective evolutionary algorithm. IEEE Access 2020;8:46605e23.
sr1 ¼ s2 ¼ [6] Zhang Y, Chen J, Shen L. Hybrid hierarchical trajectory planning for a fixed-
1 0 0 0 0 1 0 0
  wing UCAV performing air-to-surface multi-target attack. J Syst Eng Elec-
1 0 0 0 r 1 0 0 0 tron 2012;23(4):536e52.
sr3 ¼ s4 ¼ [7] Zhang H, Huang C. Maneuver decision-making of deep learning for UCAV
0 0 1 0 0 0 0 1 thorough azimuth angles. IEEE Access 2020;8:12976e87.
[8] Chin HH. Knowledge-based system of supermaneuver selection for pilot
  aiding. J Aircraft 1989;26(12):1111e7.
0 1 0 0 r 0 1 0 0
sr5 ¼ s6 ¼ [9] Hu X, Ma H, Ye Q, Luo H. Hierarchical method of task assignment for multiple
1 0 0 0 0 1 0 0 cooperating UAV teams. J Syst Eng Electron 2015;26(5):1000e9.
  [10] Fan DD, Theodorou EA, Reeder J. Model-based stochastic search for large scale
0 1 0 0 r 0 1 0 0
sr7 ¼ s8 ¼ optimization of multi-agent UAV swarms. In: 2018 IEEE symposium series on
0 0 1 0 0 0 0 1 computational intelligence (SSCI); 2018. p. 2216e22.
[11] Brooks RR, Pang J, Griffin C. Game and information theory analysis of elec-
  tronic countermeasures in pursuit-evasion games. IEEE Trans Syst Man
0 0 1 0 r 0 0 1 0 Cybern Syst Hum 2008;38(6):1281e94.
sr9 ¼ s10 ¼ [12] Lopez VG, Lewis FL, Wan Y, Sanchez EN, Fan L. Solutions for multiagent
1 0 0 0 0 1 0 0
  pursuitevasion games on communication graphs: finite-time capture and
0 0 1 0 r 0 0 1 0 asymptotic behaviors. IEEE Trans Automat Contr 2020;65(5):1911e23.
sr11 ¼ s12 ¼ [13] Austin F, Carbone G, Hinz H, Lewis M, Falco M. Game theory for automated
0 0 1 0 0 0 0 1
maneuvering during air-to-air combat. J Guid Contr Dynam 1990;13(6):
1143e9.
  [14] Cruz JB, Simaan MA, Gacic A, Jiang H, Letelliier B, Li M, Liu Y. Game-theoretic
0 0 0 1 r 0 0 0 1
sr13 ¼ s14 ¼ modeling and control of a military air operation. IEEE Trans Aero Electron Syst
1 0 0 0 0 1 0 0 2001;37(4):1393e405.
  [15] Cruz JB, Simaan MA, Gacic A, Liu Y. Moving horizon nash strategies for a
0 0 0 1 r 0 0 0 1
sr15 ¼ s16 ¼ military air operation. IEEE Trans Aero Electron Syst 2002;38(3):989e99.
0 0 1 0 0 0 0 1 [16] McEneaney WM, Fitzpatrick BG, Lauko IG. Stochastic game approach to air
operations. IEEE Trans Aero Electron Syst 2004;40(4):1191e216.
  [17] Garcia E, Casbeer DW, Pachter M. Active target defence differential game: fast
1 1 1 1 b 1 1 1 0 defender case. IET Control Theory & Appl 2017;11(17):2985e93.
sb1 ¼ s2 ¼
0 0 0 0 0 0 0 1 [18] Nishizaki I, Sakawa M. Equilibrium solutions in multiobjective bimatrix games
  with fuzzy payoffs and fuzzy goals. Fuzzy Set Syst 2000;111(1):99e116.
1 1 0 1 b 1 1 0 0 [19] Levitskii SV. System analysis of close air combat for the development of the
sb3 ¼ s4 ¼
0 0 1 0 0 0 1 1 knowledge base of an onboard operative-advising expert system. J Comput
Syst Sci Int 2002;41(6):908e20.
  [20] Wang Z, Li Z. Underground building fire emergency communication network
1 0 1 1 b 1 0 1 0 based on mesh networks. In: 2012 international conference on computer
sb5 ¼ s6 ¼ science and service system; 2012. p. 914e7.
0 1 0 0 0 1 0 1
  [21] Yeh C, Hsu U. The feasibility strategy to enhance rate of emergency rescue
1 0 0 1 1 0 0 0 dispatch with optimized way of dressing. In: 2016 international conference
sb7 ¼ sb8 ¼ on applied system innovation (ICASI); 2016. p. 1e4.
0 1 1 0 0 1 1 1
[22] Parthasarathy T, Raghavan TES. Some topics in two-person games. American
Elsevier Publishing Company; 1917.
  [23] Wang X, Liu Z, Hou C, Yuan J. Modeling and decision-making of multi-target
0 1 1 1 b 0 1 1 0
sb9 ¼ s10 ¼ optimization assignment for aerial defense weapon. Acta Armamentarii
1 0 0 0 1 0 0 1 2007;28(2):228e31.
  [24] Dong Y, Feng J, Zhang H. Cooperative tactical decision methods for multi-
0 1 0 1 b 0 1 0 0
sb11 ¼ s12 ¼ aircraft air combat simulation. J Syst Simul 2002;14(6):723e5.
1 0 1 0 1 0 1 1 [25] Jiang C, Ding Q, Wang J, Wang J. Research on threat assessment and target
distribution for multi-aircraft cooperative air combat. Fire Control Command
Control 2008;33(11):8e12þ21.
[26] Shen Z, Xie W, Zhao X, Yu C. Modeling of UAV battlefield threats based on
artificial potential field. Comput Simul 2014;31(2):60e4.
References [27] Zhao M, Li B, Wang M. On game strategy for multi-UAV beyond-visual-range
air combat. Electron Optic Contr 2015;22(4):41e4.
[1] Duan H, Zhang Q. Visual measurement in simulation environment for vision- [28] Yao M, Zhu Y, Zhao M. Study on cooperative attacking strategy of unmanned
based UAV autonomous aerial refueling. IEEE Trans Instrument Meas aerial vehicles in adversarial environment. Chin J Sci Instrum 2011;32(8):
2015;64(9):2468e80. 1891e7.
[2] Padhy RP, Xia F, Choudhury SK, Sa PK, Bakshi S. Monocular vision aided [29] Maschler M, Solan E, Zamir S. Game theory. Cambridge University Press; 2013.
autonomous UAV navigation in indoor corridor environments. IEEE Trans [30] Dyer ME, Proll LG. An algorithm for determining all extreme points of a
Sustain Comput 2019;4(1):96e108. convex polytope. Math Program 1977;12(1):81e96.
[3] Xu M, Wang S, Tao J, Liang G. Research on cooperative task allocation for [31] Basar T, Olsder GJ. Dynamic noncooperative game theory. Society for Indus-
multiple UCAVs based on modified co-evolutionary genetic algorithm. In: trial and Applied Mathematics; 1999.
2013 international conference on computational and information sciences; [32] Xie Z. Introduction to game theory. Science Press; 2010.
2013. p. 125e8.

383

You might also like