• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Cooperative maneuver decision making for multi-UAV air combat based on incomplete information dynamic game

    2023-10-09 04:30:24ZhiRenDongZhangShuoTangWeiXiongShuhengYang
    Defence Technology 2023年9期

    Zhi Ren ,Dong Zhang ,*,Shuo Tang ,Wei Xiong ,Shu-heng Yang

    a School of Astronautics,Northwestern Polytechnical University,Xi'an,710072,China

    b Shaanxi Aerospace Flight Vehicle Design Key Laboratory,Northwestern Polytechnical University,Xi'an,710072,China

    Keywords:Cooperative maneuver decision Air combat Incomplete information dynamic game Perfect bayes-nash equilibrium Reinforcement learning

    ABSTRACT Cooperative autonomous air combat of multiple unmanned aerial vehicles (UAVs) is one of the main combat modes in future air warfare,which becomes even more complicated with highly changeable situation and uncertain information of the opponents.As such,this paper presents a cooperative decision-making method based on incomplete information dynamic game to generate maneuver strategies for multiple UAVs in air combat.Firstly,a cooperative situation assessment model is presented to measure the overall combat situation.Secondly,an incomplete information dynamic game model is proposed to model the dynamic process of air combat,and a dynamic Bayesian network is designed to infer the tactical intention of the opponent.Then a reinforcement learning framework based on multiagent deep deterministic policy gradient is established to obtain the perfect Bayes-Nash equilibrium solution of the air combat game model.Finally,a series of simulations are conducted to verify the effectiveness of the proposed method,and the simulation results show effective synergies and cooperative tactics.

    1.Introduction

    Unmanned aerial vehicle (UAV) plays an important role in air combat to achieve complex missions through coordinated tactics and maneuvers[1,2].Thus,cooperative maneuver decision making is the key technology of multi-UAV air combat,where cooperative maneuvers are taken to get the better of the opponent and thus accomplish certain missions.Since the maneuver performance and strategies of the opponent is uncertain,the maneuver decision space of UAV becomes more complex as the combat situation varies[3,4].Therefore,how to infer the opponent's strategy and determine the optimal cooperative maneuvering decision according to the current combat situation is the key point of research on cooperative maneuver decision making[5,6].

    At present,typical methods that commonly used for air combat decision making include expert system,optimization theory,reinforcement learning,and game theory.The expert system is the earliest decision-making method applied to air combat,which conforms to the analysis and judgments of experienced pilots and models the process of decision making in air combat [7].On the basis of expert system,the air combat model is refined by decomposing the process of decision making into sequential links,including situation awareness assessment [8-10],maneuver intention prediction [11-13],and combat tactic decision [14-17].Considering the requirements and constraints of specific air combat scenarios,optimization theory is introduced to determine the optimal maneuver of certain countermeasures [18,19].With the development of artificial intelligence technology,reinforcement learning has also been applied to air combat decision making.Different from the results of traditional methods,neural network training can emerge new combat tactics and maneuver strategies[20-24].The above studies mainly focused on one-on-one air combat scenario with few variables and small dimensions,while cooperative maneuver of multi-UAV combat is more complex and requires further consideration of highly dynamic combat process and uncertain maneuver strategies of the rivals.Thus,the traditional methods are difficult to be directly applied.

    Game theory is a key method to study multilateral decision making,which is applicable for strategy optimization of multiple agents or groups in certain adversarial scenarios with constraints.Li et al.[25] proposed a constraint strategy game approach for cooperative decision making with strict constraints of multi-UAV in air combat.Ha et al.[26]developed a stochastic game model with a sequence of normal-form games for various beyond-visual-range air combat scenarios.Cao et al.[27] proposed a cooperative maneuver decision-making algorithm for weakly connected adversarial environment by combining game models with intuitionistic fuzzy sets.The above studies establish the game model according to specific scenarios,while the opponent's maneuver strategies are considered as complete information,which are not known in the actual air combat.Therefore,it is essential to establish an incomplete information dynamic game model for cooperative maneuver decision making.

    Perfect Bayes-Nash equilibrium(PBE)is the Nash Equilibrium of the incomplete information dynamic game,which is usually solved by linear programming [28],counterfactual regret minimization(CFR) algorithm [29],and other deigned methods.However,the incomplete information dynamic game model for multi-UAV air combat has more game phases and larger computational complexity,which makes it difficult to apply the traditional method directly to determine the PBE.Since the neural network has strong fitting ability,it is commonly used in reinforcement learning method to solve the PBE [30,31].In this paper,a PBE solving strategy based on the multi-agent deep deterministic policy gradient(MADDPG)method is proposed,where the actions,states,and rewards are designed based on the corresponding elements of multi-UAV combat scenario and the game model.

    The main contributions of this paper are given as follows:

    (i) A cooperative maneuver decision-making method based on incomplete information dynamic game model is proposed,considering the high dynamics and uncertainties of the opponents in air combat.

    (ii) A reinforcement learning framework based on MADDPG is established to determine the perfect Bayes-Nash equilibrium of the proposed air combat game model.

    The remainder of this paper is organized as follows.Section 2 introduces the UAV maneuver model and the cooperative air combat situation assessment model.In Section 3,the incomplete information dynamic game model for cooperative maneuver decision making in multi-UAV air combat is proposed,and the dynamic Bayesian network is applied to infer the incomplete information.Section 4 discusses the existence of PBE and establishes the MADDPG framework.The simulation experiments are conducted and analyzed in Section 5,and Section 6 concludes this paper.

    2.Problem formulation

    Cooperative maneuver decision making requires to consider the real-time situation and the opponent's maneuver strategies,where the determined tactical maneuvers are carried out to achieve the advantage of the overall air combat situation,and thus win the combat.

    2.1.UAV maneuver model

    The flight trajectory of UAV can be decomposed into seven elemental maneuvers,including max load factor turns,max long acceleration,steady flight,max long deceleration,max load factor pull up and max load factor push over[32].The combination of the elemental maneuvers allows for complex tactical maneuverers such as kindle,split-S,yoyo,etc.In this paper,the elemental maneuvers in five directions are expanded by considering the speed variation.The control strategy of maneuvering in any directions and variation can be described by tangential overload,normal overload and bank angle to generate,which is shown in Fig.1.

    Fig.1.Control strategy of maneuvering in any directions and variation.

    The kinematics and dynamics model of UAV in inertial coordinate system is as follows:

    whereVis the velocity of UAV,x,yare the horizonal coordinates,zis the altitude,θ is the flight path angle,φ is the heading angle,γ is the bank angle,nxis the tangential overload,nyis the normal overload andgis acceleration of gravity.Besides,the tangential overload enables acceleration,deceleration and uniformity flight;the normal overload enables pitching motion;and the bank angle enables yaw motion.Therefore,with certain control strategies,the maneuvers in all directions and variation can be generated under the constraints of control variables and flight performance,which are considered as follows:

    wherenxminis the minimum available tangential overload,nxmaxis the maximum available tangential overload,nyminis the minimum available normal overload,nymaxis the maximum available normal overload,Vminis the minimum velocity,andVmaxis the maximum velocity that UAV can reach.In particular,the control boundary of overload and velocity represents the maneuverability of UAV.

    2.2.Cooperative situation assessment model

    Situation assessment is the premise of maneuver decision making,where the survival threat and combat edge are measured to provide decision basis for maneuver decision making.In one-onone air combat situation assessment,the air combat situation includes advantage,balance and disadvantage,according to the relative position and velocity direction of two combatants.The spatial geometric relationship of UAV is shown in Fig.2.

    Fig.2.Air combat geometry situation.

    Fig.2 shows the relative course angle λibetween the combatants,which can be calculated by

    whereR=[xr-xb,yr-yb,zr-zb]Tis position vector,vi=[cos θisin φi,cos θisin φi,sin θi]Tis velocity vector,r and b represents for the UAVs of the red side and the blue side,respectively.

    The relative situation of UAV air combat is quantitatively described by the situation function based on the relative position and posture.Considering the angle advantageuθ,uφand distance advantageudas the main component of the situation function[33],which can be calculated by

    where ωθ,ωφ,ωdis the weight coefficient of angle advantage and distance advantage,respectively.

    It is easy to see that the larger the situation function is,the more likely the UAV is to survive and destroy the opponent.On this basis,the situation function matrix is designed as follows:

    whereUis the situation function matrix of m-on-n air combat anduijis the one-on-one combat situation between UAVi=1,2,…,mand UAVj=1,2,…,n.On this basis,the overall situation is determined by considering the cooperative combat tactics.Assume that each UAV can only choose one rival as the target in the process of muti-UAV air combat,then the opponenteiwith the largest value of situation function will be selected as the target,which is described as follows:

    Suppose that the UAVs with the same target can form cooperative combat tactics through certain maneuvers,and the overall cooperative situation will be improved with such cooperative effect.Assume that the combat situation among the UAVs with the same target is independent of each other,then the overall cooperative situationof UAViis defined as follows:

    3.Multi-UAV air combat game model

    3.1.Incomplete information dynamic game

    The incomplete information of the game model refers to a lack of full information by the players about the normal form of the game,such as some other players' strategy spaces and utility functions[34].In air combat problems,the tactical intention refers to the UAV's adoption of pursuit or evasion strategy according to the real-time situation,which determines the combat strategies and corresponding maneuvers.Since the opponents' preference of tactical intention is uncertain,the incomplete information of the players in the game model corresponds to the preference of the tactical intention.On this basis,a virtual player “nature”is introduced through Harsanyi transformation [35] to set the type of player,which transforms a dynamic game of incomplete information into a dynamic game of complete but imperfect information.Fig.3 shows the extensive form of the incomplete information dynamic game model for multi-UAV air combat.

    Fig.3.Extensive form game.

    On this basis,we formulate the incomplete information dynamic gameG=[N,T,Π,V] as follows:

    ?N={r1,r2,…,rm,b1,b2,…,bn}denotes the player set of multi-UAV air combat game,rrepresents for the UAVs of the red side,andbrepresents for the UAVs of the blue side.

    ?T={T1,T2,…,Tl} is type set of the player andpTis the corresponding probability density,where Tiis the type of playeriand represents for the tactical intention of UAV.Suppose that Tiis independent of each other and obey the normal distribution.The larger Tiis,the more likely that UAV prefers to adopt pursuit and attack strategy rather than evasion and defense strategy.

    ? Π={π1,π2,…,πk} is the strategy set andpπis the probability density of the mixed strategies,which represents that playeritakes the actionof the corresponding maneuver based on the determined tactical attention Ti,and the state transfers into

    ?Vdenotes the utility function of the game and represents the pay-off of certain mixed strategies.

    Suppose that the main goal of the cooperative maneuver decision making is to maintain or improve the overall air combat situation with consideration of the synergies and coordination.Therefore,the overall cooperative situation of the players after the maneuver state transfer is used as the utility function of the corresponding mixed strategies,which is defined as follows:

    3.2.Dynamic Bayesian network inference

    In the incomplete information dynamic game of multi-UAV air combat,the type of the red side is set by“nature”,while the type of the blue side is inferred through dynamic Bayesian network based on the prior knowledge,including the state transition probability and the prior probability distribution.A Bayesian network consists of nodes and directed edges,where each node represents a random variable and each directed edge represents the causal relationship between two variables,which can be denoted as follows:

    whereX={x0,x1,…,xn-1} is the joint probability distribution of state variables,Y={y0,y1,…,yn-1} is the joint probability distribution of observed variables,P(xt|xt-1) is the probability distribution of the state transfer between the neighboring state variables,P(yt|xt)is the probability distribution of state transfer between the observed variable and the neighboring state variable,andP(x0) is the probability of the initial state variable.

    Since the type of the player determines the tactical intention preference and the corresponding maneuvers of UAV in the realtime combat situation,the change pattern of the state parameters indicates the characteristic of certain maneuvers.Moreover,the state parameters of the opponent,such as the position and velocity,can be observed by the airborne radar during the process of the air combat.Therefore,the dynamic Bayesian network is divided into a decision layer and a feature layer.Fig.4 shows the structure of the designed dynamic Bayesian network,where the feature layer selects velocity,heading angle and altitude as the observed variables,and the decision layer selects type of the blue side,tactical intention decision and maneuver decision as the state variables.

    Fig.4.Structure of dynamic Bayesian network.

    According to the Bayesian theorem and the designed dynamic

    Bayesian network,the conditional probability of the tactical intention that the blue side takes in condition of the corresponding statesis calculated by wheregbis the specific tactical intention of the blue side,e1:tdenotes the sequence information of the state variables of maneuvers and the neighboring observed variable before timet,Atrepresents the state variable of tactical intention.

    In the stage of multi-UAV air combat game,the players of the red side iteratively estimate the type and maneuver of the blue side based on the probability distribution set by“nature”.Let Trbe the type of the red side set by “nature”,andsbis the priori state information of the blue side,then the inferred type of the blue side Tbis calculated by

    4.PBE solution strategy

    4.1.PBE existence analysis

    Perfect Bayes-Nash equilibrium is the strategy evaluation of the incomplete information dynamic game model as a combination of inferred probabilities and mixed strategies [36].In muti-UAV air combat,when state and posture of the blue side is given,the red side can infer the type and tactical intention based on the real-time combat situation,and the corresponding PBE can ensure that the overall combat situation in the subsequent game at any decision stage is optimal compared to other strategies.The PBE of the proposed multi-UAV air combat game model is defined as follows:

    Combining Eq.(6) and Eq.(7),it is easy to see that the utility function is continuous with upper and lower bounds on the function values between (0,1).Meanwhile,the utility function is discrete as the game stages disperse.Therefore,for each maneuver strategy in any stage of the air combat game,the other maneuvers will either improve or worsen the situation,which satisfies the following:

    Thus,the proposed gameG=[N,T,pT,Π,pπ,V] satisfies strong uniform payoff security[36]and that for each T ?T,the maneuver strategy Π described by tangential overload,normal overload and bank angle has upper semicontinuous.Then,according to the PBE existence theorem proposed by Carbonell et al.[37],the proposed air combat game must possess a perfect Bayes-Nash equilibrium,which shows that the proposed maneuver decision method can achieve the tactical intention and guarantee the advantage of the overall air combat situation.

    4.2.MADDPG training framework

    The basic process of the reinforcement learning method is iteratively transferring actions and states through policy optimization to acquire the maximum sum of rewards.MADDPG is based on the Actor-Critic method,in which multiple neural networks are connected in parallel with centralized learning and distributed execution.During the training process,the actor network uses an off-line policy to generate the training dataset and playback the empirical data.The critic network samples the training dataset based on the strategy net-work and then iteratively learns the optimal strategy.The specific structure of MADDPG designed for determining PBE of the proposed game model is shown in Fig.5.

    Fig.5.Structure of MADDPG network.

    The MADDPG training model is established based on the proposed game model to solve the Perfect Bayes-Nash equilibrium.In the MADDPG model,the agents correspond to the players in the game model,which represents for the UAV in the air combat.The state is set as the position of UAV in the air combat,and the action is set as the maneuver control of UAV,which corresponds to the action of the pure strategy in the proposed game model.Thus,the training result of the policy corresponds to the Perfect Bayes-Nash equilibrium of mixed strategy.

    The reward of the reinforcement model determines the expectation of the policy convergence of the strategy learning iterations,and the agent is trained to maximize the final cumulative reward through iterative learning.As noted in Eq.(11),the mixed strategy of Perfect Bayes-Nash equilibrium denotes the largest utility.Therefore,the utility function in the dynamic game model can be used as the reward function of the reinforcement learning model,which is designed as follows:

    Suppose that the MADDPG trainsmagents of the red side against the blue side,the policy of each agent is denoted asTo estimate the parameters of the actor and critic network σ?{θμ,θQ},respectively,the gradient descent method is introduced and calculated by the follows:

    In summary,the decision-making process of multi-UAV cooperative maneuver based on incomplete information dynamic game is illustrated in Fig.6.During the combat,each step of maneuver decision-making is considered as a stage of dynamic game.Firstly,the proposed dynamic games of incomplete information transfer to dynamic games of complete information by Harsanyi transformation,where the“nature”selects the type of both players and the Bayesian network infers the tactical intention of the opponents.Then the perfect Bayes-Nash equilibrium is solved,and the cooperative maneuver is decided to gain the advantage of overall situation.In particular,the perfect Bayes-Nash equilibrium is a mixed strategy of certain maneuvers with corresponding probabilities,where the state of UAV after maneuver decision is determined in each step as follows:

    Fig.6.Cooperative maneuver decision making process.

    5.Experimental results and analysis

    5.1.Platform setting

    In the process of air combat,suppose that the effective attack range of the UAV is a sector related to the firing angle and attack distance of the airborne weapon.When the opponent is within the attack range for a sustained period of time,it is considered as a success of air combat.Also,suppose that the effective detection range of the UAV is a sector related to the directional angle of the airborne radar and the detection range of the opponent.When the opponent is outside the detection range for a sustained period of time,it is considered that the UAV has evaded from the opponent.The typical air combat process of pursuit and evasion is shown in Fig.7.

    Fig.7.Typical air combat process: (a) Pursuit;(b) Evasion.

    The simulation experiments are designed in several sets of different postures of air combat scenario based on Zhang et al.[38].Firstly,a two-on-two air combat scenario is designed to verify the effectiveness of the proposed method.On this basis,further simulations are conducted for two-on-three and three-on-two air combat scenarios.In the simulations,the red side adopts the proposed cooperative maneuver decision method,and the blue side adopts the maneuver strategy that proposed by Ma et al.[39].Assume that UAVs of both sides have the same performance,and the parameter settings are designed based on Zhang et al.[38],where the initial flight speed of UAV is 200 m/s,the maximum detection distance is 6000 m,the maximum directional angle is 45°,the maximum attack distance is 1000 m,and the maximum attack angle is 60°,the minimum available tangential overload is -1,the maximum available tangential overload is 2,the minimum available normal overload is 0,the maximum available normal overload is 8,the minimum flight velocity is 90 m/s,and the maximum flight velocity is 400 m/s.The air combat will come to an end when either side completes its tactical intention.

    The equipment configuration is Intel(R) Core (TM) i9-10885H CPU,NIVIDA GeForce RTX 2070 Super,and the TensorFlow-GPU 2.1.0 environment is built by CUDA10.1 to construct the MADDPG training framework for the proposed game mode.The neural network parameters are shown in Table 1.

    Table 1 Convolutional neural network parameters.

    5.2.Simulation results and analysis

    5.2.1.Algorithm performance comparison and analysis

    A two-on-two combat scenario(scenario 1)is designed to verify the Perfect Bayes-Nash equilibrium solution strategy and the combat effectiveness of the proposed model.The specific initial conditions are shown in Table 2,where the red side is within the detection range ahead of the blue side,and the overall air combat situation is at a disadvantage.

    Table 2 Initial state of two-on-two air combat.

    (A) PBE solution strategy simulation The Perfect Bayes-Nash equilibrium of two-on-two combat scenario is firstly trained by the proposed MADDPG method,and the TD3 method[40]is lately applied as comparative test to verify the PBE solution strategy.The cumulative reward of the proposed MADDPG method and TD3 method is shown in Fig.8,and the exploitability of both is shown in Fig.9(a) and Fig.9(b),respectively.

    Fig.8.Cumulative reward of PBE solution strategy: (a) MADDG;(b) TD3.

    Fig.9.Exploitability of PBE solution strategy: (a) MADDG;(b) TD3.

    The cumulative reward and exploitability of the training results show that the proposed solution strategy can converge to estimate the Perfect Bayes-Nash equilibrium with fewer iterations and oscillations than TD3.It is because that the MADDPG method has good performance in large-scale space and depth with centralized training of multiple actor and critic networks rather than relaying the update of the actor network,and thus achieve better results with higher efficiency.

    (B) Deviation simulation case

    During the process of air combat,the observation of the opponents’ states has a great impact on maneuver decision making.To further verify the combat effectiveness of the proposed incomplete information dynamic game model,the noise is used to deviate the input observation of the opponent’s state parameters.

    Fig.10 shows the combat trajectory of two-on-two air combat with accurate observation.In the initial stage of the combat,the two UAVs of the red side separate and use outflanking tactics,where the left-wing UAV takes an S-shaped maneuver to evade and the right-wing UAV staggers to the left to accelerate in a dive;the blue side also adopts a separation strategy to track and pursue.In the later stages of the combat,the left-wing UAV of the red side makes a large overload maneuver of kindle to gain the advantage over the blue side and counter back,and the other one accelerates with a yoyo maneuver to take turn in the direction of the teammate,while the blue side maintains constant tracking but loses its original advantage of overall air combat situation.

    Fig.10.Combat trajectory of two-on-two air combat with accurate observation: (a) Top view of two-on-two-combat;(b) Three-dimensional view of two-on-two combat.

    Fig.11 shows the combat trajectory of two-on-two air combat with deviations,where the red side remains outflanking tactic and the blue side keeps constant tracking after separation.Although the UAVs of red side take the maneuvering trajectories different from the accurate case during the process of combat,they realize the outflanking tactic and form the pincer situation on the back of the opponent in the later stages of the combat.The deviation of the expected opponents’state parameters directly affects calculation of the overall situation and misleads the optimal response maneuvering against the opponent.Therefore,the mixed strategies of Perfect Bayes-Nash equilibrium change and the maneuvering trajectories differ from the accurate case.However,the comparative simulation shows that the Perfect Bayes-Nash equilibrium remains coherent tactical strategy,which verifies the combat effectiveness of the proposed model.

    Fig.11.Combat trajectory of two-on-two air combat with deviations: (a) Top view of two-on-two-combat;(b) Three-dimensional view of two-on-two.

    5.2.2.Air combat simulation examples

    (A) Two-on-three air combat simulation

    To further verify the effectiveness of the proposed method,a two-on-three air combat scenario(scenario 2)is designed based on scenario 1,where the red side is within the detection range of the blue side with,and the overall air combat situation is at a disadvantage.The initial conditions are shown in Table 3.The proposed method is solved by MADDPG for 25,000 generations,and the cumulative reward is shown in Fig.12.

    Table 3 Initial state of two-on-three air combat.

    Fig.12.Cumulative reward for two-on-three air combat.

    The result of the combat trajectory is shown in Fig.13.During the progress of air combat,the UAVs of the red side adopt decoy tactics,where one UAV makes a large overload maneuver of detour and pull up to evade,and the other UAV slows down as a decoy and attempts to make an S-shaped maneuver to achieve situation advantage.The three UAVs of the blue side are distracted by the decoy UAV of the red side.The right-wing approaches the teammate and pulls up to slow down in accordance with the overall combat situation,while the other UAVs continuously track and pursue the decoy UAV of the red side.In the later stage of the combat,the red side completes the tactical escape and takes the advantage,while the blue side has difficulty in coping with the red side,due to its decision to pursue the decoy UAV of the red side at the early stage of the air combat.At last,two UAVs of the blue side end up with a disadvantage situation of being with the attack range of the red side.

    Comparing the simulation results of scenario 1 and scenario 2,it can be seen that the red side determines a tactical intention to escape,but adopts a different maneuver strategy.In scenario 1,the UAVs of the red side separate first but approach at the end of the combat for coordinate support,while in scenario 2 the UAVs of the red side use decoy tactics to leave the battlefield and do not support with the teammate.The above simulation comparison can verify the effectiveness of the proposed method under different scenarios,and the Nash equilibrium combat strategies show synergies of multi-UAV air combat effectively.

    (B) Three-on-two air combat simulation

    To further study the superiority of the proposed method,the three-on-two air combat scenario(scenario 3)is established based on the scenario 2,where the initial states of the red side and the blue side are swapped to compare the air combat result of switched strategies.The initial conditions are shown in Table 4.The proposed method is solved by MADDPG for 25,000 generations,and the cumulative reward is shown in Fig.14.

    Table 4 Initial state of two-on-three air combat.

    Fig.14.Cumulative reward for three-on-two air combat.

    The result of the combat trajectory is shown in Fig.15.In the initial stage of the combat,the blue side is within the detection range ahead of the red side,where the UAVs of the red side takes the overall situation advantage.During the air combat,the two ambilateral UAVs of the red side firstly pull up and take the altitude advantage to track the left-wing UAV of the blue side,and the middle UAV dives to accelerate and keeps tracking the right-wing UAV of the blue side.Then the red side completes the formation change to keep the target pursuit and the overall situation advantage.The UAVs of the blue side adopt a half roll and split-S maneuver to reverse the posture and pine down to accelerate to storm the front.In the end,the red side forms a pincer movement and the blue side loses the combat.

    Fig.15.Combat trajectory of three-on-two air combat: (a) Top view of two-on-two-combat;(b) Three-dimensional view of three-on-two combat.

    Comparing the simulation results of scenario 2 and scenario 3,it can be seen that the Nash equilibrium strategy of the proposed method shows effective synergies,which rediscovers and verifies existing cooperative tactics in multi-UAV air combat,including decoy tactics and pincer movement shown in Fig.16.

    6.Conclusions

    In this paper,an incomplete information dynamic game theory approach is used to model the progress of multi-UAV air combat and a dynamic Bayesian network is designed to infer the incomplete information of the opponent’s preference type on tactical intention.The perfect Bayes-Nash equilibrium strategy is solved by reinforcement learning framework based on MADDPG to address the cooperative maneuver decision-making problem.The simulations verify the effectiveness of the proposed method and the results show that the perfect Bayes-Nash equilibrium strategy can generate multiple effective synergies and cooperative tactics in different scenarios.The proposed method has superiority to achieve the advantage of overall combat situation.

    Declaration of competing interest

    The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

    Acknowledgements

    This work was supported by the National Natural Science Foundation of China (Grant No.61933010 and 61903301),and Shaanxi Aerospace Flight Vehicle Design Key Laboratory.

    永久网站在线| 午夜福利视频精品| 91aial.com中文字幕在线观看| 亚洲av二区三区四区| 极品少妇高潮喷水抽搐| 热re99久久精品国产66热6| 人妻系列 视频| 一级黄片播放器| 欧美丝袜亚洲另类| 婷婷色av中文字幕| 韩国高清视频一区二区三区| 国产精品嫩草影院av在线观看| 国产精品一区www在线观看| 亚洲欧美中文字幕日韩二区| 欧美xxxx性猛交bbbb| 五月天丁香电影| 成人无遮挡网站| 卡戴珊不雅视频在线播放| 欧美日韩视频高清一区二区三区二| av天堂久久9| 综合色丁香网| 精品少妇久久久久久888优播| 丰满乱子伦码专区| 一级片'在线观看视频| xxx大片免费视频| 亚洲欧洲日产国产| 精品久久久久久久久av| 大码成人一级视频| 女人精品久久久久毛片| av福利片在线| 这个男人来自地球电影免费观看 | h日本视频在线播放| 黄色一级大片看看| 欧美激情极品国产一区二区三区 | 精品少妇内射三级| 性色avwww在线观看| 2021少妇久久久久久久久久久| 欧美日韩av久久| 国产精品一区www在线观看| 51国产日韩欧美| 99热这里只有是精品在线观看| a级毛色黄片| 欧美3d第一页| 美女视频免费永久观看网站| 成年av动漫网址| 涩涩av久久男人的天堂| av在线老鸭窝| 午夜免费观看性视频| 欧美老熟妇乱子伦牲交| 精品人妻熟女毛片av久久网站| 亚洲在久久综合| 国产精品99久久久久久久久| 一个人看视频在线观看www免费| 国产在线免费精品| 久久久久久久久久久免费av| 日本免费在线观看一区| 国产永久视频网站| 亚洲真实伦在线观看| 亚洲欧洲国产日韩| 欧美bdsm另类| 一本色道久久久久久精品综合| 在线观看三级黄色| 欧美另类一区| 嫩草影院新地址| 91在线精品国自产拍蜜月| 精品卡一卡二卡四卡免费| 熟女人妻精品中文字幕| 免费看av在线观看网站| 国产深夜福利视频在线观看| 十八禁高潮呻吟视频 | 69精品国产乱码久久久| 美女主播在线视频| 七月丁香在线播放| 我要看日韩黄色一级片| 日日爽夜夜爽网站| 精品久久久精品久久久| 丝袜脚勾引网站| 麻豆精品久久久久久蜜桃| 嘟嘟电影网在线观看| 十八禁高潮呻吟视频 | 精品久久久久久电影网| 久久人人爽人人爽人人片va| 校园人妻丝袜中文字幕| 国语对白做爰xxxⅹ性视频网站| 丝袜脚勾引网站| 国产欧美日韩一区二区三区在线 | 日日啪夜夜爽| av黄色大香蕉| 80岁老熟妇乱子伦牲交| av有码第一页| 边亲边吃奶的免费视频| 中文欧美无线码| 一本色道久久久久久精品综合| 高清av免费在线| 亚洲欧洲精品一区二区精品久久久 | 成人国产av品久久久| 一区二区三区四区激情视频| videossex国产| 欧美 亚洲 国产 日韩一| 国产精品一区www在线观看| 丝袜在线中文字幕| 国产色爽女视频免费观看| 99re6热这里在线精品视频| 欧美区成人在线视频| 久久热精品热| 如日韩欧美国产精品一区二区三区 | 国产深夜福利视频在线观看| 99久久精品一区二区三区| 亚洲国产日韩一区二区| 久久人妻熟女aⅴ| 免费人妻精品一区二区三区视频| 3wmmmm亚洲av在线观看| 丰满人妻一区二区三区视频av| 亚洲欧洲日产国产| 午夜免费鲁丝| 国语对白做爰xxxⅹ性视频网站| 水蜜桃什么品种好| 狠狠精品人妻久久久久久综合| 久久狼人影院| 国产在线视频一区二区| 国产av码专区亚洲av| 久久久久久久久久人人人人人人| 免费观看在线日韩| 亚洲av中文av极速乱| 国产精品人妻久久久影院| 国产视频首页在线观看| 久久韩国三级中文字幕| 国产精品久久久久久精品电影小说| 嘟嘟电影网在线观看| 日韩欧美一区视频在线观看 | 日韩成人av中文字幕在线观看| 亚洲欧美中文字幕日韩二区| 蜜桃久久精品国产亚洲av| 日韩,欧美,国产一区二区三区| 亚洲欧美精品自产自拍| 成人国产av品久久久| 人人妻人人添人人爽欧美一区卜| 精品少妇久久久久久888优播| 麻豆成人午夜福利视频| 亚洲不卡免费看| 美女中出高潮动态图| www.av在线官网国产| 一级毛片黄色毛片免费观看视频| 亚洲av国产av综合av卡| 日本wwww免费看| 日本猛色少妇xxxxx猛交久久| 久久青草综合色| 狂野欧美激情性bbbbbb| 亚洲精品456在线播放app| 精品一区二区三卡| 91精品一卡2卡3卡4卡| 亚洲在久久综合| 男的添女的下面高潮视频| 精品久久久久久电影网| 一级毛片电影观看| 久久国产亚洲av麻豆专区| 男人和女人高潮做爰伦理| 国产成人aa在线观看| 在线观看人妻少妇| 久久久a久久爽久久v久久| 大陆偷拍与自拍| 精品国产一区二区久久| 国产精品三级大全| 视频中文字幕在线观看| 性色avwww在线观看| 妹子高潮喷水视频| 91成人精品电影| a级片在线免费高清观看视频| 国产黄片美女视频| 国产精品一区二区性色av| 亚洲天堂av无毛| 啦啦啦在线观看免费高清www| 视频中文字幕在线观看| 人妻系列 视频| 99久久精品一区二区三区| 99精国产麻豆久久婷婷| 国产极品粉嫩免费观看在线 | 亚洲精品久久久久久婷婷小说| 亚洲一级一片aⅴ在线观看| 日韩成人av中文字幕在线观看| 最近的中文字幕免费完整| 99热这里只有是精品50| 久久久久国产网址| 午夜av观看不卡| 欧美性感艳星| 99久久精品一区二区三区| 99久久精品一区二区三区| 男女边摸边吃奶| 爱豆传媒免费全集在线观看| 一区二区三区免费毛片| 欧美成人午夜免费资源| 欧美激情极品国产一区二区三区 | 乱码一卡2卡4卡精品| 国产有黄有色有爽视频| 亚洲av男天堂| 99九九在线精品视频 | 国产一区二区在线观看av| 免费大片18禁| 在线免费观看不下载黄p国产| 日韩视频在线欧美| 久久久久国产网址| 欧美日韩精品成人综合77777| 亚洲色图综合在线观看| 国产乱来视频区| 国产av一区二区精品久久| 制服丝袜香蕉在线| 丰满人妻一区二区三区视频av| 嫩草影院新地址| 久久精品久久久久久久性| 80岁老熟妇乱子伦牲交| 在线天堂最新版资源| 欧美日韩精品成人综合77777| 欧美少妇被猛烈插入视频| 亚洲精品国产av成人精品| av黄色大香蕉| 精品熟女少妇av免费看| 久久久久久久久大av| 夜夜骑夜夜射夜夜干| 国产亚洲av片在线观看秒播厂| 日本黄色日本黄色录像| av福利片在线| 青春草亚洲视频在线观看| 嫩草影院新地址| 欧美日韩av久久| 精品久久久噜噜| 性高湖久久久久久久久免费观看| 国产在线免费精品| 欧美bdsm另类| 国产男人的电影天堂91| 久久久精品94久久精品| 精品人妻熟女av久视频| 久久久国产欧美日韩av| 99国产精品免费福利视频| 一级,二级,三级黄色视频| 九色成人免费人妻av| 在线 av 中文字幕| 在线观看人妻少妇| 人体艺术视频欧美日本| 国产精品国产三级国产av玫瑰| 能在线免费看毛片的网站| 又爽又黄a免费视频| 黄色一级大片看看| 两个人的视频大全免费| 精品久久久久久电影网| 好男人视频免费观看在线| 99热这里只有精品一区| 五月开心婷婷网| 国产欧美日韩综合在线一区二区 | 一级毛片我不卡| 亚洲国产精品国产精品| 中国美白少妇内射xxxbb| 亚洲精品视频女| 成人特级av手机在线观看| 黄色毛片三级朝国网站 | 国产一区二区在线观看av| a 毛片基地| 国产精品不卡视频一区二区| 欧美日韩精品成人综合77777| 天美传媒精品一区二区| 成人二区视频| 国产白丝娇喘喷水9色精品| 欧美精品一区二区免费开放| 久久综合国产亚洲精品| 狂野欧美激情性bbbbbb| 成年女人在线观看亚洲视频| 免费观看av网站的网址| 亚洲国产av新网站| 热re99久久国产66热| 久久精品国产鲁丝片午夜精品| 国产在线视频一区二区| 国产 一区精品| 午夜老司机福利剧场| 在线观看一区二区三区激情| 中文天堂在线官网| 免费久久久久久久精品成人欧美视频 | 欧美日韩视频高清一区二区三区二| 久久精品国产亚洲网站| 亚洲精华国产精华液的使用体验| 成人毛片60女人毛片免费| 免费久久久久久久精品成人欧美视频 | 亚洲av不卡在线观看| 五月开心婷婷网| 十分钟在线观看高清视频www | 欧美变态另类bdsm刘玥| 免费高清在线观看视频在线观看| 男女边吃奶边做爰视频| 久久精品国产亚洲网站| 亚洲无线观看免费| 久久99蜜桃精品久久| 最近中文字幕高清免费大全6| 中文字幕制服av| 女性生殖器流出的白浆| 桃花免费在线播放| 欧美 亚洲 国产 日韩一| 中文字幕久久专区| 亚洲精品视频女| 9色porny在线观看| 51国产日韩欧美| 国产成人一区二区在线| 欧美bdsm另类| 欧美日韩国产mv在线观看视频| 精品亚洲成国产av| 欧美精品人与动牲交sv欧美| 国产综合精华液| 亚洲欧美一区二区三区国产| 久久这里有精品视频免费| 久久免费观看电影| 偷拍熟女少妇极品色| 蜜桃久久精品国产亚洲av| 久久人妻熟女aⅴ| 国产视频首页在线观看| 国产一级毛片在线| 欧美区成人在线视频| 欧美国产精品一级二级三级 | 在线看a的网站| 校园人妻丝袜中文字幕| 日本91视频免费播放| 日韩av在线免费看完整版不卡| 国产av国产精品国产| 新久久久久国产一级毛片| 亚洲欧美一区二区三区黑人 | 亚洲综合精品二区| 成人二区视频| 色94色欧美一区二区| 99热网站在线观看| 国产淫片久久久久久久久| 午夜免费男女啪啪视频观看| 一区二区三区精品91| 男男h啪啪无遮挡| 精品人妻一区二区三区麻豆| 国产女主播在线喷水免费视频网站| 男女国产视频网站| 极品教师在线视频| 人体艺术视频欧美日本| 黑人猛操日本美女一级片| 国产日韩欧美视频二区| 国产精品一区二区在线观看99| 一本色道久久久久久精品综合| 中文乱码字字幕精品一区二区三区| 日韩成人伦理影院| 最新中文字幕久久久久| 亚洲va在线va天堂va国产| 久久精品熟女亚洲av麻豆精品| 欧美日韩精品成人综合77777| 国产免费一区二区三区四区乱码| 国产精品蜜桃在线观看| 国产在线视频一区二区| 97超视频在线观看视频| 曰老女人黄片| 久久久久国产网址| 亚洲天堂av无毛| 亚洲精品国产成人久久av| 国产欧美日韩一区二区三区在线 | 欧美变态另类bdsm刘玥| 日产精品乱码卡一卡2卡三| 中国国产av一级| 伊人久久国产一区二区| 两个人的视频大全免费| 久久人人爽av亚洲精品天堂| 高清毛片免费看| 91久久精品电影网| 青青草视频在线视频观看| 成人美女网站在线观看视频| 美女主播在线视频| 亚洲四区av| 亚洲国产日韩一区二区| 亚洲国产欧美在线一区| 国产欧美日韩一区二区三区在线 | 亚洲精品久久久久久婷婷小说| 毛片一级片免费看久久久久| 丝袜喷水一区| 免费高清在线观看视频在线观看| 亚洲婷婷狠狠爱综合网| 国产又色又爽无遮挡免| 亚洲美女视频黄频| 国语对白做爰xxxⅹ性视频网站| 中文资源天堂在线| 国产精品国产三级国产av玫瑰| 欧美3d第一页| 肉色欧美久久久久久久蜜桃| 狂野欧美激情性xxxx在线观看| 国产精品麻豆人妻色哟哟久久| 亚洲国产精品一区二区三区在线| 亚洲国产精品一区三区| 十八禁高潮呻吟视频 | 国产精品熟女久久久久浪| 精品少妇黑人巨大在线播放| 国产精品人妻久久久影院| 在线观看www视频免费| 日本-黄色视频高清免费观看| 在线观看美女被高潮喷水网站| 国产高清三级在线| 一区在线观看完整版| 国产美女午夜福利| 黑丝袜美女国产一区| 青春草亚洲视频在线观看| 国产一区有黄有色的免费视频| 日本av免费视频播放| 国产精品嫩草影院av在线观看| 国产一区二区三区综合在线观看 | 国产精品99久久久久久久久| 日韩中文字幕视频在线看片| 国产69精品久久久久777片| 一级毛片黄色毛片免费观看视频| 国产综合精华液| 国产日韩欧美在线精品| 欧美日韩视频精品一区| 婷婷色综合www| 成年人免费黄色播放视频 | 久久久久国产精品人妻一区二区| .国产精品久久| 亚洲国产精品一区三区| 国产亚洲av片在线观看秒播厂| 亚洲精品亚洲一区二区| 亚洲国产最新在线播放| 麻豆成人av视频| 亚洲欧洲日产国产| 涩涩av久久男人的天堂| 中文欧美无线码| 国产精品一区www在线观看| 久久国产精品大桥未久av | 国产伦在线观看视频一区| 一级毛片 在线播放| 欧美3d第一页| 国产精品一区二区性色av| 少妇猛男粗大的猛烈进出视频| kizo精华| 色婷婷av一区二区三区视频| 国产精品久久久久久精品古装| 国产女主播在线喷水免费视频网站| 99九九在线精品视频 | 精品少妇久久久久久888优播| 久久婷婷青草| 成人国产av品久久久| 寂寞人妻少妇视频99o| 少妇猛男粗大的猛烈进出视频| 少妇被粗大的猛进出69影院 | 国产日韩一区二区三区精品不卡 | 麻豆精品久久久久久蜜桃| 日本免费在线观看一区| 毛片一级片免费看久久久久| 久久久久久久久久成人| 成人特级av手机在线观看| 欧美精品高潮呻吟av久久| 成人国产麻豆网| 亚洲精品456在线播放app| 国产永久视频网站| 一本久久精品| av有码第一页| 国产精品女同一区二区软件| 日韩精品免费视频一区二区三区 | 亚洲成人一二三区av| 中文字幕久久专区| 亚洲国产精品一区二区三区在线| 天美传媒精品一区二区| 中文字幕久久专区| 菩萨蛮人人尽说江南好唐韦庄| 晚上一个人看的免费电影| 久久免费观看电影| 亚洲精品国产av成人精品| 亚洲经典国产精华液单| 视频中文字幕在线观看| 成人午夜精彩视频在线观看| 色吧在线观看| 久久久久久人妻| 亚洲真实伦在线观看| 秋霞在线观看毛片| 国产免费一级a男人的天堂| 亚洲成人手机| 国产成人一区二区在线| 一级毛片电影观看| 一级爰片在线观看| 亚洲av男天堂| 亚洲人成网站在线播| 亚洲天堂av无毛| 久久人人爽人人片av| 免费观看在线日韩| 午夜91福利影院| av有码第一页| 国精品久久久久久国模美| 亚洲无线观看免费| 久久ye,这里只有精品| a级一级毛片免费在线观看| 十分钟在线观看高清视频www | 晚上一个人看的免费电影| 亚洲精品亚洲一区二区| 2021少妇久久久久久久久久久| 欧美日本中文国产一区发布| 五月玫瑰六月丁香| 成年av动漫网址| 国产黄色免费在线视频| 王馨瑶露胸无遮挡在线观看| 色5月婷婷丁香| av又黄又爽大尺度在线免费看| 成年av动漫网址| 免费高清在线观看视频在线观看| 麻豆成人午夜福利视频| 九九在线视频观看精品| 一级毛片久久久久久久久女| 精品一品国产午夜福利视频| 成人国产av品久久久| 亚洲人成网站在线观看播放| 久久久久精品久久久久真实原创| 国产中年淑女户外野战色| 欧美三级亚洲精品| 18禁在线无遮挡免费观看视频| 亚洲成人av在线免费| 国产成人a∨麻豆精品| 十八禁网站网址无遮挡 | 熟女av电影| 精品一区在线观看国产| 两个人免费观看高清视频 | 日韩精品免费视频一区二区三区 | av黄色大香蕉| 亚洲综合色惰| 亚洲国产精品国产精品| av免费观看日本| 国产欧美日韩一区二区三区在线 | 日日撸夜夜添| 精品亚洲成a人片在线观看| 国产成人免费观看mmmm| 黑人巨大精品欧美一区二区蜜桃 | 最近的中文字幕免费完整| 亚洲av中文av极速乱| 少妇人妻精品综合一区二区| 日韩欧美 国产精品| 成年人午夜在线观看视频| 亚洲电影在线观看av| 春色校园在线视频观看| 男女免费视频国产| 曰老女人黄片| 日韩人妻高清精品专区| 久久久久网色| 日本-黄色视频高清免费观看| 免费观看的影片在线观看| 国产男女超爽视频在线观看| 丰满少妇做爰视频| 欧美三级亚洲精品| 国语对白做爰xxxⅹ性视频网站| 99视频精品全部免费 在线| 国产成人免费无遮挡视频| 五月开心婷婷网| videossex国产| 中文字幕人妻丝袜制服| av不卡在线播放| 国产成人精品无人区| 亚洲国产精品成人久久小说| 两个人的视频大全免费| 国产美女午夜福利| 日韩人妻高清精品专区| 三级国产精品片| 日韩中字成人| 蜜桃久久精品国产亚洲av| 七月丁香在线播放| 成人综合一区亚洲| 一级a做视频免费观看| 久久久精品免费免费高清| 久久午夜综合久久蜜桃| 欧美人与善性xxx| 亚洲,一卡二卡三卡| 国产黄片美女视频| 成人无遮挡网站| 一级毛片黄色毛片免费观看视频| 日韩亚洲欧美综合| 伊人亚洲综合成人网| 少妇熟女欧美另类| 少妇人妻一区二区三区视频| 91精品国产九色| 亚洲欧美成人综合另类久久久| 精品亚洲成国产av| 亚洲av.av天堂| 男人添女人高潮全过程视频| √禁漫天堂资源中文www| 18+在线观看网站| 视频中文字幕在线观看| 性高湖久久久久久久久免费观看| 久久国产精品男人的天堂亚洲 | av专区在线播放| 我要看日韩黄色一级片| 国产伦在线观看视频一区| 天美传媒精品一区二区| 伦理电影大哥的女人| 国语对白做爰xxxⅹ性视频网站| 人妻夜夜爽99麻豆av| 午夜激情福利司机影院| 午夜av观看不卡| 亚洲精品国产av蜜桃| 精品一区二区免费观看| 在线观看一区二区三区激情| 欧美国产精品一级二级三级 | 成人漫画全彩无遮挡| 国产成人freesex在线| 18禁在线无遮挡免费观看视频| 永久网站在线| 午夜激情福利司机影院| 欧美日韩在线观看h| 亚洲情色 制服丝袜| 国产探花极品一区二区| 天美传媒精品一区二区| 波野结衣二区三区在线| 国产精品嫩草影院av在线观看| 午夜日本视频在线| 欧美 日韩 精品 国产| www.色视频.com| av专区在线播放| 黄色怎么调成土黄色| 大陆偷拍与自拍| 黄色视频在线播放观看不卡| 一区二区三区精品91| 丝瓜视频免费看黄片| 在线观看三级黄色| 免费看不卡的av| 国产免费视频播放在线视频| 免费观看av网站的网址| 少妇 在线观看|