• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Discrete-time dynamic graphical games:model-free reinforcement learning solution

    2015-12-05 07:42:14MohammedABOUHEAFFrankLEWISMagdiMAHMOUDDariuszMIKULSKI
    Control Theory and Technology 2015年1期

    Mohammed I.ABOUHEAF ,Frank L.LEWIS ,Magdi S.MAHMOUD ,Dariusz G.MIKULSKI

    1.Systems Engineering Department,King Fahd University of Petroleumamp;Mineral,Dhahran-31261,Saudi Arabia;

    2.UTA Research Institute,University of Texas at Arlington,Fort Worth,Texas,U.S.A.;

    3.State Key Laboratory of Synthetical Automation for Process Industries,Northeastern University,Shenyang Liaoning 110819,China;

    4.Ground Vehicle Robotics(GVR),U.S.Army TARDEC,Warren MI,U.S.A.

    Received 31 December 2013;revised 2 January 2015;accepted 15 January 2015

    Discrete-time dynamic graphical games:model-free reinforcement learning solution

    Mohammed I.ABOUHEAF1?,Frank L.LEWIS2,3,Magdi S.MAHMOUD1,Dariusz G.MIKULSKI4

    1.Systems Engineering Department,King Fahd University of Petroleumamp;Mineral,Dhahran-31261,Saudi Arabia;

    2.UTA Research Institute,University of Texas at Arlington,Fort Worth,Texas,U.S.A.;

    3.State Key Laboratory of Synthetical Automation for Process Industries,Northeastern University,Shenyang Liaoning 110819,China;

    4.Ground Vehicle Robotics(GVR),U.S.Army TARDEC,Warren MI,U.S.A.

    Received 31 December 2013;revised 2 January 2015;accepted 15 January 2015

    This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games.The graphical game results from multi-agent dynamical systems,where pinning control is used to make all the agents synchronize to the state of a command generator or a leader agent.Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games.The Hamiltonian mechanics are used to derive the necessary conditions for optimality.The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein.Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations.An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game.This algorithm does not require any knowledge of the agents’dynamics.A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph.A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time.

    Dynamic graphical games,Nash equilibrium,discrete mechanics,optimal control,model-free reinforcement learning,policy iteration

    DOI 10.1007/s11768-015-3203-x

    1 Introduction

    This paper develops an online model-free policy iteration solution[1,2]for a class of discrete-time dynamic graphical games developed in[3].The information flow between the agents is governed by a communication graph.Continuous-time differential graphical games have been developed in[4].Novel model-free policy iteration algorithm is developed to learn Nash solution for graphical game in real-time.This paper brings together cooperative control,optimal control,game theory,and reinforcement learning techniques to find online solutions for the graphical game.

    The optimal control theory uses the Hamilton-Jacobi-Bellman(HJB)equation whose solution is the optimal cost-to-go function[5,6].Discrete-time canonical forms forthe Hamiltonian functions are found in[7–9].The cooperative control problems involve the consensus control problems and the synchronization control problems[10–16].The agents synchronize to uncontrollable node dynamics in the cooperative consensus problem.While in the synchronization control problem,the control protocols are designed such that each agent reaches the same state[17–21].

    Game theory provides a solution framework formultiagent control problems[22].Non-cooperative dynamic game theory provides an environment for formulating multi-player decision control problems[23].Each agent finds its optimal control policy through optimizing its performance index independently[23].Offline solutions for the games are given in terms of the respective coupled Hamilton-Jacobi(HJ)equations[23–25].

    Approximate dynamic programming(ADP)is an approach to solve the dynamical programming problems[1,2,26–31].ADP combines adaptive critics and reinforcement learning(RL)techniques,with dynamic programming[2].ADP techniques are developed to solve the optimal control problem online in[2]and offline in[29].Morimoto et al used the concept of Q-learning to solve the differentialdynamic programming[32].Landelius,used the action dependent heuristic dynamic programming(ADHDP)to solve the linear quadratic optimal control problem and showed that the solution is equivalent to iterating the underlying Riccati equations[33].RL is concerned with learning from interaction in a dynamic environment[34,35].RL algorithms are used to learn the optimal control solutions for dynamic systems in real-time[1,2,34,36,37].These algorithms involve policy iteration(PI)or value iteration(VI)techniques[29,36,38–42].New policy iteration approach employed ADP to find online solutions for the continuous-time Riccati equations in[43].Policy iteration solution for the adaptive optimal control problem can be obtained by relaxing the HJB equation to the equivalent optimization problem[44].RL algorithms are used to solve multi-player games for finite-state systems in[40–42]and to learn online in real-time the solutions for the optimal control problems of the differential games in[36–38,45,46].Actor-critic neural network structures are used to solve the graphical game using heuristic dynamic programming(HDP)in real-time[3].

    In this paper,the dynamic graphical game developed herein,is a special case of standard dynamic game[23]and explicitly captures the structure of the communication graph topology.The ADHDP structure for single agent[2]is extended for the case of the dynamic graphical games and it is used to solve the game in a distributed fashion unlike[23].Usually,offline methods are employed to find Nash solutions for the games in terms of the coupled HJ equations(which are difficult to solve)[23–25].Herein an online adaptive learning solution for the graphical game is given in terms of the solution to a set of novel coupled graphical game Hamiltonian functions and Bellman equations.Policy iteration convergence proof for the graphical game is given under mild condition aboutthe graph interconnectivity.In[42],the Q learning update rule will converge to the optimal response Q-function as long as all the other agents converge in behavior.The developed online adaptive learning solution allows model-free tuning of the critic networks,while partial knowledge about the game was required in[3],[4],and[37].

    The paper is organized as follows.Section 2 reviews the synchronization control problem for multi-agent systems on graphs.Section 3 formulates the dynamic graphical game in terms of the coupled Bellman equations and the respective Hamiltonian functions.This section finds the solution for the graphical game in terms of the solution to a set of coupled HJB equations.Section 4 shows that,the Nash solution for the graphical game is given in terms of the underlying coupled(HJB)equations.Section 5 develops an online adaptive model-free policy iteration algorithm to solve the dynamic graphical game in real-time along with its convergence proof.Section 6 implements the online policy iteration algorithm using critic network structures.

    2 Synchronization of multi-agent systems on graphs

    This section reviews the synchronization control problem on communication graphs.

    2.1 Graphs

    2.2 Synchronization and tracking error dynamics

    The dynamics of each agentiis given by

    wherexi(k)∈Rnis the state vector of agenti,andui(k)∈Rmiis the control input vector for agenti.

    A leader agentv0has the dynamics[47]x0(k)∈Rngiven by

    To study synchronization on graphs,define the local neighborhood tracking error[48]εi(k)∈ Rnfor each agentias

    wheregiis the pinning gain of agenti,which is nonzerogigt;0 if agentiis coupled to the leader agentx0[18].

    The overall tracking error vector ε =[εT1εT2···εTN]Tis given by

    whereG=diag{gi}∈ RN×Nis a diagonal matrix of the pinning gains.The global synchronization error vector η[13]is given by

    The graph is assumed to be strongly connected and the pinning gain isgigt;0 for at least one agenti,then the graph matrix(L+G)is nonsingular[48].The synchronization error is bounded such that

    Our objective is to minimize the local neighborhood tracking errors εi(k),which in view of(6)will guarantee approximate synchronization.

    The local neighborhood tracking error dynamics for agentiare given by

    whereu?iare the control actions of the neighbors of each agenti u?i={u j|j∈Ni}.

    Define the group of control actions of the neighbors of each agentiand the control actions of the neighbors to the neighbors of each agentiasu?i,?{?i}={u j|j∈Ni,N{?i}}and the actions of all the agents in the graph excludingiasuˉi={u j|j∈N,j?i}.

    3 Multi-playercooperative gameson graphs

    In this section,solutions for the dynamic games on graphs are developed.These dynamic interacting games are based on the error systems(7),which are locally coupled in the sense that they are driven by the agent’s control actions and those of its neighbors.The solution for the dynamic graphical game is given in terms of the solution to a set of novel coupled Hamiltonian functions and Bellman equations developed herein.The ADHDP structure for single agent[2]is extended to solve the dynamic graphical game without knowing any of the agents’dynamics.

    3.1 Graphical games:performance evaluation

    Graphical games are based on the interactions of each agentiwith the other players in graph.The local neighborhood dynamics(7)arise from the nature of synchronization problem for dynamic systems on communication graphs.Therefore,in order to define a dynamic graphical game,the local performance index for each agentiis written as

    and the utility functionUifor each agentiis given by

    whereQii0 ∈ Rn×n,Riigt;0 ∈ Rmi×mi,andRijgt;0 ∈Rmj×mjare symmetric time-invariantweighting matrices.

    For the multi-player graphical game,it is desired to determine the optimal non-cooperative solutions such that the following distributed coupled optimizations are solved simultaneously,

    Given fixed policies(μil,μ?il)of agentiand its neighbors,the value function for each agentiis given by

    Remark 1The performance index(8)measures the performance of each agenti.The value function for each agenti(11)captures local information.Thus,the solution structure of the value function will be given in terms of the local vectorˉεik.This value structure will be used in the mathematical setup for the graphical game.

    Def i nition2The dynamic graphical game with local dynamics(7)and performance indices(8)is well-formed ifRij?0eij∈E.

    3.2 Hamiltonian function for graphical games

    Given the dynamics(7)and the performance indices(8),define the Hamiltonian function[6]of each agentias

    3.3 Bellman equation for graphical games

    Herein Bellman optimality equations are developed to solve the graphical games.The value function(11)given stationary admissible policies yields the dynamic graphical game Bellman equations such that

    with initial conditionsVπi(0)=0.

    and its gradient as

    Given stationary admissible polices for the neighbors of agenti.Applying the Bellman optimality principle yields

    Consequently,the optimal control policy for each agentiis given by

    Substituting(23)into(21)yields the graphicalgame Bellman optimality equations

    3.4 Q-function based Bellman equations

    Herein,ADHDP structure for single agent is extended to formulate and solve the dynamic graphicalgame without knowing any of the agents’dynamics where only local measurements are used.The solution for the graphical game is given in terms of the solution to a set of coupled DTHJB equations.

    The Q-function for each agentiis defined as follows:

    Therefore,the best response Bellman equation is given by

    and its gradient as

    The optimal control policy for each agentiis given such that

    Therefore,the optimal control policy is given by

    Rearranging this equation yields

    which is the same as(23).Then,

    Thus,the best response Bellman optimality equation based on the Q-function(25)and optimal policies(34)is given by

    which is equivalent to(24).

    3.5 Coupled Hamilton-Jacobi-Bellman equations

    The Hamilton-Jacobi theory[9]is used to relate the Hamiltonian functions(13)and the Bellman equations(27).

    This equation provides the motivation for defining the costate variable λi(k+1)in terms of the Q-function such that

    The optimal control policy based on the Bellman optimality equation(27)is given by(34).The next result relates the Hamiltonian(13)along the optimal trajectories and the Bellman optimality equation(36).

    Theorem 1(Discrete-time coupled HJB equation)

    Introducing the Hamiltonian(42)into this equation yields

    Equations(46)and(16)yield

    The reachability matrix

    4 Nash solution for the dynamic graphical game

    The objective of the dynamic graphical game is to solve the non-cooperative minimization problems(20),which lead to the Bellman optimality equations(36).In this section,it is shown that,the solution of the coupled Bellman optimality equations(36)is a Nash equilibrium solution for the dynamic graphical game.

    4.1 Nash equilibrium for the graphical games

    Nash equilibrium solution for the game is defined as follows[23]

    4.2 Stability and Nash solution for the graphical games

    Stable Nash solution for the dynamic graphical game is shown to be equivalent to the solution of the underlying coupled Bellman optimality equations(24)or(36).

    a)all agents synchronize to the leader’s dynamics(2);

    b)Using Theorem 1 and DTHJB(39),then the Hamiltonian(41)for arbitrary control policies is given by

    The performance index at time indexlis given by

    Rearranging(53)yields

    The Hamiltonian for arbitrary control inputs is given by

    The Hamiltonian for optimal control inputs is given by

    Using(55)and(56)in(54)yields

    c)Given that the summation of the performance index(54)is always positive for arbitrary control policies such that

    Then,according to(59),the argument of the performance index(57)is positive for arbitrary control policy.Then,(59)and(54)yield

    Therefore,Nash equilibrium exists according to Definition 3.

    5 Model-free reinforcement learning solution

    In this section,an online model-free policy iteration algorithm is developed to solve the discrete-time dynamic graphical games in real-time.This is a cooperative version of the ADP single agent’s methods intro

    duced in[1,2].Specifically,the single agent(ADHDP)algorithm is extended to solve the multi-player dynamic graphical game.

    The Q-function is expressed in terms of the agent control input,state of each agenti,and the states of its neighbors such that

    The following algorithm solves the Bellman optimality equations(36)for the optimal game values and policies.

    Remark 3This algorithm does not require the knowledge of any of the agents’dynamics in the systems(7).

    Remark 4The Policy improvement step(65)in Algorithm 1 does not require the graph to be undirected as previously imposed by the optimal policy structure(34)where the out-neighbor information are required.Thus,step(65)enables Algorithm 1 to solve the dynamic graphical games with directed or undirected graph topologies.

    Remark 5To solve the Bellman equations(64),numerous instances of(64)must be obtained at successive time instants.For these equations to be independent,a persistence of excitation condition is needed as per standard usage.This is further discussed in Remark 6 immediately before Section 6.1.

    The following theorem provides the convergence proof for Algorithm 1 when all agents update their policies simultaneously.

    Proofa)Equations(27)or(64)yield,

    Using the norm properties on this inequality yields

    Under the assumption(72).Inequalities(68)and(70)yie ld

    b)Equation(66)yields

    Equation(26)or(64)yields

    Equations(74),(75),and the assumption(72)yield

    Applying the summation on(76)such that

    This reduces to

    Therefore,by induction(77)yields

    This result shows that Algorithm 1 converges when the performance indices are suitably chosen.

    6 Critic network solutions for the graphical games

    It is not clear how to best implement Algorithm 1.In the single-agent case the implementation details do not matter too much.Proper implementation of Algorithm 1 isneeded formulti-agentgraphicalgames where numerous agents are learning.There are different methods that are used to implement Policy Iteration Algorithm 1,these involve least squares,batch least squares,Actor-Critic neural network,etc.Herein,we present a novel method for implementing Algorithm 1 for multi-agent learning on graphs,wherein all agents learn simultaneously and computationsare reduced.Algorithm 2 willbe formulated such that it uses only critic networks.This is the easiest way to implement Algorithm 1,without the difficulties that are associated with the other methods.Algorithm 2 will use gradient descent to tune the weights of the critic at each iteration.

    This section develops an online model-free critic network structure based on the Q-function value approximation(61)that is used to solve the dynamic graphical gamesin real-time.This ismotivated by the graph games Algorithm 1.Each agentihas its own critic to perform the value evaluation using local information.The policy of each agentiis improved using(65).

    The policy iteration Algorithm 1 requires that the approximation of the Q-function(80)to be written as

    Using(82),the Bellman equation(64)is written such that

    The critic network structure for each agentiperforms the evaluation(64).The policy improvement(65)depends on the evaluated value function(64).

    The neural network approximation error is given by

    The square sum of the approximation error for each neural networkican be written as

    The change in the critic neural network weights is given by gradient descent on this function whose gradient is

    Therefore,the update rules for the network weights are given by

    Remark 6To solve for the weights proximated Bellman equation(83),numerous instances of(83)must be obtained at successive time instants.For these equations to be independent,a persistence of excitation condition is needed.Our approach uses the gradient descent algorithm(87)to solve these equations,and hence a PE condition is also needed.This can be achieved by adding probing noise to the control,which is decayed to zero with time as the solution to(83)is learned.

    6.1 Network weights online tuning in real-time

    The following algorithm is used to tune the critic network weights in real-time using data measured along the system trajectories.

    Algorithm 2(Network weights online tuning)

    2)Do Loop(literations).

    Do Loop(siterations).

    2.2)Critic network weights update rule

    Remark 7Algorithm 2 uses gradient descent technique to tune the critic network weights at each iteration.Theorem 3 proved the convergence of Algorithm 1 at each step.Assuming that the gradient descent algorithms converge exactly at each iteration,then Algorithm 2 at each step solves first the Bellman equation(64)and then the action update(65).Unfortunately,gradient descent cannot always be guaranteed to converge to the exact solutions in the approximation structures.However,simulations have shown the effectiveness of this algorithm.

    6.2 Graphical game example and simulation results

    In this section the graphical game problem is solved online in real-time using Algorithm 2.Simulations are performed to verify the proper performance of Algorithm 2.

    Consider the directed graph with four agents shown in Fig.1.

    The data of the graph example are given as follows:

    Agents’dynamics:

    Pinning gains:g1=0,g2=0,g3=0,g4=1.

    Graph connectivity matrix:e12=0.8,e14=0.7,e23=0.6,e31=0.8.

    Performance index weighting matrices:Q11=Q22=Q33=Q44=I2×2,R11=R22=R33=R44=1,R13=R21=R32=R41=0,R12=R14=R23=R31=1.

    The learning rates are?μic=0.0001,?i.

    Fig.1 Graphical game example.

    Fig.2 shows that,the neighborhood tracing error dynamics go to zero.Fig.3 shows that,the dynamics of the agents synchronize to the leader while preserving the optimal behavior.Fig.4 shows a 3D phase plane plot of the agents’dynamics.This figure shows that,the agents synchronize to the leader agent’s dynamics.These figures show that Algorithm 2 yields stability and synchronization to the leader’s state.As shown in Remark 7,the gradient descent technique is assumed to converge at each step.Thus,a slow learning rate is chosen.The Policy Iteration Algorithm 2,guarantees the convergence of the agents to the leader’s dynamics.For this graphical example,outer loop iterations in Algorithm 2l=60 tol=70 are enough to maintain the synchronization.

    Fig.2 Tracking error dynamics.

    Fig.3 Agents’dynamics.

    Fig.4 Phase plot of the agents.

    7 Conclusions

    This paper studies a class of discrete-time dynamical games known as dynamic graphical games.Novel coupled Bellman equations and Hamiltonian functions are developed to solve the graphical game.Optimal control solutions for the dynamic graphical game are given in terms of the solutions to a set of coupled DTHJB equations.The stability and Nash solutions for the dynamic graphical game are proved.An online model-free policy iteration algorithm is developed to solve the dynamic graphical game in real-time.The developed algorithm does not require the knowledge of any of the agents’dynamics.Policy iteration convergence proof for the dynamic graphical game is given.A gradient descent technique with critic network structures is used to implement the online policy iteration algorithm to solve the dynamic graphical game.

    [1]P.J.Werbos.Neural networks for control and system identification.Proceedings of the 28th IEEE Conference on Decision and Control,New York:IEEE,1989:260–265.

    [2]P.J.Werbos.Approximate Dynamic Programming for Real-time Control and Neural Modeling.Handbook of Intelligent Control.D.A.White,D.A.Sofge(ed.).New York:Van Nostrand Reinhold,1992.

    [3]M.I.Abouheaf,F.L.Lewis,S.Haesaert,et al.Multi-agent discrete-time graphical games:interactive Nash equilibrium and value iteration solution.Proceedings of the American Control Conference,New York:IEEE,2013:4189–4195.

    [4]K.G.Vamvoudakis,F.L.Lewis,G.R.Hudas.Multi-agent differential graphical games:online adaptive learning solution for synchronization with optimality.Automatica,2012,48(8):1598–1611.

    [5]A.E.Bryson.Optimalcontrol-1950 to 1985.IEEE ControlSystems,1996,16(3):26–33.

    [6]F.L.Lewis,D.Vrabie,V.L.Syrmos.Optimal Control.3rd ed.New York:John Wileyamp;Sons,2012.

    [7]J.E.Marsden,M.West.Discrete mechanics and variational integrators.Acta Numerica,2001,10(5):357–514.

    [8] Y.B.Suris.Discrete Lagrangian models.International School on Discrete Integrable Systems,Berlin:Springer,2004:111–184.

    [9]S.Lall,M.West.Discrete variational Hamiltonian mechanics.Journal of Physics A:Mathematical and General,2006,39(19):5509–5519.

    [10]S.Mu,T.Chu,L.Wang.Coordinated collective motion in a motile particle group with a leader.Physica A,2005,351(2/4):211–226.

    [11]R.W.Beard,V.Stepanyan.Synchronization of information in distributed multiple vehicle coordinated control.Proceedings of the IEEE Conference on Decision and Control,Maui:IEEE,2003:2029–2034.

    [12]A.Jadbabaie,J.Lin,A.Morse.Coordination of groups of mobile autonomous agents using nearest neighbor rules.IEEE Transactions on Automatic Control,2003,48(6):988–1001.

    [13]R.Olfati-Saber,R.Murray.Consensus problems in networks of agents with switching topology and time-delays.IEEE Transactions on Automatic Control,2004,49(9):1520–1533.

    [14]Z.Qu.Cooperative Control of Dynamical Systems:Applications to Autonomous Vehicles.New York:Springer,2009.

    [15]W.Ren,R.Beard,E.Atkins.A survey of consensus problems in multi-agent coordination.Proceedings of the American Control Conference,New York:IEEE,2005:1859–1864.

    [16]J.Tsitsiklis.Problems in Decentralized Decision Making and Computation.Ph.D.dissertation.Cambridge,MA:Department of Electrical Engineering and Computer Science,Massachusetts Institute of Technology,1984.

    [17]Z.Li,Z.Duan,G.Chen,et al.Consensus of multi-agent systems and synchronization of complex networks:A unified viewpoint.IEEE Transactions on Circuits and Systems,2010,57(1):213–224.

    [18]X.Li,X.Wang,G.Chen.Pinning a complex dynamical network to its equilibrium.IEEE Transactions on Circuits and Systems,2004,51(10):2074–2087.

    [19]W.Ren,K.Moore,Y.Chen.High-order and model reference consensus algorithms in cooperative control of multivehicle systems.Journal of Dynamic Systems,Measurement and Control,2007,129(5):678–688.

    [20]J.Kuang,J.Zhu.On consensus protocols for high-order multiagent systems.Journal of Control Theory and Applications,2010,8(4):406–412.

    [21]S.Zhang,G.Duan.Consensusseeking in multi-agentcooperative control systems with bounded control input.Journal of Control Theory and Applications,2011,9(2):210–214.

    [22]R.Gopalakrishnan,J.R.Marden,A.Wierman.An architectural view of game theoretic control.Performance Evaluation Review,2011,38(3):31–36.

    [23]T.Ba?sar,G.J.Olsder.Dynamic Non-cooperative Game Theory.Classics in Applied Mathematics.2nd ed.Philadelphia:SIAM,1999.

    [24]G.Freiling,G.Jank,H.Abou-Kandil.On global existence of Solutions to coupled matrix Riccatiequations in closed-loop Nash Games.IEEE Transactions on Automatic Control,2002,41(2):264–269.

    [25]Z.Gajic,T.-Y.Li.Simulation results for two new algorithms for solving coupled algebraic Riccati equations.Proceedings of the 3rd International Symposium on Differential Games.Sophia Antipolis,France,1988.

    [26]A.G.Barto,R.S.Sutton,C.W.Anderson.Neuron like adaptive elements that can solve difficult learning control problems.IEEE Transactions on Systems Man and Cybernetics,1983,13(5):834–846.

    [27]R.Howard.Dynamic Programming and Markov Processes.Cambridge,MA:MIT Press,1960.

    [28]R.Bellman.DynamicProgramming.Princeton:Princeton University Press,1957.

    [29]D.P.Bertsekas,J.N.Tsitsiklis.Neuro-dynamic Programming.Belmont,MA:Athena Scientific 1996.

    [30]P.J.Werbos.Intelligence in the Brain:a theory of how it works and how to build it.Conference on Goal-Directed NeuralSystems,Oxford:Pergamon-Elsevier Science Ltd.,2009:200–212.

    [31]D.Vrabie,F.L.Lewis.Adaptive dynamic programming for online solution ofa zero-sumdifferentialgame.JournalofControlTheory and Applications,2011,9(3):353–360.

    [32]J.Morimoto,G.Zeglin,C.Atkeson.Minimax differential dynamic programming:application to a biped walking robot.IEEE/RSJ International Conference on Intelligent Robots and Systems,New York:IEEE,2003:1927–1932.

    [33]T.Landelius.Reinforcement Learning and Distributed Local Model Synthesis.Ph.D.dissertation.Sweden:Linkoping University,1997.

    [34]R.S.Sutton,A.G.Barto.Reinforcement Learning-An Introduction.Cambridge,MA:MIT Press,1998.

    [35]S.Sen,G.Weiss.Learning In Multiagent Systems,in Multiagent Systems:A Modern Approach to Distributed Artificial Intelligence.Cambridge,MA:MIT Press,1999:259–298.

    [36]K.G.Vamvoudakis,F.L.Lewis.Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem.Automatica,2010,46(5):878–888.

    [37]K.G.Vamvoudakis,F.L.Lewis.Multi-player non-zero sum games:online adaptive learning solution of coupled Hamilton-Jacobi equations.Automatica,2011,47(8):1556–1569.

    [38]D.Vrabie,O.Pastravanu,F.L.Lewis,et al.Adaptive optimal control for continuous-time linear systems based on policy iteration.Automatica,2009,45(2):477–484.

    [39]D.P.Bertsekas.Approximate policy iteration:A survey and some new methods.Journal of Control Theory and Applications,2011,9(3):310–335.

    [40]L.Busoniu,R.Babuska,B.De-Schutter.A comprehensive survey of multiagent reinforcement learning.IEEE Transactions on Systems,Man and Cybernetics,2008,38(2):156–172.

    [41]P.Vrancx,K.Verbeeck,A.Nowe.Decentralized learning in markov games.IEEE Transactions on Systems,Man and Cybernetics,2008,38(4):976–981.

    [42]M.L.Littman.Value-function reinforcement learning in Markov games.Cognitive Systems Research,2001,2(1):55–66.

    [43]Y.Jiang,Z.Jiang.Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics.Automatica,2012,48(10):2699–2704.

    [44]Y.Jiang,Z.Jiang.Global adaptive dynamic programming for continuous-time nonlinear systems.2013:http://arxiv.org/abs/1401.0020.

    [45]T.Dierks,S.Jagannathan.Optimal control of affine nonlinear continuous-time systems using an online Hamilton-Jacobi-Isaacs formulation.Proceedings of the 49th IEEE Conference on Decision and Control,New York:IEEE,2010:3048–3053.

    [46]M.Johnson,T.Hiramatsu,N.Fitz-Coy,et al.Asymptotic Stackelberg optimal control design for an uncertain Euler Lagrange system.Proceedings of the 49th IEEE Conference on Decision and Control,New York:IEEE,2010:6686–6691.

    [47]F.L.Lewis.Applied Optimal Control and Estimation:Digital Design and Implementation.Englewood Cliffs:Prentice Hall,1992.

    [48]S.Khoo,L.Xie,Z.Man.Robust finite-time consensus tracking algorithm for multirobot systems.IEEE/ASME Transactions on Mechatronics,2009,14(2):219–228.

    Mohammed I.ABOUHEAFwas born in Smanoud,Egypt.He received his B.Sc.and M.Sc.degrees in Electronics and Communication Engineering,Mansoura College of Engineering,Mansoura,Egypt,in 2000 and 2006,respectively.He worked as an assistant lecturer with the Air Defense College,Alexandria,Egypt(2001-2002).He worked as a planning engineer for the Maintenance Department,Suez Oil Company(SUCO),South Sinai,Egypt(2002-2004).He worked as an assistant lecturer with the Electrical Engineering Department,Aswan College of Energy Engineering,Aswan,Egypt(2004-2008).He received his Ph.D.degree in Electrical Engineering,University of Texas at Arlington(UTA),Arlington,Texas,U.S.A.in 2012.He worked as a postdoctoral fellow with the University of Texas at Arlington Research Institute(UTARI),Fort Worth,Texas,U.S.A.(2012-2013).He worked as Adjunct Faculty with the Electrical Engineering Department,University of Texas at Arlington(UTA),Arlington,Texas,U.S.A.(2012-2013).He was a member of the Advanced Controls and Sensor Group(ACS)and the Energy Systems Research Center(ESRC),University of Texas at Arlington,Arlington,Texas,U.S.A.(2008-2012).Currently,he is Assistant Professor with the Systems Engineering Department,King Fahd University of Petroleum and Minerals(KFUPM),Dhahran,Saudi Arabia.His research interests include optimal control,adaptive control,reinforcement learning,fuzzy systems,game theory,microgrids,and economic dispatch.E-mail:abouheaf@kfupm.edu.sa.

    FrankL.LEWISis Member,National Academy of Inventors.Fellow IEEE,Fellow IFAC,Fellow,U.K.Institute of Measurementamp;Control,PE Texas,U.K.Chartered Engineer,UTA Distinguished Scholar Professor,UTA Distinguished Teaching Professor,and Moncrief-O’Donnell Chair at the University of Texas at Arlington Research Institute.He is Qian Ren Thousand Talents Consulting Professor,Northeastern University,Shenyang,China.He obtained the B.S.degree in Physics/EE and the MSEE at Rice University,the M.S.in Aeronautical Engineering from University of West Florida,and the Ph.D.degree at Georgia Institute of Technology.He works in feedback control,intelligent systems,cooperative control systems,and nonlinear systems.He is Author of 6 U.S.patents,numerous journal special issues,journal papers,and 20 books,including Optimal Control,Aircraft Control,Optimal Estimation,and Robot Manipulator Control which are used as university textbooks worldwide.He received the Fulbright Research Award,NSF Research Initiation Grant,ASEE Terman Award,Int.Neural Network Soc.Gabor Award,U.K.Inst Measurementamp;Control Honeywell Field Engineering Medal,and IEEE Computational Intelligence Society Neural Networks Pioneer Award.He received Outstanding Service Award from Dallas IEEE Section,and selected as Engineer of the year by Ft.Worth IEEE Section.He was listed in Ft.Worth Business Press Top 200 Leaders in Manufacturing.He received Texas Regents Outstanding Teaching Award 2013.He is Distinguished Visiting Professor at Nanjing University of Scienceamp;Technology and Project 111 Professor at Northeastern University in Shenyang,China.He is Founding Member of the Board of Governors of the Mediterranean Control Association.E-mail:lewis@uta.edu.

    Magdi S.MAHMOUDobtained the B.Sc.degree(Honors)in Communication Engineering,M.Sc.degree in Electronic Engineering,and Ph.D.degree in Systems Engineering,all from Cairo University in 1968,1972 and 1974,respectively.He has been a professor of Engineering since 1984.He is now a distinguished professor at KFUPM,Saudi Arabia.He was on the faculty at different universities worldwide including Egypt(CU,AUC),Kuwait(KU),U.A.E.(UAEU),U.K.(UMIST),U.S.A.(Pitt,Case Western),Singapore(Nanyang)and Australia(Adelaide).He lectured in Venezuela(Caracas),Germany(Hanover),U.K.((Kent),U.S.A.(UoSA),Canada(Montreal)and China(BIT,Yanshan).He is the principalauthorofthirty-four(34)books,inclusive book-chapters and the author/co-author of more than 510 peer-reviewed papers.He is the recipient of two national,one regional and four university prizes for outstanding research in engineering and applied mathematics.He is a fellow of the IEE,a senior member of the IEEE,the CEI(U.K.),and a registered consultant engineer of information engineering and systems(Egypt).He is currently actively engaged in teaching and research in the development of modern methodologies to distributed control and filtering,networked-control systems,triggering mechanisms in dynamical systems,fault-tolerant systems and information technology.He is a fellow of the IEEE,a senior member of the IEEE,the CEI(U.K.),and a registered consultant engineer of information engineering and systems Egypt.E-mail:msmahmoud@kfupm.edu.sa,magdisadekmahmoud@gmail.com,magdim@yahoo.com.

    Dariusz MIKULSKIis a research computer scientist in Ground Vehicle Robotics at the U.S.Army Tank-Automotive Research Development and Engineering Center in Warren,MI.He currently works on research to improve cooperative teaming and cyber security in military unmanned convoy operations.Dr.Mikulski earned his Ph.D.degree in Electrical and Computer Engineering at Oakland University in Rochester Hills,Michigan in 2013.He also earned his B.Sc.in Computer Science from the University of Michigan in Ann Arbor and Masters in Computer Science and Engineering from Oakland University.E-mail:dariusz.g.mikulski.civ@mail.mil.

    ?Corresponding author.

    E-mail:abouheaf@kfupm.edu.sa.Tel.:+966(13)860 2968;fax:+966(13)860 2965.King Fahd University of Petroleumamp;Mineral,P.O.Box 1956,Dhahran-31261,Saudi Arabia.

    This work was supported by the Deanship of Scientific Research at King Fahd University of Petroleumamp;Minerals Project(No.JF141002),the National Science Foundation(No.ECCS-1405173),the Office of Naval Research(Nos.N000141310562,N000141410718),the U.S.Army Research Office(No.W911NF-11-D-0001),the National Natural Science Foundation of China(No.61120106011),and the Project 111 from the Ministry of Education of China(No.B08015).

    ?2015 South China University of Technology,Academy of Mathematics and Systems Science,CAS,and Springer-Verlag Berlin Heidelberg

    亚洲欧美日韩另类电影网站| 神马国产精品三级电影在线观看 | 久久香蕉国产精品| 精品久久久久久久人妻蜜臀av | 不卡av一区二区三区| 久久久久久久久中文| 真人做人爱边吃奶动态| 国产人伦9x9x在线观看| 日本三级黄在线观看| 成人亚洲精品av一区二区| 97超级碰碰碰精品色视频在线观看| 99在线视频只有这里精品首页| 久久性视频一级片| 夜夜爽天天搞| 69av精品久久久久久| 亚洲欧美一区二区三区黑人| 国产亚洲欧美在线一区二区| 亚洲成人国产一区在线观看| 久久婷婷人人爽人人干人人爱 | 91麻豆av在线| 法律面前人人平等表现在哪些方面| 色综合欧美亚洲国产小说| 女人被狂操c到高潮| 亚洲欧美激情综合另类| 精品久久蜜臀av无| 精品午夜福利视频在线观看一区| 99久久综合精品五月天人人| 可以免费在线观看a视频的电影网站| 精品一区二区三区四区五区乱码| 18禁黄网站禁片午夜丰满| 美女 人体艺术 gogo| 午夜免费激情av| 禁无遮挡网站| 久久九九热精品免费| 日韩大码丰满熟妇| 这个男人来自地球电影免费观看| 久久久久九九精品影院| 国产成人影院久久av| 视频在线观看一区二区三区| av超薄肉色丝袜交足视频| 大型av网站在线播放| av网站免费在线观看视频| 日本 欧美在线| av网站免费在线观看视频| 这个男人来自地球电影免费观看| 亚洲欧美一区二区三区黑人| 久久精品国产99精品国产亚洲性色 | 亚洲熟妇熟女久久| 精品乱码久久久久久99久播| 人人妻人人爽人人添夜夜欢视频| 久久久水蜜桃国产精品网| 午夜福利免费观看在线| 国产欧美日韩精品亚洲av| 亚洲少妇的诱惑av| 精品电影一区二区在线| 嫩草影视91久久| ponron亚洲| 日韩免费av在线播放| 一区二区三区高清视频在线| 日本五十路高清| 女人高潮潮喷娇喘18禁视频| 波多野结衣高清无吗| 一进一出抽搐gif免费好疼| 日韩欧美国产一区二区入口| 国产午夜福利久久久久久| 亚洲午夜精品一区,二区,三区| 熟女少妇亚洲综合色aaa.| 亚洲一卡2卡3卡4卡5卡精品中文| a在线观看视频网站| 欧美乱妇无乱码| 日韩大码丰满熟妇| 国产精华一区二区三区| 一本久久中文字幕| 老鸭窝网址在线观看| 亚洲av第一区精品v没综合| 18禁美女被吸乳视频| 色播亚洲综合网| 男女下面插进去视频免费观看| 少妇裸体淫交视频免费看高清 | 日本撒尿小便嘘嘘汇集6| 桃色一区二区三区在线观看| 久久久国产成人免费| 国产成人av教育| 十分钟在线观看高清视频www| 久久草成人影院| 欧美日韩亚洲国产一区二区在线观看| 国产精品 欧美亚洲| 久久午夜综合久久蜜桃| 国产高清videossex| 最近最新中文字幕大全电影3 | 色精品久久人妻99蜜桃| 18禁裸乳无遮挡免费网站照片 | 午夜福利一区二区在线看| 91麻豆精品激情在线观看国产| 久久精品影院6| 欧美日韩乱码在线| 久久精品91无色码中文字幕| 给我免费播放毛片高清在线观看| 香蕉国产在线看| 免费在线观看亚洲国产| 狂野欧美激情性xxxx| 一卡2卡三卡四卡精品乱码亚洲| 丰满人妻熟妇乱又伦精品不卡| 精品国内亚洲2022精品成人| 亚洲一区二区三区不卡视频| 少妇 在线观看| 动漫黄色视频在线观看| 身体一侧抽搐| 成人三级黄色视频| 国产高清激情床上av| 国产欧美日韩精品亚洲av| 成在线人永久免费视频| 一个人观看的视频www高清免费观看 | 国产精品免费视频内射| 一二三四在线观看免费中文在| 老司机靠b影院| 91成人精品电影| 日本欧美视频一区| 亚洲视频免费观看视频| 国产一区二区激情短视频| 免费高清在线观看日韩| 欧美激情高清一区二区三区| 最近最新免费中文字幕在线| 日本一区二区免费在线视频| www国产在线视频色| 级片在线观看| 色综合婷婷激情| 欧美 亚洲 国产 日韩一| 亚洲精品国产区一区二| 亚洲伊人色综图| 美女大奶头视频| 18禁国产床啪视频网站| 91成年电影在线观看| 最近最新免费中文字幕在线| 波多野结衣一区麻豆| 国产97色在线日韩免费| a级毛片在线看网站| 国产欧美日韩综合在线一区二区| 午夜福利在线观看吧| 亚洲精品久久国产高清桃花| 99精品久久久久人妻精品| 狂野欧美激情性xxxx| 一级毛片女人18水好多| 99久久99久久久精品蜜桃| 国产精品免费视频内射| 黄色 视频免费看| 99国产极品粉嫩在线观看| 亚洲欧美日韩无卡精品| 欧美激情 高清一区二区三区| 极品教师在线免费播放| 亚洲国产精品成人综合色| 后天国语完整版免费观看| 91麻豆av在线| 欧美一级a爱片免费观看看 | 美女高潮喷水抽搐中文字幕| 嫩草影院精品99| 久久久久九九精品影院| e午夜精品久久久久久久| 国产精品一区二区在线不卡| 亚洲国产日韩欧美精品在线观看 | 亚洲欧美一区二区三区黑人| 成年人黄色毛片网站| x7x7x7水蜜桃| 亚洲欧洲精品一区二区精品久久久| 99精品在免费线老司机午夜| 欧美激情久久久久久爽电影 | 男人的好看免费观看在线视频 | 亚洲美女黄片视频| 午夜a级毛片| 久久久久久国产a免费观看| 中亚洲国语对白在线视频| 在线观看免费午夜福利视频| 国产激情久久老熟女| 99久久99久久久精品蜜桃| 黑人巨大精品欧美一区二区mp4| 欧美一区二区精品小视频在线| 久久狼人影院| 90打野战视频偷拍视频| 又大又爽又粗| 亚洲欧美激情综合另类| 高清在线国产一区| 久久精品国产综合久久久| 一级黄色大片毛片| 久久欧美精品欧美久久欧美| 国产一卡二卡三卡精品| 淫秽高清视频在线观看| 色精品久久人妻99蜜桃| 操美女的视频在线观看| 日韩欧美国产在线观看| 精品久久久精品久久久| 不卡一级毛片| 亚洲精品一区av在线观看| 久久人人97超碰香蕉20202| 涩涩av久久男人的天堂| 97碰自拍视频| 丰满的人妻完整版| 精品久久久久久久毛片微露脸| 一区二区三区精品91| 成人国语在线视频| 精品国产乱子伦一区二区三区| 久久久久国产精品人妻aⅴ院| 91字幕亚洲| 国产成年人精品一区二区| 国产高清有码在线观看视频 | 黑人欧美特级aaaaaa片| 精品久久久精品久久久| 动漫黄色视频在线观看| 亚洲片人在线观看| 大陆偷拍与自拍| 在线观看免费午夜福利视频| 在线播放国产精品三级| 天天一区二区日本电影三级 | 18禁裸乳无遮挡免费网站照片 | 51午夜福利影视在线观看| 麻豆国产av国片精品| 国产aⅴ精品一区二区三区波| 欧美日韩瑟瑟在线播放| 好男人电影高清在线观看| АⅤ资源中文在线天堂| 1024视频免费在线观看| 国产成人欧美在线观看| 国产成人av激情在线播放| 777久久人妻少妇嫩草av网站| 男人舔女人的私密视频| 亚洲第一青青草原| 欧美在线一区亚洲| 老司机午夜十八禁免费视频| 国产成人免费无遮挡视频| 在线观看66精品国产| 午夜免费成人在线视频| 日韩 欧美 亚洲 中文字幕| www.999成人在线观看| 1024视频免费在线观看| 欧美另类亚洲清纯唯美| 欧美乱色亚洲激情| 久久久国产欧美日韩av| 久久久久久久午夜电影| 午夜免费激情av| 国产av一区在线观看免费| 国产成+人综合+亚洲专区| 大型黄色视频在线免费观看| 亚洲激情在线av| 久久亚洲真实| 成人国语在线视频| 看黄色毛片网站| 亚洲成人久久性| 精品人妻在线不人妻| 欧美日韩乱码在线| 日韩一卡2卡3卡4卡2021年| 国产精品电影一区二区三区| 国产视频一区二区在线看| 日韩 欧美 亚洲 中文字幕| 女性生殖器流出的白浆| tocl精华| 精品第一国产精品| 亚洲五月色婷婷综合| 国产av一区二区精品久久| av视频在线观看入口| bbb黄色大片| 黄色a级毛片大全视频| 国产亚洲精品久久久久久毛片| 精品一品国产午夜福利视频| 午夜亚洲福利在线播放| aaaaa片日本免费| 91在线观看av| 波多野结衣高清无吗| 久久人妻福利社区极品人妻图片| 97超级碰碰碰精品色视频在线观看| 中文字幕久久专区| 日本黄色视频三级网站网址| 一进一出好大好爽视频| 亚洲国产欧美一区二区综合| 精品日产1卡2卡| 美女国产高潮福利片在线看| 91成人精品电影| 精品久久久久久久毛片微露脸| 在线观看日韩欧美| 黄色a级毛片大全视频| 国产一级毛片七仙女欲春2 | 亚洲国产精品sss在线观看| 亚洲精品粉嫩美女一区| 日韩高清综合在线| 级片在线观看| 国产精品国产高清国产av| 91成人精品电影| 大陆偷拍与自拍| 国产成人欧美在线观看| 久久精品亚洲精品国产色婷小说| 人人妻人人澡欧美一区二区 | 高潮久久久久久久久久久不卡| 亚洲五月色婷婷综合| 91麻豆精品激情在线观看国产| 最近最新免费中文字幕在线| 人妻久久中文字幕网| 岛国在线观看网站| 国产蜜桃级精品一区二区三区| 日本 欧美在线| 99久久综合精品五月天人人| 两性午夜刺激爽爽歪歪视频在线观看 | 女生性感内裤真人,穿戴方法视频| 欧美乱妇无乱码| 精品国内亚洲2022精品成人| 大型黄色视频在线免费观看| 免费高清在线观看日韩| 操美女的视频在线观看| 一本大道久久a久久精品| 日韩 欧美 亚洲 中文字幕| 69av精品久久久久久| 亚洲精品国产色婷婷电影| 亚洲成人免费电影在线观看| 中文字幕人妻丝袜一区二区| 久久精品91无色码中文字幕| 久久影院123| 黄网站色视频无遮挡免费观看| 国产私拍福利视频在线观看| 波多野结衣一区麻豆| 脱女人内裤的视频| 欧美一区二区精品小视频在线| 亚洲性夜色夜夜综合| 两人在一起打扑克的视频| 成人永久免费在线观看视频| 欧美av亚洲av综合av国产av| 满18在线观看网站| 婷婷丁香在线五月| 久久久久国产精品人妻aⅴ院| 中文字幕人成人乱码亚洲影| 一个人观看的视频www高清免费观看 | 午夜影院日韩av| 久久精品国产亚洲av高清一级| 中出人妻视频一区二区| 变态另类成人亚洲欧美熟女 | 成人三级做爰电影| 亚洲色图av天堂| 男女做爰动态图高潮gif福利片 | 久久人人爽av亚洲精品天堂| 此物有八面人人有两片| 亚洲欧美一区二区三区黑人| 午夜成年电影在线免费观看| 夜夜看夜夜爽夜夜摸| 天堂√8在线中文| 老司机午夜十八禁免费视频| 一本综合久久免费| 亚洲国产毛片av蜜桃av| 亚洲精华国产精华精| 久久人人爽av亚洲精品天堂| 悠悠久久av| 黑丝袜美女国产一区| 成年女人毛片免费观看观看9| 欧美 亚洲 国产 日韩一| 日韩欧美一区二区三区在线观看| 欧美乱妇无乱码| 91精品国产国语对白视频| 最好的美女福利视频网| 18禁美女被吸乳视频| 免费高清视频大片| 18禁黄网站禁片午夜丰满| 熟妇人妻久久中文字幕3abv| 亚洲精品久久成人aⅴ小说| 桃红色精品国产亚洲av| 看片在线看免费视频| 97碰自拍视频| 久久精品国产亚洲av高清一级| 成人18禁在线播放| 亚洲一区二区三区色噜噜| 久久久国产精品麻豆| 操美女的视频在线观看| 又黄又爽又免费观看的视频| 宅男免费午夜| 麻豆国产av国片精品| 国产熟女xx| 中文字幕人成人乱码亚洲影| 桃红色精品国产亚洲av| 波多野结衣巨乳人妻| 日本欧美视频一区| 久久久久九九精品影院| 波多野结衣巨乳人妻| 欧美成人午夜精品| 色哟哟哟哟哟哟| 久久中文字幕一级| 高清在线国产一区| 91字幕亚洲| а√天堂www在线а√下载| 亚洲精品国产色婷婷电影| 久久久久久人人人人人| 少妇裸体淫交视频免费看高清 | 日韩国内少妇激情av| 18禁国产床啪视频网站| 精品国产一区二区久久| 亚洲va日本ⅴa欧美va伊人久久| 午夜福利一区二区在线看| 亚洲人成电影免费在线| 色婷婷久久久亚洲欧美| 免费在线观看完整版高清| 亚洲成a人片在线一区二区| 国产亚洲精品一区二区www| 久久草成人影院| 色尼玛亚洲综合影院| 久久久久久人人人人人| 男女之事视频高清在线观看| 亚洲天堂国产精品一区在线| 伊人久久大香线蕉亚洲五| 欧美在线一区亚洲| 国产99久久九九免费精品| 国产又爽黄色视频| 制服诱惑二区| 久久精品亚洲熟妇少妇任你| 亚洲熟妇中文字幕五十中出| 亚洲人成伊人成综合网2020| 激情在线观看视频在线高清| 岛国视频午夜一区免费看| 欧美黄色淫秽网站| 国产精品亚洲一级av第二区| 成人亚洲精品一区在线观看| 欧美国产日韩亚洲一区| 波多野结衣高清无吗| 18美女黄网站色大片免费观看| 大码成人一级视频| 久热爱精品视频在线9| 国产精品二区激情视频| 久久亚洲真实| 国产精品野战在线观看| 久热这里只有精品99| 国产成人av教育| 少妇 在线观看| 亚洲人成77777在线视频| 黄片播放在线免费| 无遮挡黄片免费观看| 给我免费播放毛片高清在线观看| 色婷婷久久久亚洲欧美| 国产精品九九99| 人妻久久中文字幕网| ponron亚洲| 中国美女看黄片| 超碰成人久久| 国产一区二区激情短视频| 国产高清videossex| 亚洲国产精品999在线| 真人一进一出gif抽搐免费| 又黄又粗又硬又大视频| 中文字幕人妻丝袜一区二区| 熟女少妇亚洲综合色aaa.| 此物有八面人人有两片| 最好的美女福利视频网| 国产精品一区二区三区四区久久 | 91成年电影在线观看| 日韩欧美国产在线观看| 午夜福利18| 好男人在线观看高清免费视频 | 亚洲欧美日韩另类电影网站| 变态另类丝袜制服| 午夜成年电影在线免费观看| 嫩草影视91久久| 国产高清有码在线观看视频 | www国产在线视频色| 在线观看www视频免费| 两个人视频免费观看高清| 一级毛片高清免费大全| 亚洲男人天堂网一区| 熟女少妇亚洲综合色aaa.| 国产欧美日韩一区二区三区在线| www.www免费av| 免费在线观看视频国产中文字幕亚洲| 99在线人妻在线中文字幕| 级片在线观看| 美女午夜性视频免费| 黄色片一级片一级黄色片| 国产视频一区二区在线看| 亚洲一码二码三码区别大吗| 精品高清国产在线一区| 久久中文看片网| 校园春色视频在线观看| 波多野结衣高清无吗| 成人手机av| 一级a爱视频在线免费观看| 免费观看人在逋| 免费在线观看日本一区| 久久亚洲精品不卡| 最新在线观看一区二区三区| 一进一出抽搐gif免费好疼| 日韩大码丰满熟妇| 免费在线观看视频国产中文字幕亚洲| 国产黄a三级三级三级人| 级片在线观看| 国产主播在线观看一区二区| www国产在线视频色| 黑人巨大精品欧美一区二区蜜桃| 两性夫妻黄色片| 国产精品1区2区在线观看.| 久久人妻福利社区极品人妻图片| 一级毛片精品| 人人妻人人爽人人添夜夜欢视频| 多毛熟女@视频| 一级a爱片免费观看的视频| 操美女的视频在线观看| 国产精品综合久久久久久久免费 | 亚洲avbb在线观看| 村上凉子中文字幕在线| 精品欧美国产一区二区三| 变态另类成人亚洲欧美熟女 | 欧美绝顶高潮抽搐喷水| 国产精品九九99| 成人18禁在线播放| 黑人巨大精品欧美一区二区蜜桃| 欧美乱妇无乱码| 成人国产综合亚洲| 亚洲狠狠婷婷综合久久图片| 天堂影院成人在线观看| 欧美在线黄色| 首页视频小说图片口味搜索| 伊人久久大香线蕉亚洲五| 大香蕉久久成人网| 黑人巨大精品欧美一区二区mp4| 一夜夜www| 在线国产一区二区在线| 美女高潮到喷水免费观看| 91老司机精品| 88av欧美| 制服人妻中文乱码| av视频免费观看在线观看| 成人亚洲精品av一区二区| 欧美av亚洲av综合av国产av| 99久久国产精品久久久| 老汉色∧v一级毛片| 午夜免费激情av| 欧美日韩精品网址| 日韩av在线大香蕉| 女性被躁到高潮视频| 日日干狠狠操夜夜爽| 亚洲国产日韩欧美精品在线观看 | 欧美日本视频| 亚洲中文av在线| 窝窝影院91人妻| 亚洲熟妇中文字幕五十中出| 91大片在线观看| 美女高潮喷水抽搐中文字幕| 日韩欧美国产在线观看| 亚洲视频免费观看视频| 法律面前人人平等表现在哪些方面| 美女午夜性视频免费| 少妇的丰满在线观看| av天堂在线播放| 日本撒尿小便嘘嘘汇集6| 伊人久久大香线蕉亚洲五| 午夜精品久久久久久毛片777| 国产99白浆流出| 91国产中文字幕| 国产视频一区二区在线看| 午夜免费激情av| 亚洲欧美激情在线| 1024视频免费在线观看| 麻豆成人av在线观看| 嫁个100分男人电影在线观看| 国产视频一区二区在线看| 又大又爽又粗| 国产精品一区二区精品视频观看| 国产成人精品无人区| 一边摸一边抽搐一进一出视频| 国产伦人伦偷精品视频| 中文字幕精品免费在线观看视频| 欧美日韩亚洲综合一区二区三区_| 好看av亚洲va欧美ⅴa在| 叶爱在线成人免费视频播放| 免费一级毛片在线播放高清视频 | 非洲黑人性xxxx精品又粗又长| 亚洲精品中文字幕在线视频| 满18在线观看网站| 看片在线看免费视频| 黄色视频不卡| 可以在线观看毛片的网站| 女生性感内裤真人,穿戴方法视频| 丁香六月欧美| 国产亚洲欧美在线一区二区| 真人做人爱边吃奶动态| 久久香蕉国产精品| 亚洲色图综合在线观看| 中文字幕人妻丝袜一区二区| 中文字幕最新亚洲高清| 欧美av亚洲av综合av国产av| 大香蕉久久成人网| 免费无遮挡裸体视频| 国产精品自产拍在线观看55亚洲| 伦理电影免费视频| 午夜福利18| 成人手机av| 女性被躁到高潮视频| 女性生殖器流出的白浆| 亚洲国产中文字幕在线视频| 男人舔女人下体高潮全视频| 欧美另类亚洲清纯唯美| 黄色女人牲交| 亚洲在线自拍视频| 亚洲精品国产精品久久久不卡| 一级毛片高清免费大全| 亚洲狠狠婷婷综合久久图片| 免费在线观看视频国产中文字幕亚洲| 亚洲av熟女| 一边摸一边做爽爽视频免费| 一区二区三区精品91| 免费在线观看影片大全网站| 国产国语露脸激情在线看| 最好的美女福利视频网| 欧美一级毛片孕妇| 国产高清激情床上av| 国产精品一区二区在线不卡| 久久精品影院6| 久久久久久久久中文| 51午夜福利影视在线观看| 国产精品1区2区在线观看.| 一区二区三区精品91| 纯流量卡能插随身wifi吗| 97超级碰碰碰精品色视频在线观看| 在线视频色国产色| 日韩欧美在线二视频| 人人妻人人爽人人添夜夜欢视频| 看免费av毛片|