• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    An Optimal Control-Based Distributed Reinforcement Learning Framework for A Class of Non-Convex Objective Functionals of the Multi-Agent Network

    2023-10-21 03:09:54ZheChenandNingLi
    IEEE/CAA Journal of Automatica Sinica 2023年11期

    Zhe Chen and Ning Li,,

    Abstract—This paper studies a novel distributed optimization problem that aims to minimize the sum of the non-convex objective functionals of the multi-agent network under privacy protection, which means that the local objective of each agent is unknown to others.The above problem involves complexity simultaneously in the time and space aspects.Yet existing works about distributed optimization mainly consider privacy protection in the space aspect where the decision variable is a vector with finite dimensions.In contrast, when the time aspect is considered in this paper, the decision variable is a continuous function concerning time.Hence, the minimization of the overall functional belongs to the calculus of variations.Traditional works usually aim to seek the optimal decision function.Due to privacy protection and non-convexity, the Euler-Lagrange equation of the proposed problem is a complicated partial differential equation.Hence, we seek the optimal decision derivative function rather than the decision function.This manner can be regarded as seeking the control input for an optimal control problem, for which we propose a centralized reinforcement learning (RL) framework.In the space aspect, we further present a distributed reinforcement learning framework to deal with the impact of privacy protection.Finally, rigorous theoretical analysis and simulation validate the effectiveness of our framework.

    I.INTRODUCTION

    DISTRIBUTED optimization [1]–[4] has been widely applied to many scenarios such as smart grids [5], [6],traffic networks [7], [8], and sensor networks [9], [10].In these works, the whole task is accomplished by a group of agents with privacy protection, which means that the local objective function of each agent is unknown to others.

    In this paper, the existing distributed optimization model is extended to a more complex one, where the global decision variable is a time-variant continuous function rather than just a vector.For each agent, its local objective is the integration of a non-convex function, which is known as the “functional”.These agents try to compute the decision function to minimize the sum of their functionals under privacy protection.The proposed problem is referred to as the minimization of the non-convex objective functionals of multi-agent network.

    Most of the existing works [11]–[15] involving distributed optimization focus on the distributed gradient manner, where the agents reach the consensus to the optimal solution through communication with neighbors and the gradient information of its local function.Reference [11] combines the push-sum protocol with a distributed inexact gradient method to solve the distributed optimization problem.Reference [12] raises two bounded distributed protocols to address a global optimal consensus problem.However, these works are concerned with the minimization of the smooth and convex function.When it comes to the non-convex function, the global minimizer cannot be obtained through a derivative.Moreover, the decision variable in the above work is a vector.In contrast, our decision variable is a continuous time-variant function.Hence, the local change of the functional is represented with the variation rather than differential.Thus, these works cannot be directly applied to our problems.

    Since the overall objective in our problem is the integration of a time-variant function, this problem behaves as a similar feature of a distributed online optimization problem.In[16]–[21], the agents aim to minimize the regret, which quantifies the error between the accumulated time-variant cost and the best-fixed cost knowing the overall objective functions in advance.In [22]–[25], the overall objective also behaves as the feature in the sum of the time-varying cost functions.The existence of the ramp-rate constraint increases its complexity.However, their decision variable is in the form of the discrete function, which means the amount of the variables is finite.In contrast, our decision variable is a continuous function over a bounded time interval.Since this kind of interval is uncountable [26], the amount of the variables is infinite.Moreover,our objective functional takes into account the influence of the derivative of the decision function.Therefore, these works cannot deal with our problem either.

    The minimization of the overall objectives in this paper is essentially a problem in the calculus of variations (CVs).The core of CVs is mainly concerned with the solutions for the Euler-Lagrange (EL) equation.Due to unknown global information and non-convexity, the proposed problem in this paper is a complicated partial differential equation (PDE).Hence,we do not attempt to directly seek the decision function.Instead, a linear integral system is made up.In view of this system, the derivative of the decision function is regarded as the control input.Then instead of seeking a function to minimize the functional, we seek to obtain the optimal control input to minimize the cost function, which is essentially an optimal control problem [27].Each agent has its local cost function known only to itself.The target is to obtain the consistent optimal control input to minimize the sum of the cost functions of these agents under privacy protection.

    Solving the optimal control problem is usually transformed into obtaining the optimal value function of the corresponding Hamilton-Jacobi-Bellman (HJB) equation [28].However,it is still challenging to solve due to the coupling between the control input function and the value function.To handle this,policy iteration (PI) [29] in reinforcement learning (RL) [30]is a valid technique.Therefore, we propose a centralized RL framework to obtain the centralized solution without the demand of solving a complicated PDE.The proposed RL framework is relevant to [31]–[35] that also apply RL to acquiring the solution of the HJB equation.However, these works have not considered privacy protection.

    Considering the existence of privacy protection, we further design a distributed RL framework.In each iteration of this framework, the agents cooperatively evaluate the value function and update the control input.For value function evaluation, each agent approximates the value function through a time-varying neural network.Therefore, a static distributed optimization problem is generated to evaluate the value function at every time instant.A continuous protocol is designed to make the agents reach a consensus on the value function based on their local information.To update the control input cooperatively, these agents estimate the control input function at every time instant.Similarly, another continuous protocol is designed for the agents to update the control input cooperatively.The centralized RL framework and the two protocols make up the distributed RL framework.

    The main contributions of this paper are as follows.

    1) This paper raises a novel distributed optimization problem.Existing works involving distributed optimization seldom concern the difficulty in the time aspect where the decision variable is a time-varying continuous function.Moreover,our objective functional is the integration of a non-convex function over a continuous time interval.

    2) Since the EL equation of the proposed problem is a complicated PDE difficult to solve, we convert the optimization of the functional into an optimal control problem.Then, the relationship between distributed optimal control and distributed optimization is built in this paper.Based on the transformed problem, this paper proposes a centralized RL framework.

    3) To eliminate the influence caused by privacy protection,this paper further puts forward a distributed RL framework.The convergence analysis is also provided.

    The structure of this paper is organized as follows.In Section II, some preliminary knowledge is given.Then, problem formulation for the proposed problem is represented.Section III illustrates problem transformation into the distributed optimal control problem.Section IV proposes a centralized RL framework for the transformed problem in Section III.Based on the centralized RL framework in Section IV, Section V raises a distributed RL framework for the agents.Section VI reveals the simulation result of our framework.Finally, conclusions and future orientation are stated in Section VII.

    II.PROBLEM FORMULATION FOR THE NON-CONVEX OBJECTIVE FUNCTIONALS OF MULTI-AGENT NETWORK

    This section firstly introduces some preliminary knowledge.Then the exact problem formulation for the non-convex objective functionals of the multi-agent network is given.

    A. Preliminary Knowledge

    Our preliminary knowledge mainly involves relevant graph theory, static distributed optimization, and the calculus of variations.

    3)Calculus of Variations: The functional can be understood as a particular function whose domain of definition is composed of several continuous functions.For any functionx(t)in its domain of definition, there exists a unique real number that corresponds tox(t).The functional can be regarded as“the function of an function”.The mathematical model of a typical functional can be expressed as follows:

    wherex(t)∈Rnis a continuous vector function with respect tot, and its first-order derivative isx˙(t).g(·) andh(·) are scalar functions.In this paper,t0,x0, andtfare fixed.

    Calculus of variations is the origin of functional analysis in mathematics, and it is a method that aims to seek the decision function to optimize the functional.According to the definition in (2), the mathematical model can be expressed with

    wherex(t) is the decision function.

    The overallx(t) fromt=t0tot=tfcan be understood as a“special” vector.We know from functional analysis that the bounded interval [t0,tf] is uncountable, therefore this special vectorx(t) is with infinite dimensions.

    B. Non-Convex Objective Functionals of Multi-Agent Network

    Now a network composed of multi-agents withNnodes is given.The topology of this network is represented by the graphG.For the agenti(i=1,2,...,N) in this network, it is equipped with its local functional known only to itself

    Remark 1: The functional (4) of the agentimaps the functionx(t),t∈[t0,tf] to a real number.The agents in this network share the same decision functionx(t).However, there is privacy protection where each agent only knows its own functional.

    Assumption 1: In this paper, the functional (4) of the agentihas the following form:

    where symmetric matricesRi>0 , andQi(x),hi(x) of some agentiare non-convex continuous functions.

    Under privacy protection, each agent aims to communicate with its neighbour to find the consistent decision functionx(t)such that the sum of these functionalsJi(x(t)) is minimized.That is

    Based on Assumption 1, the sum of these functionals is also non-convex with the following form:

    III.PROBLEM TRANSFORMATION INTO DISTRIBUTED OPTIMAL CONTROL PROBLEM

    The minimization of the overall objective functional in (6)is essentially a problem in the calculus of variations.The Euler-Lagrange (EL) equation for (6) is a complicated PDE.Moreover, non-convexity and privacy protection bring more difficulties of solving (6).This section mainly discusses how to handle non-convexity by converting the problem (6) to a distributed optimal control problem.Furthermore, this transformation is also the basis of the RL frameworks in the following two sections.

    A. The Relationship Between Calculus of Variations and Optimal Control

    According to the mathematical model defined in (2), the decision variable of the functionalJ(x(t)) is a functionx(t),which is essentially a trajectory starting fromx(t0)tox(tf).Its derivative functionx˙(t) can be regarded as its dynamics.Therefore, we regardx(t) as the system trajectory of the following control system:

    whereu(t)∈Rnis the control input of this system.As long as the control inputu(t),t∈[t0,tf] is given, the corresponding system trajectoryx(t) can be generated.Therefore, the functionalJ(x(t)) in (2) could be transformed into

    The value ofV(x0,u(t)) is essentially the same as that ofJ(x(t)).Since (9) can be regarded as the cost function for system (8), the minimization forJ(x(t)) is essentially the following optimal control problem:

    As long as the optimal control inputu(t) for (10) is obtained, the system trajectoryx(t) together with its derivative functionx˙(t) , i.e.,u(t) can minimize the functional (2).Therefore, the minimization of (2) is transformed into the optimal control problem (10).

    B. Distributed Optimal Control Problem

    According to the discussion about optimal control, we can transform the minimization problem (6) into the following distributed optimal control problem.

    For the linear system given by (8), each agent in the networkGis equipped with its own cost function unknown to other agents

    With Assumption 1, the cost function for each agentiis represented with

    Under privacy protection, the agents aim to find the consistent optimal control policyu?(t) such that the sum of their cost functions can be minimized, that is

    Remark 3: The transformed problem (13) belongs to a distributed control problem.However, this problem represents distinct philosophies compared with existing works [36], [37].The distributed control problems discussed in [36], [37]mainly concern the optimal control problem of large-scale interconnected systems.The distributed concept represented in the above works mainly lies in the decomposition of the large-scale system.Differently, the distributed concept shown in our problem (13) emphasizes more on the decomposition of the overall cost function.Due to privacy protection, the cost functionVi(x0,t0,u(t)) of agentiis unknown to other agents.The methods in the above works [36], [37] can not deal with the problem (13).

    C. Centralized Solution for Distributed Optimal Control

    Before solving the distributed optimal control problem (13),this subsection gives the property of the global solution.Similar to (7), the overall cost function is

    Rewriting the minimization of the cost function (14) as

    Similarly, we have

    According to the Bellman principle of optimality, we have

    WhenTis small enough, we have the Bellman equation

    The corresponding Hamiltonian function is defined as

    Remark 4:The target of (6) is to seek the optimal functionx(t).However, the non-convexity inQ(x),h(x) results in difficulty solving (6).In contrast, this subsection transforms (6)into (13), and turns to seek the optimalu?(t) for (13) by minimizingH(x,t,u,V?) at each time instantt.Foru, the Hamiltonian functionH(x,t,u,V?) is convex.This is the reason whyu?(t) can be derived through (20) even thoughQ(x),h(x) is non-convex.Moreover, the transformation in this section is the basis of the following two RL frameworks, where solving a PDE is avoided.

    IV.CENTRALIZED REINFORCEMENT LEARNING FRAMEWORK

    Recalling the Bellman principle of optimality, the key to obtaining the optimal control policyu?for (17) is to accurately approximate the structure ofV?, i.e., the solution for the HJB (21).However, we cannot directly obtainV?by (21).One reason is that (21) is still a complicated PDE in second order,and another reason is that the global message involvingQ(x),Ris unknown to all agents.In this section, we mainly introduce a centralized reinforcement learning framework to deal with the former concerns.

    A. Centralized Policy Iteration Framework for Optimal Control Problem

    The reason why (21) is a complicated PDE lies in the coupling betweenV?andu?.Equation (20) implies that if we aim to obtain the optimalu?,V?should be known in advance.The origin of the HJB (21) is (18) whereu?is substituted with(20).Equation (18) implies that if we aim to obtain the optimalV?in (18),u?should be known in advance.This phenomenon seems to be a “chicken-and-egg” problem.

    To deal with the coupling betweenV?andu?, this subsection considers policy iteration of reinforcement learning.Then, (17) becomes

    with

    With (23), (22) is equivalent to

    Fig.1.The structure diagram for the centralized PI framework.

    Algorithm 1 Centralized Policy Iteration Framework j=1 uj 1: Initialization.Set.Start with a stabilizing control input.uj 2: Value function evaluation.Apply the current control policy to the system (8).Measure the corresponding system state trajectory.Evaluate by x(t) Vj Vj(x(t),t)=h(x(tf))+■t f (t Q(x)+ujT Ruj)dτ.(26)uj+1(t)3: Policy update.Update the control policy by uj+1(t)=-1 2R-1 ?Vj?x.(27)j= j+1 //Vj-Vj-1//≤δ0 δ0>0 4: Set , and go to Step 2 until where is a constant and is known as the cyclic tolerance error.

    Remark 5: Reference [38] proposes the similar algorithm,and proves its convergence.However, when the global messages aboutQ,Rare unknown, the value function evaluation step in these algorithms cannot be directly implemented.This is also the motivation of the distributed algorithm in Section V.

    B. Centralized Reinforcement Learning Framework With Value Function Approximation

    For convenience, we denoteVj(x(t),t) asVj(x,t).The structure of Algorithm 1 is given in Fig.1.

    The key for Algorithm 1 is determining how to evaluateVjin Step 2.Therefore, this subsection first approximatesVj(x,t)in (26) using the neural network with time-varying weights.That is

    Based on (28), we have

    Based on (28) and (29), Algorithm 1 becomes Algorithm 2.

    Algorithm 2 Centralized Reinforcement Learning Framework With Value Function Approximation j=1 uj 1: Initialization.Set.Start with a stabilizing control input.uj 2: Value function evaluation.Apply the current control policy to the system (8).Measure the corresponding system state trajectory.Evaluate in by x(t) ?Wj(t) Vj ?Wj(t)T σ(x)=h(x(tf))+■t f (t Q(x)+ujT Ruj)dτ.(30)uj+1(t)3: Policy update.Update the control policy by[?σ(x)]T uj+1(t)=-1 2 R-1?x ?Wj(t).(31)j= j+1 //?Wj- ?Wj-1//≤δ1 δ1>0 4: Set , and go to Step 2 until where is a constant and is known as the cyclic tolerance error.

    Remark 6: In thej-th iteration of this framework, seeking the structure ofVjis transformed into seekingW?j(t).

    V.DISTRIBUTED REINFORCEMENT LEARNING FRAMEWORK

    Algorithm 2 only settles the first concern that (21) is a complicated PDE in the second order.However, the second concern about privacy protection implies that the global message abouth(x(tf),tf),Q(x),Ris unknown for all the agents.Therefore, Steps 2 and 3 of Algorithm 2 can not be directly implemented under this privacy protection.To deal with this concern, this section introduces the distributed reinforcement learning framework.

    A. Distributed Solution for Step 2 in Algorithm 2

    At time instantt, the agentiestimatesθasθi∈Rlwith the corresponding equality constraint and the penalty item in the optimization function.On the basis of (36), the exact problem becomes

    whereLis the Laplacian matrix of the graphG, and

    According to [39], we know that the solution to the problem (37) is the saddle point of its Lagrangian function.Similar to the procedure in [39], the continuous protocol for the agenticould be derived as

    B. Distributed Solution for Step 3 in Algorithm 2

    In Step 3 of Algorithm 2, the target is to update the new control policyuj+1.SinceRis unknown for all the agents, (31)in Step 3 cannot be directly implemented.For agenti, onlyRiis known to itself.It can only try to update the new control policyuj+1to minimize the following Hamiltonian function:

    The problem (41) is also the standard form of the static distributed optimization problem in (1).Note that the value ofVj,Qi(x)is changing over time.Therefore, for each time instantt, there is a static distributed optimization problem(41), which is diverse from that of other time instants.

    Similar to the procedure from (37) to (38), the continuous protocol for the agentiis

    C. Distributed Reinforcement Learning Framework

    The protocols (38) and (42) are implemented without the demand of the global message.Each agent only needs its own local message.Now their convergence and optimality is given.

    Theorem 1: The protocol (38) could converge.When it converges, θireach the consensus toW?j(t) for Step 2 of Algorithm 2.

    Proof: See Appendix.

    Similar to the proof for the protocol (38), the protocol (42)

    uj+1 could converge, and the new policy for the Step 3 in Algorithm 2 could also be obtained when (42) converges.

    Since the convergence and the optimality of the protocols(38) and (42) has been proved, Steps 2 and 3 in thej-th iteration of Algorithm 2 could be implemented in a distributed manner.Then we give the following Algorithm 3 according to the framework in Algorithm 2.

    Algorithm 3 Distributed Reinforcement Learning Framework j=1 uj 1: Initialization.Set.Start with a stabilizing control input.uj 2: Value function evaluation.Apply the current control policy to the system (8).Measure the corresponding system state trajectory.For each time instant from to , the agent i calculates defined in (35).Then, its is updated through the protocol x(t) t=t0 t=tf yi θi d ˙θi [?θi fi(θi)N∑N∑]dτ =-α N +aij(θi-θj)+aij(λi-λj)j=1 j=1 d ˙λi [ N∑]dτ =α aij(θi-θj).(43)j=1 ?Wj(t)When (43) converges, is determined through the consistent value of.3: Policy update.[t0,tf]k=1 t0 x(t0)=x0 θi 3.1: Discretize the time interval with a small time period T.Set.For the initial time instant , is given.t=t0+(k-1)T ?Wj(t)x(t) ?Vj 3.2: At the time instant , based on obtained in Step 2 and the current state , for the time instant t is obtained.The agent i is equipped with its own , and the initial value of is set as.Then, the agent i calculates its own, and updates its through the protocol?x =[?σ(x)?x]T ?Wj(t)ρi ρi 0?θi Hi(ρi)=2Riρi+ ?Vj?x ρi d ˙ρi [?ρi Hi(ρi)N∑N∑]dτ =-β N +ai j(ρi-ρj)+aij(ηi-ηj)j=1 j=1 d ˙ηi [ N∑dτ =βaij(ρi-ρj)].(44)j=1 uj+1(t)ρi When the protocol (44) converges, is determined by the consistent value of.uj+1(t)x(t+T) k=k+1 t=t0+(k-1)T t=tf 3.3: Apply the updated to system (8) to generate the new state.Let , and go to Step 3.2 until.j= j+1 //?Wj- ?Wj-1//≤δ2 δ2>0 4: Set , and go to Step 2 until where is a constant and is known as the cyclic tolerance error.

    VI.NUMERICAL SIMULATION

    In this section, the simulation results of Algorithm 3 are mainly discussed.The agents are distributed over the network in Fig.2 whose graph is represented with the following adjacency matrix:

    Fig.2.The topological graph of the agents.

    For the agenti∈{1,2,3,4}, its objective functionalJimaps the decision functionx(t) to a real number.These agents aim to minimize their overall functional

    where the global message is unknown to the agents.

    To seek the optimal decision functionx(t) minimizing (47)in a distributed manner, the following linear system is considered:

    whereu(t)∈R2is the control input of this system.Then, the minimization for (47) is transformed into seeking the optimal control inputu?(t) to minimize

    The minimization of (49) is essentially an optimal control problem.Its HJB equation is

    Since the agentionly knows its local messageQi,Ri,hiinQ,R,h, the optimal solution cannot be obtained by solving(50) directly.That is why we propose Algorithm 3 to deal with this problem.

    Now, we show some settings before implementing our Algorithm 3.In thejiteration of this algorithm, the agents desire to cooperatively obtainmaking up the value functionVj=σ(x) that satisfies the HJB (50) as possible.For this problem, the basis function σ (x) is selected as

    and the exact structure of the time-varying NN weights is

    Fig.3.The convergence tendency of the first nine elements of

    Fig.4.The convergence tendency of the last element of

    Fig.5.The convergence of the error functionalEw(W?j(t)).

    Fig.6.The exact procedure of how the agents reach the consensus to the consistent W ?j,6(t) for the j=5 iteration.

    Fig.7.The exact procedure of how the agents reach the consensus to the consistent W ?j,8(t) for the j=5 iteration.

    Fig.8.The exact procedure of how the agents reach the consensus to the consistent u j+1,1(t) in the new control policy of the j=5 iteration.

    VII.CONCLUSIONS AND FUTURE ORIENTATION

    Fig.9.The exact procedure of how the agents reach the consensus to the consistentuj+1,1(t) at different time instants.

    Fig.10.The exact procedure of how the agents reach the consensus to the consistent u j+1,2(t) in the new control policy of the j=5 iteration.

    This paper studies a novel distributed optimization problem,where each agent is equipped with a local functional unknown to others.Besides the privacy protection discussed in the existing works about distributed optimization, the proposed problem involves the difficulty in the time aspect that the decision variable is a time-varying continuous function.Especially, the functionals of some agents are non-convex.Considering privacy protection and non-convexity, we transform the proposed problem into a distributed optimal control problem.Then, we propose a centralized RL framework to avoid solving a PDE directly.What’s more, we further put forward a distributed RL framework to handle the presence of privacy protection.The numerical simulation verifies the effectiveness of our framework.In the future, we will discuss a more representative case than that in Assumption 1.

    APPENDIX PROOF OF THEOREM 1

    Proof(Stability): For convenience, we analyze the matrix form of (38)

    Differentiating both sides of (59) with respect to the update timeτ, we have

    Differentiating both sides of (58), we have

    Substituting (61) into (60) yields

    Fig.11.The exact procedure of how the agents reach the consensus to the consistentuj+1,2(t) at different time instants.

    Fig.12.The optimal decision functionx?(t) cooperatively obtained by the agents.

    Considering (33) and (35), we have

    Therefore,

    Since (59) has implied ? is positive semi-definite, (58) will converge.

    Proof(Optimality): The convergence of (38) has been proved.When it converges to the equilibrium point, we have

    Based on (65), we have

    and

    Adding up (67) fromi=1 toi=N, we have

    Therefore, the consistentθ, or equivalentlyW?j(t) is the optimal solution for (34).Moreover, based on (68), (33), and (35),we have

    Therefore, the consistentθsatisfies (30) in Step 2 of Algorithm 2.

    国产美女午夜福利| 欧美成人性av电影在线观看| 精品久久久噜噜| 偷拍熟女少妇极品色| 在线观看一区二区三区| 一进一出抽搐gif免费好疼| 亚洲国产精品合色在线| 欧洲精品卡2卡3卡4卡5卡区| 99精品久久久久人妻精品| 亚洲aⅴ乱码一区二区在线播放| 免费av观看视频| 老司机午夜福利在线观看视频| 久久久久国内视频| 亚洲熟妇中文字幕五十中出| eeuss影院久久| 亚洲av五月六月丁香网| 啦啦啦观看免费观看视频高清| 婷婷丁香在线五月| 国产亚洲精品久久久com| 久久亚洲真实| 国产免费av片在线观看野外av| 22中文网久久字幕| 成人国产综合亚洲| 精品久久久久久久人妻蜜臀av| 国产v大片淫在线免费观看| 欧美xxxx性猛交bbbb| a在线观看视频网站| 日本免费a在线| 久久欧美精品欧美久久欧美| 国产精品国产高清国产av| 国内精品久久久久精免费| 麻豆av噜噜一区二区三区| 欧美zozozo另类| 美女xxoo啪啪120秒动态图| 国产精品人妻久久久影院| 欧美+亚洲+日韩+国产| 亚洲,欧美,日韩| 级片在线观看| 男人狂女人下面高潮的视频| 亚洲黑人精品在线| 久久精品国产亚洲av香蕉五月| 久久国产乱子免费精品| 久久精品人妻少妇| 亚洲精华国产精华液的使用体验 | 国产在线精品亚洲第一网站| 欧美丝袜亚洲另类 | 99视频精品全部免费 在线| 看十八女毛片水多多多| 亚洲精品一区av在线观看| 久久亚洲精品不卡| 亚洲av成人av| 毛片一级片免费看久久久久 | 亚洲最大成人手机在线| 美女 人体艺术 gogo| 成年女人毛片免费观看观看9| 久久精品国产99精品国产亚洲性色| 亚洲男人的天堂狠狠| 可以在线观看的亚洲视频| 校园春色视频在线观看| 国产aⅴ精品一区二区三区波| 成人综合一区亚洲| 一个人看视频在线观看www免费| 看免费成人av毛片| 亚洲第一区二区三区不卡| 国产午夜福利久久久久久| 亚洲中文字幕日韩| 亚洲国产精品sss在线观看| 亚洲经典国产精华液单| av.在线天堂| 成人亚洲精品av一区二区| 亚洲精品国产成人久久av| АⅤ资源中文在线天堂| 午夜a级毛片| 久久精品91蜜桃| 亚洲三级黄色毛片| 国产精品久久久久久久久免| 精品久久久久久成人av| 欧美日韩亚洲国产一区二区在线观看| 午夜福利成人在线免费观看| 成人精品一区二区免费| 中文资源天堂在线| av.在线天堂| 一区二区三区高清视频在线| 亚洲真实伦在线观看| 国产男靠女视频免费网站| 久久精品夜夜夜夜夜久久蜜豆| 久久九九热精品免费| 美女 人体艺术 gogo| 99久国产av精品| 国产成人aa在线观看| 啦啦啦啦在线视频资源| 亚洲国产精品久久男人天堂| 亚洲一区高清亚洲精品| 日本欧美国产在线视频| 亚洲av日韩精品久久久久久密| 日本-黄色视频高清免费观看| 国产熟女欧美一区二区| 永久网站在线| 国内少妇人妻偷人精品xxx网站| 日韩人妻高清精品专区| 人人妻,人人澡人人爽秒播| 欧美又色又爽又黄视频| 不卡一级毛片| 日韩欧美一区二区三区在线观看| 春色校园在线视频观看| 精品人妻一区二区三区麻豆 | 亚洲18禁久久av| 18禁裸乳无遮挡免费网站照片| 久久久久久久久久久丰满 | 搡老妇女老女人老熟妇| 国产v大片淫在线免费观看| 亚洲色图av天堂| 久久这里只有精品中国| 99在线视频只有这里精品首页| 美女被艹到高潮喷水动态| 22中文网久久字幕| 欧美在线一区亚洲| 国产精品永久免费网站| 亚洲电影在线观看av| 国产精品三级大全| 老司机福利观看| 欧美区成人在线视频| 99热只有精品国产| 在线看三级毛片| 伦精品一区二区三区| 欧美日韩精品成人综合77777| 色精品久久人妻99蜜桃| 免费人成在线观看视频色| 日日啪夜夜撸| 91久久精品国产一区二区成人| 直男gayav资源| 不卡一级毛片| 亚洲七黄色美女视频| 别揉我奶头 嗯啊视频| 国产午夜福利久久久久久| 国产精品三级大全| 亚洲男人的天堂狠狠| 成人特级黄色片久久久久久久| 亚洲av成人av| 人妻久久中文字幕网| 国产高清不卡午夜福利| 久久久久久久午夜电影| 中文字幕免费在线视频6| 欧美日韩综合久久久久久 | 色尼玛亚洲综合影院| 国产免费一级a男人的天堂| 舔av片在线| 亚洲久久久久久中文字幕| 国产精品亚洲一级av第二区| 国产精品爽爽va在线观看网站| 精品午夜福利视频在线观看一区| av福利片在线观看| 欧美一区二区国产精品久久精品| 国内久久婷婷六月综合欲色啪| 国产免费一级a男人的天堂| 国产精品自产拍在线观看55亚洲| 日韩欧美在线二视频| av.在线天堂| 露出奶头的视频| 精品午夜福利视频在线观看一区| 麻豆成人av在线观看| 国产精品免费一区二区三区在线| 一级黄片播放器| 少妇熟女aⅴ在线视频| 久久国产乱子免费精品| 国产高清视频在线观看网站| 不卡一级毛片| 色噜噜av男人的天堂激情| 一区二区三区四区激情视频 | 久久精品国产鲁丝片午夜精品 | 91久久精品国产一区二区三区| 在线观看舔阴道视频| 人妻制服诱惑在线中文字幕| 在线观看免费视频日本深夜| 久久久久久久久中文| 日韩国内少妇激情av| 亚洲第一区二区三区不卡| 日韩欧美精品v在线| 久久精品国产自在天天线| 69人妻影院| 人妻少妇偷人精品九色| 国产精品一区二区三区四区免费观看 | 色av中文字幕| 国内少妇人妻偷人精品xxx网站| 嫩草影院新地址| 日韩欧美国产在线观看| 午夜影院日韩av| 久久亚洲真实| 国产精品野战在线观看| 国产免费av片在线观看野外av| 熟女人妻精品中文字幕| 变态另类成人亚洲欧美熟女| a级毛片a级免费在线| 亚洲av电影不卡..在线观看| 久久久午夜欧美精品| 中亚洲国语对白在线视频| 欧美最黄视频在线播放免费| 日韩精品有码人妻一区| 一级黄片播放器| 变态另类成人亚洲欧美熟女| 国产伦人伦偷精品视频| 日日夜夜操网爽| 美女大奶头视频| 在线观看午夜福利视频| 欧美绝顶高潮抽搐喷水| 久久精品国产亚洲av天美| 国产精品一区二区性色av| 舔av片在线| 色尼玛亚洲综合影院| 97碰自拍视频| 国产精华一区二区三区| 日韩精品中文字幕看吧| 在线免费观看的www视频| 韩国av在线不卡| 欧美区成人在线视频| 我要看日韩黄色一级片| 国产成人a区在线观看| 一区二区三区高清视频在线| 天堂√8在线中文| 女人十人毛片免费观看3o分钟| 又爽又黄无遮挡网站| 国产成人a区在线观看| 亚洲av五月六月丁香网| 自拍偷自拍亚洲精品老妇| 亚洲精品国产成人久久av| 国产精品伦人一区二区| 国产老妇女一区| 国产一区二区激情短视频| 非洲黑人性xxxx精品又粗又长| 中文字幕av在线有码专区| 午夜福利高清视频| 久久精品综合一区二区三区| 亚洲欧美激情综合另类| 亚洲无线观看免费| 一级a爱片免费观看的视频| 欧美成人一区二区免费高清观看| 一卡2卡三卡四卡精品乱码亚洲| 日韩中文字幕欧美一区二区| 亚洲熟妇中文字幕五十中出| 99热6这里只有精品| 日日摸夜夜添夜夜添av毛片 | 久久久久久久午夜电影| 女人被狂操c到高潮| 亚洲av二区三区四区| 久久人人爽人人爽人人片va| 国产美女午夜福利| 国产亚洲精品综合一区在线观看| 色在线成人网| 岛国在线免费视频观看| 九九爱精品视频在线观看| 黄色配什么色好看| 国内揄拍国产精品人妻在线| 色播亚洲综合网| 日韩欧美在线二视频| 欧美日韩综合久久久久久 | 乱系列少妇在线播放| 在线播放无遮挡| 欧美日韩精品成人综合77777| 国模一区二区三区四区视频| 天堂影院成人在线观看| 不卡一级毛片| 久久久久久国产a免费观看| 国产成人av教育| 国产精品不卡视频一区二区| 美女大奶头视频| videossex国产| 精品一区二区三区av网在线观看| 国产乱人视频| 乱人视频在线观看| 观看免费一级毛片| 国产精品av视频在线免费观看| 三级毛片av免费| 亚洲人成网站在线播放欧美日韩| 免费人成视频x8x8入口观看| 欧美+日韩+精品| 精品久久国产蜜桃| 国产探花在线观看一区二区| 亚洲18禁久久av| 亚洲国产精品久久男人天堂| 色噜噜av男人的天堂激情| 韩国av在线不卡| 国产一区二区三区视频了| 毛片一级片免费看久久久久 | 久久久久久国产a免费观看| 中文在线观看免费www的网站| 国产午夜精品久久久久久一区二区三区 | 国产精品自产拍在线观看55亚洲| 午夜视频国产福利| 久久精品影院6| 午夜久久久久精精品| 深夜a级毛片| 乱系列少妇在线播放| 全区人妻精品视频| a在线观看视频网站| 欧美激情久久久久久爽电影| 国内毛片毛片毛片毛片毛片| 日日摸夜夜添夜夜添av毛片 | 天天一区二区日本电影三级| 国产伦人伦偷精品视频| 国产精品99久久久久久久久| 国产一区二区三区在线臀色熟女| 色综合婷婷激情| 欧美黑人欧美精品刺激| 日本一本二区三区精品| 久久久久久伊人网av| 男人狂女人下面高潮的视频| 日韩欧美精品v在线| 在线免费观看不下载黄p国产 | 我要搜黄色片| 色精品久久人妻99蜜桃| 久久精品国产亚洲av天美| 身体一侧抽搐| 极品教师在线免费播放| 亚洲av不卡在线观看| av.在线天堂| 露出奶头的视频| 亚洲精品乱码久久久v下载方式| 日韩中文字幕欧美一区二区| 综合色av麻豆| 九九爱精品视频在线观看| 99久久久亚洲精品蜜臀av| 久久久精品欧美日韩精品| 国产精品永久免费网站| 欧美一区二区精品小视频在线| 亚洲欧美日韩卡通动漫| 联通29元200g的流量卡| 国产 一区精品| 亚洲最大成人手机在线| 欧美激情国产日韩精品一区| 日韩欧美一区二区三区在线观看| 在线免费观看不下载黄p国产 | 成人午夜高清在线视频| 熟妇人妻久久中文字幕3abv| 一级毛片久久久久久久久女| 国产免费男女视频| 国产精品综合久久久久久久免费| 性色avwww在线观看| 国产精品98久久久久久宅男小说| 亚洲国产精品成人综合色| 欧美人与善性xxx| 国产精品精品国产色婷婷| 日日摸夜夜添夜夜添小说| 亚洲精品色激情综合| 亚洲第一电影网av| 99久久中文字幕三级久久日本| 精品久久久久久成人av| 男女做爰动态图高潮gif福利片| 中出人妻视频一区二区| 色噜噜av男人的天堂激情| 精品午夜福利视频在线观看一区| 男女下面进入的视频免费午夜| 91麻豆av在线| 少妇被粗大猛烈的视频| 日本三级黄在线观看| 18禁黄网站禁片免费观看直播| 热99re8久久精品国产| 亚洲成人久久性| 亚洲欧美清纯卡通| 一级黄色大片毛片| 国产精品久久久久久av不卡| 又紧又爽又黄一区二区| 国产精华一区二区三区| 免费在线观看影片大全网站| av女优亚洲男人天堂| 国产一区二区三区视频了| 亚洲四区av| 亚洲国产高清在线一区二区三| 欧美成人一区二区免费高清观看| 国产精品久久久久久久电影| 真人一进一出gif抽搐免费| 成人特级黄色片久久久久久久| 免费搜索国产男女视频| 黄色日韩在线| www.www免费av| 亚洲中文字幕日韩| 麻豆国产av国片精品| 国产极品精品免费视频能看的| 中国美女看黄片| 三级男女做爰猛烈吃奶摸视频| 夜夜爽天天搞| 夜夜看夜夜爽夜夜摸| 国产亚洲av嫩草精品影院| 久久香蕉精品热| 亚洲欧美清纯卡通| 悠悠久久av| 熟女人妻精品中文字幕| 亚洲精品一区av在线观看| 美女高潮的动态| 午夜福利18| 最新在线观看一区二区三区| 91av网一区二区| 又爽又黄无遮挡网站| 一级a爱片免费观看的视频| 老司机福利观看| 精品国产三级普通话版| 黄色丝袜av网址大全| 免费电影在线观看免费观看| 亚洲av成人精品一区久久| 国产亚洲91精品色在线| 中文字幕高清在线视频| 国产黄色小视频在线观看| 一个人看视频在线观看www免费| 日本a在线网址| 伊人久久精品亚洲午夜| 亚洲国产精品成人综合色| 欧美最新免费一区二区三区| 高清毛片免费观看视频网站| 国内精品美女久久久久久| 人妻少妇偷人精品九色| 最好的美女福利视频网| 欧美另类亚洲清纯唯美| 久久久久久伊人网av| 又黄又爽又刺激的免费视频.| 亚州av有码| 99精品在免费线老司机午夜| 乱人视频在线观看| 欧美激情久久久久久爽电影| a在线观看视频网站| 不卡一级毛片| 国内精品久久久久久久电影| 3wmmmm亚洲av在线观看| 欧美极品一区二区三区四区| 国产精品久久久久久亚洲av鲁大| 最近在线观看免费完整版| h日本视频在线播放| 校园人妻丝袜中文字幕| 国产高清三级在线| 一a级毛片在线观看| 18禁黄网站禁片午夜丰满| 国产精品久久久久久亚洲av鲁大| 国产v大片淫在线免费观看| 国产主播在线观看一区二区| 又黄又爽又刺激的免费视频.| 99在线人妻在线中文字幕| 国国产精品蜜臀av免费| 精品久久久久久久人妻蜜臀av| 极品教师在线免费播放| 精品福利观看| 禁无遮挡网站| 中文字幕av成人在线电影| 成年免费大片在线观看| 亚洲美女视频黄频| 中文字幕久久专区| 偷拍熟女少妇极品色| 欧美激情久久久久久爽电影| 桃红色精品国产亚洲av| 91午夜精品亚洲一区二区三区 | 久久亚洲精品不卡| 人妻丰满熟妇av一区二区三区| 成人特级av手机在线观看| 亚洲,欧美,日韩| 欧美黑人巨大hd| 久久香蕉精品热| 成人性生交大片免费视频hd| 深夜a级毛片| 久久久久久久久久久丰满 | 国内久久婷婷六月综合欲色啪| 亚洲自偷自拍三级| 91在线观看av| 中亚洲国语对白在线视频| 身体一侧抽搐| 干丝袜人妻中文字幕| 国产伦精品一区二区三区视频9| 老师上课跳d突然被开到最大视频| 国产黄a三级三级三级人| 亚洲精华国产精华液的使用体验 | 国产精品自产拍在线观看55亚洲| 99久久九九国产精品国产免费| 午夜免费激情av| 男女啪啪激烈高潮av片| 欧美成人免费av一区二区三区| 高清日韩中文字幕在线| 日日摸夜夜添夜夜添小说| 少妇裸体淫交视频免费看高清| 亚洲经典国产精华液单| 简卡轻食公司| 亚洲精品成人久久久久久| 国产av一区在线观看免费| 波多野结衣高清作品| 中国美白少妇内射xxxbb| 岛国在线免费视频观看| 国产一级毛片七仙女欲春2| 久久精品国产亚洲网站| 在线观看一区二区三区| 人妻丰满熟妇av一区二区三区| 亚洲性夜色夜夜综合| 国产毛片a区久久久久| 亚洲va在线va天堂va国产| 又粗又爽又猛毛片免费看| 亚洲av不卡在线观看| 亚州av有码| 国产亚洲精品av在线| 性插视频无遮挡在线免费观看| 国产免费一级a男人的天堂| 啦啦啦啦在线视频资源| 免费电影在线观看免费观看| 亚洲精华国产精华液的使用体验 | 少妇裸体淫交视频免费看高清| 亚洲经典国产精华液单| 国产美女午夜福利| 日韩精品有码人妻一区| 可以在线观看毛片的网站| 男女之事视频高清在线观看| 亚洲色图av天堂| av女优亚洲男人天堂| 亚洲精品日韩av片在线观看| 淫妇啪啪啪对白视频| 免费看美女性在线毛片视频| 在线观看66精品国产| 亚洲人与动物交配视频| 亚洲美女视频黄频| 亚洲av中文字字幕乱码综合| 99久久无色码亚洲精品果冻| 欧美精品啪啪一区二区三区| 国产亚洲精品久久久com| 国产男靠女视频免费网站| 高清在线国产一区| 九色成人免费人妻av| 国产伦精品一区二区三区四那| av视频在线观看入口| 国产精品一区二区性色av| 国产精品伦人一区二区| 精品午夜福利在线看| 尤物成人国产欧美一区二区三区| 国产精品爽爽va在线观看网站| 黄色日韩在线| 国产中年淑女户外野战色| 日韩精品有码人妻一区| 国产综合懂色| 噜噜噜噜噜久久久久久91| 一级黄色大片毛片| aaaaa片日本免费| 久久精品国产亚洲av香蕉五月| 天堂av国产一区二区熟女人妻| 一夜夜www| 岛国在线免费视频观看| 12—13女人毛片做爰片一| 女的被弄到高潮叫床怎么办 | 国产午夜精品论理片| videossex国产| 午夜福利在线在线| 国产综合懂色| 亚洲av一区综合| 亚洲成人免费电影在线观看| 最新在线观看一区二区三区| 国产成年人精品一区二区| 亚洲无线观看免费| 久久国内精品自在自线图片| 村上凉子中文字幕在线| 真实男女啪啪啪动态图| 美女xxoo啪啪120秒动态图| 成人特级黄色片久久久久久久| 干丝袜人妻中文字幕| 日本精品一区二区三区蜜桃| 亚洲国产精品合色在线| 亚洲成人久久爱视频| 成人国产麻豆网| 中文字幕av在线有码专区| 午夜老司机福利剧场| 国产精品福利在线免费观看| 最近最新中文字幕大全电影3| 超碰av人人做人人爽久久| 永久网站在线| 亚洲av电影不卡..在线观看| 成人特级av手机在线观看| 欧美日韩综合久久久久久 | 日本免费一区二区三区高清不卡| 日本三级黄在线观看| 中文字幕熟女人妻在线| 欧美三级亚洲精品| 国产单亲对白刺激| 国产亚洲91精品色在线| 亚洲av五月六月丁香网| 亚洲人成伊人成综合网2020| 精品乱码久久久久久99久播| 国产精品99久久久久久久久| 久久久色成人| 亚洲美女视频黄频| 欧美日本视频| 免费观看的影片在线观看| 国产真实伦视频高清在线观看 | 我要搜黄色片| 99久久久亚洲精品蜜臀av| 亚洲精品乱码久久久v下载方式| 亚洲狠狠婷婷综合久久图片| 久久香蕉精品热| 极品教师在线视频| 国产精品日韩av在线免费观看| 色综合色国产| 美女 人体艺术 gogo| 97超视频在线观看视频| 日韩精品中文字幕看吧| 很黄的视频免费| 亚洲18禁久久av| 国产乱人伦免费视频| 非洲黑人性xxxx精品又粗又长| 国产精品国产三级国产av玫瑰| 日韩,欧美,国产一区二区三区 | 人人妻,人人澡人人爽秒播| 又黄又爽又刺激的免费视频.| 亚洲av熟女| 人人妻,人人澡人人爽秒播| 亚洲在线观看片| 99国产精品一区二区蜜桃av| 午夜福利在线在线| 精品久久久噜噜| 亚洲精品影视一区二区三区av| 村上凉子中文字幕在线| 日韩一区二区视频免费看| 亚洲va在线va天堂va国产| 国产一区二区在线观看日韩| 色综合婷婷激情| 久久久色成人| 久久久成人免费电影| 国产伦一二天堂av在线观看|