• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Convolutional Neural Network-Based Deep Q-Network(CNN-DQN)Resource Management in Cloud Radio Access Network

    2022-10-27 04:45:38AmjadIqbalMauLuenThamYoongChoonChang
    China Communications 2022年10期

    Amjad Iqbal,Mau-Luen Tham,Yoong Choon Chang

    Department of Electrical and Electronic Engineering,Lee Kong Chian Faculty of Engineering and Science,Universiti Tunku Abdul Rahman(UTAR),Malaysia

    *The corresponding author,email:thamml@utar.edu.my

    Abstract:The recent surge of mobile subscribers and user data traffic has accelerated the telecommunication sector towards the adoption of the fifth-generation(5G)mobile networks.Cloud radio access network(CRAN)is a prominent framework in the 5G mobile network to meet the above requirements by deploying low-cost and intelligent multiple distributed antennas known as remote radio heads(RRHs).However,achieving the optimal resource allocation(RA)in CRAN using the traditional approach is still challenging due to the complex structure.In this paper,we introduce the convolutional neural network-based deep Q-network(CNN-DQN)to balance the energy consumption and guarantee the user quality of service(QoS)demand in downlink CRAN.We first formulate the Markov decision process(MDP)for energy efficiency(EE)and build up a 3-layer CNN to capture the environment feature as an input state space.We then use DQN to turn on/off the RRHs dynamically based on the user QoS demand and energy consumption in the CRAN.Finally,we solve the RA problem based on the user constraint and transmit power to guarantee the user QoS demand and maximize the EE with a minimum number of active RRHs.In the end,we conduct the simulation to compare our proposed scheme with nature DQN and the traditional approach.

    Keywords:energy efficiency(EE);markov decision process(MDP);convolutional neural network(CNN);cloud RAN;deep Q-network(DQN)

    I.INTRODUCTION

    The past two decades have witnessed the exponential growth of mobile subscribers and user data traffic.In the annual report of Cisco 2020,it is expected that mobile subscribers will reach 5.7 billion with monthly data traffic of 110 exabytes(EB)in 2023[1].To fulfill the above requirements,a large number of base stations(BSs)need to be installed within the coverage area.However,installing more BSs leads to infrastructure costs as well as energy and power consumption.Approximately 60-75% of the total energy consumption is due to BSs in the cellular network[2].Therefore,it is essential to dynamically turn off the BSs,when the required users demand is low to ensure low energy consumption.

    In the existing radio access network(RAN)framework,the capacity is limited by the isolated resource management among BSs.One way to improve the existing RAN framework capacity is by network densification.However,such a process increases the capital and operational cost(CAPEX and OPEX),and thus,existing RAN frameworks cannot support the everincreasing user demand and mobile subscribers[3].

    Cloud radio access network(CRAN)is a prominent architecture to overcome the above difficulties by providing reliable and fast real-time communication for the next generation network[4].The main idea of CRAN is to decouple the BS functionality into distributed low-cost,low-power remote radio heads(RRHs)and centralized baseband unit(BBU).The transceiving of a radio signal from/to end-users are performed by RRHs,while the BBU is responsible for baseband signal processing functions.Due to the centralized processing,CRAN assigns the overall radio resources knowledge to RRHs according to the user demand and mobility.Although CRAN is a key enabler technology for the upcoming generation,adaptive resource allocation(RA)is still a topic worthy of investigation.

    Many researchers have investigated the RA problem in CRAN from different perspectives,i.e.,throughput,resource allocation,and joint cell activation[5–7].However,these problems are formulated based on the traditional model-based approach with a static network environment.Such approaches become impractical,especially where the user mobility affects the network state at each time stept.Therefore,in this work,we consider the model-free approach to optimize the RA problem in the entire operational period in real-time.

    Reinforcement learning(RL)is a machine learning(ML)approach,where the learning agent interacts continuously with an unknown environment to tackle the complex decision-making problem based on the current state[8].The learning agent chooses the possible action from each state and then trains the model based on the available data to make the decision at each time stept.Recently,deep learning(DL)has been applied successfully in many applications,i.e.,image processing,computer vision(CV),natural language processing(NLP),and speech recognition.Similarly,DL has also been used in wireless communication to learn the sequential control task to help the RL algorithm end-to-end[9].The convolutional neural network(CNN)advances the DL method to extract more complex dynamic features in the mobility scenario[10].In[11]and[12]use the CNN to train the neural network(NN)to maximize the energy efficiency and throughput of the multicell heterogeneous networks.However,they considered neither the setting of CRAN nor RL,which captures the interactions with the environment.Furthermore,most of the existing work defines the state of the wireless network as user demand and RRHs and neglects the relationship between them[13,14].The main drawback of these works is that users report their information to the respective RRHs,increasing the burden on signaling overhead as feedback.Secondly,the above works[13,14]usually exploit the fully connected layers to train the NN;increasing the training parameters complexity[15].Keeping the above drawbacks in our mind,we consider the relationship between users and RRHs as raw observation at the input state and propose a three-layer relational CNN-based deep QNetwork(CNN-DQN),that randomly captures the environmental state features.We combine CNN and DQN schemes in this paper to extract the raw observation between users and RRHs from the network.The CNN phase is responsible for extracting features and reducing network parameter complexity.The DQN phase on the other hand is responsible for dynamically turning on/off the RRHs.To address the problem of RA more efficiently,we first devise the Markov decision process(MDP)framework for EE by defining the state,action,reward,and next state.We then propose the CNN and DQN method to dynamically switch on/off the RRHs to maximize energy efficiency(EE)and satisfy the user quality of service(QoS)demand.Finally,we solve the RA problem based on the user constraint and transmit power to guarantee the user QoS demand and maximize the EE with a minimum number of active RRHs.The key contributions of this paper are as follows:

    1.We proposed a DRL-based autonomous RA decision making approach that successfully guarantees user satisfaction and maximizes EE while minimizing power consumption in downlink CRAN.

    2.The RA problem is formulated as MDP by defining the RL components,i.e.,state-space,actionspace,and reward function.We first build up a three layers CNN framework that captures the raw observation as an input state.The CNN output is fed with the DQN input to ensure the RRHs on/off switching decision based on the user requirements.

    3.We divide our algorithm into two phases,i.e.,CNN and DQN.The CNN phase is used to extract the raw observation feature from the environment,and the DQN phase determines the best possible action in a particular state.

    In the end,we have conducted a comprehensive simulation to validate the effectiveness of the proposed algorithm,and results show that the proposed solution has better performance in terms of maximizing energy efficiency,power-saving,and satisfying the user QoS requirements,compared to nature DQN and traditional approach.

    The rest of this paper is organized as follows.We present some of the closely related work to this research in Section II.The network model,along with power consumption and problem formulation,is described in Section III.The proposed scheme,follow by 3-layer CNN phase,is presented in Section IV.Simulation results and conclusions are discussed in Section V and Section VI,respectively.

    II.RELATED WORKS

    The increasing popularity of smartphone applications has accelerated the development of the wireless network.One of the wireless network’s significant challenges is successfully handling power consumption,maximizing the EE,and satisfying the user QoS requirements.As such,many scholars have shown their interest to propose a lasting solution for the above problems.

    In[16],maximization of the EE problem is studied based on the mode-selection algorithm using the transmission rate as a QoS requirement.Based on the device’s various factors,it is concluded that EE is maximized successfully for each content delivery.In[17],joint power and RRHs selection problem are formulated to improve the EE for green CRAN.Comprehensive simulation in[17],show that the proposed method can obtain the near-optimal solution for EE.The authors of[17],extend their work to use mixed integer non-linear programming problems at jointly selected RRHs to reduce the computational complexity in[18].In[19],two transmission strategies,i.e.,data sharing and data compression,are formulated to minimize total power in the wireless network.The radio resource management framework is proposed in heterogeneous CRAN to maximize the performance of EE in[20].Similarly,a load-aware maximization approach is proposed to maximize the EE optimization problem in a small dense network[21].In[22],a soft fractional frequency reuse method is proposed to formulate the joint optimization problem with resource block and power allocation to maximize the performance of EE in heterogeneous CRANs.In[23],the user association problem is investigated to improve the EE performance in a small cell heterogeneous network.All the above works[16–23]applied the modelbased optimization approach to solve the RA management problem in the wireless network.These methods solve the utility function by assuming the static environment.However,such approaches are not practical as the channel conditions of a typical mobile radio network change dynamically.

    Recently,the deep learning(DL)branch of ML has been applied successfully to solve the high computation mass data issues.DL-based approaches significantly reduce the high data complexity and have been adopted in wireless network problems,i.e.,RA[24]and physical layer communication[25].One step ahead,Minh et al.,[26]introduced the advancement of DL,known as deep reinforcement learning(DRL),that can solve the human-level complicated control problem.DRL provides a promising solution to tackle the RA problems in the wireless network[27,28].A DRL technique is applied to solve the cloud computing systems power management problem and overall resource distribution[29].A DRL-based algorithm is applied in multiple relay cooperative networks to maximize EE performance and overall data rate[30].Furthermore,different DRL algorithms are used to solve the power management problem in CRAN[13,14].Although these works solve the RA problem with handcrafted features and do not explicitly describe the relationship between users and RRHs information in the network state.If such information is present between the users and RRHs,then RRHs are responsible for recording all the valid information.The users do not need to provide any such information for signaling.Such a process reduces the signaling burden in the network.Secondly,the above works utilize the fully connected layer to train the NN,which significantly increases the training parameters[15].This motivates us to combine CNN with DQN.The CNN phase is responsible for extracting the input state feature containing the user’s demand,the on/off switching of RRHs,and the relationship between users and RRHs.In contrast,the DQN phase speeds up the algorithm learning process and achieves better network performance.

    Figure 1.DRL-based dynamic RA in CRAN.

    III.SYSTEM MODEL

    As depicted in Figure 1,we consider a typical downlink CRAN framework,containing a set of RRHsR,set of UEsUand a single BBU and denoted asR={1,2,...R},andU={1,2,...U},respectively.We also consider a time-period T={1,2,...T}.The UEs change their position randomly and reports user data rate demandDu∈[Dmax,Dmin]and channel state information(CSI)to the BBU pool.The BBU pool act as an RL agent.The major notations are summarized in Table 1.Without loss of generality,we assume that the users and RRHs are equipped with a single antenna.

    Furthermore,we consider that the users can access all the RRHs,and the RRHs are connected to the BBU pool.Thus,all the information is shared in a centralized manner.The path loss of the system model is followed by[19].

    wheredr,uindicates the distance between the RRHs and users.The channel fading model is defined as[19]:

    whereζr,u,ρr,uandωr,urepresents the antenna gain,small scale-fading and shadowing coefficient,respectively.According to[19],the signal-to-interference-plus-noise ratio received by the UEuat timet δu(t)can be represented as:

    Such thatσ2denotes the background noise.hu(t)represents the channel gain between RRHs and users at timetand expressed ashu(t) =[h1u(t),h2u(t),...,hR u(t)]T.wu(t)is known as beamforming weight between RRHs and user at timetand can be denoted aswu(t) =[w1u(t),w2u(t),...,wR u(t)]T.Thus,the achievable data rate for the user at time steptis given as[19]:

    whereWandJmimplies the channel bandwidth and SNR gap,respectively.The SNR gap depends on the modulation scheme.We assumeJm=1 according to[14].

    3.1 Power Consumption Model

    According to[31],the relationship between BS power consumption and transmit power can be approximated linearly.Therefore,for each RRH,a linear power model is applied as:

    Such that,pr,A(t)denotes the active RRHs power without transmitting any signals.pr,S(t)represents the sleep power when there is no need for a transmission.indicates the RRHs transmit power.Whereasτis known as the power amplifier drain efficiency and consider as a constant.AandSmeans the set of active and sleep modes of RRHs,respectively.Thus,one hasA∪S.

    Most of the works in the literature,e.g.,[17],[19],and[32],have ignored the transition power to calculate the total power consumption,which is a change mode power of RRH states.In this paper,we also consider the transition power,denoted aspr,G(t).Therefore,the total power consumptionPtotal(t)of all RRHs at time steptcan be expressed mathematically as:

    3.2 Problem Formulation

    This work aims to maximize the long-term EE performance by adjusting the per RRH transmit power and user data rate.According to[33],EE(Mbits/J)is defined as the ratio between the sum of throughput and total power consumption at timet.Mathematically,we can express the EE as:

    Thus,the optimization problem of EE can be formulated as:

    Constraints(8b)indicate that each user target data rate must be less than or equal to the achievable data rate.Whereas constraints(8c)specify users transmit power must be less than or equal to the maximum transmit power.

    IV.PROPOSED SOLUTION

    In this paper,we present the DRL approach to maximize the long-term EE performance and satisfy the user QoS requirement in downlink CRAN.In this section,we first introduce the basic concept of RL for better readability,followed by the proposed scheme description in detail.

    4.1 RL Concept

    RL is a powerful artificial intelligence(AI)technique in which an agent interacts solely with an unknown environment to monitor the current state and map the situation to maximize the reward value.Basically,RL follows the concept of the MDP framework for modeling the complex decision-making problem.The MDP can be defined as a tuple ofN=(S,A,K(s,a),P(s′,k|s,a).Such thatS,Arepresents the discrete state and action space,respectively.K(s,a)denotes the reward function for a particular state and action.P(s′,k|s,a)imply the transition probability,when the agent moves from the given states∈Sand actiona∈Ato the next states′∈S.The agent observes the current network statestvalue at each time steptand execute an actionatas shown in Figure 2.A feedback is obtained after executing the action from the environment in terms of scalar reward.The goal of the agent is to learn the near-optimal control policya=π(s),that can maximize the reward function value in a long-term perspective.A state value functionis introduced to calculate the average accumulative reward function.ThisVπ(s)follows the recursive relationship based on the Bellman equation[26]as:

    Figure 2.RL basic form and components.

    whereμis the discount factor,specify the importance of the future reward than the current reward.According to[26],two basic approaches are used to solve the MDP framework for(9),i.e.,dynamic programming(DP)and Q-learning.DP is mostly used with the model-based approach where the state transition probability is already known.However,in the complex 5G networking environment,the state transition probability is changing with each time stept.Therefore,such approach is not feasible to solve the complex network application problem.To solve the MDP problem,we deal with the unknown state transition probability in this work.

    4.2 Q-Learning

    Q-learning is a basic algorithm for dealing with unknown state transition problems based on the temporal difference method.Before explaining the Qlearning,we first evaluate the concept of the Q-value function,also known as the state-action value functionThe optimal Q-function can be represented asQ*(s,a)=maxπ Q(s,a).The Bellman equation[26],for the optimal Q-function is then written as:

    The action selection in Q-learning relies on?-greedy exploration,where the agent chooses the random action with a probability of?and the greedy action with a probability of 1-?.The Q-value is initialized with the given state and action value and update iteratively while evolving the action selection.The updated Qvalue can be written as:

    whereγis known as the learning rate.From(11),it can be concluded that every value of state and action is stored in the form of a Q-table,which works well for a limited state-action dimension.However,in the realtime application,the state-action value increases exponentially,creating a problem for Q-learning to store all the values in the lookup Q-table

    4.3 DQN-Learning

    To avoid the dimensionality problem,a linear function approximation method is proposed to approximate the Q-value function.However,such method can not estimate the Q-value function accurately.Such a problem is then solved by proposing deep reinforcement learning(DRL)with the help of a neural network known as a deep neural network(DNN).The basic idea of DNN is to use the non-linear function to approximate the Q-value function.Deep Q-network(DQN)is the widely used DRL algorithm proposed for different applications[34].In DQN,a separate target network and experience replayDis further added besides the DNN to reduce the correlation between data and make the system more stable for convergence.In DQN,the learning agent collects all the information and then applies this information to train the policy(offline)in its background.Thus,DQN makes all the decisions efficiently and timely based on the already learned policy.In DQN,the state-action value functionQ(s,a)can be represented based on Bellman equation asK+μ Q*(s′,a′).The loss function is then calculated as:

    where

    θandθ′indicates the weights of evaluated and target network,respectively.We optimize these weights by using the stochastic gradient descent algorithm[35].

    4.4 Proposed Convolutional Neural Network(CNN)Scheme

    Due to random user movement at each time stept,the state space dimension is increasing exponentially.We propose a relational CNN-DQN algorithm,that significantly solves the state space dimensionality issue and achieved the optimal control policy on RRHs on/off switching.Considering the dynamic network statespace characteristic,we propose three hidden convolutional layers.The hidden convolution layer contains the 32,32 and 64 convolution filters with an input matrix ofM×M,respectively.The input matrix consists of the user’s demand,RRHs on/off state,and CSI feature.We use the Xavier normal initializer[36]approach to initialize each convolutional filter.The output of the convolution filter can be calculated as:

    whereOis the convolutional filter output,and I,K,P,and S represents the input size,kernel(filter size),number of paddings,and stride,respectively.In this work,we consider the kernel size for all hidden layers is 2×2,the padding value for all hidden layers is assumed to be 0,and the stride value for all hidden layers is 1.Furthermore,we employ the activation function as a rectified linear unit(ReLU)for all the hidden layers[35].The proposed CNN-DQN algorithm consists of a convolutional layer,pooling layers followed by flatten and fully connected layers,as shown in Figure 3.The convolution layers are responsible for extracting the environment state-space features.The pooling layers are used to perform the down-sampling of the extracted feature.We apply a max filter that will output the maximum value of a particular region.To prevent the NN from over-fitting,we dropout the output of the last max-pooling layer with a probability ofβ=0.25.The output of the last max-pooling layer is then flattened to a one dimensional vector,which is then connected to 100×1 of fully connected(FC)NN.The training process is then executed by the DQN algorithm,as shown in Figure 3.Therefore,the extracted state feature of CNN is fed to DQN to perform the on/off RRHs switch decision.We define the state-spaces(t),action-spacea(t)and reward functionK(t)for our problem as:

    Algorithm 1.CNN-based DQN framework.Input:User date rate demand Du(t),RRHs on/off state vr(t),and Channel gain H(t).Output:Energy Efficiency EE(t).1:Initialize the experience memoryDwith capacity 2:Initialize the weights and biases for the main and target network θ and θ′3:for each episode do 4:Observe the initial state st 5:Extract the CSI feature φt using CNN 6:Feed the extracted CSI feature φt to the DRL agent 7:for each time step t do 8: Choose a probability ρ 9: if ?≥ρ then 10: Select a random action at 11: else 12: Select a greedy action at=argmax Q*(φt,at;θ)13: end if 14: Solve(19)to obtain optimal beamforming solution based on an active set of RRHs R.15: Calculate reward Kt and successor state s′t 16: Store the transition of(st,at,Kt,s′t)into D.17: Randomly sample mini-batch transition(st,at,Kt,s′t)from D 18: Set targetimages/BZ_141_1556_1964_1593_2009.pngKt=Kt;if episode terminate Kt+μmax Q(φ′,a′;θ′);otherwise (15)19: Train the network to minimize the loss function of(12)20: Perform the stochastic descent step on(yt-Q(φt,at;θ))2 21:end for 22:end for

    Figure 3.Proposed CNN based DQN framework.

    4.4.1 State Space

    At each time stept,we capture the feature of the state,contains the user date rate demandDu(t),the RRHs on/off statevr(t)and a relational matrix between usersUand RRHsR.The relational matrix can be constructed asH∈RR×Hand define as:

    wherehURindicates the CSI features between users and RRHs.We then concatenate all these three features into a single vector.Thus,the state-space becomes as:

    4.4.2 Action Space

    At each time stept,we define the action based on the RRHs on/off state and represented asar(t)∈{0,1}.However,we restricted our RL agent to decides the action based on the active set of RRHsA.

    4.4.3 Reward

    The reward indicates whether to punish or encourage the actions.In our proposed framework,the reward is the objective function defined in(8),which shows the improvement of EE and can be written mathematically as:

    4.5 Resource Allocation Optimization

    Recall(6);we consider three essential power consumption,where the state power(pr,A,pr,S)and transition power(pr,G)are composed of some constant values and can be easily calculated.These two powers rely on the current value of state and action.Therefore,it is necessary to calculate the minimum transmit power to minimize the total power consumptionPtotal(t)at each time stept.Thus,the allocation scheme at each time steptdepends on the active set of RRHs beamforming weights.Therefore,we express the optimization problem as:

    The objective is to achieve the minimum transmit power given by the states of RRHsR.TheCu(t)is defnied as the user demand;1);whereasPr,Tindicates the constraints of maximum RRHs transmit power.Constraints(19a)represents that all user demand must be met,whereas constraints(19b)ensure the transmit power limitation of each RRHs.The problem explains in(19)belongs to the convex optimization problem and can be modified to the second-order cone optimization problem[37].Such a problem can be solved by using some of the iterative approaches[38].At the start of such optimization,it is worth noting that it may have no feasible solution due to insufficient active RRHs.In this case,a large negative reward is assigned to the DQN agent.To avoid the infeasibility issue,more RRHs are activated to satisfy the user demand.The detail of the proposed framework pseudo-code is summarized in Algorithm 1.

    4.6 Computational Complexity

    The computational complexity of the proposed CNNbased DQN(CNN-DQN)algorithm is derived from(19).Since(19)can be modified to second-order cone programming(SOCP),that can be solved in polynomial time by a standard interior-point method;e.g.,[39].The total total number of variables of(19)isR+Uand a total number of constraints is 2R+2U+1.Thus,the worst-case computational complexity per-episode isO(RU)3.5.Therefore,the overall computational complexity to solve Algorithm 1 isO(R3.5U3.5K+Ψ.Ω+D+|Gθ|),whereKis the number of episodes required to converge the Algorithm 1.(Ψ.Ω),DandGθspecify the size of extracted channel gain,the number of experience samples from the replay buffer,and the number of hidden layers,respectively.Similarly,the computational complexity of[14]isO(R3.5U3.5K+D+|Gθ|).The computational complexity of the proposed algorithm is much higher than that in[14]since the proposed algorithm limits the size of the channel gain feature at the input of the network state.However,the signaling overhead of our proposed algorithm is much less than that in[14]because users do not have to exchange their information to the respective RRHs.RRHs record all the information between RRHs and users.That reduces the signaling burden of the network.

    Table 2.Simulation parameters.

    V.RESULTS AND DISCUSSION

    In this section,we analyze the simulation setting and illustrate the performance of our proposed CNN-DQN algorithm.We compared our proposed algorithm with nature DQN and the traditional approach.Without loss of generality,we assumed the traditional approach as a full coordinate association and denoted as FA,where all RRHs are always turned on,followed by solving the convex optimization problem(19).This is different from the DRL-based approach where the agent learns and switches on/off certain RRHs based on user demands and channel state information.The performance evaluation is carried out to maximize energy efficiency performance,total power consumption,and user QoS satisfaction.We fix the user demand in the range of[10-60]Mbps with a step size of 10 Mbps.Furthermore,we consider two different scenarios to verify the effectiveness of increasing RRHs,when RL agents cannot find a feasible solution to satisfy the user QoS demand.All the simulations are successfully performed in the environment of Python 3.7.First,we consider 1000 training episodes for the DRL agent to learn the environment behavior,and then performance is measured based on 100 testing episodes.All other simulation parameters used in our work are summarized in Table 2.

    5.1 Convergence Analysis

    Figure 4.Convergence algorithm.

    Consequently,we first compare and analyze the convergence of the algorithm.We evaluate the convergence of the algorithm,when the number of RRHs isR=8 and number of users isU=4.It can be seen from Figure 4,that both algorithms can converge.The proposed solution CNN-DQN converge to the optimal when the number of episodes is 790.However,it can be clearly seen that the optimal value obtained by convergence of proposed solution is far greater than DQNsolution.At the start of algorithm,we can see that the convergence rate of proposed solution is similar to DQN solution.The DQN algorithm converges at 900 episode.However,at this time,the weighted energy efficiency calculated by proposed solution has been better than that of DQN.Therefore,when proposed solution converges,the optimal energy efficiency of the system is superior to that of DQN solution.

    5.2 Effect of Learning Rate

    Learning rateγis an important hyperparameter used in machine learning(ML).γis used to tune the NN to achieve the optimal performance for the problem.Therefore,it is necessary to choose an optimal value forγ.The larger the value ofγthe more chance to overfit the model.However,the larger value ofγincreases the learning speed of NN.The smaller the value ofγ,the easier it is to avoid the model from overfitting.However,the smaller value ofγrequires tremendous computing power to train the NN.For epochitheγis given by:

    wheredandγinitindicates the positive integer and initial learning rate to control the decaying speed.In this work,we assumed∈{0.1,0.2,0.3,...1.0}.Theγis constant for all values of the epoch whend=0.As shown in Figure 5,theγiis decreased sharply for increasingd.Therefore,to avoid the NN from overfitting,we usedγ=0.001=10-3andd=1.

    Figure 5.Effect of learning rate on the different decaying values with epoch.

    5.3 Power Minimization

    This section demonstrates the proposed CNN-DQN algorithm for power consumption performance on different values of the user demands.We then compare our proposed algorithm with nature DQN and FA,as shown in Figure 5.We first consider theR=6 andU=4.It can be observed from Figure 6,that the power is exponentially increasing for each point of increasing user demand for all three approaches.The proposed solution can save 5-10%more power at all the points of user demand.It can also be noted,that the proposed approach and DQN-based approach consistently outperforming the FA-based approach.The reason comes from the learning of the environment.So,at each time stept,the learning agent takes the best possible action from the action space.The FA randomly chooses the action from the current action space and do not learn anything from the environment.However,all three approaches become infeasible to satisfy the user QoS demand after reaches to 50Mbps.This is because,insufficient number of active RRHs are available to satisfy the user requirements.To avoid such a problem,we increased RRHsR=8 with the same number of usersU=4 as shown in Figure 6.It can be concluded that the proposed solution can significantly save more power and also satisfy the user QoS requirement.However,increasing RRHs effects on the power consumption.We can observe from Figure 6,when the user demand is 50Mbps,the power consumed by the system is 48.73W forR=6,while at the same point,the power is consumed 55W forR=8.

    Figure 6.Comparison of the proposed algorithm with other algorithms for power saving on different user demand.

    Figure 7.Comparison of proposed algorithm with other algorithms for EE maximization on different user demands.

    5.4 Energy Efficiency Maximization

    Figure 8.Comparison of proposed solution for EE maximization with power consumption for R=6 and U=4.

    Figure 9.Comparison of proposed solution for EE maximization with power consumption for R=8 and U=4.

    In Figure 7,we have plotted the EE performance against the different user demands.The EE is linearly increasing with increasing user demand.It can be noticed from Figure 7,that DRL-based method outperforming the FA approach.In the FA approach,EE performance depends on the immediate network state,making the decision only for the current action space.The DQN-based approach improves the EE performance at each point of increasing user demand compared to the FA-based approach.However,the DQNbased approach contains a large number of state-action pairs.Which increases the computational complexity as well as reduces the system performance.The DQNbased approach still achieves 4-8%better performance than the FA-based approach.From Figure 7,we can observe that our proposed approach invokes to reduce the training parameters and outperform the other two approaches for increasing user demand.The proposed approach can obtain 5-12%better performance at every point of user demand for both scenarios.These performances are evidence of using a CNN-DQN approach for a high mobility scenario.In Figure 8,we have plotted the EE with power consumption forR=6 andU=4.It can be seen from Figure 8,that at the start,EE is slightly increased over a small increase in power for all the approaches.However,the EE starts to decline without further increasing after reaching to maximum value for all three approaches.This is because high transmit power is required to satisfy the user QoS demand.From Figure 8,we can observe the proposed approach achieve a maximum EE of 4.10Mbits/J with a power consumption of 48.73W Similarly,DQN and FA can achieve 3.92 Mbits/J with a power consumption of 51.95W and 3.61 Mbit/J with a power consumption of 54.07W,respectively.A similar trend has been applied for Figure 9,increasing the RRHsR=8 with the same number of users asU=4.As shown in Figure 9,the proposed approach can achieve the EE of 3.95 Mbit/J with a power consumption of 61.15W while the DQN-based and and 3.55 Mbits/J with a power consumption of 63W and 65W respectively.These figure shows the effectiveness of the proposed method in terms of achieving more EE with less power consumption.

    Figure 10.Average EE performance vs transmit power.

    5.5 Transmit Power Selection

    Figure 10,demonstrates the average EE performance with different values of transmit power.From Figure 10,we can observe that our proposed approach always outperforms the performance of EE compare to DQN and FA approaches.At the start,the transmit power of RRHs is very low;all three approaches almost achieve the same level of average EE performance.As the transmit power is increased,the average EE performance is increasing linearly.The proposed solution can achieve a higher value of average EE on different levels of transmit power.Which shows the effectiveness of the proposed approach for different values of transmit power.

    VI.CONCLUSION

    In this paper,we proposed a CNN-based DQN(CNNDQN)approach in downlink CRAN to simultaneously balance the EE performance and satisfy the user QoS demand.First,we combined the CNN approach with DQN,where the CNN phase is responsible for extracting the input state information.The extracted feature of CNN was fed to the input of DQN,which dynamically performs the switching decision of the RRHs based on the energy consumption of user demand.Then,the RA optimization scheme was formulated based on the user constraints and transmit power to balance the performance of EE and satisfy the user QoS requirements.Finally,comprehensive simulation results showed that the proposed solution achieves 10-15%more efficiency than the baseline solutions and best balances the EE performance and satisfies the user QoS requirements in different scenarios.

    ACKNOWLEDGEMENT

    This work is supported by the Universiti Tunku Abdul Rahman(UTAR)Malaysia under UTARRF(IPSR/RMC/UTARRF/2021-C1/T05).

    欧美另类亚洲清纯唯美| 99久久九九国产精品国产免费| 三级国产精品欧美在线观看| 色综合亚洲欧美另类图片| 午夜亚洲福利在线播放| 男人狂女人下面高潮的视频| 免费一级毛片在线播放高清视频| 日本黄色视频三级网站网址| 亚洲自拍偷在线| 亚洲18禁久久av| 丰满少妇做爰视频| 国产精品永久免费网站| 久久久久久久久久成人| 欧美日本亚洲视频在线播放| 综合色丁香网| 国产精品一二三区在线看| 秋霞伦理黄片| 婷婷色麻豆天堂久久 | 三级国产精品片| 老师上课跳d突然被开到最大视频| 国产精品一区二区三区四区免费观看| 久久精品国产自在天天线| 欧美性感艳星| 青春草国产在线视频| 天堂av国产一区二区熟女人妻| 国产极品精品免费视频能看的| 精品一区二区免费观看| 18禁裸乳无遮挡免费网站照片| 午夜精品国产一区二区电影 | 久久久国产成人免费| 国产大屁股一区二区在线视频| 色综合色国产| 色综合亚洲欧美另类图片| 卡戴珊不雅视频在线播放| 国产精品伦人一区二区| 内射极品少妇av片p| 久久99精品国语久久久| 免费观看性生交大片5| 建设人人有责人人尽责人人享有的 | 99久久精品一区二区三区| 日本黄大片高清| 国产私拍福利视频在线观看| 亚洲av免费高清在线观看| 乱码一卡2卡4卡精品| 国产高清不卡午夜福利| 亚洲av福利一区| 男插女下体视频免费在线播放| 亚洲国产精品国产精品| 日本黄色视频三级网站网址| 亚洲欧洲日产国产| 韩国高清视频一区二区三区| 亚洲国产精品合色在线| 亚洲欧洲国产日韩| 国产精华一区二区三区| 波多野结衣巨乳人妻| 国内少妇人妻偷人精品xxx网站| 九色成人免费人妻av| 国产淫片久久久久久久久| 黄色欧美视频在线观看| 国内精品美女久久久久久| 亚洲一级一片aⅴ在线观看| 丰满少妇做爰视频| 欧美潮喷喷水| 国产乱人视频| 中文字幕熟女人妻在线| 久久精品久久久久久久性| 日日摸夜夜添夜夜添av毛片| 五月伊人婷婷丁香| 成人无遮挡网站| 直男gayav资源| 国产高清不卡午夜福利| 高清av免费在线| 精品一区二区三区视频在线| 亚洲国产日韩欧美精品在线观看| 看非洲黑人一级黄片| 最近中文字幕高清免费大全6| 久久精品久久久久久久性| 亚洲,欧美,日韩| 美女大奶头视频| 久久99热这里只频精品6学生 | 又粗又爽又猛毛片免费看| 国产一区二区亚洲精品在线观看| 啦啦啦啦在线视频资源| 日韩国内少妇激情av| 美女大奶头视频| 岛国在线免费视频观看| 亚洲国产精品久久男人天堂| 国内精品美女久久久久久| 国产成人福利小说| 国产精品伦人一区二区| 国产高清不卡午夜福利| 有码 亚洲区| 狂野欧美白嫩少妇大欣赏| 午夜免费男女啪啪视频观看| 哪个播放器可以免费观看大片| 久久99热这里只频精品6学生 | 国产av在哪里看| 99九九线精品视频在线观看视频| 亚洲一区高清亚洲精品| 夫妻性生交免费视频一级片| 亚洲自偷自拍三级| 欧美丝袜亚洲另类| 久久久久国产网址| 免费电影在线观看免费观看| 日韩 亚洲 欧美在线| 精品久久久久久久人妻蜜臀av| 亚洲精品自拍成人| 偷拍熟女少妇极品色| 亚洲精品456在线播放app| 看免费成人av毛片| 国产一区亚洲一区在线观看| 韩国av在线不卡| 国产伦在线观看视频一区| 日韩中字成人| 日本欧美国产在线视频| 国产精品不卡视频一区二区| 精品国产三级普通话版| 99在线视频只有这里精品首页| 中文字幕久久专区| 久久精品人妻少妇| 国产成人福利小说| 在线免费十八禁| 亚洲欧美成人精品一区二区| 久久亚洲精品不卡| 国产成人aa在线观看| 久久这里有精品视频免费| 亚洲真实伦在线观看| 91在线精品国自产拍蜜月| 欧美一区二区国产精品久久精品| 三级国产精品片| 人妻制服诱惑在线中文字幕| 高清日韩中文字幕在线| 麻豆成人午夜福利视频| 日韩成人伦理影院| 亚洲av.av天堂| 欧美色视频一区免费| 久久久久久久久中文| 亚洲自拍偷在线| 18禁在线无遮挡免费观看视频| 精品久久久噜噜| 国产三级中文精品| 六月丁香七月| 免费人成在线观看视频色| 男人舔奶头视频| 女人十人毛片免费观看3o分钟| 老师上课跳d突然被开到最大视频| 国产女主播在线喷水免费视频网站 | 在线观看美女被高潮喷水网站| 99热这里只有精品一区| 日韩人妻高清精品专区| 欧美成人a在线观看| 国产一区有黄有色的免费视频 | 日韩中字成人| 精品人妻熟女av久视频| 三级经典国产精品| 美女黄网站色视频| 久热久热在线精品观看| 欧美日韩一区二区视频在线观看视频在线 | 久久精品熟女亚洲av麻豆精品 | 亚洲怡红院男人天堂| 精品欧美国产一区二区三| 中文字幕av在线有码专区| 亚洲精品日韩av片在线观看| 久久久久九九精品影院| 久久精品国产亚洲av涩爱| 国产精品av视频在线免费观看| 美女cb高潮喷水在线观看| 赤兔流量卡办理| 精华霜和精华液先用哪个| 春色校园在线视频观看| 国产伦在线观看视频一区| 亚洲伊人久久精品综合 | 欧美区成人在线视频| 床上黄色一级片| 一级av片app| 三级男女做爰猛烈吃奶摸视频| 有码 亚洲区| 极品教师在线视频| 中文天堂在线官网| 97热精品久久久久久| 美女被艹到高潮喷水动态| 美女大奶头视频| 国产精华一区二区三区| 国产毛片a区久久久久| 国产一区亚洲一区在线观看| 晚上一个人看的免费电影| 亚洲欧美精品自产自拍| 18禁在线播放成人免费| 男女那种视频在线观看| 久久99精品国语久久久| 久久国内精品自在自线图片| 日韩,欧美,国产一区二区三区 | 91精品一卡2卡3卡4卡| 草草在线视频免费看| 久久久国产成人免费| 亚洲欧美日韩东京热| 国产男人的电影天堂91| 亚洲一区高清亚洲精品| 22中文网久久字幕| 日韩欧美精品免费久久| 免费看光身美女| 黑人高潮一二区| 成人鲁丝片一二三区免费| 亚洲中文字幕日韩| 久久韩国三级中文字幕| 免费av观看视频| 美女国产视频在线观看| 麻豆精品久久久久久蜜桃| 一级黄色大片毛片| 青青草视频在线视频观看| 观看免费一级毛片| 成人国产麻豆网| 日韩 亚洲 欧美在线| 成年版毛片免费区| 男女国产视频网站| 真实男女啪啪啪动态图| 国产不卡一卡二| 国产淫片久久久久久久久| 欧美激情久久久久久爽电影| av在线天堂中文字幕| 日韩欧美国产在线观看| 日韩中字成人| 免费黄色在线免费观看| 麻豆久久精品国产亚洲av| 一夜夜www| 午夜精品一区二区三区免费看| 国产精品无大码| 1024手机看黄色片| 精品酒店卫生间| 有码 亚洲区| 久久精品国产亚洲av天美| 久久久亚洲精品成人影院| 少妇猛男粗大的猛烈进出视频 | 国产精品麻豆人妻色哟哟久久 | 日日干狠狠操夜夜爽| 欧美潮喷喷水| 国产一区亚洲一区在线观看| 一卡2卡三卡四卡精品乱码亚洲| 22中文网久久字幕| 日本欧美国产在线视频| 一级黄色大片毛片| 国产黄色小视频在线观看| 成人美女网站在线观看视频| 午夜a级毛片| 久99久视频精品免费| av福利片在线观看| 岛国在线免费视频观看| 国产成人一区二区在线| 国产成人福利小说| 亚洲在线观看片| 婷婷色综合大香蕉| 国产黄色视频一区二区在线观看 | 国产美女午夜福利| 美女脱内裤让男人舔精品视频| 一级毛片久久久久久久久女| 国产在线男女| 中文精品一卡2卡3卡4更新| 晚上一个人看的免费电影| 日韩av不卡免费在线播放| 好男人视频免费观看在线| 亚洲性久久影院| 国产一区亚洲一区在线观看| 老师上课跳d突然被开到最大视频| 国产真实伦视频高清在线观看| 一本一本综合久久| 你懂的网址亚洲精品在线观看 | 小说图片视频综合网站| 一区二区三区四区激情视频| 欧美人与善性xxx| 亚洲自偷自拍三级| videos熟女内射| 亚洲国产精品sss在线观看| 国产精品一区www在线观看| 免费av不卡在线播放| 深爱激情五月婷婷| 秋霞伦理黄片| 九色成人免费人妻av| 亚洲天堂国产精品一区在线| av国产免费在线观看| 99久久成人亚洲精品观看| 国产免费一级a男人的天堂| 嫩草影院精品99| 欧美成人午夜免费资源| 亚洲中文字幕日韩| 国产视频首页在线观看| 久久久国产成人精品二区| 人妻制服诱惑在线中文字幕| 97超视频在线观看视频| 午夜激情福利司机影院| 天天一区二区日本电影三级| 国产伦精品一区二区三区视频9| 久久久久九九精品影院| 成人欧美大片| 黄片无遮挡物在线观看| 亚洲高清免费不卡视频| 日韩欧美在线乱码| 麻豆国产97在线/欧美| 久久人妻av系列| 男人舔女人下体高潮全视频| 久久久久性生活片| 亚洲精品久久久久久婷婷小说 | 天天一区二区日本电影三级| 视频中文字幕在线观看| 色5月婷婷丁香| 国产成人福利小说| av专区在线播放| 爱豆传媒免费全集在线观看| av黄色大香蕉| 久久这里只有精品中国| 久热久热在线精品观看| 精华霜和精华液先用哪个| 久久人人爽人人片av| 亚洲av日韩在线播放| 亚洲五月天丁香| 国内精品一区二区在线观看| 一区二区三区四区激情视频| 国产毛片a区久久久久| 亚洲av一区综合| 国产又色又爽无遮挡免| 激情 狠狠 欧美| 国产精品电影一区二区三区| 爱豆传媒免费全集在线观看| 亚洲成人av在线免费| 99九九线精品视频在线观看视频| 中国美白少妇内射xxxbb| 久久久精品94久久精品| 欧美97在线视频| 97超视频在线观看视频| 久久久久久久亚洲中文字幕| 黑人高潮一二区| 中文在线观看免费www的网站| 久久精品久久久久久噜噜老黄 | 亚洲精品色激情综合| 别揉我奶头 嗯啊视频| 婷婷色综合大香蕉| 欧美另类亚洲清纯唯美| 草草在线视频免费看| 亚洲人成网站在线播| 欧美日韩精品成人综合77777| 日韩一区二区三区影片| 村上凉子中文字幕在线| 欧美3d第一页| 日本与韩国留学比较| 麻豆乱淫一区二区| 亚洲欧美精品专区久久| 又爽又黄无遮挡网站| 欧美日本亚洲视频在线播放| 国产单亲对白刺激| 亚洲国产精品sss在线观看| 啦啦啦啦在线视频资源| 精华霜和精华液先用哪个| 婷婷色av中文字幕| 村上凉子中文字幕在线| 国产精品久久久久久久久免| 五月伊人婷婷丁香| 高清毛片免费看| 少妇被粗大猛烈的视频| 欧美成人午夜免费资源| 亚洲av男天堂| 国产日韩欧美在线精品| 国产av不卡久久| 成人美女网站在线观看视频| 少妇的逼好多水| 午夜精品一区二区三区免费看| 亚洲自偷自拍三级| 可以在线观看毛片的网站| 久久人人爽人人爽人人片va| 麻豆一二三区av精品| 国产成人福利小说| kizo精华| 视频中文字幕在线观看| 色综合色国产| 能在线免费看毛片的网站| 国产老妇女一区| 午夜福利在线观看吧| 亚洲成人av在线免费| 日韩高清综合在线| 亚洲电影在线观看av| 一区二区三区乱码不卡18| 一级黄色大片毛片| 亚洲四区av| 永久网站在线| 日韩 亚洲 欧美在线| 亚洲精品乱码久久久久久按摩| 亚洲欧美一区二区三区国产| 人人妻人人澡欧美一区二区| 91精品国产九色| 欧美日韩精品成人综合77777| 中文字幕制服av| 久久久久精品久久久久真实原创| av黄色大香蕉| 国产成人精品婷婷| 午夜精品一区二区三区免费看| 99在线人妻在线中文字幕| 可以在线观看毛片的网站| 亚洲国产欧美人成| 国产亚洲精品av在线| 我要搜黄色片| 国产精品.久久久| 国产精品久久视频播放| 亚洲成色77777| 两个人视频免费观看高清| 欧美日本视频| av播播在线观看一区| 最新中文字幕久久久久| 观看美女的网站| 精品久久久噜噜| 深夜a级毛片| 一个人观看的视频www高清免费观看| 波多野结衣巨乳人妻| 国产一区亚洲一区在线观看| 国产精品久久电影中文字幕| 日本与韩国留学比较| 国产人妻一区二区三区在| 一个人看的www免费观看视频| 色综合站精品国产| 亚洲精品一区蜜桃| 国产三级在线视频| 中文字幕精品亚洲无线码一区| 成人毛片60女人毛片免费| 国产美女午夜福利| 精品人妻视频免费看| 老司机福利观看| 亚洲av成人精品一二三区| 亚洲天堂国产精品一区在线| 床上黄色一级片| 欧美高清成人免费视频www| 青青草视频在线视频观看| 少妇高潮的动态图| 一本久久精品| 日韩av不卡免费在线播放| 桃色一区二区三区在线观看| 26uuu在线亚洲综合色| 99久久人妻综合| 亚洲精品乱久久久久久| 久久久久久久久中文| 午夜亚洲福利在线播放| 色网站视频免费| 久久国内精品自在自线图片| 久久欧美精品欧美久久欧美| 特大巨黑吊av在线直播| 日日撸夜夜添| 成人午夜精彩视频在线观看| 久久精品久久精品一区二区三区| 在线a可以看的网站| 亚洲欧美日韩高清专用| 日本与韩国留学比较| 久久久久久久国产电影| 精品午夜福利在线看| 小蜜桃在线观看免费完整版高清| 欧美一区二区亚洲| 国语自产精品视频在线第100页| 国产成人a∨麻豆精品| 色综合色国产| 狠狠狠狠99中文字幕| 亚洲无线观看免费| 国产成人freesex在线| 国产白丝娇喘喷水9色精品| 亚洲婷婷狠狠爱综合网| h日本视频在线播放| 日日摸夜夜添夜夜爱| 两性午夜刺激爽爽歪歪视频在线观看| 日韩欧美三级三区| 国产在视频线在精品| 男女下面进入的视频免费午夜| av免费在线看不卡| 国产91av在线免费观看| 成人国产麻豆网| 亚洲av电影不卡..在线观看| 国产一区二区三区av在线| 亚洲一区高清亚洲精品| 亚洲真实伦在线观看| 久久人人爽人人片av| 久久久久久九九精品二区国产| 91久久精品国产一区二区成人| 国产精品乱码一区二三区的特点| 久久久久久大精品| 国产亚洲av片在线观看秒播厂 | 色5月婷婷丁香| 女人十人毛片免费观看3o分钟| 你懂的网址亚洲精品在线观看 | 最新中文字幕久久久久| 黄色欧美视频在线观看| 综合色丁香网| 欧美日韩一区二区视频在线观看视频在线 | 蜜臀久久99精品久久宅男| 中文字幕免费在线视频6| 最近最新中文字幕免费大全7| 精品久久久久久久人妻蜜臀av| 最近中文字幕高清免费大全6| 高清午夜精品一区二区三区| 麻豆乱淫一区二区| 九九久久精品国产亚洲av麻豆| 啦啦啦观看免费观看视频高清| 成人漫画全彩无遮挡| 国产精品.久久久| 一区二区三区乱码不卡18| 又粗又爽又猛毛片免费看| 国产单亲对白刺激| 久久久久久久久久黄片| 国产成人一区二区在线| 亚洲精品国产av成人精品| 男人舔奶头视频| 欧美成人午夜免费资源| 亚洲欧洲日产国产| 白带黄色成豆腐渣| 国产真实乱freesex| 少妇人妻一区二区三区视频| 国产又黄又爽又无遮挡在线| 国产精品精品国产色婷婷| 国语自产精品视频在线第100页| 国产精品久久久久久精品电影小说 | 亚洲av免费高清在线观看| 国产三级在线视频| 观看免费一级毛片| 免费大片18禁| 亚洲电影在线观看av| 久久99热6这里只有精品| 99热网站在线观看| 91精品一卡2卡3卡4卡| 欧美成人一区二区免费高清观看| 亚洲精品日韩在线中文字幕| av线在线观看网站| 成人特级av手机在线观看| 三级经典国产精品| 午夜激情福利司机影院| 国产精品久久久久久精品电影| 看黄色毛片网站| 狠狠狠狠99中文字幕| 国产免费一级a男人的天堂| 99久国产av精品| 亚洲国产欧洲综合997久久,| 国产成人freesex在线| 午夜精品在线福利| 亚洲国产精品sss在线观看| 一级爰片在线观看| 搡老妇女老女人老熟妇| 寂寞人妻少妇视频99o| 蜜桃亚洲精品一区二区三区| 黑人高潮一二区| 两性午夜刺激爽爽歪歪视频在线观看| 日韩亚洲欧美综合| 五月伊人婷婷丁香| 美女被艹到高潮喷水动态| 亚洲国产精品成人综合色| 久久热精品热| 国产乱人偷精品视频| 色综合站精品国产| 亚洲成人av在线免费| 久久99热6这里只有精品| 国产精品国产三级国产av玫瑰| 国产成人一区二区在线| 国内精品宾馆在线| 美女大奶头视频| 大又大粗又爽又黄少妇毛片口| 久久99热这里只频精品6学生 | 最后的刺客免费高清国语| 欧美又色又爽又黄视频| 久久人妻av系列| 一夜夜www| 中文天堂在线官网| 亚洲图色成人| 99九九线精品视频在线观看视频| 国产午夜福利久久久久久| 特大巨黑吊av在线直播| 成年女人永久免费观看视频| 夫妻性生交免费视频一级片| 建设人人有责人人尽责人人享有的 | 日本一二三区视频观看| 亚洲精品日韩av片在线观看| 美女被艹到高潮喷水动态| 美女内射精品一级片tv| 国产亚洲5aaaaa淫片| 乱码一卡2卡4卡精品| 校园人妻丝袜中文字幕| 国产精品一区二区性色av| 亚洲久久久久久中文字幕| 欧美性猛交黑人性爽| 在线天堂最新版资源| 91精品国产九色| 男女边吃奶边做爰视频| .国产精品久久| 欧美丝袜亚洲另类| 国产精品人妻久久久影院| 在线播放无遮挡| 丝袜美腿在线中文| 午夜福利在线观看免费完整高清在| 国产麻豆成人av免费视频| 赤兔流量卡办理| 精品一区二区免费观看| 中文字幕亚洲精品专区| 色吧在线观看| 日韩在线高清观看一区二区三区| 久久久久久国产a免费观看| av女优亚洲男人天堂| av免费观看日本| 中文字幕久久专区| 日韩成人av中文字幕在线观看| 精品国产露脸久久av麻豆 | 91久久精品国产一区二区三区| av在线亚洲专区| 性色avwww在线观看| 亚洲精品日韩在线中文字幕| 亚洲久久久久久中文字幕| 在线观看av片永久免费下载| 亚洲久久久久久中文字幕| 亚洲国产欧美人成| 亚洲av二区三区四区| 在线免费观看不下载黄p国产| 国产伦精品一区二区三区视频9| 日韩一区二区视频免费看| 日韩av在线大香蕉| 在线观看av片永久免费下载| 亚洲国产成人一精品久久久| 人人妻人人澡欧美一区二区|