• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Plume: Lightweight and Generalized Congestion Control with Deep Reinforcement Learning

    2022-12-09 09:50:00DehuiWeiJiaoZhangXuanZhangChengyuanHuang
    China Communications 2022年12期

    Dehui Wei,Jiao Zhang,Xuan Zhang,Chengyuan Huang

    State Key Laboratory of Networking and Switching Technology,BUPT 100876,China

    Abstract: Congestion control (CC) is always an important issue in the field of networking,and the enthusiasm for its research has never diminished in both academia and industry.In current years,due to the rapid development of machine learning(ML),the combination of reinforcement learning (RL) and CC has a striking effect.However,These complicated schemes lack generalization and are too heavyweight in storage and computing to be directly implemented in mobile devices.In order to address these problems,we propose Plume,a high-performance,lightweight and generalized RL-CC scheme.Plume proposes a lightweight framework to reduce the overheads while preserving the original performance.Besides,Plume innovatively modifies the framework parameters of the reward function during the retraining process,so that the algorithm can be applied to a variety of scenarios.Evaluation results show that Plume can retain almost all the performance of the original model but the size and decision latency can be reduced by more than 50%and 20%,respectively.Moreover,Plume has better performances in some special scenes.

    Keywords: congestion control;deep reinforcement learning;lightweight;generalization

    I.INTRODUCTION

    Researchers’ enthusiasm for congestion control(CC)[1—8] has never faded,and papers about it are published every year.With the rapid growth of user traffic,this issue attracts widespread attention from both industry and academia.Under dynamic and complex network conditions,dynamically adjusting the transmission rate to make full use of network resources is the function of CC and also its challenge.To meet this requirement,a good CC scheme first needs to perceive the network conditions.Then,it can consider the impact of various factors including throughput,dynamic latency,packet loss rate,etc.and can quickly select the optimal sending rate.Second,it can widely apply to a variety of scenarios,such as highly dynamic networks,scenarios that require low latency,highly competitive links and so on.Obviously,the traditional CCs fail to meet the above requirements.They often rely on assumptions about the network and determine whether the network is congested based on certain signals.But when the signal is confused,the performance will be greatly reduced and the convergence time will be long.And they often need manual adjustment of parameters for specific scenarios to improve efficiency.

    In recent years,with the rapid development of machine learning (ML),learning-based network CC schemes spring up.Because ML is suitable for learning the complex behavior of the network.It has sufficient ability to consider the influence of multiple factors on decision-making.Using learning-based algorithms can learn the mapping of historical feedback to behavior from the environment,without manual tuning for special scenarios.Learning-based solutions can divide into online learning and offline learning.Due to the need to explore the rate,online learning schemes[2,4] usually has a longer convergence time.Since the offline methods use trained models,they have a short decision time.In offline learning schemes,due to the label-free nature of network and CC is a continuous decision-making problem,reinforcement learning (RL) is usually chosen.RL actively learns the relationship between input and output,and seeks to maximize the long-term cumulative reward policy,so it can quickly respond to changes in network conditions and not be confused by noisy signals.

    Existing prominent RL-based CCs include Remy[8],Aurora[9],and Indigo[10].As the advantages indicated in the previous paragraph,they are still difficult to deploy and apply in practice.Because they have the following two problems: The first is the overhead.The pursuit of performance by researchers makes the RL-CC model increasingly complicated.Training and using them requires huge computing resources and memory resources that mobile devices(such as mobile phones,smart TVs,drones,etc.)cannot satisfy.However,the latest statistics reveal that these mobile terminals occupy a large part of the client demand.The second is the lack of flexibility and generalization.The concept of generalization means that the algorithm can be custom designed for the scenario without excessive cost.The goal of RL-CC is to maximize the long-term cumulative discount reward.Therefore,current algorithms can only trade-off between generalizability and performance,and may cause the problems of excessive bandwidth preemption or insufficient bandwidth utilization.Moreover,they have a strong dependence on the training environment.When the environmental conditions are too far from training,the performance is considerably reduced.And because of the problem of overhead,it is difficult to retrain in the application environment.Therefore,we still need to design a new adaptive scheme to adjust to the above problems.

    To address the above problems,we design Plume— ahigh-performance,lightweight and generalizedRL-Based CC scheme.Inspired by Lottery Ticket Hypothesis[11],we still use the DRL algorithm to output a CC policy.At the same time,Plume designs a pruning policy to faithfully preserve the performance of complex models while reducing the model.Moreover,we analyze the model structure of RL-Based CC in detail and provide a method to replicate small networks,so that the model can be truly reduced and the training process can be accelerated.In the retrain stage,Plume innovatively explores the idea of customizing the reward function for specific scenarios to achieve the generalization.In the experimental part,it is verified that Plume can faithfully retain the performance of complex schemes and reduce the model size by more than 50%that a size can be deployed to low-power devices and reduce the decision latency by~20%.Plume not only saves storage resources but also saves operating expenses.Moreover,using the pruned model structure to retrain with modified Reward Function,it only takes a small amount of steps to achieve the desired tendency.

    Contributions:In summary,our key contributions are:

    ? We analyze the advantages of RL-based CC and illustrate the obstacles of using RL-based CC in practical applications,which motivates the design of Plume,a high-performance,lightweight and generalized CC algorithmusing RL.

    ? In Plume,we firstly propose thepruning method to lighten the RL and provide theoretical analysis on how to prune.Then wemake Plume generalized,through customizing reward functions and simple retraining.

    ?We evaluate Plume in a simulation framework.Results show that the pruned model can retain almost all the performance of the original model butthe size and decision latency can be reduced by more than 50%and~20%,respectively.Moreover,customized retraining exhibits better performance than the original model in some specific scenarios.

    In what follows,we first summarize the ideas of the related work and their shortcomings in §II.Later,we present the motivation for designing Plume in the same section.In §III,we describe the challenges encountered in achieving lightweighting and generalization.In§IV,we first overview Plume’s design and then dig deep into the design and describe the details of it.In§V,we evaluate the performance of Plume and report its results.We conclude the paper in§VI.

    II.RELATED WORK AND MOTIVATION

    2.1 Related Work

    Table 1.Representative congestion control algorithms.

    2.1.1 Existing CC Algorithms

    The research of CC has a long history.We will consider selected ones from many CC algorithms and divide them into the following three parts to make a summary statement.

    Heuristic schemes:As summarized in table 1,nonlearning-based algorithms determine network conditions with locally collected signals like packet loss,round-trip time(RTT) and so on.The early TCP family include TCP Tahoe,TCP Reno[19],and TCP NewReno[20] use packet loss as a signal of congestion and control congestion window (CWND)with AIMD(additive increase and multiplicative decrease)’s heuristic algorithm.When non-congestion packet loss occurs,they face severe performance degradation.Then,BIC and Cubic try to use a new increment function for CWND.However,algorithms relying on packet loss will induce high latency.Thus algorithms combined with latency like TCP Vegas and Copa are derived.The recent proposal BBR approaches the network as a white box and adjusts the sending rate depending on its own assumed network conditions.None of them can accurately perceive changes in network conditions and respond promptly.We will be specified in§2.2.1.

    Learning-based:Existing learning-based solutions can be divided into online and offline learning-based Online learning-based schemes include PCC Allegro and PCC Vivace.They think of the Internet as a black box,adjusting the sending rate to the most favorable utility.But they do not learn network utilization according to experience,leading to over-utilization and converging slowly.In the offline schemes,TCP Remy models CC as a decision problem under uncertainty that uses network hypothesis and prior knowledge to map the CWND and pacing rate.It helps improve transmission efficiency,but has poor adaptability.When environment against network assumptions,performance will be greatly reduced.And Remy needs to store a lot of information,which limits his practicality.More recently,due to the unlabeled nature of CC,advanced learning algorithms generally use RL to train a model.QTCP uses RL to design CC.It combines Q-learning and TCP.Aurora generates a policy based on DRL that maps observed network statistics to select the sending rate while Indigo employs imitation-learning.However,they both suffer from over-utilization and practicability is limited to the network conditions of training.

    Environment customization:Other algorithms design for specific environments,such as DCTCP,Timely,pFabric applied to data centers and Sprout focus on cellular networks.They often need the cooperation of the underlying hardware,which is not universal,and the cost of updating is high.

    2.1.2 Existing Lightweight Methods

    Popular methods of network compression include knowledge distillation[21],network prune[22—24]and weight quantization[25—27].Knowledge distillation uses the teacher model to guide the learning of the student model.Student models are trained using supervised information provided by the teacher so that the precision is hard to guarantee.Because of the limitations of the teacher model,the generalization of the algorithm is not easy to improve.Weight Quantization technology hashed the network weights to different groups before training,and the weight values of each group are shared.Because only the weight and hash index need to be stored,storage resources can be saved.However,when the algorithm is used,the shared weights still need to be restored to the original position,which cannot save computing resources.Some works use binary/ternary storage methods,with moderate loss in accuracy.Network pruning to remove unimportant parameters or neurons for the purpose of compressing the model.In contrast,network pruning has more practical implications.We will describe the reasons for this selection in more detail in section 3.2.

    The recent research that apply lightweight includes pitree[28].It transforms the adaptive bitrate streaming(ABR) algorithm into a decision tree to save operating expenses.To the best of our knowledge,we are the first work that applies lightweight methods to RL-based CC algorithms.

    2.2 Motivation

    2.2.1 Advantages of RL-Based CC

    We will explain the benefits of using RL-based CC from the following two aspects.

    Rapid response to changes in network conditions.RL-based algorithms are able to learn complex patterns of behavior.The agent trained can find complex relationships between the environment and action.Consider a simple network configuration: the bottleneck link bandwidth changes every 5 seconds as shown in figure 1 with a min latency of10ms.The shaded part means the actual bandwidth,which is also the ideal sending rate when there is only one flow.We specify several representative and widely used CC schemes named TCP Cubic(TCP variants widely used in Linux and other systems),BBR (Google release),Copa (Microsoft release),Sprout (For cellular networks),PCC-Vivace (Online learning representative algorithm) and emulate them over this network,report their sending rate performance.As shown in the figure 1,they fail to perceive the network condition or react quickly to changes and behave too aggressively or sluggishly,causing congestion or insufficient bandwidth utilization.The sending decision will be affected by many factors,and it is difficult for traditional algorithms to find the complicated relationship between input and output.Therefore,Plume employs an RL framework that can consider many factors such as packet loss,RTT,throughput,etc.and generate input and output policy based on long-term experience to make the sending rate close to the expected behavior.

    Distinguish between noisy information.As described in related work,traditional heuristic algorithms determine the network condition based on the current locally collected information.But this information is sometimes disorientating.Their appearance does not indicate the real condition in the network,such as random packet loss.Relying on this noisy information can make false hypotheses about congestion,and thus the algorithm will choose the wrong action.For instance,the well known TCP family (e.g.Reno,BIC,Cubic)use packet loss as a congestion signal to take action.Yet there are various reasons for packet loss.Traditional loss-based algorithms cannot identify non-congested packet loss.As a result,they suffer a great reduction in performance (10x away in[4])on high non-congestion dynamic links(e.g.satellite links,cellular networks).Then companies and research institutions launched many “variant” protocols to compensate for the needs in different scenarios.However,a certain piece of bad information can still cause a devastating blow to performance.As shown in figure 2,we add 1% random packet loss to a dynamic link.Robust Cubic also fails to fully utilize the bandwidth of the link.In contrast,Plume effectively identifies non-congested packet loss.RL-based algorithms break through the limitations of traditional models to learn empirical network rules,given the long-term jackpot.Therefore,it can organize noisy information and maintain high link utilization.

    2.2.2 Problems with Existing RL-Based CC Algorithms

    Although RL-based CC has the above advantages,it is difficult to deploy in practice.Existing approaches suffer from the following two issues in general that1)too heavyweight,2)lack of generalization.Next,we will analyze these two issues in detail.They are also the driving force for the proposal of Plume.

    Make the model lightweight.Extensive previous work has shown that increasing the complexity of a model can improve its expressiveness.But the addition of each extra dimension significantly slows down the training process.The long training process,high demand for computational and storage resources all greatly hinder the application of RL-based CC on lowpower devices such as mobile devices,drones and smart TV.All of this poses a challenge to such advanced algorithms.What’s more,the users of mobile devices are growing rapidly.More and more enterprises begin to attach importance to lightweight and constantly pursue the limit of deep learning computing,hoping that its advantages can be realized in lowpower devices.The RL-Baesd CCs show them overwhelming strengths,but also leave many limitations.This motivates us to propose a solution to enjoy the ultimate performance experience with RL-baesd CCs.[29]shows that the model needs to be less than 2MB when applying to low-power devices,which is exactly the lightweight goal we are pursuing.

    Make the scheme general.As introduced in introduction,existing RL-based algorithms make a tradeoff between performance and generalization.Although they consider many factors,it can only balance in a variety of indicators.As shown in the figure 3,we perform the Aurora in a simulator where the bandwidth of the link varies between 1Mbps and 2Mbps every 2s.Because its reward can only be balanced between throughput and latency,resulting in algorithm over-utilization.Remy also states that it makes a compromise between generality and performance and does not clearly indicate the user’s true preferences.The lack of generalization also greatly limits the scope of application of RL-based-CC in practice.This motivates us to make the algorithm customizable for the training of different needs.

    III.CHALLENGES AND DESIGN CHOICE

    3.1 Challenges

    Achieving high performance.A prerequisite for achieving lightweight without losing precision is having an optimal pre-training model.For CC as a decision problem,the first step is to determine the rate of delivery based on historical information to achieve the expected performance.This expected performancecan be higher link utilization,lower latency,or less packet loss.In RL algorithms,this expectation is expressed by the reward function.Hence,the design of the reward function is very critical.How to make the initial model more encompassing is our challenge.Our goal becomes to maximize the total score of the function once the reward function is determined.In theory,we could find a policy to maximize it,but the search can be difficult.The history observed so far contains too much information that we need to pick the most crucial parts of it.Another issue is when to make decisions.

    Table 2.The statistics of input.

    How to select and apply a lightweight method?The challenges about lightweight are mainly in the choice of method and application.While a variety of compression model techniques have been introduced in§2.1.2,there are many limitations to apply to the CC issues.First,retaining behavior of the original model is the basic requirement for the algorithm.If too much accuracy is lost in the process of compressing model and Plume cannot outperform traditional heuristics,lightweight will not make any sense.However,there is no work to use lightweight methods to RL.So it is a challenge to choose and design algorithms to achieve lightweight without losing precision.Second,we require Plume that can easily reduce the overhead including shrinking the model and accelerating the computation without any other external support.If Plume needs to change or add hardware to it,it is contrary to the original intent of generality.But most of the existing lightweight papers focus on theoretical analysis and do not really realize the model reduction.They often do not mention the ways to implement compression in practice.How to really reduce the model size and speed up the computation process is what we need to think about.The existing works are based on multilayer perceptron(MLP),convolutional neural networks(CNN)and other underlying networks for analysis.The RL models are composed of multiple underlying networks and are interrelated.It is a challenge to achieve a real reduction of the model without affecting its performance.

    Keep high performance and generalization.In pretraining,we can maximize the value of an expected reward function.However,the model cannot accommodate multiple objectives.In [8],Remy states that it made a tradeoff between generalization and performance.Remy uses equation (1) to represent the flow’s score whereUα(x)=xandyrepresents throughput and average round-trip delay of the connection,respectively.Uβ(y)is bringβandyinUα(x).αandβexpress the fairness-vs.-efficiency tradeoffs in throughput and latency,respectively.δexpresses the balance between latency and throughput.It is an offline algorithm that allows users to select only one preference.So how to keep high performance of the algorithm under different objectives is a challenge.

    3.2 Design Choice

    Due to the challenges presented in§III,Plume makes the following choices:

    Input variables:we synthesize a large number of experiments and analyze the essence of CC to pick the states in table 2 as input.These follow the idea of[9].Further,we add the packet loss rate parameter.Many existing state-of-the-art learning-based CCs optimize the reward function toward high throughput,low latency and low packet loss.We also reward and penalize these three variables.Plume’s input is mainly calculated and collected for these three variables.These parameters cover a range of status that are representative of the network condition and greatly shorten the states of sender tracking.

    Compression method:As described in related work,we choose to use the network pruning technique by considering both precision preservation and practicability.Pruning can solve the second challenge described in the above subsection(§3.1)for two reasons:

    ?Over-parameterization:Over-parameterization means that during the training stage we need a large number of parameters to capture the tiny information in the data,but once the training has been completed to the reasoning stage,we do not need so many parameters.However,if the model is too small,the expression space of the model is limited,which will result in underfitting.Neural network pruning can reduce a large number of parameters(over 90%[11]).

    ?Lottery ticket hypothesis:In the paper[11]published in 2019 shows that the sparse network derived from the initial network pruning can train better than the small network initialized directly.By training complex deep networks,we make the model more expressive.Through pruning the network,the performance of the model can be retained more completely and achieve lightweight.

    IV.DESIGN

    4.1 Overview

    The figure 4 shows the overview of Plume.We will introduce the algorithm through the following three parts:

    ?Training the initialization model (§4.2.1):In order to fully learn the rules of the network,Plume first trains a complex DRL model based on previous experience.In this way,we can capture every detail in the environment that impacts proper actions through complex models.

    ?Model pruning (§4.2.2):Plume lightens the deep-network using structured pruning[30]while preserving most of the performance.We propose a network replication approach to obtain the compression models,We propose a network replication approach to obtain the compression model.

    ?Retrain the model structure (§4.2.3):In the process of retraining,Plume proposes a plan to modify the reward function.We get the ”winning ticket” (effective model structure) through the second step.Through our experiments,it is proved that modifying the reward function can get the more desired effect,and the result is obtained faster than training another complex model and pruning.

    Next we will give a detailed introduction to these three modules.

    4.2 Design Details

    4.2.1 Training the Initialization Universal Model

    Plume’s operation is to find the best mapping for the action based on states.This mapping relationship is called policy in RL.The job of Plume at this stage is to pre-compute the policy to be adopted in the model.Training models requires the following elements which we follow the idea of[9]:

    Input and output:In§3.2,we illustrate the input to the model.At eachmonitor intervals(MIs),the agent collects observation values.Then algorithm calculates them into parameters and inputs into the model to select the send rate.Since the CC algorithm has a huge action space,the output of the agent is the parameterat,which is used to correct the current sending ratext.Useat ?xtfor each modification.The positive or negative ofatdetermines the increase or decrease of the rate.

    Reward:Reward defines the goal in the RL problem,guiding the tendency of policy changes.In different applications,the demands on the network are not the same.We first design Plume from the most common objective of CC,that is high-bandwidth and low-latency.However,random packet loss is severe on high-loss networks,and packet loss is prone to occur when greedy for high bandwidth.So we consider packet loss as a penalty item in the reward function.Therefore,we formalize the reward in Aurora as follows:

    That is,high throughput is used as a reward item,and latency and loss are used as a penalty item to train the policy.

    The training structure and algorithm:The model chooses to use a fully connected neural network in the network architecture and uses the Proximal Policy Optimization (PPO)[31] algorithm to learn policy.PPO is a kind of policy gradient,based on the Actor-Critic algorithm.Plume tries to calculate a new policy in each iteration while ensuring that the deviation from the policy of the previous iteration is relatively small.It is suitable for the complex and huge exploration environment like the network.In the algorithm,the agent plays the role of actor and trains its policyπ,and its network parameter isθ.The agent continuously interacts with the environment to form a trajectoryτ:

    sis the state andais the action.The reward obtained by the sequenceτis the sum of the rewards obtained at each stage,calledR(τ).Therefore,when the actor’s policy isπ,the expected reward that can be obtained is:

    In order to maximize the expected reward,the gradient descent method is used to update the network parametersθ.In PPO,in order to speed up the training,another actor will be trained through importance Sampling to get another strategyπ,and the corresponding network parameter isθ′,then the previous policy will be calledπold.πis the target policy we are updating,andπoldis the policy behavior record.Then the expected likelihood function at this time becomes:

    Aθ′(st,at)is the advantage of(st,at).If the probability distributions ofπandπoldare too different,and the learning step is too large,the policy will fluctuate greatly.The algorithm wisely adds Kullback-Leibler(KL) divergence to measure the gap between the two distributions to limit the scope of the update,then the update process becomes:The expression ofJθ′(θ)is descirbed in Equation(5).

    4.2.2 Model Pruning

    After the discussion in§3.2,the structured pruning scheme stands out.As for applying the algorithm to the CC model we need to solve the following two problems:1)How to decide which neurons to prune?2)How to make the model truly reduced and be able to reuse?Next,we will explain our design around these two issues.

    Theoretical foundation of pruning.The basic method is to prune the initial model depending on the importance of the neurons.The closer the weight is to 0,the less impact on the result.Pruning out a neuron is equivalent to cutting out all the connections associated with it.However,it is unlikely that all connections of a neuron are close to 0.So we choose to use the L2 norm to calculate the sparsity of neurons.Why use L2 norm to evaluate the importance of neurons?L2 norm is expressed as the equation(7)whereXis an ndimensional feature (x1,x2,x3,...xn).The L2 norm acts to amplify the effect of the values,especially for small values.The constraint is possible to obtain the minimum when as many weights as possible in a neuron are close to zero.In this way,neurons with lower scores have less influence on the model.

    Then our goal is to find the remaining neurons in each layer to maximize the pruning objective defined in (8).Wherelrepresents each layer of the model network,andnrepresents the remaining neurons in this layer.gis the group of weights related to thenneuron.||Wlng||is the norm calculated by this group of weights.We choose to use the L2 norm to calculate the sparsity ofWn.Then the original problem is transformed to maximize the L2 norm of the remaining neurons and the score of each neuron is calculated as equation(9).

    Reduce the model.In the last part,we describe how to evaluate the importance of neurons,and in this part we will explain how to perform pruning.The reference to existing work in§III does not indicate the application method.Instead,it simply sets the relevant weight to 0.In order to achieve the goal of lightweight,we have designed and tried many approaches,such as: 1) Express the tensor in a sparse way that means only to store the positions and values which are not 0.But in this way,tensors need to be parsed when other applications reuse the model,and cannot work directly.The restored matrix still has zero values and cannot speed up the calculation.2) Directly change the matrix of the stored model to remove the weights that are zero.But neither tensorflow nor pytorch supports this kind of operation very well.And this will destroy the original model that causes a loss of reusability.In order to solve the problems,we use the method of network reproduction.According to different percentages of pruning,customize a model of the same size as the rest of the network.Then copy the parameters not pruned into it.In this way,the original model is not broken,but can obtain a really reduced model.But the difficulty is how to build the small model.Because the RL algorithm we use is more complicated,the model network structure obtained is also very complicated.In order to clarify the model architecture,we print out each layer of the model and analyze it carefully as figure 5.

    Neural Architecture Learning.As shown in the figure 5,the solid black line represents the process of information transmission while the solid blue line represents the process of back propagation and updating the weight.The red dotted line is part of the loop update.It can be seen that the model saved by the PPO algorithm will store two policies,πandπold.Each policy is divided into critic network(vfnetwork)and actor network(polnetwork).There is also aq ?valuenetwork.Each network adopts a fully connected way.Prune’s process only considers the underlying connections of neurons,but the reduced process requires all connections related to neurons.The stored weight matrix of each layer represents the connections between the neurons this layer and the neurons in the lower layer.And thepollayer and thevflayer are stored alternately.For this reason,we first need to separate thepollayer and thevflayer to find out the matrix elements related to the neurons pruned in the lower layer.Then Plume copies the remaining elements into the new network to form a new model,waiting to be applied to different scenarios for retraining.

    In pruning,we use a “one-shot” approach as opposed to the common way.Which means that we cut off the excess neurons at once in proportion.In the pruning process,the previous work uses the prune→fine-tune method to ensure the performance is preserved.For example,if the application needs to be pruned by 50%,the previous work will first prune 10%and then retrain,and so on iteratively.Plume,on the other hand,and will directly prune 50% of the neurons.

    We reserve the retraining for when the model is employed in different scenarios.The model is lightweight to ensure operation at the remote end.And it establishes the foundation for customized training afterwards.The experiments in chapter 5.2 demonstrate that pruning in this way also preserves most of the performance of the original model.

    4.2.3 Customized Retraining

    Another disadvantage of RL-based for CC is lack of generalization,which is also a common problem in traditional heuristic algorithms.As shown in figure 3,in order to maintain high throughput,some latency can only be sacrificed,causing packet loss.We try other clean-state methods of reward function.In[32],it designs the reward function according to an important metric called Power defined asPower=[33].The results show that the same problem still exists,and the detailed results are shown in section§5.5.Although this meets the general requirements of high throughput and low latency,in order to pursue more extreme performance in different scenarios,it is necessary to customize for varied networks.

    In order to solve the above problems,we innovatively propose to customize the reward function for the environment in the retrain phase.When retraining,iterative pruning is usually used.But this is not our requirement,and it is more suitable for unstructured pruning.So we adopt One-shot pruning as mentioned before,it is more important to keep the network structure after pruning.This structure preserves the connection between input and output.When retraining,the framework of the reward function is still based on equations(3).We use differentα,β,ccombinations to train the small model according to the needs,and the detailed results will be described in§V.

    Why does this method work? First,the model has become lightweight.The previous RL-based CC training would consume a lot of computing resources and will take a long time to train again.So the scheme can only be used after offline training.CC after our lightweight processing,the retraining time becomes shorter and the resources occupied are reduced,so simple retraining can be performed in the application.Second,we get an effective model.It has been proved that our small reduced model can retain the performance of the large model (§5.2).And only a small amount of retraining can achieve the tendency expressed by the same structure reward function.So Plume can be customized for different scenarios to achieve optimal performance.

    V.EVALUATION

    We verify Plume’s performance from multiple perspectives through simulation experiments including performance retention,overhead,and generalization capabilities.The evaluation results show that:

    ?Performance.Plume can maintain the performance of the original model in many aspects of sending rate,throughput,and latency.Moreover,after retraining,the latency can be reduced by over 50%and the packet loss rate can be reduced by over 80%.In the environment that competes with TCP Cubic,Plume can have faster convergence speed.

    ?Overhead.Plume can reduce size over 22%and reduce decision delay by over 19%.

    ?Necessity.The experimental results show that directly training the model of the same size after pruning does not have good performance.And environmental customization is also very necessary.

    5.1 Simulation Framework Construction

    We use python to build a network simulation environment and use an open-source gym to train the agent.We complete an interface in the framework to trim the model at different scales using tenorflow.In order to facilitate modification and testing,we leave an interface for the parameter of reward function.

    The trained model is saved in ackptformat.In the next step,the model will be re-read.Another network will be constructed according to the size of the pruning.Then,we will set the cropped column to 0 in the weight matrix.At this time,only the lower connection of the neuron is deleted,and the upper connection remains unchanged.This will not affect the final calculation result but will confuse the process of copying into the small network.Therefore,we need to disassemble the model according to the analysis in§4.2.3,pruning all the relevant connections of the neuron.Then a truly reduced model is formed and retrain it.

    The network topology tested is the same as most CC papers that is an end-to-end structure.The network’s packet loss rate fluctuates between 0~0.05.Link latency fluctuates between 0.05~5ms.Buffer size fluctuates between 0~8 orders of magnitude of packet size.We tested Plume in various link bandwidths.We have done experiments in a large number of different environments.The following paper mainly shows the results in two environments.Others can be found in the appendix.Environment one: Link bandwidth fluctuates regularly every 2s.Environment 2: Irregular random fluctuation of link bandwidth.

    5.2 Performance After Pruning

    Setup:First train the initial model.We test a variety of neural network architectures under different network conditions,including different number of layers and neurons.More complex network conditions require more complex model learning rules.If the model is not large enough,it will not be able to fully learn knowledge.Under the irregular network conditions,the performance is poor.Finally,We choose to use four hidden layers composed of 32→32→32→16 neurons to do the simulation experiment.Reward function reference [9] is 10?throughput ?1000?latency ?2000?loss.We prune the model in different proportions,such as 25%,50%and 75%.Then we compare their performance from three aspects of sending rate,throughput and latency under different network conditions.Results in environment one and two are shown in figure 6.Result:The results are shown in figure 6.In the three subfigures,the orange line represents the performance of the initial model.We focus on the degree of fit between different color lines,the better fitting means that the performance is retained more completely.figure 6a indicates the retention degree of the sending rate.From the experiments,it can be found that the performance is still well preserved when the pruning ratio reaches 70%,and even the sending rate is more reasonable than the initial model.The same is so with throughput.However,as can be seen in figure 6c,the latency starts to increase when the percentage of pruning reaches 70%.It proves that the model still has a lot of room for lightweight.And it can maintain the original model’s functionality after reduction.

    5.3 Why Not Train the Small Model Directly?

    Setup:In order to prove the necessity of lightweight,we directly use 50% of the network of the model in the§5.2 for training.The training environment is the same.And test it under the environment two,the results are shown in figure 7.

    Result:It can be seen from figure 7 that the model has not learned the connection between network information and sending rate.It can’t respond effectively to bandwidth changes that waste network resources.On the other hand,the model after pruning that has the same size network with small model has a good performance as shown in figure 7.This shows the necessity of training complex models and the effectiveness of pruning to achieve lightweight.

    5.4 Overhead

    5.4.1 Decision Latency and Model Size

    Setup:In§5.2,we confirm that the model after prune can still maintain proper performance.Next,we will discuss the overhead of Plume.We will measure both the size of the model and the decision latency of the scheme.In order to measure the decision latency,we put each model into the environment two for simulation.The CPU used in the simulation is Intel Xeon(R)Gold 6230 CPU@2.10GHz×32,and the memory is 16GB.We record the time that model takes to obtain information from the environment to make a decision,and conduct multiple tests to average.The result of decision latency is shown in the figure 8.We also compare the model size before and after pruning to check if the model is suitable for deployment.

    Result:It can be seen from the figure that the reduction of decision latency is also proportional to the proportion of pruning.The model after pruning can reduce decisions latency over 19%.Shorter decision time means that you can react to network conditions more quickly and uses network resources more effectively.After many measurements,results show that the reduction of model size is proportional to the proportion of pruning,at least 22%.Size reduces to about 800K and less than 2MB that suitable for low power devices.

    5.4.2 CPU Consumption

    Setup:Since the experiment is conducted in the form of simulation,the events are performed in a discrete form and will fill up the CPU when running.So it is difficult to measure the relationship between the model pruning and CPU consumption.However,we come up with a new way that convert the CPU load as decision latency/decision interval*CPU load.The result is shown in the figure 9.

    Result:As can be seen from the figure,the CPU utilization gradually decreases as the pruning ratio increases.The maximum reduction can be~16%.This means that fewer computing resources are required when the scheme is applied,which is very friendly to mobile devices.

    5.5 Why Is Scene Customization Necessary?

    Setup:As mentioned in the §2.2.2,the existing RLbased CC has a problem that the reward function is monotonous so that the schemes lack generalization.In order to prove that this problem is not only present in aurora,we train another model according to the reward function mentioned in §4.2.3 and put them into environment two for testing.The report of their sending rate and throughput are shown in figure 10.

    Result:As shown in the figure,Reward 1 represents the linear function we use,and Reward 2 is the Power function in §4.2.3.It can be seen that Reward 2 can quickly change the action according to the bandwidth,but there will be large fluctuations.After many tests,this problem still exists.And there will be the same problem as [9],which can only be balanced between different performances.Therefore,we need to customize the training for the environment so that the scheme can have extreme performance in different scenarios.

    5.6 Retrain for Low Latency

    Setup:In order to achieve low latency,we redesigned the reward function that multiply the throughput parameterαby 0.3,and the latency parameterβby 2 and retrain.The link latency varies from 0.1msto 0.3ms.We put each model into environment two to test.We record the send rate,latency and packet loss rate of the model with a pruning ratio of 25%and 75%,compared with the original model.The results are shown in the figure 11.

    Result:Figure 11a shows the test results of the latency.Aurora’s average latency is 0.7ms.The average latency after prune and retrain is 0.34ms.Average latency is reduced by~50%.And no matter how the bandwidth changes,the latency of the retrain model fluctuates very little,unlike aurora.Figure 11b shows the result of packet loss rate under the same conditions.For universality,aurora will send as many packets as possible to the network,causing serious packet loss.The model after prune can quickly reduce the packet loss rate to close to 0 after only a small amount of retraining.This is very important for high random loss links.

    Figure 1.Send rate of traditional and online-learning representative CC algorithms over a emulated network that bottlelink bandwidth changes every 5 seconds.

    Figure 2.A trace of throughput Plume and TCP Cubic on a link with random packet loss.

    Figure 3.A trace of aurora on a link which alternates between 1 and 2 Mbps every 2s.

    Figure 4.The overview of Plume.

    Figure 5.Model architecture analysis.

    Figure 6.Performance comparison of models before and after pruning under different network conditions.

    Figure 7.Using the same size network as that after pruning for training,the performance is very bad.

    Figure 8.Overhead of Plume: the time it takes for the model to obtain information from the environment to make a decision.

    Figure 9.Overhead of Plume: CPU consumption.

    Figure 10.Results under different reward function training.

    Figure 11.Modify the reward function,multiply the throughput parameter α by 0.3,and the latency parameter β by 2 and retrain.Latency and packet loss rate comparison.

    Figure 12.Modify the reward function,multiply the throughput parameter α by 4,and the latency parameter β by 2 and retrain.Throughput standard deviation comparison.

    Figure 13.Overall normalized avg.delay,and avg.throughput in several real traces.

    Figure 14.Performance comparison of models before and after pruning under different network conditions.

    5.7 Retrain for Multi-Algorithm Fairness

    Setup:To improve the aggressiveness of the algorithm,we redesign the reward function that multiplies the throughput parameterαby 4,and the latency parameterβby 2 and retrain.We have completed the TCP Cubic simulation in the framework.TCP Cubic is a representative CC algorithm based on packet loss,with strong preemptiveness.We put Cubic and Plume into a network environment where the link bandwidth gradually decreases and gradually increases.The standard deviation of the bandwidths obtained by each scheme is shown in figure 12.

    Result:The blue solid line in the figure represents the ideal bandwidth allocation.As can be seen,Plume and Cubic finally successfully converged to fair allocation.This experiment shows Plume’s ability to quickly adapt to environmental changes,and it has a faster convergence speed than Cubic.What’s more,it can also be seen from the experiment that even in the face of such a ”strong opponent” Cubic,Plume will not be disadvantaged and can maintain fairness.This proves the advantages of customization for the environment and Plume’s flexibility and generalization.

    5.8 Test in Real-Trace

    Setup:To demonstrate the performance of Plume under real-trace,we acquire real traffic traces from the mahimahi website[34].These traces represent timevarying cellular network traffic for US users.We select several sets of scenarios to be applied under the framework described in§5.1.The ellipse indicates the standard deviations from the average values of the delay and the throughput and their covariance,and the center of each ellipse shows the average of the results.

    Result:We have selected a traditional algorithm representative -Cubic and a learning-based algorithm representative Aurora for comparison with Plume.As can be seen in following figure 13,since Cubic’s aggressiveness causes longer latency and random packet loss can degrade his throughput severely.Compared to Cubic,Aurora has a better performance.However,it can be seen that Plume retains consistent high performance no matter under which trace.

    VI.CONCLUSION

    In this paper,we propose Plume,a high-performance,lightweight and generalized congestion control with deep reinforcement learning.Plume uses pruning technology to faithfully retain the performance of the original model while reducing computing and storage resources.Moreover,Plume uses the modification of the reward function to enhance the generalization of the scheme.Extensive experiments show that Plume can accomplish high performance and low overhead at the same time.Therefore,we believe that Plume can improve the friendliness of mobile device deployment and is universal for different network conditions.

    ACKNOWLEDGEMENT

    This work is partially supported by National Natural Science Foundation of China(NSFC)under Grant(No.61872401),National Natural Science Foundation of China (NSFC) under Grant (No.62132022),Fok Ying Tung Education Foundation (No.171059),and BUPT Excellent Ph.D.Students Foundation(No.CX2021102)

    APPENDIX

    In order to reveal the generality of Plume,we supplement the two environments presented in §5.1.As shown in figure 14,the shaded part represents the real bandwidth including the highly variable link,etc.It can be seen that Plume maintains its pre-pruning performance in different environments.

    国产乱人偷精品视频| 亚洲国产毛片av蜜桃av| 侵犯人妻中文字幕一二三四区| 国产免费一区二区三区四区乱码| av女优亚洲男人天堂| 国产视频首页在线观看| 免费av中文字幕在线| 国产男人的电影天堂91| 777米奇影视久久| 又粗又硬又长又爽又黄的视频| 国产精品欧美亚洲77777| 校园人妻丝袜中文字幕| 热99国产精品久久久久久7| 中文字幕免费在线视频6| 国产又色又爽无遮挡免| 精品国产一区二区三区久久久樱花| 亚洲av中文av极速乱| 最黄视频免费看| 中国三级夫妇交换| 欧美国产精品一级二级三级| 亚洲 欧美一区二区三区| 99久国产av精品国产电影| 少妇熟女欧美另类| 三级国产精品片| 激情视频va一区二区三区| 午夜激情av网站| 国产亚洲午夜精品一区二区久久| 五月玫瑰六月丁香| 一本色道久久久久久精品综合| 日日撸夜夜添| 九色成人免费人妻av| 中国国产av一级| 久久 成人 亚洲| 99久国产av精品国产电影| 韩国高清视频一区二区三区| 三级国产精品片| 天天躁夜夜躁狠狠躁躁| 国产又爽黄色视频| 国产午夜精品一二区理论片| 性高湖久久久久久久久免费观看| 亚洲av日韩在线播放| 只有这里有精品99| av片东京热男人的天堂| 夫妻午夜视频| av卡一久久| 精品一区二区三区视频在线| 国产欧美日韩一区二区三区在线| 久久亚洲国产成人精品v| 国产成人精品在线电影| 国产精品国产三级国产av玫瑰| 午夜老司机福利剧场| 精品亚洲乱码少妇综合久久| 制服丝袜香蕉在线| 黑人猛操日本美女一级片| 国产成人aa在线观看| 亚洲精品久久成人aⅴ小说| 欧美精品一区二区大全| 日韩av免费高清视频| 国产精品嫩草影院av在线观看| 国产亚洲av片在线观看秒播厂| 五月开心婷婷网| 精品国产乱码久久久久久小说| 亚洲 欧美一区二区三区| 中文字幕亚洲精品专区| 国产成人一区二区在线| 中文天堂在线官网| 精品人妻一区二区三区麻豆| 精品久久久精品久久久| 国语对白做爰xxxⅹ性视频网站| 9热在线视频观看99| 18禁观看日本| kizo精华| 22中文网久久字幕| 女人精品久久久久毛片| 亚洲欧洲精品一区二区精品久久久 | 大片免费播放器 马上看| 亚洲精品第二区| 香蕉精品网在线| 精品一品国产午夜福利视频| 免费av不卡在线播放| 黄片无遮挡物在线观看| 国产欧美日韩综合在线一区二区| av免费在线看不卡| 午夜激情久久久久久久| 亚洲欧洲精品一区二区精品久久久 | 亚洲美女搞黄在线观看| 国产毛片在线视频| 亚洲精品视频女| 又粗又硬又长又爽又黄的视频| 亚洲高清免费不卡视频| 久久99精品国语久久久| 最新的欧美精品一区二区| 九色成人免费人妻av| 热99国产精品久久久久久7| 久久精品国产亚洲av天美| 久久亚洲国产成人精品v| 伊人亚洲综合成人网| 一本色道久久久久久精品综合| 免费观看性生交大片5| 男女边吃奶边做爰视频| 22中文网久久字幕| 一区二区三区精品91| 亚洲av成人精品一二三区| 日本与韩国留学比较| 边亲边吃奶的免费视频| 国产成人欧美| 国产精品久久久av美女十八| 看免费成人av毛片| 亚洲精品久久久久久婷婷小说| 国产一区亚洲一区在线观看| 卡戴珊不雅视频在线播放| 亚洲精品成人av观看孕妇| 午夜福利乱码中文字幕| 亚洲国产精品一区二区三区在线| 亚洲欧美成人综合另类久久久| 亚洲国产精品一区三区| av女优亚洲男人天堂| 老女人水多毛片| 黄色一级大片看看| 婷婷色麻豆天堂久久| 爱豆传媒免费全集在线观看| 夜夜骑夜夜射夜夜干| 久久精品国产鲁丝片午夜精品| 精品人妻熟女毛片av久久网站| 在线观看免费日韩欧美大片| 成人影院久久| 老司机影院成人| 18禁动态无遮挡网站| 乱人伦中国视频| 日本黄色日本黄色录像| 丝袜人妻中文字幕| 爱豆传媒免费全集在线观看| 亚洲 欧美一区二区三区| 99热6这里只有精品| 日日啪夜夜爽| 日日摸夜夜添夜夜爱| 亚洲情色 制服丝袜| 最近中文字幕高清免费大全6| 黄色视频在线播放观看不卡| 老女人水多毛片| 女人久久www免费人成看片| 美女大奶头黄色视频| 免费在线观看黄色视频的| 男女边吃奶边做爰视频| 性色avwww在线观看| 性色av一级| freevideosex欧美| 亚洲精品aⅴ在线观看| 亚洲av中文av极速乱| 制服诱惑二区| 91久久精品国产一区二区三区| 黄色视频在线播放观看不卡| 男人添女人高潮全过程视频| 日韩熟女老妇一区二区性免费视频| 国产精品久久久久久久电影| 国产精品国产三级专区第一集| 中国三级夫妇交换| 午夜福利视频精品| 国产亚洲一区二区精品| 国产精品人妻久久久久久| 日韩av在线免费看完整版不卡| 又粗又硬又长又爽又黄的视频| 三级国产精品片| videosex国产| 五月玫瑰六月丁香| 国产精品一二三区在线看| 成人免费观看视频高清| xxx大片免费视频| 久久99热6这里只有精品| 成人亚洲精品一区在线观看| 欧美精品av麻豆av| 男女午夜视频在线观看 | 色视频在线一区二区三区| 国产乱人偷精品视频| 国产永久视频网站| 日本av手机在线免费观看| 男女午夜视频在线观看 | 中文字幕制服av| 免费观看av网站的网址| 91在线精品国自产拍蜜月| 一个人免费看片子| 日本色播在线视频| 大香蕉97超碰在线| 精品视频人人做人人爽| 精品久久久久久电影网| 婷婷成人精品国产| 22中文网久久字幕| 一二三四中文在线观看免费高清| 赤兔流量卡办理| 满18在线观看网站| www.熟女人妻精品国产 | 国产麻豆69| 亚洲欧美一区二区三区国产| 大陆偷拍与自拍| 夫妻午夜视频| 国产高清三级在线| 最黄视频免费看| av不卡在线播放| 免费看不卡的av| 亚洲国产精品专区欧美| 啦啦啦啦在线视频资源| 国产免费福利视频在线观看| av国产精品久久久久影院| 日本91视频免费播放| 国产精品偷伦视频观看了| 精品国产国语对白av| 少妇的逼水好多| www.色视频.com| 人妻少妇偷人精品九色| 国产精品一区二区在线观看99| 免费av中文字幕在线| 久久国内精品自在自线图片| 免费人成在线观看视频色| 国产精品成人在线| 久久综合国产亚洲精品| 久久久久久人妻| 91午夜精品亚洲一区二区三区| 热re99久久国产66热| 日本欧美视频一区| 亚洲国产av影院在线观看| 美女国产高潮福利片在线看| 国产免费现黄频在线看| 色视频在线一区二区三区| 久久精品国产a三级三级三级| 91午夜精品亚洲一区二区三区| 大片免费播放器 马上看| 赤兔流量卡办理| 免费高清在线观看视频在线观看| 99久国产av精品国产电影| 不卡视频在线观看欧美| 狂野欧美激情性xxxx在线观看| 涩涩av久久男人的天堂| 69精品国产乱码久久久| 建设人人有责人人尽责人人享有的| 亚洲中文av在线| 国产精品免费大片| av在线老鸭窝| 久久久久精品久久久久真实原创| 亚洲精品成人av观看孕妇| 插逼视频在线观看| 亚洲 欧美一区二区三区| 久久这里有精品视频免费| 精品一区二区三区四区五区乱码 | 国产av精品麻豆| 国产精品三级大全| 另类精品久久| 久久女婷五月综合色啪小说| 亚洲精品日韩在线中文字幕| 免费观看在线日韩| 亚洲熟女精品中文字幕| av在线播放精品| 自线自在国产av| 国产男女超爽视频在线观看| 草草在线视频免费看| 高清欧美精品videossex| 国产日韩欧美在线精品| 纵有疾风起免费观看全集完整版| 国产男女内射视频| 在线免费观看不下载黄p国产| 久久久久久久久久成人| 国产一区二区在线观看av| 久久国内精品自在自线图片| 国产午夜精品一二区理论片| 香蕉国产在线看| 免费观看av网站的网址| 亚洲精品日韩在线中文字幕| 搡老乐熟女国产| a级毛色黄片| 在线精品无人区一区二区三| 亚洲欧洲国产日韩| 亚洲国产最新在线播放| 一级爰片在线观看| 亚洲av欧美aⅴ国产| 两个人看的免费小视频| 精品亚洲成a人片在线观看| 成人黄色视频免费在线看| 国产精品一区www在线观看| 国产在线免费精品| 中文精品一卡2卡3卡4更新| 国产国拍精品亚洲av在线观看| 色婷婷av一区二区三区视频| 久久国内精品自在自线图片| 精品人妻一区二区三区麻豆| 在线天堂最新版资源| 亚洲成国产人片在线观看| 一本久久精品| 久久狼人影院| 91国产中文字幕| 日日爽夜夜爽网站| 在线观看人妻少妇| 九九在线视频观看精品| 国产亚洲欧美精品永久| 妹子高潮喷水视频| 成人亚洲精品一区在线观看| 啦啦啦视频在线资源免费观看| 欧美激情 高清一区二区三区| 午夜激情久久久久久久| 国产日韩欧美在线精品| 五月玫瑰六月丁香| 免费少妇av软件| 国产欧美日韩综合在线一区二区| 美女国产高潮福利片在线看| 国产福利在线免费观看视频| 一本久久精品| 最近中文字幕2019免费版| 亚洲欧美日韩卡通动漫| 不卡视频在线观看欧美| 国产男女内射视频| 激情视频va一区二区三区| 深夜精品福利| 一本一本久久a久久精品综合妖精 国产伦在线观看视频一区 | 久久国产精品男人的天堂亚洲 | 亚洲精品一区蜜桃| 免费高清在线观看视频在线观看| 91精品伊人久久大香线蕉| 国产av一区二区精品久久| 女人精品久久久久毛片| 国产一级毛片在线| 狠狠婷婷综合久久久久久88av| 国产日韩一区二区三区精品不卡| 国产在线一区二区三区精| 91午夜精品亚洲一区二区三区| 一区二区三区乱码不卡18| 十八禁网站网址无遮挡| 成人国语在线视频| 免费黄网站久久成人精品| 亚洲精品久久久久久婷婷小说| 在现免费观看毛片| 久久女婷五月综合色啪小说| 成人18禁高潮啪啪吃奶动态图| 国产高清三级在线| 99热国产这里只有精品6| 曰老女人黄片| 街头女战士在线观看网站| 亚洲精品av麻豆狂野| 国产一区有黄有色的免费视频| 欧美日韩成人在线一区二区| av国产精品久久久久影院| 精品亚洲成a人片在线观看| 美女大奶头黄色视频| 色视频在线一区二区三区| 午夜福利视频在线观看免费| 免费高清在线观看视频在线观看| av国产久精品久网站免费入址| 蜜桃在线观看..| 国产精品偷伦视频观看了| 久久 成人 亚洲| 青春草亚洲视频在线观看| 国产精品久久久久久精品古装| 国产成人精品在线电影| 男女国产视频网站| 国产色爽女视频免费观看| 国产黄频视频在线观看| 在现免费观看毛片| 亚洲精品一二三| 久久99蜜桃精品久久| 亚洲四区av| 久久精品国产亚洲av涩爱| 亚洲av男天堂| 亚洲av综合色区一区| 久久久久国产网址| 少妇被粗大猛烈的视频| 国产精品嫩草影院av在线观看| 天天影视国产精品| 亚洲精品视频女| 亚洲精品av麻豆狂野| 午夜精品国产一区二区电影| 国产免费福利视频在线观看| 亚洲精品色激情综合| 少妇熟女欧美另类| 天堂中文最新版在线下载| 又粗又硬又长又爽又黄的视频| 青春草国产在线视频| 亚洲精品一二三| 亚洲av男天堂| 久久国内精品自在自线图片| a级片在线免费高清观看视频| 一级毛片 在线播放| 国产精品国产三级国产专区5o| videosex国产| 国产精品国产三级国产av玫瑰| 国产精品一区www在线观看| 十八禁网站网址无遮挡| 亚洲精品乱久久久久久| 韩国av在线不卡| 99国产综合亚洲精品| 视频中文字幕在线观看| 亚洲伊人色综图| 亚洲av中文av极速乱| 大片免费播放器 马上看| 免费人成在线观看视频色| 国产一区二区在线观看日韩| 国语对白做爰xxxⅹ性视频网站| 午夜免费观看性视频| 成人黄色视频免费在线看| 欧美国产精品va在线观看不卡| 国产探花极品一区二区| 桃花免费在线播放| 午夜激情av网站| 国产男女超爽视频在线观看| av天堂久久9| 少妇的逼水好多| 免费高清在线观看视频在线观看| 亚洲精品国产色婷婷电影| 亚洲图色成人| 国产精品 国内视频| 性色avwww在线观看| 午夜免费男女啪啪视频观看| 美女主播在线视频| 一本大道久久a久久精品| 亚洲欧美成人精品一区二区| 欧美日韩视频精品一区| av不卡在线播放| 免费看光身美女| www.色视频.com| 黄色视频在线播放观看不卡| 97超碰精品成人国产| 少妇人妻久久综合中文| 欧美成人午夜精品| 欧美xxⅹ黑人| 黄色怎么调成土黄色| 日韩视频在线欧美| 最黄视频免费看| 久久午夜福利片| 日本欧美视频一区| 欧美日韩视频精品一区| 大香蕉97超碰在线| 90打野战视频偷拍视频| www.熟女人妻精品国产 | 高清视频免费观看一区二区| 亚洲综合色网址| 内地一区二区视频在线| 永久网站在线| 在线观看国产h片| 91成人精品电影| 日产精品乱码卡一卡2卡三| 成人无遮挡网站| 欧美激情极品国产一区二区三区 | 高清毛片免费看| freevideosex欧美| 丝袜美足系列| 99国产精品免费福利视频| 国产精品国产三级国产av玫瑰| 狠狠精品人妻久久久久久综合| 免费女性裸体啪啪无遮挡网站| 成年人午夜在线观看视频| 国产精品不卡视频一区二区| 国产男女内射视频| tube8黄色片| 成人国产av品久久久| 国产极品粉嫩免费观看在线| 丝袜脚勾引网站| 亚洲av欧美aⅴ国产| 午夜福利视频精品| 久久久久国产网址| 九色成人免费人妻av| 国产淫语在线视频| 国产国拍精品亚洲av在线观看| 在线观看一区二区三区激情| 成人国语在线视频| 18禁观看日本| 只有这里有精品99| 99热这里只有是精品在线观看| 午夜日本视频在线| 少妇熟女欧美另类| 国产成人精品无人区| 中文字幕精品免费在线观看视频 | 22中文网久久字幕| 有码 亚洲区| 涩涩av久久男人的天堂| 我的女老师完整版在线观看| 国产深夜福利视频在线观看| 亚洲中文av在线| 少妇猛男粗大的猛烈进出视频| 国产欧美亚洲国产| 99热网站在线观看| 免费黄色在线免费观看| 亚洲欧美精品自产自拍| 一级毛片我不卡| 在线观看一区二区三区激情| 国产 一区精品| 婷婷色av中文字幕| 亚洲第一区二区三区不卡| 国产片内射在线| 日韩电影二区| 亚洲一级一片aⅴ在线观看| 高清毛片免费看| 亚洲美女视频黄频| 婷婷色av中文字幕| 如日韩欧美国产精品一区二区三区| 丰满饥渴人妻一区二区三| 久久99精品国语久久久| 91久久精品国产一区二区三区| 99九九在线精品视频| 在线精品无人区一区二区三| 啦啦啦中文免费视频观看日本| 亚洲精品美女久久av网站| 日韩成人av中文字幕在线观看| 日本黄色日本黄色录像| 美女xxoo啪啪120秒动态图| 尾随美女入室| 精品一区在线观看国产| 一区二区日韩欧美中文字幕 | 久久国产亚洲av麻豆专区| 一本一本久久a久久精品综合妖精 国产伦在线观看视频一区 | 夫妻性生交免费视频一级片| 精品卡一卡二卡四卡免费| 夫妻性生交免费视频一级片| 春色校园在线视频观看| 国产免费福利视频在线观看| 中文乱码字字幕精品一区二区三区| 欧美xxⅹ黑人| 一二三四中文在线观看免费高清| 国产精品国产三级国产专区5o| 制服丝袜香蕉在线| 2021少妇久久久久久久久久久| 最近中文字幕2019免费版| 亚洲色图综合在线观看| 深夜精品福利| 久久热在线av| 另类亚洲欧美激情| 国产国拍精品亚洲av在线观看| 精品亚洲乱码少妇综合久久| 成人国产麻豆网| 国产欧美日韩一区二区三区在线| 视频在线观看一区二区三区| 日本vs欧美在线观看视频| 成年av动漫网址| 男女边摸边吃奶| 热99国产精品久久久久久7| 国产高清不卡午夜福利| 最近中文字幕高清免费大全6| 18+在线观看网站| 大片电影免费在线观看免费| 新久久久久国产一级毛片| av福利片在线| 欧美3d第一页| 亚洲精品乱码久久久久久按摩| 国产精品嫩草影院av在线观看| 夫妻性生交免费视频一级片| 一区二区三区四区激情视频| 亚洲国产精品999| 亚洲欧洲精品一区二区精品久久久 | 亚洲人与动物交配视频| 日韩人妻精品一区2区三区| 韩国高清视频一区二区三区| 精品国产一区二区久久| 亚洲精品国产av蜜桃| 亚洲av男天堂| 国产亚洲av片在线观看秒播厂| 97在线视频观看| 十八禁网站网址无遮挡| 咕卡用的链子| 国产成人精品在线电影| 久久 成人 亚洲| 看免费成人av毛片| 久久久久久久精品精品| 国产毛片在线视频| 亚洲av电影在线观看一区二区三区| 99国产精品免费福利视频| 性色avwww在线观看| 亚洲av欧美aⅴ国产| 成人二区视频| 亚洲av.av天堂| 久久这里有精品视频免费| 天堂8中文在线网| 99国产综合亚洲精品| 亚洲人与动物交配视频| 亚洲av中文av极速乱| 黄网站色视频无遮挡免费观看| 最近最新中文字幕大全免费视频 | 久久久久久久大尺度免费视频| 国产一区二区在线观看日韩| 高清在线视频一区二区三区| 一级a做视频免费观看| 国产午夜精品一二区理论片| 校园人妻丝袜中文字幕| 国产免费现黄频在线看| 9191精品国产免费久久| 九九在线视频观看精品| 99久久精品国产国产毛片| 肉色欧美久久久久久久蜜桃| 日本欧美国产在线视频| 成人毛片a级毛片在线播放| 国产精品熟女久久久久浪| 最近最新中文字幕大全免费视频 | 女的被弄到高潮叫床怎么办| 最新中文字幕久久久久| 国产高清三级在线| 国国产精品蜜臀av免费| 亚洲精华国产精华液的使用体验| 成人影院久久| 欧美老熟妇乱子伦牲交| 青春草亚洲视频在线观看| 人人妻人人爽人人添夜夜欢视频| 国产成人a∨麻豆精品| 最近最新中文字幕大全免费视频 | 老司机亚洲免费影院| 日韩人妻精品一区2区三区| 黄网站色视频无遮挡免费观看| 久久精品久久久久久噜噜老黄| 日本91视频免费播放| √禁漫天堂资源中文www| 国产伦理片在线播放av一区| 卡戴珊不雅视频在线播放| 成人亚洲欧美一区二区av| 国产精品免费大片| 国产一级毛片在线| 亚洲综合色惰| 久久毛片免费看一区二区三区| 美女xxoo啪啪120秒动态图| 免费av不卡在线播放| 9热在线视频观看99| 国产精品无大码| 精品人妻一区二区三区麻豆|