• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Machine Learning-based Optimal Framework for Internet of Things Networks

    2022-08-23 02:19:54MoathAlsafasfehZaidAridaandOmarSaraereh
    Computers Materials&Continua 2022年6期

    Moath Alsafasfeh,Zaid A.Arida and Omar A.Saraereh

    1Department of Computer Engineering,College of Engineering,Al-Hussein Bin Talal University,Ma’an,Jordan

    2Abdul Aziz Ghurair School of Advanced Computing(ASAC),LTUC,Amman,P11118,Jordan

    3Department of Electrical Engineering,Engineering Faculty,The Hashemite University,Zarqa,13133,Jordan

    Abstract:Deep neural networks(DNN)are widely employed in a wide range of intelligent applications, including image and video recognition.However,due to the enormous amount of computations required by DNN.Therefore, performing DNN inference tasks locally is problematic for resourceconstrained Internet of Things (IoT) devices.Existing cloud approaches are sensitive to problems like erratic communication delays and unreliable remote server performance.The utilization of IoT device collaboration to create distributed and scalable DNN task inference is a very promising strategy.The existing research, on the other hand, exclusively looks at the static split method in the scenario of homogeneous IoT devices.As a result, there is a pressing need to investigate how to divide DNN tasks adaptively among IoT devices with varying capabilities and resource constraints,and execute the task inference cooperatively.Two major obstacles confront the aforementioned research problems:1)In a heterogeneous dynamic multi-device environment,it is difficult to estimate the multi-layer inference delay of DNN tasks; 2)It is difficult to intelligently adapt the collaborative inference approach in real time.As a result,a multi-layer delay prediction model with fine-grained interpretability is proposed initially.Furthermore, for DNN inference tasks,evolutionary reinforcement learning(ERL)is employed to adaptively discover the approximate best split strategy.Experiments show that,in a heterogeneous dynamic environment, the proposed framework can provide considerable DNN inference acceleration.When the number of devices is 2,3,and 4,the delay acceleration of the proposed algorithm is 1.81 times,1.98 times and 5.28 times that of the EE algorithm,respectively.

    Keywords: IoT; distributed computing; neural networks; reinforcement learning

    1 Introduction

    In recent years, the Internet of Things (IoT) devices have become more and more common.According to Gartner data,the number of IoT devices is expected to reach 25 billion by 2021[1–5].As a typical representative of“Internet+”,the IoT extends the traditional information communication to a wider physical world,which greatly expanding the coverage and composition of the Internet[6–9].The wireless sensor network(WSN)is composed of a large number of sensor nodes with limited resources such as computing,communication,and energy in a multi-hop and self-organizing manner[10].It is the core support of the perception layer of the IoT.Since the WSN was proposed in the 1990s,it has received extensive attention worldwide,especially in developed countries or regions such as the United States,Europe,Japan,and South Korea[11–13].Continued to carry out related exploratory research,people usually think that:WSN technology has the ability to increase the existing network functions and improve the people’s perception of the world.The IoT based on WSN has great potential, and its constantly emerging innovative application results will have a subversive impact on human life and social progress[14].

    At present, deep neural networks (DNNs) are developing rapidly and have been widely used in various intelligent tasks (such as computer vision, video recognition and machine translation).The IoT devices are expected to perform DNN inference tasks to achieve real-time data processing and analysis.For example,in the smart home scene,the camera can perform video recognition and speech translation tasks based on the DNN model[15].However,due to the limited resources of IoT devices,and the DNN task requires a lot of computing resources and memory usage, it is difficult for IoT devices to perform DNN inference tasks locally.In order to overcome the above-mentioned challenges,reference[16]proposed to split the DNN model between a single IoT device and cloud server to achieve task inference acceleration.However,limited by factors such as the large amount of transmitted data and the unpredictable network communication delay,the method of cloud assisting in the execution of DNN task inference is difficult to guarantee the efficiency of data processing,and it will increase the dependence on cloud services.

    Aggregating the computing power of multiple IoT devices to perform DNN tasks together is an effective solution.The advantage of this approach is to reduce the dependence on cloud services,protect the privacy of IoT devices, and enable the distributed collaborative computing.Reference[17] is the first to use resource-constrained multiple IoT devices to collaborate to perform DNN tasks such as voice and video recognition.Reference [18] proposed the DeepThings framework to divide the convolutional layer to reduce the overall execution delay and memory usage.However,the existing research work only considers the isomorphism of IoT devices, and cannot achieve realtime dynamic DNN task splitting.How to efficiently split DNN tasks and collaborative inference in dynamic heterogeneous scenes is a key issue to be solved urgently.

    The above-mentioned research problems face two important challenges.First,different parameter configurations (layer type, number of layers, convolution kernel size, input feature size, etc.) and heterogeneous device capabilities lead to significant differences in inference delays.It is impractical to perform DNN inference tasks on demand to obtain the inference delay under each system setting and task splitting strategy.Therefore,it is necessary to predict the current system state and the inference delay caused by the split collaboration strategy in advance.The existing DNN delay prediction model is based on the single-layer prediction,and the multi-layer prediction delay is obtained by adding the single-layer prediction delay.However,reference[19]found through experiments that,the difference between the sum of the delays of the individual execution of each layer and the overall execution delay becomes more obvious with increasing number of convolutional layers,and the existing DNN delay prediction model cannot be within the acceptable error range to perform effective evaluation and prediction of inference delay.Moreover, the existing delay prediction model only considers specific parameter configuration,and does not consider the impact of equipment capabilities on DNN inferred delay.Therefore,it is of great significance to study the accurate multi-layer delay prediction model in the case of multiple parameter configurations and heterogeneous equipment.

    DNN task splitting will generate communication overhead while distributing the amount of computation.Although, increasing the number of devices that cooperate to perform DNN tasks will reduce the calculation delay of a single device, it will also increase the communication delay between devices.Therefore,the collaborative splitting strategy needs to efficiently weigh the calculation and communication delays.Because the DNN structure, network status, and device capabilities are dynamically changing and highly heterogeneous,the DNN task splitting and collaborative inference strategies need to be dynamically adjusted and efficient decision-making based on the current system state,determine the number of devices to perform tasks,and select the split of DNN tasks according to the location and the computing tasks assigned to each device,in order to obtain the optimal DNN inference acceleration and make full use of the computing power of the IoT device [20].In view of the above problems,traditional optimization methods have high computational complexity and long solution time,making it difficult to apply.The data-driven artificial intelligence methods can establish automated decision-making models through data processing and analysis,training and learning,and making decisions directly based on the learned decision-making model when the system status changes,thereby achieving adaptive,intelligent and real-time decision-making.This paper uses a data-driven learning algorithm to develop real-time intelligent DNN task splitting and collaborative inference strategies under the diversification of device capabilities,network status,and DNN tasks.

    This paper proposes a novel IoT device collaborative execution DNN task inference(IoT-CDI)framework.Based on various factors such as DNN structure,device capabilities,and network status,it can adaptively adjust the DNN splitting and task allocation strategies, which can be used when resources are limited.It realizes the DNN collaborative inference between heterogeneous IoT devices,and makes full use of the computing power to minimize the inference delay of DNN tasks.The main contributions of this paper include three aspects:

    1) Fine-grained characterization of DNN model layer types,parameter configuration and equipment capabilities,etc.,mining complex mapping relationships between features and execution delays,generating interpretable multi-layer delay prediction models,and evaluating a variety of common predictions through a large number of experiments.Then,it obtains an accurate model suitable for multi-layer delay prediction.

    2) Convert the original DNN split and collaborative inference problem into the shortest path discovery problem, and reduce it to an NP-hard problem.An adaptive DNN splitting and collaborative inference algorithm based on Evolutionary Reinforcement Learning (ERL) is proposed to realize the real-time intelligent DNN inference acceleration among heterogeneous devices.

    3) Use real experiments to verify.Five common DNN models and various types of Raspberry Pi devices are selected to verify the effectiveness of the proposed IoT-CDI framework.The experimental results show that the proposed IoT-CDI can significantly improve the inference speed and is better than the benchmark algorithms.

    The remaining of the paper is organized as follows.In Section 2,the literature review is discussed.In Section 3,the background introduction and research motivation are elaborated.In Section 4,the proposed IoT-CDI model is explained.In Section 5,the task splitting mechanism of DNN is discussed.Section 6 provides the proposed framework.In Section 7,the experimental results analysis is described.In Section 8,the discussion on numerical results of the algorithms is given while Section 9 concludes the article.

    2 Literature Review

    2.1 Research on End-Cloud Collaboration Inference

    Limited by the memory limitations and computing resource constraints of IoT devices,existing work is mainly devoted to the research of DNN task collaboration inference strategies between the IoT devices and cloud servers.Reference[21]proposed a DNN inference delay prediction algorithm based on a tree regression model.In[22],the author designed a flexible and efficient two-step pruning algorithm.According to multiple factors,such as hierarchical data transmission and calculation delay,tolerable accuracy loss, wireless channel and device computing power, etc., the pruning model and the optimal DNN splitting position are determined.While reducing the load of calculation and communication transmission,it also satisfies the inference accuracy requirements of DNN tasks.The authors in [23] designed an adaptive DNN splitting algorithm, which can find the optimal splitting strategy under dynamic and time-varying network load conditions.

    Although the collaborative inference of IoT devices and cloud servers can use the computing power of cloud servers to reduce the inference delay,there are still problems such as high dependence on cloud servers,unscalable inference,long communication delay,and device privacy protection.

    2.2 IoT Device Collaboration Inference

    As cloud assists DNN task inference facing the above-mentioned problems,an emerging research trend is to aggregate the computing capabilities of resource constrained IoT devices, and multiple IoT devices collaborate to perform DNN inference tasks.Reference[24]used multiple IoT devices to perform the DNN inference for the first time,and achieved task inference acceleration by reducing the computational cost and memory usage of a single device.However,the existing research work does not consider the heterogeneous capabilities of IoT devices,dynamic changes of environmental conditions,and is difficult to achieve real-time adaptive decision-making under the diversified environment configuration and high computational complexity of problem solving.It is worth noting that, the above work is orthogonal to the compression and acceleration methods that use weight pruning[25,26],quantization [27,28] and low-precision inference [29,30] to reduce the computational cost of DNN models.At the same time,these two technologies are used to accelerate the DNN inference.Reference[31] proposes a novel system energy consumption model that considers the runtime, switching, and processing energy consumption of all involved servers(cloud and edge)and IoT devices.Then,utilizing a Self-adaptive Particle Swarm Optimization algorithm with Genetic Algorithm operators (SPSOGA),a novel energy-efficient offloading approach is developed.With layer partition procedures,this innovative technique can efficiently make offloading decisions for DNN layers,reducing the encoding dimension and improving SPSO-GA execution time.The authors in [32] provide a technology framework that supports fault-tolerant and low-latency AI predictions by combining the Edge-Cloud architectural concept with BranchyNet advantages.The benefits of running Distributed DNN(DDNN)in the Cloud-to-Things continuum may be assessed thanks to the deployment and evaluation of this architecture.Reference[33]proposes a new convolutional neural network structure—BBNet—that speeds up collaborative inference on two levels:(1)through channel-pruning,which reduces the number of calculations and parameters in the original network; and (2) by compressing the feature map at the split point,which reduces the size of the data transmitted even more.

    Tab.1 compare the summary of related works and proposed method.

    Table 1: Comparison of the related works and proposed work

    3 Background Introduction and Research Motivation

    This section first introduces the types and characteristics of DNN layers, and then leads to the research motivation of this article based on real experimental analysis.

    3.1 DNN Layer Type

    DNN tasks include multiple layer types, such as convolutional layer (conv), fully connected layer (fc), pooling layer, activation layer and Softmax layer.Among them, the computational cost and memory usage of the convolutional layer and the fully connected layer are the most.The fully connected layer has the largest memory overhead for more than 87%.Therefore, this article only focuses on the convolutional and fully connected layer in the DNN model.

    3.2 Real Problem

    1) Model prediction.The current research work only considers the single-layer delay prediction models with different layer types under different configuration parameters.However, the authors show that, there are obvious prediction errors in evaluating the multi-layer delay through the single-layer delay accumulation method.We conduct real experiments to conduct a comprehensive analysis of the multi-layer delay prediction problem, and reveal the true relationship between the delay sum of each layer executed separately and the actual delay of the entire multi-layer execution on DNN models with different channel types.As the number of different channel types gradually increases,the similarity of DNN models gradually decreases.As shown in Fig.1, the abscissa represents the number of different channel types and the ordinate represents the reduction ratio of the overall execution delay compared to the individual execution delay summation.In the case of the same convolutional layer channel type,the overall execution delay is reduced by 50% compared with the delay summation executed separately.If the number of different channel types is large, it means that the convolutional layer has low similarity, and the delay of separate execution sum is approximately equal to the overall execution delay.This experiment provides persuasiveness for the development of a multi-layer delay prediction model,which is used to better guide the DNN task splitting and collaborative inference.

    Figure 1:Comparison of latency

    2) Equipment heterogeneity.First,measure the inference delay of five common DNN models on three variants of Raspberry Pi (Raspberry Pi 2B, Raspberry Pi 3B and Raspberry Pi 3B+).Five DNN models are executed on each model of Raspberry Pi device.The experimental results are shown in Fig.2.The bar graph represents the inferred latency,and the line graph represents the ratio of the execution latency of different devices.For example, the AlexNet model is used on the Raspberry Pi 2B.The inference delay required for the above execution is 1.66 s,while the execution delay on the Raspberry Pi 3B is reduced to 1.06 s,which is only the inference delay of the Raspberry Pi 2B 64%.It can be seen that;the difference of equipment capabilities will significantly affect the inference delay of DNN tasks.Moreover,as the amount of calculation of the DNN model increases,the difference in the inference delay caused by the execution of DNN tasks by different devices becomes more prominent.The inference delays of the VGG16 model executed on Raspberry Pi 2B and 3B are 11.68 and 5.24 s,respectively,and the execution speed is increased by about 2.23 times.This experiment shows that, the DNN splitting should consider the heterogeneous capabilities of the device,and make full use of the computing resources of the device to achieve the approximate optimal inference acceleration.For this reason,it is necessary to design an accurate model to analyze the impact of equipment heterogeneous capabilities on the DNN inference delay.

    Figure 2:Comparison of latency of various deep neural networks models

    4 IoT-CDI Model

    4.1 System Model

    The schematic diagram of the IoT-CDI scenario is shown in Fig.3.It is assumed that,there is a group of IoT devices with heterogeneous capabilitiesN= {1, 2,...,N}.Each devicedevigenerates a DNN inference taskmwith a certain probability.The DNN task inference is carried out layer by layer, the output of the previous layer is the input of the next layer, and the task is terminated when all layers are executed.Suppose a DNN inference taskmcontainsKlayers, and each layer is considered a subtask.For a DNN inference task such as video recognition, usually a series of data frames are continuously input to the DNN model for inference,and the sampling rate is assumed to beQframes/second.

    Figure 3:Proposed system model

    Given the number of available devicesNand the number of DNN subtasks(number of layers)K,the goal is to find the split position of the DNN task and the optimal task allocation of these devices.For each subtaskk, find an IoT devicedevito execute it.After each IoT devicedeviexecutes the assigned computing task (some layers of the DNN task), the output data generated is transmitted to the device that performs the next layer task until the DNN task inference is completed.The research goal is to minimize the overall execution delay of the DNN tasks.If all subtasks are executed on one IoT device,the limited resources of a single IoT device will cause a long calculation delay.However, if tasks are distributed to multiple IoT devices, the communication delay increases significantly.Therefore,it is necessary to split and allocate the DNN tasks reasonably,effectively weigh communication and calculation delays,and minimize the overall inference delay of DNN tasks.

    4.2 Problem Description

    Figure 4:Schematic flow of the IoT-CDI

    The IoT-CDI problem can be transformed into an optimal path problem from the first layer to theK-th layer.The problem is expressed as:

    Eq.(1) indicates that, if the (k+1)th layer is allocated to the IoT devicedevj, an edge starting from the IoT devicedevjneeds to be selected.Eq.(2) represents the memory limit of each device.Eq.(3)ensures that each layer is executed by only one device.In addition,the DNN inference is usually composed of multiple input data streams,so the optimization goal needs to be data stream-oriented.Once the DNN splitting strategy is determined,each frame needs to be processed in order according to the strategy.We introduce the concept of pipeline processing as shown in Fig.5.Specifically, for two consecutive data frames,the IoT devicedevifirst completes the task assigned by the data frame 1,and when the data frame 2 arrives,the IoT devicedeviwill immediately execute the task of the data frame 2.Obviously, the bottleneck of pipeline processing is the maximum value ofTijk, which is the device with the longest processing time for a single frame.This fact is verified through experiments.The VGG16 model is divided into three parts and executed on different devices.The time for each device to execute one frame is 2.374,7.768 and 1.456 s,and a single frame is executed.The maximum inferred delay of the three devices during the task is 7.768 s,and the total execution delay of 100 frames in the experimental test is approximately equal to 100×7.768 s.In order to enable the DNN split and task allocation strategy to support the pipeline processing,the delay calculation formula is modified to the maximum value of the individual execution delay of each IoT device.The method of multidevice cooperative execution of DNN tasks proposed in this paper aggregates the computing power of multiple devices and makes full use of the concurrent processing capabilities,which can effectively improve the overall throughput.It is achieved by adaptively splitting the DNN tasks among multiple IoT devices in real time with the goal of minimizing the total inference delay after processing all data frames.

    Figure 5:Processing illustration of deep neural network model

    4.3 Problem Solution

    First it is proved that, the IoT-CDI problem is NP-hard, and then use the known NP-hard problem—general assignment problem (GAP) to prove it [34,35].The GAP assumes that, there areMitems andNboxes,put itemiinto boxj,and get the incomeMi,j.The goal is to pack each item into an appropriate box,and maximize the overall revenue under the constraints of the cost of each box.Through parameter mapping and conversion, the IoT-CDI problem is reduced to a GAP problem,which proves that the problem is NP-hard.

    Since the IoT-CDI problem is NP-hard, it is difficult to obtain the optimal DNN splitting and collaborative inference strategy in polynomial time.Therefore, accurate algorithms such as enumeration are not suitable for solving this problem.In addition,due to the diversity of DNN model structures,heterogeneous equipment capabilities and dynamic changes in communication status,it is necessary to adjust the collaborative inference strategy in real time.To this end,we adopt a data-driven artificial intelligence method to solve the problem, which can make real-time automated decisionmaking based on environmental information.Reinforcement learning(RL)is an effective data-driven method that continuously learns and guides behavior by interacting with the environment to obtain rewards to obtain the maximum benefits.In this paper, an enhanced learning algorithm is used to determine the optimal DNN splitting strategy, and to perform collaborative inference between heterogeneous devices to achieve inference acceleration.

    5 DNN Task Split Strategy

    In this section,we first elaborate and analyze the proposed accurate multi-layer delay prediction model through specific parameter configuration and a variety of typical prediction models.On this basis,the ERL algorithm is used to intelligently and adaptively determine the cooperative inference strategy between heterogeneous devices.

    5.1 Parameter Configuration of Convolutional Layer and Fully Connected Layer

    The convolutional layer includes input feature dimensions (input heightin_height, input widthin_width),convolution kernel size(kernel_height,kernel_width),channel size(in_channel,out_channel),stride and padding.The parameter configuration of the fully connected layer includes the input feature dimension (in_dim)and the output feature dimension (out_dim).The parameter configuration range is shown in Tab.2.The configurable parameters of each layer are generated by random combination,and the execution delay Y of each parameter combination is measured.Similar to[36],the interpretable parameter vectorXis determined according to the above model parameters,including floating point operations (FLOPs), memory footprint and parameter scale.The specific definition of the interpretable parameter vectorXis:X= (FLOPs, mem, param_size), wheremem=mem_in+mem_out+mem_inter,mem_inrepresents the input data occupancy scale,mem_outrepresents the memory occupancy scale of output data,mem_interrepresents the memory occupancy scale of temporary data,detailed definitions of memory and parameter characteristics can be found in[37].The CPU operations and memory operations affect the execution time of the program to a certain extent.In the DNN model,the CPU operations and memory operations are reflected in floating-point operations,memory footprint and parameter scale.A large number of[X,Y]data pairs are obtained through various parameter configuration combinations for delayed model training and prediction.

    Table 2: Layers parameters

    5.2 Multi-Layer Delay Prediction Model

    In this section,we conduct a comprehensive study on the multi-layer delay prediction model of the convolutional and the fully connected layer.The interpretable parameter vectorXof the multi-layer delay prediction model includes the number of layers,the sum of floating-point operations,memory footprint and parameter scales.In order to perform multi-layer predictive analysis, first generate a DNN model of any number of layers,and generate a characteristic parameter combination,execute on IoT devices with different computing capabilities to obtain the execution delay Y in the case of any number of layers and different parameter configurations.After obtaining the [X, Y] data pair,establish the correlation model of equipment capabilities, task characteristics and execution delay,study a variety of common predictive models to fit multi-layer input data and execution delay, and mine a variety of characteristic parameters and execution mapping relationship between delays.The coefficient of determinationR2, mean squared error (MSE) and mean absolute percentage error(MAPE)are used as the evaluation indicators of the accuracy of the prediction model.Also,study the linear regression(LR),RANdom SAmple Consensus regression(RANSAC),kernel ridge regression(KRR),k-nearest neighbor(KNN),decision tree(DT),support vector machine(SVM),random forest(RF),AdaBoostADA,gradient boosted regression trees(GBRT)and artificial neural network(ANN)models.

    Compared with the convolutional layer, the fully connected layer has a shorter execution time,fewer parameters and a small number of layers.For example,the AlexNet model only contains three fully connected layers, and the ResNet model only contains one fully connected layer.We prove through experiments that the error of the sum of the overall execution delay and the individual execution delay of the fully connected layer is less than 2%.Therefore,we only study the single-layer prediction model executed by the fully connected layer on different devices,and compare the prediction performance of different prediction models on the fully connected layer.From Tab.3,it can be seen that a variety of prediction models can predict fully connected execution delay of the layer.For the convolutional layer,due to the many types of input feature parameters,the wider configuration range,the number of execution layers and the complex coupling relationship between feature parameters,the delay prediction is relatively increased.Adding the ANN prediction model,because the neural network can effectively obtain the nonlinear relationship and has strong generalization and fitting ability,and can obtain an approximate actual model without assuming the mapping relationship between the feature variable and the result.

    Table 3: Comparative performance of different algorithms for single-layer

    Taking Raspberry Pi 3B as an example,Tab.4 compares the performance of different multi-layer delay prediction models for the convolutional layer.It can be seen from Tab.4 that,the performance of the three prediction models of RF, GBRT and ANN is better than other models.For example,compared with the RANSAC model and the ADA model, the MAPE index of the ANN model is reduced by 43% and 81%, respectively.The experiment in Section 6 further verify the accuracy of these three multi-layer prediction models.

    Table 4: Performance comparison of the algorithms for conv multi-layer

    Table 4:Continued

    5.3 DNN Task Splitting Strategy Based on Evolutionary Reinforcement Learning

    1) Description

    Reinforcement learning (RL) is an effective machine learning algorithm for decision making.Agents can observe the state of the environment and learn which behaviors can obtain better returns.At each time stept, the agent observes the current environment statest, and chooses a behaviorataccording to the strategyπ∈(at|st).The instantaneous profitrtis obtained after the execution of the behavior, and the state transition is performed according to the state transition probability environment,and the state is adjusted tost+1.The goal of the agent is to obtain the optimal strategy to maximize the cumulative discounted incomeand the discount factor isγt.Strategy learning is based on the behavior value function,which is defined as the expected value of the cumulative discounted income that each state behavior can obtain,and is calculated as:

    The goal of reinforcement learning is to find the optimal strategy to maximize the behavior value,which can be expressed as

    Deep reinforcement learning (DRL) [38] is proposed to solve the curse of dimensionality.DRL uses a DNN to approximate theQfunctionQ(st,at)≈Q(st,at|θ), whereθrepresents the model parameters of the neural network.Deep Q-network (DQN) is a typical DRL method [39].DQN stores the experience tuples in the experience pool,each time a batch of samples are randomly selected from the experience pool for training, and then the parameterθis updated to minimize the loss function.However, the DQN method based on back propagation cannot be optimized for a long time,and it is difficult to learn the optimal behavior when the reward is sparse(a series of behaviors can be used to obtain benefits).In addition,in the face of high-dimensional action and state spaces,efficient exploration is still a key challenge that needs to be solved urgently.In this case, there is a challenge of difficulty in convergence.In summary, DQN is a traditional DRL algorithm which faces important challenges such as sparse rewards, lack of effective exploration, and difficulty in convergence.Therefore,traditional DRL algorithms(such as DQN)cannot be directly applied to solve the IoT-CDI problem,because the problem behavior is decomposed into continuous sub-behaviors,there are problems such as sparse rewards and huge behavior state space, and convergence is very difficult.For this reason,the evolutionary ERL algorithm[40]is proposed to realize the DNN splitting and collaborative inference among heterogeneous devices.

    2) DNN Task Splitting Strategy Based On ERL

    From the perspective of DRL,the device used to determine the DNN splitting strategy is modeled as an agent.In order to reduce the dimensions of the state and behavior space, the DNN split task is decomposed into hierarchical sequence subtasks, and each layer is treated as a subtask.In each decision-making, you only need to select the appropriate execution equipment for each layer of the model.The behaviors of each layer obtain the overall behavior set,and perform DNN task splitting and collaborative inference according to the behavior set.The DNN task execution delay is used as the benefit to measure the performance of the behavior set.First define the basic elements of the state,behavior,and return of the problem.

    1) State.At each timet,the statestcontains 5 parts:

    i)ftrepresents the current number of layers;

    ii)comtrepresents the current network status,that is,the communication rate

    iii)ct={c1,t,c2,t,...,cN,t}represents the capability of each IoT device;

    iv)lt= {l1,t,l2,t,...,lN,t} represents the cumulative delay required for each IoT device to complete the pre-allocated subtask;

    v)et={e1,t,e2,t,...,eN,t}represents the inferred delay caused by the execution of the current subtask assigned to each IoT device.From the above description,we can see thatst=(ft,comt,ct,lt,et),The state dimension is 3N+2.

    2) Behavior.atmeans to select a device fromNIoT devices to perform the current subtask.

    3) Revenue.If the current subtask is the last one,the revenue is the overall inferred delay of the DNN task(for the data flow situation,the revenue is the maximum value of the delay required for each IoT device to perform its own task),otherwise the revenue is zero.

    The DRL algorithm based on back propagation is difficult to obtain the optimal strategy for this problem,because this problem faces challenges such as sparse rewards and difficult exploration.Compared with the traditional DRL method,the ERL integrates the population-based method in the natural evolution strategy,which makes diversified exploration possible,and uses fitness indicators to learn and generate better offspring,so that multiple strategies can be effectively explored,and continue to evolve towards high returns.

    The ERL process is as follows: Apply evolution to the candidate sample population, and continuously generate new offspring by increasing the random deviation.By performing the selection operation, the offspring with a higher fitness value have more chances to retain and produce new offspring.The higher the fitness value, the better the performance, and the next generation by the selection operation will provide better performance.In this article, each sample represents a set of parameters of the neural network,and the random deviation added to the offspring represents random disturbance to the weight of the neural network.

    The overall algorithm flow is shown in Algorithm 1.

    Algorithm 1:ERL DNN algorithm Input:random weight θ of behavior value function Q,parent weight θ,number of children C,learning rate η;Output:Parent weight θp.1:for episode : =1,2,...E do 2:Initialize state s 3:For i in range C do(Continued)

    Algorithm 1:Continued 4:θi =θp+noise 5:Select the behavior ai and observe the revenue ri 6:Calculate the average return r and calculate the gain of each offspring gi=ri-ˉr 7:θp =θp+η×C∑i=1gi×θi 8:End for 9:End for

    In Algorithm 1, the parameters are initialized at the beginning.Then describe how to update the neural network during training.Specifically,the parent neural network generatesCchild neural networks by perturbing the parameters of the neural network,and evaluates the income value obtained by each child during each iteration,that is,the fitness value.If a child has a higher fitness value,then the child is selected with a higher probability and the offspring is generated.Calculate the gain value of each child by normalizing the difference between the income value obtained by each child and the average income value of all children.Update the parameters of the parent neural network according to the gain value g ofCchildren(steps 3~9).

    6 Proposed Framework

    The overall process diagram of the IoT-CDI framework is shown in Fig.6, which includes two stages of offline training and online execution.The offline stage generates a multi-layer delay prediction model and completes the training process of the ERL algorithm.The online stage dynamically determines the split location and based on the system state.Task allocation,multiple devices cooperate to perform DNN tasks together.The topological structure of different DNN tasks is different, the calculation amount of each layer and the amount of intermediate data transmission generated are different, network status changes directly affect the data transmission delay, and the heterogeneity of equipment capabilities significantly affects the calculation delay.So it needs to be based on these dynamic factors, automatically adjust the DNN task splitting and allocation strategy to effectively reduce the inference delay.The IoT-CDI framework can determine the split location of the DNN model and the task assignment of each device according to the current system status, including communication status, device capabilities, and DNN task requirements, and realize distributed and collaborative DNN task inference among heterogeneous devices.It deploys a master device(IoT device or gateway)to manage and control the entire process.

    6.1 Offline Training Phase

    In this stage,the training of multi-layer delay prediction model and ERL split strategy training are mainly carried out.For the two types of convolutional and fully connected layer,the delay prediction model under the condition of arbitrary multi-layer different parameter configurations is described,which allows accurate evaluation of the actual execution delay of the inference task without executing the DNN task.Due to different layer types,layer parameter configurations and the number of layers will have obvious delay differences.So build different layer types of prediction models(convolutional layer and fully connected layer),change the number of layers and each layer parameter of each layer type configure,use these parameters to determine the calculation scale and data transmission scale,and analyze the impact of different device capabilities on execution delay when the parameter configuration is the same.Real measurement data of parameter configuration,equipment capability and execution delay are obtained through experiments,and the prediction model training is carried out based on the data.A variety of common prediction models,involving regression,k-nearest neighbors,decision trees,combination and artificial neural network models and other types of models are analyzed.Through experiments,it is found that there are fewer types of parameters in the fully connected layer,and the prediction is relatively simple, and many models can obtain accurate prediction performance.The convolutional layer has many parameter types and complex configurations,so the performance of the prediction model with strong generalization and nonlinear fitting capabilities is more accurate.It is worth noting that, by mapping the model parameters to the calculation and transmission scale and analyzing the impact of different device capabilities on the execution delay, the proposed prediction model is independent of the DNN model and related to the device capabilities and can be adapted to heterogeneous devices.When the DNN model structure and parameters change,it can quickly obtain accurate execution delay based on the prediction model, avoiding additional execution overhead.Based on the generated multi-layer delay prediction model, the ERL algorithm is trained in order to obtain the approximate optimal DNN task splitting and collaborative inference strategy when the DNN model,network status and device capabilities dynamically change.The status information of the ERL model includes model parameters, number of layers, communication status, and device capabilities.The behavior strategy is to determine the execution equipment of each layer of the DNN model.After training 2,000 times to reach convergence,the ERL model after training is stored on the main device,and then the best split strategy is determined based on the input system state.

    Figure 6:Proposed framework

    6.2 Online Execution Phase

    This stage includes three steps:1)The system profiler obtains the current system status,including DNN inference tasks,current communication status and device capabilities,etc.;2)This information is fed back to the decision maker,and it uses offline training to complete the completed multi-layer delay prediction model evaluates the inference delay of each candidate decision,and uses the ERL split model that is also trained in the offline phase to obtain the optimal split strategy to achieve DNN inference acceleration and device resources among heterogeneous multiple devices.3) Each device executes its assigned tasks according to the split strategy.

    IoT devices need to communicate with each other to transmit commands and data.In order to effectively identify the device,each IoT device needs to register an IP address.After knowing the DNN task splitting and allocation strategy, maintain each device’s own IP processing table, which records the inference tasks assigned to it and the predecessor and successor nodes of its own task.The master device maintains the overall IP processing table,which records the execution tasks of each device.Once the system status changes, for example, the communication rate changes or new equipment joins or exits, it will trigger the adjustment of the split strategy, and the master node will update the record.Then the master node distributes the updated information of the IP processing table to all devices,and each device modifies its own IP processing table according to the updated information.

    The DNN inference process is executed according to the IP processing table.An IoT device will receive the input data required for calculation from the predecessor device, and send the generated output result to the successor device after completing the assigned task.In order to realize the above process,the remote procedure call(RPC)is deployed to realize the interaction between devices,which can communicate and transmit data between two devices.Taking the VGG model as an example,suppose that device 1 implement the 1~5 layers of the VGG model,and its successor device implements the 6–10 layers of the VGG model for device 2.After device 1 completes the assigned number of layers,it sends the generated output result to the subsequent device 2, and the two devices jointly execute the DNN tasks according to the strategy.When the environment status changes,adjust the split and allocation strategy according to the ERL algorithm.For example,device 1 executes layer 1~7,device 2 executes layer 8~10,you need to update the IP processing table of each device,modify the allocation task and the predecessor and successor node.

    7 Experimental Verification

    We use real experiments to verify the proposed IoT-CDI framework.First, it is proved that the proposed multi-layer delay prediction model is accurate.Then, compared with the benchmark algorithm,it is found that the proposed ERL method can significantly reduce the inference delay and realize the acceleration of inference.In addition, we also evaluate the influence of factors such as communication status and the number of devices on the performance of the experiment.

    7.1 Experimental Setup

    1) Device type.Three types of Raspberry Pi devices are used as heterogeneous IoT devices,namely Raspberry Pi 2B, Raspberry Pi 3B and Raspberry Pi 3B+, using Raspbian GNU/Linux10 buster operating system.Different models of Raspberry Pi have different computing capabilities,providing differentiated inference performance.The specifications of different models of Raspberry Pi are shown in Tab.5.In order to perform DNN tasks on the Raspberry Pi,we install basic software and platforms such as Python 3.7.3,Keras 2.2.4 and Tensorflow 1.13.1.

    Table 5: Configuration of Raspberry Pi

    2) DNN model.Five common DNN models are used, namely AlextNet, DarkNet, NiN,ResNet18 and VGG16.The VGG16 represents the long DNN model (with more layers),and AlexNet represents the short DNN model (with fewer layers).The AlexNet model and the ResNet18 model are less computationally intensive,while the VGG16 model and the NiN model are more computationally intensive.The type of calculation is relatively large, but the communication volume of the VGG16 model is relatively small, and the communication volume of the NiN model is relatively large.

    3) Communication method.The average transmission rate between IoT devices is used to simulate different wireless networks.The experiment sets up 3 kinds of network environments, 3G network,WiFi and 4G network,the transmission rate is 1.1,18.88 and 5.85 Mbps respectively.

    4) Benchmark algorithms.We consider four comparison algorithms.The device-execution(DE)algorithm refers to the execution of DNN tasks only on the local device that generates the task.The maximum-execution (ME) algorithm refers to assigning DNN tasks to computing the most capable equipment.The equal-execution(EE)algorithm refers to the equal distribution of DNN tasks to all available devices.The classic shortest path Dijkstra algorithm obtains the shortest execution delay from the first to the last layer of the DNN model, and uses a single-layer prediction model to determine the weight of each edge, which is represented as short-execution(SE).The DE algorithm is used here as a benchmark and the proposed ERL algorithm is evaluated accordingly.

    7.2 Forecast Model Accuracy

    Delay prediction data set: For the two layer types of convolutional and fully connected layer,different parameter ranges are set respectively.The layer parameters and parameter ranges are shown in Tab.2.Various configurable parameter sets are generated through random combination, and the configuration parameters are converted for floating-point operations, memory footprint and parameter scales and other related interpretable variables.Then,obtain the execution delay of three models of Raspberry Pi devices under different parameter settings, and determine the parameter settings and execution of the convolutional layer and the fully connected layer real time delay measurement data set.For multi-layer delay prediction, the multi-layer parameter configuration is generated according to the actual principles of DNN model.The number of layers’ranges from 1 to 40.The parameter configuration of each layer is converted into an explainable variable, and the accumulated explanatory variable is obtained by adding layer by layer,and through the tree execution of Raspberry Pi gets multiple inference delays.Based on the multi-layer parameter configuration and inferred delay data set obtained by real measurement, a variety of common prediction models are trained, and the convolutional layer and the fully connected layer are respectively predicted.The prediction performance of different prediction models is shown in Tabs.3 and 4.

    The following verifies the accuracy of the multi-layer delay prediction model of the convolutional layer.Take VGG16 and AlexNet as examples,as shown in Figs.7 and 8,respectively,the histogram represents the actual execution delay of the experimental measurement.For example,when the abscissa is 7, it means that the delay required to execute the first seven layers of the VGG16 model is 6.08 s.The line graph shows the prediction performance of different prediction models, and the MAPE is used as the evaluation index.It can be seen from Figs.7 and 8 that, the three prediction models of RF, GBRT and ANN can accurately predict the inference delay of any number of layers, and the average percentage error of the prediction results of any layer of the three models is less than 4%.The main reason for accurate prediction is to accurately describe the model parameters that can affect the inference delay, and map these parameters into explanatory variables such as calculation scale and communication scale.The above three prediction models have good hierarchical fitting and generalization capabilities,and can effectively obtain the complex nonlinear relationship between feature variables and delay.In addition,further consider the impact of equipment capabilities on the inference delay,and obtain each type of equipment.The real data set of parameter configuration and execution delay, and the prediction model training for each device, so as to accurately predict the inferred delay of various devices under different parameter settings.

    Figure 7:Comparison of latency and accuracy using VGG16

    Figure 8:Comparison of latency and accuracy using AlexNet

    7.3 Performance Comparison

    1) DNN split.Fig.9 shows different splitting strategies of three typical DNN models.It can be seen from Fig.9 that, the DNN splitting strategy varies with the change of the DNN model and the number of devices.The VGG16 model has a large amount of calculation and a small amount of data transmission,so it tends to use more IoT devices to obtain better performance.The NiN model has a large amount of calculation and data transmission and excessive communication overhead cause performance degradation.Therefore, the NiN model tends to adopt fewer devices to cooperate to reduce the communication overhead.The ResNet18 model has a small amount of calculation,and it is necessary to consider whether the reduced computational overhead of collaborative inference can offset the increased communication overhead.Therefore,the split strategy of the ResNet18 model needs to weigh the computational gain and communication overhead.From this, it can be concluded that the DNN splitting strategy needs to be adaptively adjusted according to the characteristics of the DNN model and the environmental state.

    2) Delayed acceleration.We compared the delay acceleration of five algorithms for different DNN models, set the number of devices to three, and the communication mode to WiFi.It can be seen from Fig.10 that,compared with the DE,ME,EE and SE algorithms,our proposed ERL algorithm has different degrees of improvement.

    Figure 9:Partitioning comparison of algorithms

    Figure 10:Comparison of latency of the proposed and existing algorithms for different neural networks models

    As the computing demand increases,the performance improvement becomes more obvious.For example,the VGG16 model uses the ERL algorithm and the delay acceleration is about twice that of the DE algorithm.Mainly because of the limited resources of IoT devices,the performance of separate execution is poor when the amount of calculation is large,and the demand for DNN task splitting is stronger.However, when the amount of data transmission is large, the higher communication delay caused by DNN task splitting will seriously reduce the advantage of cooperative execution, so the delay acceleration is not obvious in the NiN model.

    Since a single IoT device cannot bear the heavy computational burden, the performance of DE and ME algorithms are not ideal.Although the EE algorithm can benefit from collaborative inference,the average decision is not the optimal split strategy.Due to the inaccuracy of single-layer prediction,the performance of SE algorithm is not ideal.The proposed ERL algorithm can effectively balance the computational and communication costs,make full use of the heterogeneous capabilities of the device,and can achieve better DNN inference acceleration.

    7.4 Adaptability to Environmental Conditions

    1) Influence of communication status.This experiment evaluates the influence of communication status on delay acceleration.Under 3G,4G and WiFi communication conditions,the performance of the five algorithms is compared with the VGG16 model as an example, as shown in Fig.11.It is worth noting that, when the communication rate increases, the performance of the proposed ERL algorithm improves more significantly than the benchmark algorithms.When using a 3G network,the communication conditions are poor,and the computational gain generated by the cooperative execution is difficult to offset the communication cost generated by the data transmission.Therefore,the performance of VGG16 model EE algorithm is lower than DE algorithm.When using a 4G network,the delay acceleration of the ERL algorithm is 2.07 times that of the DE algorithm.When using WiFi for communication, the latency is increased to 2.36 times.The main reason is that,when the communication conditions are good,the data transmission delay required for DNN splitting is reduced,so the cooperative execution advantage is more obvious.

    Figure 11:Comparison of latency of various communication networks of the schemes

    In order to further verify that the proposed ERL algorithm can adapt to various communication states,the communication rate is set from 1 to 20 Mbps,and the VGG16 model is taken as an example to compare the delay acceleration performance of the different algorithms.Through experiments,it is found that, the performance of the proposed ERL algorithm is optimal at any communication rate, and it is inferred that the delay is reduced by more than two times.It can be seen from Fig.12 that with the increase of communication rate, the delay acceleration becomes more obvious.This is because,the increase of communication rate can effectively reduce the communication cost caused by splitting,thereby reducing the overall execution delay.The DE,ME and EE algorithms cannot adjust the split strategy according to the network status, so as the communication rate increases, the delay acceleration performance improvement is not obvious.The proposed ERL and the SE algorithms can effectively balance the communication and calculation overhead according to the current network state, thereby achieving significant inference acceleration as the communication rate increases, and effectively reducing the inference delay of the DNN task.

    Figure 12:Latency vs.data rate of comparison of the proposed and existing schemes

    2) Influence of the number of equipment.We deploy different numbers of IoT devices to evaluate the performance of five algorithms.Taking the NiN model as an example, it can be seen from Fig.13 that, the proposed ERL algorithm has the best performance in terms of delay acceleration.When the number of devices is 2,3,and 4,the delay acceleration of the proposed algorithm is 1.81 times, 1.98 times and 5.28 times that of the EE algorithm, respectively.Since the communication cost of the NiN model cannot be ignored,the EE algorithm cannot flexibly adjust the split strategy, and it is difficult to effectively weigh the calculation and communication cost.The proposed ERL algorithm can intelligently determine the splitting strategy to obtain the approximate optimal performance.

    7.5 Complexity Analysis

    Fig.14 compares the computational complexity of the proposed and existing algorithms.It can be seen from Fig.14 that,the complexity of all the algorithms increases with increasing the number of neurons in the layer.However,the complexity of the proposed algorithm is lower than all algorithms which makes it practicable and effective in the IoT-DNN deployment.As a result, the proposed algorithm outperforms the typical neural network in terms of complexity optimization.

    Figure 13:Comparison of latency vs.number of IoT devices

    8 Discussion

    1) Equipment heterogeneity.The experiment uses different models of Raspberry Pi devices to reflect the heterogeneity of the device.The performance differences of the three models of Raspberry Pi are shown in Tab.4.Five DNN models, including AlexNet and DarkNet, are run on the three models of Raspberry Pi.Experiments of the measurement data are shown in Fig.2.Through experiments, it can be seen that, there are obvious performance differences between the three types of equipment,which can reflect the heterogeneity of equipment.In the follow-up,we will consider various types of devices such as Raspberry Pi,mobile phones and wearable devices, and analyze the differences.The performance difference of different types of equipment,the fine-grained modeling of equipment capabilities,on this basis,the study of cooperative inference issues between multiple types of equipment.

    2) The number of equipment.The capacity of a single IoT device is insufficient, and the cooperative execution of multiple devices can effectively reduce the inference delay.However,increasing the number of cooperative devices will reduce the computational delay and increase the communication delay overhead.In order to prevent communication bottlenecks, the number of devices for cooperatively executing the DNN model will not be too much.From Fig.9,it can be found that,for DNN models with a large amount of data transmission(such as the NiN model),even if there are more available devices,they tend to use a few devices.Even if the DNN model(such as the VGG16 model)with a small amount of data transmission and a large amount of calculation,the number of cooperative execution devices will not be too much.

    3) The practicality of the IoT-CDI framework.The IoT-CDI framework mainly solves two problems:i)Aiming at the problem that the error of the existing single-layer prediction method cannot be ignored,a fine-grained multi-layer prediction method is designed,which can accurately evaluate the inference delay of any layer DNN task.ii)For equipment capabilities,the DNN task characteristics and dynamic changes in network status and heterogeneous conditions,an intelligent decision-making algorithm based on reinforcement learning is adopted.In order to overcome problems such as sparse returns and convergence difficulties, the evolutionary reinforcement learning is used to quickly obtain splitting strategies.The proposed IoT-CDI framework uses a data-driven approach to achieve accurate predictive analysis and real-time intelligent decision-making.However, compared with traditional methods, there are more obvious system overhead(requires storage models),scalability,and online adjustment.Future work will focus on the practicality of the framework to solve the problems existing in actual deployment to improve the feasibility.

    9 Conclusion

    This paper proposes a novel IoT-CDI framework for IoT devices to collaborate and perform DNN tasks.According to the DNN task requirements and device capabilities, a variety of factors, such as power and network status, realize real-time adaptive DNN task collaboration inference among heterogeneous IoT devices.Specifically, a multi-layer delay prediction model with different layer types,parameter configurations and device capabilities is proposed,which can accurately predict the inference delay of DNN tasks in different split situations.In addition,an intelligent DNN task splitting and collaborative inference algorithm based on evolutionary reinforcement learning is proposed,which can obtain an approximate optimal strategy in the case of heterogeneous and dynamic changes in device capabilities,network status,and task requirements.The experimental results show that,the proposed algorithm can effectively balance the communication and the calculation delay, make full use of the computing power of the equipment,and significantly reduce the DNN inference delay.In the future, we will further study the optimal value of cooperation equipment required by different DNN models.When the number of equipment is sufficient, the number of cooperation equipment can be adaptively adjusted according to the DNN task requirements to achieve the optimal inference acceleration.Future work is to address the challenges such as the scenario when the number of IoT devices is enormous and the inference delay is variable and latency is accumulating.

    Acknowledgement:The authors would like to thanks the editors and reviewers for their review and recommendations.

    Funding Statement:The authors received no specific funding for this study.

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    精品免费久久久久久久清纯| 精品乱码久久久久久99久播| 亚洲av日韩精品久久久久久密| 国产亚洲av嫩草精品影院| 久久性视频一级片| 97超级碰碰碰精品色视频在线观看| 50天的宝宝边吃奶边哭怎么回事| 在线观看免费视频日本深夜| 在线观看免费视频网站a站| 俄罗斯特黄特色一大片| 在线av久久热| 黑人操中国人逼视频| 每晚都被弄得嗷嗷叫到高潮| 女生性感内裤真人,穿戴方法视频| 大香蕉久久成人网| 日韩大尺度精品在线看网址 | av网站免费在线观看视频| 两人在一起打扑克的视频| 日日摸夜夜添夜夜添小说| 精品国产超薄肉色丝袜足j| 少妇被粗大的猛进出69影院| 久久人妻熟女aⅴ| 操美女的视频在线观看| 欧美 亚洲 国产 日韩一| 精品人妻1区二区| 国产欧美日韩精品亚洲av| 亚洲人成电影观看| 亚洲欧美日韩高清在线视频| 国产99白浆流出| 亚洲视频免费观看视频| 成人免费观看视频高清| 日韩欧美一区二区三区在线观看| 免费观看人在逋| 19禁男女啪啪无遮挡网站| 狠狠狠狠99中文字幕| 香蕉丝袜av| 男女下面插进去视频免费观看| 日本a在线网址| 成人免费观看视频高清| 少妇 在线观看| 久久草成人影院| 久久精品国产99精品国产亚洲性色 | 黄网站色视频无遮挡免费观看| 悠悠久久av| 亚洲国产毛片av蜜桃av| 欧美激情 高清一区二区三区| 99在线人妻在线中文字幕| 精品免费久久久久久久清纯| 真人一进一出gif抽搐免费| 十分钟在线观看高清视频www| 日韩欧美国产在线观看| 色综合婷婷激情| 在线观看66精品国产| 亚洲人成网站在线播放欧美日韩| 亚洲久久久国产精品| 亚洲成av人片免费观看| 国产高清videossex| 免费久久久久久久精品成人欧美视频| av中文乱码字幕在线| 超碰成人久久| 国语自产精品视频在线第100页| 亚洲专区字幕在线| 一级毛片高清免费大全| 欧美日韩瑟瑟在线播放| 精品一区二区三区视频在线观看免费| 免费少妇av软件| av有码第一页| 啪啪无遮挡十八禁网站| 99精品久久久久人妻精品| 国产亚洲欧美精品永久| 成人欧美大片| 九色亚洲精品在线播放| 嫩草影视91久久| 久久狼人影院| 亚洲av日韩精品久久久久久密| 一级片免费观看大全| 老司机福利观看| 一边摸一边抽搐一进一出视频| 国产不卡一卡二| 久久亚洲真实| 精品国产乱子伦一区二区三区| 黄色 视频免费看| 亚洲精品国产精品久久久不卡| 99精品久久久久人妻精品| 99国产精品99久久久久| 国产xxxxx性猛交| 亚洲黑人精品在线| 精品一区二区三区av网在线观看| 97碰自拍视频| 国产精品久久久久久人妻精品电影| 国产精品久久电影中文字幕| 欧美日韩黄片免| 亚洲熟妇熟女久久| bbb黄色大片| 国产亚洲精品综合一区在线观看 | 无限看片的www在线观看| 国产精品久久久av美女十八| 一本久久中文字幕| 啦啦啦韩国在线观看视频| 手机成人av网站| 久久久久国产一级毛片高清牌| 亚洲一卡2卡3卡4卡5卡精品中文| 在线观看免费日韩欧美大片| 黄色片一级片一级黄色片| 69精品国产乱码久久久| 日韩高清综合在线| 女人精品久久久久毛片| 91老司机精品| 99国产精品一区二区三区| 99久久久亚洲精品蜜臀av| 男女做爰动态图高潮gif福利片 | 午夜福利视频1000在线观看 | 国产精品久久久av美女十八| 在线天堂中文资源库| 免费搜索国产男女视频| 三级毛片av免费| xxx96com| 国产精品99久久99久久久不卡| 亚洲欧美日韩无卡精品| 淫秽高清视频在线观看| 后天国语完整版免费观看| 操出白浆在线播放| av天堂久久9| 两个人免费观看高清视频| 久久人人精品亚洲av| 国产成人系列免费观看| 久久婷婷人人爽人人干人人爱 | 欧美成狂野欧美在线观看| 欧美中文日本在线观看视频| 看黄色毛片网站| 精品一区二区三区视频在线观看免费| 久久亚洲精品不卡| 性色av乱码一区二区三区2| 女警被强在线播放| 怎么达到女性高潮| 久热爱精品视频在线9| 美女免费视频网站| 久久精品人人爽人人爽视色| 免费看十八禁软件| 一卡2卡三卡四卡精品乱码亚洲| 亚洲国产日韩欧美精品在线观看 | 国产私拍福利视频在线观看| 两性午夜刺激爽爽歪歪视频在线观看 | 动漫黄色视频在线观看| 一级a爱片免费观看的视频| 久久久久久大精品| 成人手机av| 日本在线视频免费播放| 欧美国产精品va在线观看不卡| www.自偷自拍.com| 黄色视频不卡| 久久性视频一级片| 日韩欧美国产在线观看| 国产精品99久久99久久久不卡| 丁香欧美五月| 精品国产乱码久久久久久男人| 国产成人一区二区三区免费视频网站| 高潮久久久久久久久久久不卡| 亚洲 欧美 日韩 在线 免费| 亚洲成人国产一区在线观看| 中国美女看黄片| 欧美国产日韩亚洲一区| 老熟妇乱子伦视频在线观看| 午夜a级毛片| av网站免费在线观看视频| 中国美女看黄片| 国产高清videossex| 在线观看免费视频日本深夜| 丰满的人妻完整版| 男人的好看免费观看在线视频 | 嫩草影院精品99| 欧美乱妇无乱码| www.熟女人妻精品国产| 午夜日韩欧美国产| av免费在线观看网站| 日韩欧美国产一区二区入口| 老汉色av国产亚洲站长工具| 波多野结衣一区麻豆| 精品久久久久久久久久免费视频| 国产精品一区二区精品视频观看| 国产xxxxx性猛交| 母亲3免费完整高清在线观看| 国产单亲对白刺激| 黄色 视频免费看| 亚洲中文av在线| 欧美人与性动交α欧美精品济南到| 国产成人欧美在线观看| 中文字幕高清在线视频| 黑人巨大精品欧美一区二区蜜桃| 久久精品国产亚洲av香蕉五月| 成在线人永久免费视频| 欧美人与性动交α欧美精品济南到| 色尼玛亚洲综合影院| 国产亚洲精品一区二区www| 亚洲中文字幕一区二区三区有码在线看 | 日韩中文字幕欧美一区二区| 日韩 欧美 亚洲 中文字幕| 老汉色av国产亚洲站长工具| 国产成人精品久久二区二区91| 热99re8久久精品国产| 久久人妻av系列| 午夜激情av网站| 欧美日本亚洲视频在线播放| 免费av毛片视频| 视频区欧美日本亚洲| 成人免费观看视频高清| 中亚洲国语对白在线视频| 母亲3免费完整高清在线观看| e午夜精品久久久久久久| 99精品在免费线老司机午夜| 国产精品免费视频内射| 97人妻精品一区二区三区麻豆 | 999久久久国产精品视频| 黄色视频不卡| 一进一出抽搐动态| 精品国产乱子伦一区二区三区| 欧美不卡视频在线免费观看 | www国产在线视频色| 搞女人的毛片| 亚洲少妇的诱惑av| 露出奶头的视频| 久久久国产成人免费| 最近最新中文字幕大全电影3 | 日韩欧美在线二视频| 欧美 亚洲 国产 日韩一| 久久国产乱子伦精品免费另类| 成人免费观看视频高清| 桃红色精品国产亚洲av| 亚洲精品国产一区二区精华液| 午夜福利影视在线免费观看| 成人国产一区最新在线观看| 男女床上黄色一级片免费看| 99香蕉大伊视频| 精品高清国产在线一区| 悠悠久久av| 国产成年人精品一区二区| 日本 欧美在线| 日韩欧美在线二视频| 亚洲自偷自拍图片 自拍| 亚洲少妇的诱惑av| 亚洲精品美女久久av网站| 亚洲色图 男人天堂 中文字幕| 男人操女人黄网站| av视频在线观看入口| 亚洲欧美精品综合久久99| 在线观看66精品国产| 十八禁网站免费在线| 欧美中文日本在线观看视频| 亚洲欧美精品综合一区二区三区| 午夜老司机福利片| 久久 成人 亚洲| 亚洲成人精品中文字幕电影| 国产亚洲精品久久久久5区| 精品久久久久久成人av| 999久久久精品免费观看国产| 嫩草影院精品99| 麻豆国产av国片精品| 精品久久久久久久久久免费视频| 91成人精品电影| 天堂动漫精品| 欧美成人免费av一区二区三区| 黑人巨大精品欧美一区二区mp4| 午夜免费观看网址| 制服诱惑二区| 脱女人内裤的视频| 国产精品久久久久久亚洲av鲁大| 亚洲精品美女久久av网站| 视频在线观看一区二区三区| 久久伊人香网站| 一边摸一边抽搐一进一出视频| 一a级毛片在线观看| 色播亚洲综合网| 久久久久精品国产欧美久久久| 亚洲av五月六月丁香网| 无遮挡黄片免费观看| 久久人妻福利社区极品人妻图片| 日本撒尿小便嘘嘘汇集6| 久久精品国产综合久久久| 搡老岳熟女国产| 亚洲精品中文字幕一二三四区| 黄色片一级片一级黄色片| 可以在线观看毛片的网站| 一区二区三区高清视频在线| 亚洲自拍偷在线| 18禁美女被吸乳视频| 日本撒尿小便嘘嘘汇集6| 色综合婷婷激情| 乱人伦中国视频| av有码第一页| 69精品国产乱码久久久| 亚洲色图综合在线观看| 一进一出抽搐gif免费好疼| АⅤ资源中文在线天堂| 极品人妻少妇av视频| 一级毛片高清免费大全| 大型黄色视频在线免费观看| 国产av精品麻豆| 法律面前人人平等表现在哪些方面| 亚洲五月色婷婷综合| 99精品欧美一区二区三区四区| 亚洲色图 男人天堂 中文字幕| 亚洲国产欧美网| 亚洲精品久久成人aⅴ小说| 别揉我奶头~嗯~啊~动态视频| 欧美国产精品va在线观看不卡| 亚洲久久久国产精品| 亚洲第一欧美日韩一区二区三区| 妹子高潮喷水视频| 日日干狠狠操夜夜爽| 脱女人内裤的视频| 精品国产一区二区三区四区第35| 制服诱惑二区| 欧美激情久久久久久爽电影 | 可以在线观看的亚洲视频| 色婷婷久久久亚洲欧美| 国产精品一区二区三区四区久久 | 真人一进一出gif抽搐免费| 两性午夜刺激爽爽歪歪视频在线观看 | 可以免费在线观看a视频的电影网站| 免费无遮挡裸体视频| av有码第一页| 高清在线国产一区| 女人高潮潮喷娇喘18禁视频| 亚洲精品久久国产高清桃花| 看黄色毛片网站| 91麻豆精品激情在线观看国产| 精品国产一区二区久久| 亚洲五月色婷婷综合| 香蕉丝袜av| 啦啦啦观看免费观看视频高清 | 99久久综合精品五月天人人| 国产一区二区三区视频了| 精品免费久久久久久久清纯| 一级毛片女人18水好多| 午夜a级毛片| 99久久精品国产亚洲精品| 亚洲中文字幕日韩| 一夜夜www| 精品国内亚洲2022精品成人| 午夜a级毛片| 亚洲人成伊人成综合网2020| 国产视频一区二区在线看| 精品久久久久久久毛片微露脸| 国产日韩一区二区三区精品不卡| 国产在线观看jvid| 国产精品美女特级片免费视频播放器 | 国产视频一区二区在线看| 日韩欧美免费精品| 日日夜夜操网爽| 国产精品久久久人人做人人爽| 亚洲成a人片在线一区二区| 国产一区二区三区视频了| cao死你这个sao货| 9191精品国产免费久久| 亚洲,欧美精品.| 狂野欧美激情性xxxx| 亚洲成人久久性| 免费搜索国产男女视频| 精品久久久精品久久久| 国产蜜桃级精品一区二区三区| 国产亚洲av嫩草精品影院| 精品卡一卡二卡四卡免费| 国产真人三级小视频在线观看| 18禁裸乳无遮挡免费网站照片 | 性欧美人与动物交配| 亚洲欧美日韩另类电影网站| 亚洲精品粉嫩美女一区| 国产成人欧美在线观看| 女同久久另类99精品国产91| av有码第一页| 欧美日韩亚洲国产一区二区在线观看| 欧美日韩黄片免| 久久精品人人爽人人爽视色| 精品久久久久久成人av| 亚洲成人久久性| 色播亚洲综合网| 丁香六月欧美| 一区福利在线观看| 一进一出抽搐动态| 免费无遮挡裸体视频| 久久久水蜜桃国产精品网| 亚洲免费av在线视频| 久久国产精品影院| 天堂动漫精品| 极品人妻少妇av视频| 黄色成人免费大全| 精品久久久久久久毛片微露脸| 欧美日韩亚洲国产一区二区在线观看| 国产单亲对白刺激| 久久国产亚洲av麻豆专区| 亚洲 欧美 日韩 在线 免费| 女生性感内裤真人,穿戴方法视频| 精品日产1卡2卡| 韩国精品一区二区三区| 一夜夜www| 中文字幕最新亚洲高清| 色哟哟哟哟哟哟| 黄色视频不卡| 少妇的丰满在线观看| 亚洲专区国产一区二区| 欧美黑人欧美精品刺激| 老司机午夜十八禁免费视频| 精品国产亚洲在线| 香蕉丝袜av| 丰满人妻熟妇乱又伦精品不卡| 色哟哟哟哟哟哟| 亚洲人成77777在线视频| 成年女人毛片免费观看观看9| 成人国产综合亚洲| 国产精品,欧美在线| 99riav亚洲国产免费| 国产99久久九九免费精品| 国产av在哪里看| 色播亚洲综合网| 国产精品一区二区免费欧美| 999久久久国产精品视频| 制服丝袜大香蕉在线| 国产成人啪精品午夜网站| 一边摸一边做爽爽视频免费| 88av欧美| 日本vs欧美在线观看视频| 女人爽到高潮嗷嗷叫在线视频| 中文字幕人成人乱码亚洲影| 91国产中文字幕| 亚洲精品av麻豆狂野| 久久欧美精品欧美久久欧美| 大陆偷拍与自拍| 国产亚洲欧美精品永久| 精品少妇一区二区三区视频日本电影| 美女高潮喷水抽搐中文字幕| 操美女的视频在线观看| 午夜福利高清视频| 在线观看免费日韩欧美大片| 国产精品乱码一区二三区的特点 | 国产精品秋霞免费鲁丝片| 中文字幕av电影在线播放| 丰满的人妻完整版| 精品久久久久久久久久免费视频| 久久久久久大精品| 欧美日韩瑟瑟在线播放| 黄片大片在线免费观看| 纯流量卡能插随身wifi吗| 操美女的视频在线观看| 人人妻人人澡欧美一区二区 | 一区福利在线观看| 欧美人与性动交α欧美精品济南到| 99精品在免费线老司机午夜| 国产黄a三级三级三级人| 欧美日本视频| 99re在线观看精品视频| av福利片在线| 久久久国产欧美日韩av| 欧美国产日韩亚洲一区| 国产黄a三级三级三级人| 咕卡用的链子| 一级,二级,三级黄色视频| 国产一级毛片七仙女欲春2 | 亚洲色图综合在线观看| 欧美激情高清一区二区三区| 亚洲欧美精品综合一区二区三区| 一级作爱视频免费观看| 美女大奶头视频| 国产精品永久免费网站| 国产午夜精品久久久久久| 在线观看舔阴道视频| 高清黄色对白视频在线免费看| 午夜两性在线视频| 亚洲人成伊人成综合网2020| 亚洲片人在线观看| 啦啦啦韩国在线观看视频| 欧美日韩亚洲综合一区二区三区_| 精品福利观看| 欧美人与性动交α欧美精品济南到| 亚洲精品一区av在线观看| 午夜久久久在线观看| 欧美黄色片欧美黄色片| 韩国av一区二区三区四区| 啪啪无遮挡十八禁网站| 国产高清视频在线播放一区| 狂野欧美激情性xxxx| 1024香蕉在线观看| 亚洲av成人一区二区三| 日韩av在线大香蕉| 国产一区二区三区视频了| 99久久精品国产亚洲精品| 久久精品aⅴ一区二区三区四区| 午夜免费鲁丝| 日本免费a在线| 国产精品免费视频内射| 女警被强在线播放| 在线观看免费视频网站a站| 999久久久国产精品视频| 99久久综合精品五月天人人| 日本精品一区二区三区蜜桃| 欧美乱码精品一区二区三区| www.www免费av| 日韩精品青青久久久久久| 精品久久久久久成人av| 久久久水蜜桃国产精品网| 日韩国内少妇激情av| 禁无遮挡网站| 久久人人精品亚洲av| 国产片内射在线| 亚洲av日韩精品久久久久久密| 又黄又爽又免费观看的视频| 欧美日韩中文字幕国产精品一区二区三区 | 日韩欧美一区二区三区在线观看| 18禁黄网站禁片午夜丰满| 日韩中文字幕欧美一区二区| 嫁个100分男人电影在线观看| 亚洲国产高清在线一区二区三 | 中文字幕人妻丝袜一区二区| 淫妇啪啪啪对白视频| 亚洲av美国av| 国产99久久九九免费精品| 国产一区二区激情短视频| 91精品国产国语对白视频| 国产精品爽爽va在线观看网站 | 丝袜美足系列| 国产精品久久视频播放| 久久久久亚洲av毛片大全| 高清在线国产一区| 久久久精品欧美日韩精品| 黄色成人免费大全| 老司机午夜十八禁免费视频| 亚洲视频免费观看视频| 好看av亚洲va欧美ⅴa在| 国产成人影院久久av| 精品人妻1区二区| av天堂久久9| 国产精品av久久久久免费| 在线观看免费视频日本深夜| 两人在一起打扑克的视频| 他把我摸到了高潮在线观看| 欧美日本视频| 国产亚洲av嫩草精品影院| 天天躁狠狠躁夜夜躁狠狠躁| 日韩大尺度精品在线看网址 | 亚洲一区二区三区色噜噜| 久久草成人影院| 久久精品影院6| 久久精品亚洲熟妇少妇任你| 久久午夜亚洲精品久久| 中文字幕人成人乱码亚洲影| 中文字幕av电影在线播放| 一个人观看的视频www高清免费观看 | 亚洲男人天堂网一区| 熟妇人妻久久中文字幕3abv| 好男人在线观看高清免费视频 | a级毛片在线看网站| 欧美+亚洲+日韩+国产| 国产高清videossex| av在线天堂中文字幕| 亚洲精品在线美女| 成熟少妇高潮喷水视频| 欧美日韩中文字幕国产精品一区二区三区 | 久久亚洲精品不卡| 久久久久久久久中文| www.www免费av| 久久久国产欧美日韩av| 高潮久久久久久久久久久不卡| 亚洲av熟女| 亚洲全国av大片| 伦理电影免费视频| 少妇裸体淫交视频免费看高清 | 日日爽夜夜爽网站| 久久天躁狠狠躁夜夜2o2o| 女人高潮潮喷娇喘18禁视频| 久久久久久久久免费视频了| 高清黄色对白视频在线免费看| 18禁观看日本| 午夜久久久久精精品| 欧美日韩乱码在线| 久久婷婷成人综合色麻豆| 国产精品久久电影中文字幕| 免费高清在线观看日韩| 亚洲精品在线美女| 免费久久久久久久精品成人欧美视频| 国产成人精品在线电影| 精品国产亚洲在线| 校园春色视频在线观看| 91成年电影在线观看| 日本黄色视频三级网站网址| 自拍欧美九色日韩亚洲蝌蚪91| 国产一区二区激情短视频| 麻豆成人av在线观看| 少妇的丰满在线观看| 久久香蕉国产精品| 一边摸一边做爽爽视频免费| 精品欧美一区二区三区在线| 国产亚洲欧美在线一区二区| 国产亚洲av高清不卡| 亚洲一区高清亚洲精品| 19禁男女啪啪无遮挡网站| 精品少妇一区二区三区视频日本电影| 性欧美人与动物交配| 国产精品九九99| 又紧又爽又黄一区二区| 中文字幕色久视频| 琪琪午夜伦伦电影理论片6080| 亚洲欧美激情在线| 人妻久久中文字幕网| 亚洲av电影在线进入| 妹子高潮喷水视频| 午夜成年电影在线免费观看| 99国产综合亚洲精品| 身体一侧抽搐| www.熟女人妻精品国产| 一夜夜www| 亚洲av成人不卡在线观看播放网| 亚洲情色 制服丝袜| 日韩av在线大香蕉| 91成人精品电影| 免费一级毛片在线播放高清视频 | 99国产极品粉嫩在线观看|