• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Machine Learning-based Optimal Framework for Internet of Things Networks

    2022-08-23 02:19:54MoathAlsafasfehZaidAridaandOmarSaraereh
    Computers Materials&Continua 2022年6期

    Moath Alsafasfeh,Zaid A.Arida and Omar A.Saraereh

    1Department of Computer Engineering,College of Engineering,Al-Hussein Bin Talal University,Ma’an,Jordan

    2Abdul Aziz Ghurair School of Advanced Computing(ASAC),LTUC,Amman,P11118,Jordan

    3Department of Electrical Engineering,Engineering Faculty,The Hashemite University,Zarqa,13133,Jordan

    Abstract:Deep neural networks(DNN)are widely employed in a wide range of intelligent applications, including image and video recognition.However,due to the enormous amount of computations required by DNN.Therefore, performing DNN inference tasks locally is problematic for resourceconstrained Internet of Things (IoT) devices.Existing cloud approaches are sensitive to problems like erratic communication delays and unreliable remote server performance.The utilization of IoT device collaboration to create distributed and scalable DNN task inference is a very promising strategy.The existing research, on the other hand, exclusively looks at the static split method in the scenario of homogeneous IoT devices.As a result, there is a pressing need to investigate how to divide DNN tasks adaptively among IoT devices with varying capabilities and resource constraints,and execute the task inference cooperatively.Two major obstacles confront the aforementioned research problems:1)In a heterogeneous dynamic multi-device environment,it is difficult to estimate the multi-layer inference delay of DNN tasks; 2)It is difficult to intelligently adapt the collaborative inference approach in real time.As a result,a multi-layer delay prediction model with fine-grained interpretability is proposed initially.Furthermore, for DNN inference tasks,evolutionary reinforcement learning(ERL)is employed to adaptively discover the approximate best split strategy.Experiments show that,in a heterogeneous dynamic environment, the proposed framework can provide considerable DNN inference acceleration.When the number of devices is 2,3,and 4,the delay acceleration of the proposed algorithm is 1.81 times,1.98 times and 5.28 times that of the EE algorithm,respectively.

    Keywords: IoT; distributed computing; neural networks; reinforcement learning

    1 Introduction

    In recent years, the Internet of Things (IoT) devices have become more and more common.According to Gartner data,the number of IoT devices is expected to reach 25 billion by 2021[1–5].As a typical representative of“Internet+”,the IoT extends the traditional information communication to a wider physical world,which greatly expanding the coverage and composition of the Internet[6–9].The wireless sensor network(WSN)is composed of a large number of sensor nodes with limited resources such as computing,communication,and energy in a multi-hop and self-organizing manner[10].It is the core support of the perception layer of the IoT.Since the WSN was proposed in the 1990s,it has received extensive attention worldwide,especially in developed countries or regions such as the United States,Europe,Japan,and South Korea[11–13].Continued to carry out related exploratory research,people usually think that:WSN technology has the ability to increase the existing network functions and improve the people’s perception of the world.The IoT based on WSN has great potential, and its constantly emerging innovative application results will have a subversive impact on human life and social progress[14].

    At present, deep neural networks (DNNs) are developing rapidly and have been widely used in various intelligent tasks (such as computer vision, video recognition and machine translation).The IoT devices are expected to perform DNN inference tasks to achieve real-time data processing and analysis.For example,in the smart home scene,the camera can perform video recognition and speech translation tasks based on the DNN model[15].However,due to the limited resources of IoT devices,and the DNN task requires a lot of computing resources and memory usage, it is difficult for IoT devices to perform DNN inference tasks locally.In order to overcome the above-mentioned challenges,reference[16]proposed to split the DNN model between a single IoT device and cloud server to achieve task inference acceleration.However,limited by factors such as the large amount of transmitted data and the unpredictable network communication delay,the method of cloud assisting in the execution of DNN task inference is difficult to guarantee the efficiency of data processing,and it will increase the dependence on cloud services.

    Aggregating the computing power of multiple IoT devices to perform DNN tasks together is an effective solution.The advantage of this approach is to reduce the dependence on cloud services,protect the privacy of IoT devices, and enable the distributed collaborative computing.Reference[17] is the first to use resource-constrained multiple IoT devices to collaborate to perform DNN tasks such as voice and video recognition.Reference [18] proposed the DeepThings framework to divide the convolutional layer to reduce the overall execution delay and memory usage.However,the existing research work only considers the isomorphism of IoT devices, and cannot achieve realtime dynamic DNN task splitting.How to efficiently split DNN tasks and collaborative inference in dynamic heterogeneous scenes is a key issue to be solved urgently.

    The above-mentioned research problems face two important challenges.First,different parameter configurations (layer type, number of layers, convolution kernel size, input feature size, etc.) and heterogeneous device capabilities lead to significant differences in inference delays.It is impractical to perform DNN inference tasks on demand to obtain the inference delay under each system setting and task splitting strategy.Therefore,it is necessary to predict the current system state and the inference delay caused by the split collaboration strategy in advance.The existing DNN delay prediction model is based on the single-layer prediction,and the multi-layer prediction delay is obtained by adding the single-layer prediction delay.However,reference[19]found through experiments that,the difference between the sum of the delays of the individual execution of each layer and the overall execution delay becomes more obvious with increasing number of convolutional layers,and the existing DNN delay prediction model cannot be within the acceptable error range to perform effective evaluation and prediction of inference delay.Moreover, the existing delay prediction model only considers specific parameter configuration,and does not consider the impact of equipment capabilities on DNN inferred delay.Therefore,it is of great significance to study the accurate multi-layer delay prediction model in the case of multiple parameter configurations and heterogeneous equipment.

    DNN task splitting will generate communication overhead while distributing the amount of computation.Although, increasing the number of devices that cooperate to perform DNN tasks will reduce the calculation delay of a single device, it will also increase the communication delay between devices.Therefore,the collaborative splitting strategy needs to efficiently weigh the calculation and communication delays.Because the DNN structure, network status, and device capabilities are dynamically changing and highly heterogeneous,the DNN task splitting and collaborative inference strategies need to be dynamically adjusted and efficient decision-making based on the current system state,determine the number of devices to perform tasks,and select the split of DNN tasks according to the location and the computing tasks assigned to each device,in order to obtain the optimal DNN inference acceleration and make full use of the computing power of the IoT device [20].In view of the above problems,traditional optimization methods have high computational complexity and long solution time,making it difficult to apply.The data-driven artificial intelligence methods can establish automated decision-making models through data processing and analysis,training and learning,and making decisions directly based on the learned decision-making model when the system status changes,thereby achieving adaptive,intelligent and real-time decision-making.This paper uses a data-driven learning algorithm to develop real-time intelligent DNN task splitting and collaborative inference strategies under the diversification of device capabilities,network status,and DNN tasks.

    This paper proposes a novel IoT device collaborative execution DNN task inference(IoT-CDI)framework.Based on various factors such as DNN structure,device capabilities,and network status,it can adaptively adjust the DNN splitting and task allocation strategies, which can be used when resources are limited.It realizes the DNN collaborative inference between heterogeneous IoT devices,and makes full use of the computing power to minimize the inference delay of DNN tasks.The main contributions of this paper include three aspects:

    1) Fine-grained characterization of DNN model layer types,parameter configuration and equipment capabilities,etc.,mining complex mapping relationships between features and execution delays,generating interpretable multi-layer delay prediction models,and evaluating a variety of common predictions through a large number of experiments.Then,it obtains an accurate model suitable for multi-layer delay prediction.

    2) Convert the original DNN split and collaborative inference problem into the shortest path discovery problem, and reduce it to an NP-hard problem.An adaptive DNN splitting and collaborative inference algorithm based on Evolutionary Reinforcement Learning (ERL) is proposed to realize the real-time intelligent DNN inference acceleration among heterogeneous devices.

    3) Use real experiments to verify.Five common DNN models and various types of Raspberry Pi devices are selected to verify the effectiveness of the proposed IoT-CDI framework.The experimental results show that the proposed IoT-CDI can significantly improve the inference speed and is better than the benchmark algorithms.

    The remaining of the paper is organized as follows.In Section 2,the literature review is discussed.In Section 3,the background introduction and research motivation are elaborated.In Section 4,the proposed IoT-CDI model is explained.In Section 5,the task splitting mechanism of DNN is discussed.Section 6 provides the proposed framework.In Section 7,the experimental results analysis is described.In Section 8,the discussion on numerical results of the algorithms is given while Section 9 concludes the article.

    2 Literature Review

    2.1 Research on End-Cloud Collaboration Inference

    Limited by the memory limitations and computing resource constraints of IoT devices,existing work is mainly devoted to the research of DNN task collaboration inference strategies between the IoT devices and cloud servers.Reference[21]proposed a DNN inference delay prediction algorithm based on a tree regression model.In[22],the author designed a flexible and efficient two-step pruning algorithm.According to multiple factors,such as hierarchical data transmission and calculation delay,tolerable accuracy loss, wireless channel and device computing power, etc., the pruning model and the optimal DNN splitting position are determined.While reducing the load of calculation and communication transmission,it also satisfies the inference accuracy requirements of DNN tasks.The authors in [23] designed an adaptive DNN splitting algorithm, which can find the optimal splitting strategy under dynamic and time-varying network load conditions.

    Although the collaborative inference of IoT devices and cloud servers can use the computing power of cloud servers to reduce the inference delay,there are still problems such as high dependence on cloud servers,unscalable inference,long communication delay,and device privacy protection.

    2.2 IoT Device Collaboration Inference

    As cloud assists DNN task inference facing the above-mentioned problems,an emerging research trend is to aggregate the computing capabilities of resource constrained IoT devices, and multiple IoT devices collaborate to perform DNN inference tasks.Reference[24]used multiple IoT devices to perform the DNN inference for the first time,and achieved task inference acceleration by reducing the computational cost and memory usage of a single device.However,the existing research work does not consider the heterogeneous capabilities of IoT devices,dynamic changes of environmental conditions,and is difficult to achieve real-time adaptive decision-making under the diversified environment configuration and high computational complexity of problem solving.It is worth noting that, the above work is orthogonal to the compression and acceleration methods that use weight pruning[25,26],quantization [27,28] and low-precision inference [29,30] to reduce the computational cost of DNN models.At the same time,these two technologies are used to accelerate the DNN inference.Reference[31] proposes a novel system energy consumption model that considers the runtime, switching, and processing energy consumption of all involved servers(cloud and edge)and IoT devices.Then,utilizing a Self-adaptive Particle Swarm Optimization algorithm with Genetic Algorithm operators (SPSOGA),a novel energy-efficient offloading approach is developed.With layer partition procedures,this innovative technique can efficiently make offloading decisions for DNN layers,reducing the encoding dimension and improving SPSO-GA execution time.The authors in [32] provide a technology framework that supports fault-tolerant and low-latency AI predictions by combining the Edge-Cloud architectural concept with BranchyNet advantages.The benefits of running Distributed DNN(DDNN)in the Cloud-to-Things continuum may be assessed thanks to the deployment and evaluation of this architecture.Reference[33]proposes a new convolutional neural network structure—BBNet—that speeds up collaborative inference on two levels:(1)through channel-pruning,which reduces the number of calculations and parameters in the original network; and (2) by compressing the feature map at the split point,which reduces the size of the data transmitted even more.

    Tab.1 compare the summary of related works and proposed method.

    Table 1: Comparison of the related works and proposed work

    3 Background Introduction and Research Motivation

    This section first introduces the types and characteristics of DNN layers, and then leads to the research motivation of this article based on real experimental analysis.

    3.1 DNN Layer Type

    DNN tasks include multiple layer types, such as convolutional layer (conv), fully connected layer (fc), pooling layer, activation layer and Softmax layer.Among them, the computational cost and memory usage of the convolutional layer and the fully connected layer are the most.The fully connected layer has the largest memory overhead for more than 87%.Therefore, this article only focuses on the convolutional and fully connected layer in the DNN model.

    3.2 Real Problem

    1) Model prediction.The current research work only considers the single-layer delay prediction models with different layer types under different configuration parameters.However, the authors show that, there are obvious prediction errors in evaluating the multi-layer delay through the single-layer delay accumulation method.We conduct real experiments to conduct a comprehensive analysis of the multi-layer delay prediction problem, and reveal the true relationship between the delay sum of each layer executed separately and the actual delay of the entire multi-layer execution on DNN models with different channel types.As the number of different channel types gradually increases,the similarity of DNN models gradually decreases.As shown in Fig.1, the abscissa represents the number of different channel types and the ordinate represents the reduction ratio of the overall execution delay compared to the individual execution delay summation.In the case of the same convolutional layer channel type,the overall execution delay is reduced by 50% compared with the delay summation executed separately.If the number of different channel types is large, it means that the convolutional layer has low similarity, and the delay of separate execution sum is approximately equal to the overall execution delay.This experiment provides persuasiveness for the development of a multi-layer delay prediction model,which is used to better guide the DNN task splitting and collaborative inference.

    Figure 1:Comparison of latency

    2) Equipment heterogeneity.First,measure the inference delay of five common DNN models on three variants of Raspberry Pi (Raspberry Pi 2B, Raspberry Pi 3B and Raspberry Pi 3B+).Five DNN models are executed on each model of Raspberry Pi device.The experimental results are shown in Fig.2.The bar graph represents the inferred latency,and the line graph represents the ratio of the execution latency of different devices.For example, the AlexNet model is used on the Raspberry Pi 2B.The inference delay required for the above execution is 1.66 s,while the execution delay on the Raspberry Pi 3B is reduced to 1.06 s,which is only the inference delay of the Raspberry Pi 2B 64%.It can be seen that;the difference of equipment capabilities will significantly affect the inference delay of DNN tasks.Moreover,as the amount of calculation of the DNN model increases,the difference in the inference delay caused by the execution of DNN tasks by different devices becomes more prominent.The inference delays of the VGG16 model executed on Raspberry Pi 2B and 3B are 11.68 and 5.24 s,respectively,and the execution speed is increased by about 2.23 times.This experiment shows that, the DNN splitting should consider the heterogeneous capabilities of the device,and make full use of the computing resources of the device to achieve the approximate optimal inference acceleration.For this reason,it is necessary to design an accurate model to analyze the impact of equipment heterogeneous capabilities on the DNN inference delay.

    Figure 2:Comparison of latency of various deep neural networks models

    4 IoT-CDI Model

    4.1 System Model

    The schematic diagram of the IoT-CDI scenario is shown in Fig.3.It is assumed that,there is a group of IoT devices with heterogeneous capabilitiesN= {1, 2,...,N}.Each devicedevigenerates a DNN inference taskmwith a certain probability.The DNN task inference is carried out layer by layer, the output of the previous layer is the input of the next layer, and the task is terminated when all layers are executed.Suppose a DNN inference taskmcontainsKlayers, and each layer is considered a subtask.For a DNN inference task such as video recognition, usually a series of data frames are continuously input to the DNN model for inference,and the sampling rate is assumed to beQframes/second.

    Figure 3:Proposed system model

    Given the number of available devicesNand the number of DNN subtasks(number of layers)K,the goal is to find the split position of the DNN task and the optimal task allocation of these devices.For each subtaskk, find an IoT devicedevito execute it.After each IoT devicedeviexecutes the assigned computing task (some layers of the DNN task), the output data generated is transmitted to the device that performs the next layer task until the DNN task inference is completed.The research goal is to minimize the overall execution delay of the DNN tasks.If all subtasks are executed on one IoT device,the limited resources of a single IoT device will cause a long calculation delay.However, if tasks are distributed to multiple IoT devices, the communication delay increases significantly.Therefore,it is necessary to split and allocate the DNN tasks reasonably,effectively weigh communication and calculation delays,and minimize the overall inference delay of DNN tasks.

    4.2 Problem Description

    Figure 4:Schematic flow of the IoT-CDI

    The IoT-CDI problem can be transformed into an optimal path problem from the first layer to theK-th layer.The problem is expressed as:

    Eq.(1) indicates that, if the (k+1)th layer is allocated to the IoT devicedevj, an edge starting from the IoT devicedevjneeds to be selected.Eq.(2) represents the memory limit of each device.Eq.(3)ensures that each layer is executed by only one device.In addition,the DNN inference is usually composed of multiple input data streams,so the optimization goal needs to be data stream-oriented.Once the DNN splitting strategy is determined,each frame needs to be processed in order according to the strategy.We introduce the concept of pipeline processing as shown in Fig.5.Specifically, for two consecutive data frames,the IoT devicedevifirst completes the task assigned by the data frame 1,and when the data frame 2 arrives,the IoT devicedeviwill immediately execute the task of the data frame 2.Obviously, the bottleneck of pipeline processing is the maximum value ofTijk, which is the device with the longest processing time for a single frame.This fact is verified through experiments.The VGG16 model is divided into three parts and executed on different devices.The time for each device to execute one frame is 2.374,7.768 and 1.456 s,and a single frame is executed.The maximum inferred delay of the three devices during the task is 7.768 s,and the total execution delay of 100 frames in the experimental test is approximately equal to 100×7.768 s.In order to enable the DNN split and task allocation strategy to support the pipeline processing,the delay calculation formula is modified to the maximum value of the individual execution delay of each IoT device.The method of multidevice cooperative execution of DNN tasks proposed in this paper aggregates the computing power of multiple devices and makes full use of the concurrent processing capabilities,which can effectively improve the overall throughput.It is achieved by adaptively splitting the DNN tasks among multiple IoT devices in real time with the goal of minimizing the total inference delay after processing all data frames.

    Figure 5:Processing illustration of deep neural network model

    4.3 Problem Solution

    First it is proved that, the IoT-CDI problem is NP-hard, and then use the known NP-hard problem—general assignment problem (GAP) to prove it [34,35].The GAP assumes that, there areMitems andNboxes,put itemiinto boxj,and get the incomeMi,j.The goal is to pack each item into an appropriate box,and maximize the overall revenue under the constraints of the cost of each box.Through parameter mapping and conversion, the IoT-CDI problem is reduced to a GAP problem,which proves that the problem is NP-hard.

    Since the IoT-CDI problem is NP-hard, it is difficult to obtain the optimal DNN splitting and collaborative inference strategy in polynomial time.Therefore, accurate algorithms such as enumeration are not suitable for solving this problem.In addition,due to the diversity of DNN model structures,heterogeneous equipment capabilities and dynamic changes in communication status,it is necessary to adjust the collaborative inference strategy in real time.To this end,we adopt a data-driven artificial intelligence method to solve the problem, which can make real-time automated decisionmaking based on environmental information.Reinforcement learning(RL)is an effective data-driven method that continuously learns and guides behavior by interacting with the environment to obtain rewards to obtain the maximum benefits.In this paper, an enhanced learning algorithm is used to determine the optimal DNN splitting strategy, and to perform collaborative inference between heterogeneous devices to achieve inference acceleration.

    5 DNN Task Split Strategy

    In this section,we first elaborate and analyze the proposed accurate multi-layer delay prediction model through specific parameter configuration and a variety of typical prediction models.On this basis,the ERL algorithm is used to intelligently and adaptively determine the cooperative inference strategy between heterogeneous devices.

    5.1 Parameter Configuration of Convolutional Layer and Fully Connected Layer

    The convolutional layer includes input feature dimensions (input heightin_height, input widthin_width),convolution kernel size(kernel_height,kernel_width),channel size(in_channel,out_channel),stride and padding.The parameter configuration of the fully connected layer includes the input feature dimension (in_dim)and the output feature dimension (out_dim).The parameter configuration range is shown in Tab.2.The configurable parameters of each layer are generated by random combination,and the execution delay Y of each parameter combination is measured.Similar to[36],the interpretable parameter vectorXis determined according to the above model parameters,including floating point operations (FLOPs), memory footprint and parameter scale.The specific definition of the interpretable parameter vectorXis:X= (FLOPs, mem, param_size), wheremem=mem_in+mem_out+mem_inter,mem_inrepresents the input data occupancy scale,mem_outrepresents the memory occupancy scale of output data,mem_interrepresents the memory occupancy scale of temporary data,detailed definitions of memory and parameter characteristics can be found in[37].The CPU operations and memory operations affect the execution time of the program to a certain extent.In the DNN model,the CPU operations and memory operations are reflected in floating-point operations,memory footprint and parameter scale.A large number of[X,Y]data pairs are obtained through various parameter configuration combinations for delayed model training and prediction.

    Table 2: Layers parameters

    5.2 Multi-Layer Delay Prediction Model

    In this section,we conduct a comprehensive study on the multi-layer delay prediction model of the convolutional and the fully connected layer.The interpretable parameter vectorXof the multi-layer delay prediction model includes the number of layers,the sum of floating-point operations,memory footprint and parameter scales.In order to perform multi-layer predictive analysis, first generate a DNN model of any number of layers,and generate a characteristic parameter combination,execute on IoT devices with different computing capabilities to obtain the execution delay Y in the case of any number of layers and different parameter configurations.After obtaining the [X, Y] data pair,establish the correlation model of equipment capabilities, task characteristics and execution delay,study a variety of common predictive models to fit multi-layer input data and execution delay, and mine a variety of characteristic parameters and execution mapping relationship between delays.The coefficient of determinationR2, mean squared error (MSE) and mean absolute percentage error(MAPE)are used as the evaluation indicators of the accuracy of the prediction model.Also,study the linear regression(LR),RANdom SAmple Consensus regression(RANSAC),kernel ridge regression(KRR),k-nearest neighbor(KNN),decision tree(DT),support vector machine(SVM),random forest(RF),AdaBoostADA,gradient boosted regression trees(GBRT)and artificial neural network(ANN)models.

    Compared with the convolutional layer, the fully connected layer has a shorter execution time,fewer parameters and a small number of layers.For example,the AlexNet model only contains three fully connected layers, and the ResNet model only contains one fully connected layer.We prove through experiments that the error of the sum of the overall execution delay and the individual execution delay of the fully connected layer is less than 2%.Therefore,we only study the single-layer prediction model executed by the fully connected layer on different devices,and compare the prediction performance of different prediction models on the fully connected layer.From Tab.3,it can be seen that a variety of prediction models can predict fully connected execution delay of the layer.For the convolutional layer,due to the many types of input feature parameters,the wider configuration range,the number of execution layers and the complex coupling relationship between feature parameters,the delay prediction is relatively increased.Adding the ANN prediction model,because the neural network can effectively obtain the nonlinear relationship and has strong generalization and fitting ability,and can obtain an approximate actual model without assuming the mapping relationship between the feature variable and the result.

    Table 3: Comparative performance of different algorithms for single-layer

    Taking Raspberry Pi 3B as an example,Tab.4 compares the performance of different multi-layer delay prediction models for the convolutional layer.It can be seen from Tab.4 that,the performance of the three prediction models of RF, GBRT and ANN is better than other models.For example,compared with the RANSAC model and the ADA model, the MAPE index of the ANN model is reduced by 43% and 81%, respectively.The experiment in Section 6 further verify the accuracy of these three multi-layer prediction models.

    Table 4: Performance comparison of the algorithms for conv multi-layer

    Table 4:Continued

    5.3 DNN Task Splitting Strategy Based on Evolutionary Reinforcement Learning

    1) Description

    Reinforcement learning (RL) is an effective machine learning algorithm for decision making.Agents can observe the state of the environment and learn which behaviors can obtain better returns.At each time stept, the agent observes the current environment statest, and chooses a behaviorataccording to the strategyπ∈(at|st).The instantaneous profitrtis obtained after the execution of the behavior, and the state transition is performed according to the state transition probability environment,and the state is adjusted tost+1.The goal of the agent is to obtain the optimal strategy to maximize the cumulative discounted incomeand the discount factor isγt.Strategy learning is based on the behavior value function,which is defined as the expected value of the cumulative discounted income that each state behavior can obtain,and is calculated as:

    The goal of reinforcement learning is to find the optimal strategy to maximize the behavior value,which can be expressed as

    Deep reinforcement learning (DRL) [38] is proposed to solve the curse of dimensionality.DRL uses a DNN to approximate theQfunctionQ(st,at)≈Q(st,at|θ), whereθrepresents the model parameters of the neural network.Deep Q-network (DQN) is a typical DRL method [39].DQN stores the experience tuples in the experience pool,each time a batch of samples are randomly selected from the experience pool for training, and then the parameterθis updated to minimize the loss function.However, the DQN method based on back propagation cannot be optimized for a long time,and it is difficult to learn the optimal behavior when the reward is sparse(a series of behaviors can be used to obtain benefits).In addition,in the face of high-dimensional action and state spaces,efficient exploration is still a key challenge that needs to be solved urgently.In this case, there is a challenge of difficulty in convergence.In summary, DQN is a traditional DRL algorithm which faces important challenges such as sparse rewards, lack of effective exploration, and difficulty in convergence.Therefore,traditional DRL algorithms(such as DQN)cannot be directly applied to solve the IoT-CDI problem,because the problem behavior is decomposed into continuous sub-behaviors,there are problems such as sparse rewards and huge behavior state space, and convergence is very difficult.For this reason,the evolutionary ERL algorithm[40]is proposed to realize the DNN splitting and collaborative inference among heterogeneous devices.

    2) DNN Task Splitting Strategy Based On ERL

    From the perspective of DRL,the device used to determine the DNN splitting strategy is modeled as an agent.In order to reduce the dimensions of the state and behavior space, the DNN split task is decomposed into hierarchical sequence subtasks, and each layer is treated as a subtask.In each decision-making, you only need to select the appropriate execution equipment for each layer of the model.The behaviors of each layer obtain the overall behavior set,and perform DNN task splitting and collaborative inference according to the behavior set.The DNN task execution delay is used as the benefit to measure the performance of the behavior set.First define the basic elements of the state,behavior,and return of the problem.

    1) State.At each timet,the statestcontains 5 parts:

    i)ftrepresents the current number of layers;

    ii)comtrepresents the current network status,that is,the communication rate

    iii)ct={c1,t,c2,t,...,cN,t}represents the capability of each IoT device;

    iv)lt= {l1,t,l2,t,...,lN,t} represents the cumulative delay required for each IoT device to complete the pre-allocated subtask;

    v)et={e1,t,e2,t,...,eN,t}represents the inferred delay caused by the execution of the current subtask assigned to each IoT device.From the above description,we can see thatst=(ft,comt,ct,lt,et),The state dimension is 3N+2.

    2) Behavior.atmeans to select a device fromNIoT devices to perform the current subtask.

    3) Revenue.If the current subtask is the last one,the revenue is the overall inferred delay of the DNN task(for the data flow situation,the revenue is the maximum value of the delay required for each IoT device to perform its own task),otherwise the revenue is zero.

    The DRL algorithm based on back propagation is difficult to obtain the optimal strategy for this problem,because this problem faces challenges such as sparse rewards and difficult exploration.Compared with the traditional DRL method,the ERL integrates the population-based method in the natural evolution strategy,which makes diversified exploration possible,and uses fitness indicators to learn and generate better offspring,so that multiple strategies can be effectively explored,and continue to evolve towards high returns.

    The ERL process is as follows: Apply evolution to the candidate sample population, and continuously generate new offspring by increasing the random deviation.By performing the selection operation, the offspring with a higher fitness value have more chances to retain and produce new offspring.The higher the fitness value, the better the performance, and the next generation by the selection operation will provide better performance.In this article, each sample represents a set of parameters of the neural network,and the random deviation added to the offspring represents random disturbance to the weight of the neural network.

    The overall algorithm flow is shown in Algorithm 1.

    Algorithm 1:ERL DNN algorithm Input:random weight θ of behavior value function Q,parent weight θ,number of children C,learning rate η;Output:Parent weight θp.1:for episode : =1,2,...E do 2:Initialize state s 3:For i in range C do(Continued)

    Algorithm 1:Continued 4:θi =θp+noise 5:Select the behavior ai and observe the revenue ri 6:Calculate the average return r and calculate the gain of each offspring gi=ri-ˉr 7:θp =θp+η×C∑i=1gi×θi 8:End for 9:End for

    In Algorithm 1, the parameters are initialized at the beginning.Then describe how to update the neural network during training.Specifically,the parent neural network generatesCchild neural networks by perturbing the parameters of the neural network,and evaluates the income value obtained by each child during each iteration,that is,the fitness value.If a child has a higher fitness value,then the child is selected with a higher probability and the offspring is generated.Calculate the gain value of each child by normalizing the difference between the income value obtained by each child and the average income value of all children.Update the parameters of the parent neural network according to the gain value g ofCchildren(steps 3~9).

    6 Proposed Framework

    The overall process diagram of the IoT-CDI framework is shown in Fig.6, which includes two stages of offline training and online execution.The offline stage generates a multi-layer delay prediction model and completes the training process of the ERL algorithm.The online stage dynamically determines the split location and based on the system state.Task allocation,multiple devices cooperate to perform DNN tasks together.The topological structure of different DNN tasks is different, the calculation amount of each layer and the amount of intermediate data transmission generated are different, network status changes directly affect the data transmission delay, and the heterogeneity of equipment capabilities significantly affects the calculation delay.So it needs to be based on these dynamic factors, automatically adjust the DNN task splitting and allocation strategy to effectively reduce the inference delay.The IoT-CDI framework can determine the split location of the DNN model and the task assignment of each device according to the current system status, including communication status, device capabilities, and DNN task requirements, and realize distributed and collaborative DNN task inference among heterogeneous devices.It deploys a master device(IoT device or gateway)to manage and control the entire process.

    6.1 Offline Training Phase

    In this stage,the training of multi-layer delay prediction model and ERL split strategy training are mainly carried out.For the two types of convolutional and fully connected layer,the delay prediction model under the condition of arbitrary multi-layer different parameter configurations is described,which allows accurate evaluation of the actual execution delay of the inference task without executing the DNN task.Due to different layer types,layer parameter configurations and the number of layers will have obvious delay differences.So build different layer types of prediction models(convolutional layer and fully connected layer),change the number of layers and each layer parameter of each layer type configure,use these parameters to determine the calculation scale and data transmission scale,and analyze the impact of different device capabilities on execution delay when the parameter configuration is the same.Real measurement data of parameter configuration,equipment capability and execution delay are obtained through experiments,and the prediction model training is carried out based on the data.A variety of common prediction models,involving regression,k-nearest neighbors,decision trees,combination and artificial neural network models and other types of models are analyzed.Through experiments,it is found that there are fewer types of parameters in the fully connected layer,and the prediction is relatively simple, and many models can obtain accurate prediction performance.The convolutional layer has many parameter types and complex configurations,so the performance of the prediction model with strong generalization and nonlinear fitting capabilities is more accurate.It is worth noting that, by mapping the model parameters to the calculation and transmission scale and analyzing the impact of different device capabilities on the execution delay, the proposed prediction model is independent of the DNN model and related to the device capabilities and can be adapted to heterogeneous devices.When the DNN model structure and parameters change,it can quickly obtain accurate execution delay based on the prediction model, avoiding additional execution overhead.Based on the generated multi-layer delay prediction model, the ERL algorithm is trained in order to obtain the approximate optimal DNN task splitting and collaborative inference strategy when the DNN model,network status and device capabilities dynamically change.The status information of the ERL model includes model parameters, number of layers, communication status, and device capabilities.The behavior strategy is to determine the execution equipment of each layer of the DNN model.After training 2,000 times to reach convergence,the ERL model after training is stored on the main device,and then the best split strategy is determined based on the input system state.

    Figure 6:Proposed framework

    6.2 Online Execution Phase

    This stage includes three steps:1)The system profiler obtains the current system status,including DNN inference tasks,current communication status and device capabilities,etc.;2)This information is fed back to the decision maker,and it uses offline training to complete the completed multi-layer delay prediction model evaluates the inference delay of each candidate decision,and uses the ERL split model that is also trained in the offline phase to obtain the optimal split strategy to achieve DNN inference acceleration and device resources among heterogeneous multiple devices.3) Each device executes its assigned tasks according to the split strategy.

    IoT devices need to communicate with each other to transmit commands and data.In order to effectively identify the device,each IoT device needs to register an IP address.After knowing the DNN task splitting and allocation strategy, maintain each device’s own IP processing table, which records the inference tasks assigned to it and the predecessor and successor nodes of its own task.The master device maintains the overall IP processing table,which records the execution tasks of each device.Once the system status changes, for example, the communication rate changes or new equipment joins or exits, it will trigger the adjustment of the split strategy, and the master node will update the record.Then the master node distributes the updated information of the IP processing table to all devices,and each device modifies its own IP processing table according to the updated information.

    The DNN inference process is executed according to the IP processing table.An IoT device will receive the input data required for calculation from the predecessor device, and send the generated output result to the successor device after completing the assigned task.In order to realize the above process,the remote procedure call(RPC)is deployed to realize the interaction between devices,which can communicate and transmit data between two devices.Taking the VGG model as an example,suppose that device 1 implement the 1~5 layers of the VGG model,and its successor device implements the 6–10 layers of the VGG model for device 2.After device 1 completes the assigned number of layers,it sends the generated output result to the subsequent device 2, and the two devices jointly execute the DNN tasks according to the strategy.When the environment status changes,adjust the split and allocation strategy according to the ERL algorithm.For example,device 1 executes layer 1~7,device 2 executes layer 8~10,you need to update the IP processing table of each device,modify the allocation task and the predecessor and successor node.

    7 Experimental Verification

    We use real experiments to verify the proposed IoT-CDI framework.First, it is proved that the proposed multi-layer delay prediction model is accurate.Then, compared with the benchmark algorithm,it is found that the proposed ERL method can significantly reduce the inference delay and realize the acceleration of inference.In addition, we also evaluate the influence of factors such as communication status and the number of devices on the performance of the experiment.

    7.1 Experimental Setup

    1) Device type.Three types of Raspberry Pi devices are used as heterogeneous IoT devices,namely Raspberry Pi 2B, Raspberry Pi 3B and Raspberry Pi 3B+, using Raspbian GNU/Linux10 buster operating system.Different models of Raspberry Pi have different computing capabilities,providing differentiated inference performance.The specifications of different models of Raspberry Pi are shown in Tab.5.In order to perform DNN tasks on the Raspberry Pi,we install basic software and platforms such as Python 3.7.3,Keras 2.2.4 and Tensorflow 1.13.1.

    Table 5: Configuration of Raspberry Pi

    2) DNN model.Five common DNN models are used, namely AlextNet, DarkNet, NiN,ResNet18 and VGG16.The VGG16 represents the long DNN model (with more layers),and AlexNet represents the short DNN model (with fewer layers).The AlexNet model and the ResNet18 model are less computationally intensive,while the VGG16 model and the NiN model are more computationally intensive.The type of calculation is relatively large, but the communication volume of the VGG16 model is relatively small, and the communication volume of the NiN model is relatively large.

    3) Communication method.The average transmission rate between IoT devices is used to simulate different wireless networks.The experiment sets up 3 kinds of network environments, 3G network,WiFi and 4G network,the transmission rate is 1.1,18.88 and 5.85 Mbps respectively.

    4) Benchmark algorithms.We consider four comparison algorithms.The device-execution(DE)algorithm refers to the execution of DNN tasks only on the local device that generates the task.The maximum-execution (ME) algorithm refers to assigning DNN tasks to computing the most capable equipment.The equal-execution(EE)algorithm refers to the equal distribution of DNN tasks to all available devices.The classic shortest path Dijkstra algorithm obtains the shortest execution delay from the first to the last layer of the DNN model, and uses a single-layer prediction model to determine the weight of each edge, which is represented as short-execution(SE).The DE algorithm is used here as a benchmark and the proposed ERL algorithm is evaluated accordingly.

    7.2 Forecast Model Accuracy

    Delay prediction data set: For the two layer types of convolutional and fully connected layer,different parameter ranges are set respectively.The layer parameters and parameter ranges are shown in Tab.2.Various configurable parameter sets are generated through random combination, and the configuration parameters are converted for floating-point operations, memory footprint and parameter scales and other related interpretable variables.Then,obtain the execution delay of three models of Raspberry Pi devices under different parameter settings, and determine the parameter settings and execution of the convolutional layer and the fully connected layer real time delay measurement data set.For multi-layer delay prediction, the multi-layer parameter configuration is generated according to the actual principles of DNN model.The number of layers’ranges from 1 to 40.The parameter configuration of each layer is converted into an explainable variable, and the accumulated explanatory variable is obtained by adding layer by layer,and through the tree execution of Raspberry Pi gets multiple inference delays.Based on the multi-layer parameter configuration and inferred delay data set obtained by real measurement, a variety of common prediction models are trained, and the convolutional layer and the fully connected layer are respectively predicted.The prediction performance of different prediction models is shown in Tabs.3 and 4.

    The following verifies the accuracy of the multi-layer delay prediction model of the convolutional layer.Take VGG16 and AlexNet as examples,as shown in Figs.7 and 8,respectively,the histogram represents the actual execution delay of the experimental measurement.For example,when the abscissa is 7, it means that the delay required to execute the first seven layers of the VGG16 model is 6.08 s.The line graph shows the prediction performance of different prediction models, and the MAPE is used as the evaluation index.It can be seen from Figs.7 and 8 that, the three prediction models of RF, GBRT and ANN can accurately predict the inference delay of any number of layers, and the average percentage error of the prediction results of any layer of the three models is less than 4%.The main reason for accurate prediction is to accurately describe the model parameters that can affect the inference delay, and map these parameters into explanatory variables such as calculation scale and communication scale.The above three prediction models have good hierarchical fitting and generalization capabilities,and can effectively obtain the complex nonlinear relationship between feature variables and delay.In addition,further consider the impact of equipment capabilities on the inference delay,and obtain each type of equipment.The real data set of parameter configuration and execution delay, and the prediction model training for each device, so as to accurately predict the inferred delay of various devices under different parameter settings.

    Figure 7:Comparison of latency and accuracy using VGG16

    Figure 8:Comparison of latency and accuracy using AlexNet

    7.3 Performance Comparison

    1) DNN split.Fig.9 shows different splitting strategies of three typical DNN models.It can be seen from Fig.9 that, the DNN splitting strategy varies with the change of the DNN model and the number of devices.The VGG16 model has a large amount of calculation and a small amount of data transmission,so it tends to use more IoT devices to obtain better performance.The NiN model has a large amount of calculation and data transmission and excessive communication overhead cause performance degradation.Therefore, the NiN model tends to adopt fewer devices to cooperate to reduce the communication overhead.The ResNet18 model has a small amount of calculation,and it is necessary to consider whether the reduced computational overhead of collaborative inference can offset the increased communication overhead.Therefore,the split strategy of the ResNet18 model needs to weigh the computational gain and communication overhead.From this, it can be concluded that the DNN splitting strategy needs to be adaptively adjusted according to the characteristics of the DNN model and the environmental state.

    2) Delayed acceleration.We compared the delay acceleration of five algorithms for different DNN models, set the number of devices to three, and the communication mode to WiFi.It can be seen from Fig.10 that,compared with the DE,ME,EE and SE algorithms,our proposed ERL algorithm has different degrees of improvement.

    Figure 9:Partitioning comparison of algorithms

    Figure 10:Comparison of latency of the proposed and existing algorithms for different neural networks models

    As the computing demand increases,the performance improvement becomes more obvious.For example,the VGG16 model uses the ERL algorithm and the delay acceleration is about twice that of the DE algorithm.Mainly because of the limited resources of IoT devices,the performance of separate execution is poor when the amount of calculation is large,and the demand for DNN task splitting is stronger.However, when the amount of data transmission is large, the higher communication delay caused by DNN task splitting will seriously reduce the advantage of cooperative execution, so the delay acceleration is not obvious in the NiN model.

    Since a single IoT device cannot bear the heavy computational burden, the performance of DE and ME algorithms are not ideal.Although the EE algorithm can benefit from collaborative inference,the average decision is not the optimal split strategy.Due to the inaccuracy of single-layer prediction,the performance of SE algorithm is not ideal.The proposed ERL algorithm can effectively balance the computational and communication costs,make full use of the heterogeneous capabilities of the device,and can achieve better DNN inference acceleration.

    7.4 Adaptability to Environmental Conditions

    1) Influence of communication status.This experiment evaluates the influence of communication status on delay acceleration.Under 3G,4G and WiFi communication conditions,the performance of the five algorithms is compared with the VGG16 model as an example, as shown in Fig.11.It is worth noting that, when the communication rate increases, the performance of the proposed ERL algorithm improves more significantly than the benchmark algorithms.When using a 3G network,the communication conditions are poor,and the computational gain generated by the cooperative execution is difficult to offset the communication cost generated by the data transmission.Therefore,the performance of VGG16 model EE algorithm is lower than DE algorithm.When using a 4G network,the delay acceleration of the ERL algorithm is 2.07 times that of the DE algorithm.When using WiFi for communication, the latency is increased to 2.36 times.The main reason is that,when the communication conditions are good,the data transmission delay required for DNN splitting is reduced,so the cooperative execution advantage is more obvious.

    Figure 11:Comparison of latency of various communication networks of the schemes

    In order to further verify that the proposed ERL algorithm can adapt to various communication states,the communication rate is set from 1 to 20 Mbps,and the VGG16 model is taken as an example to compare the delay acceleration performance of the different algorithms.Through experiments,it is found that, the performance of the proposed ERL algorithm is optimal at any communication rate, and it is inferred that the delay is reduced by more than two times.It can be seen from Fig.12 that with the increase of communication rate, the delay acceleration becomes more obvious.This is because,the increase of communication rate can effectively reduce the communication cost caused by splitting,thereby reducing the overall execution delay.The DE,ME and EE algorithms cannot adjust the split strategy according to the network status, so as the communication rate increases, the delay acceleration performance improvement is not obvious.The proposed ERL and the SE algorithms can effectively balance the communication and calculation overhead according to the current network state, thereby achieving significant inference acceleration as the communication rate increases, and effectively reducing the inference delay of the DNN task.

    Figure 12:Latency vs.data rate of comparison of the proposed and existing schemes

    2) Influence of the number of equipment.We deploy different numbers of IoT devices to evaluate the performance of five algorithms.Taking the NiN model as an example, it can be seen from Fig.13 that, the proposed ERL algorithm has the best performance in terms of delay acceleration.When the number of devices is 2,3,and 4,the delay acceleration of the proposed algorithm is 1.81 times, 1.98 times and 5.28 times that of the EE algorithm, respectively.Since the communication cost of the NiN model cannot be ignored,the EE algorithm cannot flexibly adjust the split strategy, and it is difficult to effectively weigh the calculation and communication cost.The proposed ERL algorithm can intelligently determine the splitting strategy to obtain the approximate optimal performance.

    7.5 Complexity Analysis

    Fig.14 compares the computational complexity of the proposed and existing algorithms.It can be seen from Fig.14 that,the complexity of all the algorithms increases with increasing the number of neurons in the layer.However,the complexity of the proposed algorithm is lower than all algorithms which makes it practicable and effective in the IoT-DNN deployment.As a result, the proposed algorithm outperforms the typical neural network in terms of complexity optimization.

    Figure 13:Comparison of latency vs.number of IoT devices

    8 Discussion

    1) Equipment heterogeneity.The experiment uses different models of Raspberry Pi devices to reflect the heterogeneity of the device.The performance differences of the three models of Raspberry Pi are shown in Tab.4.Five DNN models, including AlexNet and DarkNet, are run on the three models of Raspberry Pi.Experiments of the measurement data are shown in Fig.2.Through experiments, it can be seen that, there are obvious performance differences between the three types of equipment,which can reflect the heterogeneity of equipment.In the follow-up,we will consider various types of devices such as Raspberry Pi,mobile phones and wearable devices, and analyze the differences.The performance difference of different types of equipment,the fine-grained modeling of equipment capabilities,on this basis,the study of cooperative inference issues between multiple types of equipment.

    2) The number of equipment.The capacity of a single IoT device is insufficient, and the cooperative execution of multiple devices can effectively reduce the inference delay.However,increasing the number of cooperative devices will reduce the computational delay and increase the communication delay overhead.In order to prevent communication bottlenecks, the number of devices for cooperatively executing the DNN model will not be too much.From Fig.9,it can be found that,for DNN models with a large amount of data transmission(such as the NiN model),even if there are more available devices,they tend to use a few devices.Even if the DNN model(such as the VGG16 model)with a small amount of data transmission and a large amount of calculation,the number of cooperative execution devices will not be too much.

    3) The practicality of the IoT-CDI framework.The IoT-CDI framework mainly solves two problems:i)Aiming at the problem that the error of the existing single-layer prediction method cannot be ignored,a fine-grained multi-layer prediction method is designed,which can accurately evaluate the inference delay of any layer DNN task.ii)For equipment capabilities,the DNN task characteristics and dynamic changes in network status and heterogeneous conditions,an intelligent decision-making algorithm based on reinforcement learning is adopted.In order to overcome problems such as sparse returns and convergence difficulties, the evolutionary reinforcement learning is used to quickly obtain splitting strategies.The proposed IoT-CDI framework uses a data-driven approach to achieve accurate predictive analysis and real-time intelligent decision-making.However, compared with traditional methods, there are more obvious system overhead(requires storage models),scalability,and online adjustment.Future work will focus on the practicality of the framework to solve the problems existing in actual deployment to improve the feasibility.

    9 Conclusion

    This paper proposes a novel IoT-CDI framework for IoT devices to collaborate and perform DNN tasks.According to the DNN task requirements and device capabilities, a variety of factors, such as power and network status, realize real-time adaptive DNN task collaboration inference among heterogeneous IoT devices.Specifically, a multi-layer delay prediction model with different layer types,parameter configurations and device capabilities is proposed,which can accurately predict the inference delay of DNN tasks in different split situations.In addition,an intelligent DNN task splitting and collaborative inference algorithm based on evolutionary reinforcement learning is proposed,which can obtain an approximate optimal strategy in the case of heterogeneous and dynamic changes in device capabilities,network status,and task requirements.The experimental results show that,the proposed algorithm can effectively balance the communication and the calculation delay, make full use of the computing power of the equipment,and significantly reduce the DNN inference delay.In the future, we will further study the optimal value of cooperation equipment required by different DNN models.When the number of equipment is sufficient, the number of cooperation equipment can be adaptively adjusted according to the DNN task requirements to achieve the optimal inference acceleration.Future work is to address the challenges such as the scenario when the number of IoT devices is enormous and the inference delay is variable and latency is accumulating.

    Acknowledgement:The authors would like to thanks the editors and reviewers for their review and recommendations.

    Funding Statement:The authors received no specific funding for this study.

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    国产av一区在线观看免费| 村上凉子中文字幕在线| 久久久久久久久久黄片| 亚洲五月婷婷丁香| 欧洲精品卡2卡3卡4卡5卡区| 欧美激情久久久久久爽电影| 精品国产三级普通话版| 亚洲av第一区精品v没综合| 精品熟女少妇八av免费久了| 夜夜躁狠狠躁天天躁| 国产精品99久久99久久久不卡| 亚洲中文字幕日韩| 国产av在哪里看| 麻豆国产av国片精品| 亚洲国产欧美一区二区综合| 18美女黄网站色大片免费观看| 日本 欧美在线| 男女视频在线观看网站免费| 丝袜人妻中文字幕| 亚洲欧美一区二区三区黑人| 日韩高清综合在线| av女优亚洲男人天堂 | 少妇熟女aⅴ在线视频| 亚洲中文字幕日韩| 精品电影一区二区在线| 国内精品美女久久久久久| 午夜福利在线在线| 久久久国产欧美日韩av| 国内少妇人妻偷人精品xxx网站 | 亚洲七黄色美女视频| 日韩av在线大香蕉| 中国美女看黄片| 精品久久久久久久末码| 精品久久久久久久末码| 丝袜人妻中文字幕| 日日干狠狠操夜夜爽| 免费观看人在逋| 18禁美女被吸乳视频| 国产成人精品久久二区二区91| 国产黄色小视频在线观看| cao死你这个sao货| 观看美女的网站| 青草久久国产| 精品国内亚洲2022精品成人| 欧美又色又爽又黄视频| or卡值多少钱| 狂野欧美激情性xxxx| 黑人操中国人逼视频| 婷婷精品国产亚洲av在线| 久久久久久久午夜电影| 成年免费大片在线观看| 久久久久国产精品人妻aⅴ院| 村上凉子中文字幕在线| 岛国在线免费视频观看| 丰满人妻熟妇乱又伦精品不卡| 国产精品一区二区精品视频观看| 色吧在线观看| 国产亚洲av嫩草精品影院| 中文在线观看免费www的网站| 婷婷精品国产亚洲av| 最新中文字幕久久久久 | xxx96com| 三级国产精品欧美在线观看 | 亚洲国产欧美网| 亚洲专区中文字幕在线| 舔av片在线| 99热6这里只有精品| 国产成人精品久久二区二区免费| 嫩草影视91久久| 国产一区二区三区在线臀色熟女| 国内精品久久久久久久电影| 在线十欧美十亚洲十日本专区| 精品久久久久久,| 香蕉丝袜av| 亚洲一区高清亚洲精品| 又黄又爽又免费观看的视频| 国产蜜桃级精品一区二区三区| 麻豆av在线久日| 中文字幕最新亚洲高清| 日韩中文字幕欧美一区二区| 国产亚洲欧美在线一区二区| 非洲黑人性xxxx精品又粗又长| 久久久久久久久中文| 此物有八面人人有两片| 69av精品久久久久久| 成人亚洲精品av一区二区| 国产精品国产高清国产av| av欧美777| 久久亚洲真实| 国产高清videossex| 日本成人三级电影网站| 国产黄a三级三级三级人| 亚洲人成电影免费在线| 黑人操中国人逼视频| 午夜福利18| 精品免费久久久久久久清纯| 成人18禁在线播放| 国产一级毛片七仙女欲春2| 九色成人免费人妻av| 国产成人aa在线观看| 亚洲av片天天在线观看| 国产精品香港三级国产av潘金莲| 久久性视频一级片| 99久久99久久久精品蜜桃| 村上凉子中文字幕在线| 高清毛片免费观看视频网站| 国产伦精品一区二区三区四那| 亚洲精品在线观看二区| 男女视频在线观看网站免费| 精品国内亚洲2022精品成人| 97人妻精品一区二区三区麻豆| 狠狠狠狠99中文字幕| 美女扒开内裤让男人捅视频| 精品一区二区三区视频在线观看免费| 一a级毛片在线观看| 国模一区二区三区四区视频 | 久久中文字幕人妻熟女| 色综合欧美亚洲国产小说| 亚洲色图av天堂| 精品乱码久久久久久99久播| 亚洲人成电影免费在线| 天天一区二区日本电影三级| 日韩欧美三级三区| 欧美日韩福利视频一区二区| 欧美精品啪啪一区二区三区| 深夜精品福利| 一进一出抽搐动态| www日本在线高清视频| 亚洲成人中文字幕在线播放| 色精品久久人妻99蜜桃| 亚洲在线观看片| 精品电影一区二区在线| 亚洲黑人精品在线| 天堂影院成人在线观看| 国产男靠女视频免费网站| 精品免费久久久久久久清纯| 人妻丰满熟妇av一区二区三区| 18禁观看日本| 黄频高清免费视频| www.熟女人妻精品国产| 国产爱豆传媒在线观看| 天天躁日日操中文字幕| 一级毛片女人18水好多| 18禁黄网站禁片免费观看直播| 午夜福利在线观看吧| 天堂网av新在线| 国产av在哪里看| 国产精品久久久久久人妻精品电影| 丰满人妻熟妇乱又伦精品不卡| 1000部很黄的大片| 国产精品爽爽va在线观看网站| 又爽又黄无遮挡网站| 亚洲av五月六月丁香网| 国产一区二区激情短视频| 一级作爱视频免费观看| 欧美不卡视频在线免费观看| 女生性感内裤真人,穿戴方法视频| 精品熟女少妇八av免费久了| 亚洲精品一卡2卡三卡4卡5卡| 长腿黑丝高跟| 成年免费大片在线观看| 国产一区二区三区视频了| 午夜福利免费观看在线| 高清在线国产一区| 久久久国产欧美日韩av| 亚洲国产色片| 国产黄片美女视频| 成在线人永久免费视频| 淫妇啪啪啪对白视频| 人人妻人人澡欧美一区二区| 男插女下体视频免费在线播放| 看黄色毛片网站| 五月伊人婷婷丁香| 亚洲人与动物交配视频| 久久国产精品影院| 一个人免费在线观看电影 | 一个人免费在线观看的高清视频| 亚洲国产中文字幕在线视频| 男人舔奶头视频| 啪啪无遮挡十八禁网站| 国内精品久久久久精免费| 国产三级在线视频| 午夜福利在线观看吧| 国产精品野战在线观看| 欧美午夜高清在线| 日本黄色片子视频| 亚洲成人免费电影在线观看| 男女之事视频高清在线观看| 99久久精品一区二区三区| 日韩欧美精品v在线| 18禁裸乳无遮挡免费网站照片| 国产精品野战在线观看| 国产精品一及| 中出人妻视频一区二区| 一进一出抽搐gif免费好疼| 日本a在线网址| 久久久精品大字幕| 日韩成人在线观看一区二区三区| 男插女下体视频免费在线播放| 久久精品国产综合久久久| 日韩欧美一区二区三区在线观看| 精品一区二区三区四区五区乱码| 黄色成人免费大全| 天天添夜夜摸| 99热这里只有是精品50| 一区福利在线观看| 在线看三级毛片| 国产精品1区2区在线观看.| 哪里可以看免费的av片| 亚洲欧美精品综合久久99| 亚洲电影在线观看av| 日韩欧美免费精品| 99精品在免费线老司机午夜| 99久久精品热视频| 国产视频内射| 麻豆久久精品国产亚洲av| 亚洲精品中文字幕一二三四区| 国产 一区 欧美 日韩| 亚洲真实伦在线观看| 久久久久国内视频| 亚洲成av人片在线播放无| 丰满的人妻完整版| 欧美一级毛片孕妇| 欧美一区二区精品小视频在线| av视频在线观看入口| 国产激情欧美一区二区| 国产精品免费一区二区三区在线| 成人三级做爰电影| 久久久国产欧美日韩av| 久久这里只有精品19| 欧美日韩精品网址| 性色av乱码一区二区三区2| 国产高清三级在线| 97人妻精品一区二区三区麻豆| 麻豆国产av国片精品| 免费无遮挡裸体视频| 女生性感内裤真人,穿戴方法视频| 法律面前人人平等表现在哪些方面| 亚洲专区中文字幕在线| 青草久久国产| 又紧又爽又黄一区二区| 日本a在线网址| 99热精品在线国产| 丰满的人妻完整版| 99热这里只有精品一区 | 国产亚洲精品久久久com| 国产伦一二天堂av在线观看| 女生性感内裤真人,穿戴方法视频| 亚洲国产精品久久男人天堂| 一级a爱片免费观看的视频| 久9热在线精品视频| 成人国产综合亚洲| 我要搜黄色片| 少妇的逼水好多| 国产精品久久久人人做人人爽| 成年免费大片在线观看| 波多野结衣高清无吗| 久久精品91无色码中文字幕| 欧美激情久久久久久爽电影| 黄频高清免费视频| 好看av亚洲va欧美ⅴa在| АⅤ资源中文在线天堂| 观看免费一级毛片| 男女做爰动态图高潮gif福利片| 热99re8久久精品国产| 三级男女做爰猛烈吃奶摸视频| 黄色视频,在线免费观看| 国内精品久久久久精免费| 亚洲美女视频黄频| 国产v大片淫在线免费观看| 国产高清视频在线播放一区| 久久九九热精品免费| 少妇熟女aⅴ在线视频| 国产精品久久久人人做人人爽| 国产亚洲av嫩草精品影院| 丁香六月欧美| 国产蜜桃级精品一区二区三区| 9191精品国产免费久久| 国产又色又爽无遮挡免费看| 欧美中文日本在线观看视频| 亚洲精品一区av在线观看| 香蕉国产在线看| 曰老女人黄片| 99国产精品一区二区蜜桃av| 香蕉久久夜色| 亚洲av日韩精品久久久久久密| 国产高清三级在线| 国产精品免费一区二区三区在线| 精品久久久久久久久久久久久| 无遮挡黄片免费观看| 在线观看免费午夜福利视频| 悠悠久久av| 十八禁人妻一区二区| 亚洲欧美日韩东京热| 欧美一区二区国产精品久久精品| 亚洲在线自拍视频| 国产成年人精品一区二区| 亚洲国产中文字幕在线视频| 男女下面进入的视频免费午夜| 亚洲成a人片在线一区二区| 悠悠久久av| 国产精华一区二区三区| av天堂中文字幕网| 99久久无色码亚洲精品果冻| 成年版毛片免费区| 国产成人精品无人区| 9191精品国产免费久久| 舔av片在线| 免费一级毛片在线播放高清视频| 亚洲av成人av| 久久久精品大字幕| 亚洲av成人av| 成人一区二区视频在线观看| 久久久水蜜桃国产精品网| 中文字幕最新亚洲高清| 久久天堂一区二区三区四区| 一本精品99久久精品77| 黄片小视频在线播放| 99热这里只有精品一区 | 国产成人一区二区三区免费视频网站| 老熟妇仑乱视频hdxx| 亚洲成av人片免费观看| 精品一区二区三区视频在线观看免费| 久久性视频一级片| 欧美三级亚洲精品| 久久午夜亚洲精品久久| 久久香蕉精品热| 欧美在线一区亚洲| 国产乱人伦免费视频| 国产黄色小视频在线观看| 久久久久免费精品人妻一区二区| 偷拍熟女少妇极品色| 麻豆一二三区av精品| 天堂√8在线中文| 国产精品一区二区免费欧美| 丰满人妻一区二区三区视频av | 天堂网av新在线| 18禁国产床啪视频网站| 日本一本二区三区精品| 美女高潮喷水抽搐中文字幕| 国内精品久久久久精免费| 99热只有精品国产| 怎么达到女性高潮| 国产av不卡久久| 日本一本二区三区精品| 美女黄网站色视频| 99热这里只有是精品50| 黄片小视频在线播放| 中文在线观看免费www的网站| 国产精品久久久人人做人人爽| 法律面前人人平等表现在哪些方面| 国产不卡一卡二| 少妇的丰满在线观看| 国产av在哪里看| 亚洲人成伊人成综合网2020| 女人高潮潮喷娇喘18禁视频| 97超级碰碰碰精品色视频在线观看| 伊人久久大香线蕉亚洲五| 欧美日韩福利视频一区二区| 亚洲av成人不卡在线观看播放网| 很黄的视频免费| 嫩草影视91久久| 午夜激情福利司机影院| 久久人人精品亚洲av| 国产97色在线日韩免费| 成人三级黄色视频| 亚洲第一欧美日韩一区二区三区| 99re在线观看精品视频| 亚洲avbb在线观看| 香蕉av资源在线| 欧美大码av| 国产欧美日韩一区二区三| 午夜久久久久精精品| 国内揄拍国产精品人妻在线| 中文字幕最新亚洲高清| 十八禁网站免费在线| 精品久久久久久久久久免费视频| 黄片小视频在线播放| 国产又黄又爽又无遮挡在线| 国产欧美日韩精品亚洲av| 夜夜躁狠狠躁天天躁| 久久精品影院6| 国产探花在线观看一区二区| 欧美一区二区国产精品久久精品| 9191精品国产免费久久| 51午夜福利影视在线观看| 国产免费男女视频| 岛国在线免费视频观看| 变态另类成人亚洲欧美熟女| 十八禁人妻一区二区| 精品国产亚洲在线| 国产美女午夜福利| 久久国产乱子伦精品免费另类| 夜夜夜夜夜久久久久| 亚洲美女黄片视频| 亚洲国产中文字幕在线视频| 免费电影在线观看免费观看| 中文字幕久久专区| 久久久精品欧美日韩精品| 亚洲国产日韩欧美精品在线观看 | 国产97色在线日韩免费| 亚洲国产看品久久| 久久午夜综合久久蜜桃| 日韩三级视频一区二区三区| 免费一级毛片在线播放高清视频| 叶爱在线成人免费视频播放| 亚洲午夜精品一区,二区,三区| 99久久无色码亚洲精品果冻| 精品不卡国产一区二区三区| 后天国语完整版免费观看| 国内少妇人妻偷人精品xxx网站 | 国产精品久久久人人做人人爽| 国产aⅴ精品一区二区三区波| 99精品欧美一区二区三区四区| 淫妇啪啪啪对白视频| 国产成人福利小说| 亚洲av中文字字幕乱码综合| 久久久国产精品麻豆| 搡老熟女国产l中国老女人| 日本a在线网址| 网址你懂的国产日韩在线| www.www免费av| 身体一侧抽搐| 欧美色欧美亚洲另类二区| 亚洲自拍偷在线| 一卡2卡三卡四卡精品乱码亚洲| 久久亚洲精品不卡| 观看免费一级毛片| 亚洲av日韩精品久久久久久密| 久久久久免费精品人妻一区二区| 亚洲aⅴ乱码一区二区在线播放| 国产黄片美女视频| 久久性视频一级片| 日韩免费av在线播放| 中文字幕精品亚洲无线码一区| 美女黄网站色视频| 看免费av毛片| 国产精品精品国产色婷婷| 天天一区二区日本电影三级| 性色avwww在线观看| 亚洲成人久久性| 免费在线观看亚洲国产| 美女午夜性视频免费| 国产精品 欧美亚洲| 美女高潮喷水抽搐中文字幕| 亚洲av成人一区二区三| 岛国在线免费视频观看| 精品国产亚洲在线| 免费在线观看亚洲国产| 日本 av在线| 国产aⅴ精品一区二区三区波| 变态另类丝袜制服| 搞女人的毛片| 中国美女看黄片| 每晚都被弄得嗷嗷叫到高潮| x7x7x7水蜜桃| 一a级毛片在线观看| 国产 一区 欧美 日韩| 久久久久亚洲av毛片大全| 狂野欧美激情性xxxx| 久久久成人免费电影| 久久国产精品人妻蜜桃| 九九热线精品视视频播放| 亚洲欧美日韩东京热| 丝袜人妻中文字幕| 欧美+亚洲+日韩+国产| 中出人妻视频一区二区| 亚洲,欧美精品.| 成人精品一区二区免费| 欧美在线一区亚洲| 麻豆久久精品国产亚洲av| 97超级碰碰碰精品色视频在线观看| 日韩 欧美 亚洲 中文字幕| 最近最新中文字幕大全电影3| 国产三级黄色录像| 亚洲最大成人中文| 毛片女人毛片| 欧美日本亚洲视频在线播放| 精品久久久久久久久久久久久| 变态另类丝袜制服| 亚洲欧美日韩东京热| 男人的好看免费观看在线视频| 伊人久久大香线蕉亚洲五| 婷婷丁香在线五月| 美女黄网站色视频| 中文资源天堂在线| 在线观看午夜福利视频| 久久欧美精品欧美久久欧美| 成人亚洲精品av一区二区| 88av欧美| 亚洲专区字幕在线| 久久久久九九精品影院| www.自偷自拍.com| 法律面前人人平等表现在哪些方面| 一二三四社区在线视频社区8| 欧美日韩国产亚洲二区| 久久久久九九精品影院| 神马国产精品三级电影在线观看| 亚洲精品乱码久久久v下载方式 | 午夜激情福利司机影院| 亚洲美女视频黄频| 真人做人爱边吃奶动态| 夜夜爽天天搞| 日韩人妻高清精品专区| www.精华液| 九色成人免费人妻av| 亚洲欧美日韩东京热| 欧美乱妇无乱码| av中文乱码字幕在线| 99久久成人亚洲精品观看| 麻豆国产97在线/欧美| 757午夜福利合集在线观看| 一本一本综合久久| 97人妻精品一区二区三区麻豆| 男女那种视频在线观看| 亚洲性夜色夜夜综合| 欧美日韩综合久久久久久 | 999精品在线视频| 又爽又黄无遮挡网站| 99热只有精品国产| 搞女人的毛片| 一本一本综合久久| 97人妻精品一区二区三区麻豆| 亚洲人成网站高清观看| 性色av乱码一区二区三区2| 欧美三级亚洲精品| 少妇的丰满在线观看| 亚洲精品色激情综合| 国产精品99久久久久久久久| 欧美乱妇无乱码| 亚洲国产精品999在线| 精品久久久久久久久久免费视频| 黄频高清免费视频| 999精品在线视频| 欧美一级毛片孕妇| 久久久水蜜桃国产精品网| 又黄又爽又免费观看的视频| 99精品久久久久人妻精品| 美女免费视频网站| 免费观看精品视频网站| 日韩av在线大香蕉| 嫩草影视91久久| www.熟女人妻精品国产| 亚洲性夜色夜夜综合| 夜夜看夜夜爽夜夜摸| 久久久国产成人精品二区| 五月玫瑰六月丁香| 欧美绝顶高潮抽搐喷水| 久9热在线精品视频| 色噜噜av男人的天堂激情| 国产欧美日韩精品一区二区| 老鸭窝网址在线观看| 欧美+亚洲+日韩+国产| 亚洲第一电影网av| av在线天堂中文字幕| 日本黄色视频三级网站网址| 成年女人毛片免费观看观看9| 亚洲精品美女久久久久99蜜臀| 757午夜福利合集在线观看| 成人性生交大片免费视频hd| 国产精品av视频在线免费观看| 日韩欧美在线乱码| 日本一二三区视频观看| 18禁黄网站禁片免费观看直播| 国产激情欧美一区二区| 亚洲精华国产精华精| 女人高潮潮喷娇喘18禁视频| 精品久久久久久久久久免费视频| 亚洲精品中文字幕一二三四区| 久久久久久人人人人人| 亚洲第一电影网av| 国内毛片毛片毛片毛片毛片| 999精品在线视频| 国产精品久久电影中文字幕| 噜噜噜噜噜久久久久久91| 亚洲电影在线观看av| 婷婷丁香在线五月| 成人特级av手机在线观看| 国产视频一区二区在线看| 国产成人欧美在线观看| 88av欧美| 亚洲国产日韩欧美精品在线观看 | 一区二区三区高清视频在线| 91久久精品国产一区二区成人 | 窝窝影院91人妻| 极品教师在线免费播放| 日韩精品中文字幕看吧| 这个男人来自地球电影免费观看| 日韩人妻高清精品专区| 老司机福利观看| 国产av不卡久久| 国产伦一二天堂av在线观看| 久久天躁狠狠躁夜夜2o2o| 91字幕亚洲| 欧美日韩亚洲国产一区二区在线观看| 国产成年人精品一区二区| 成人永久免费在线观看视频| 99riav亚洲国产免费| 伊人久久大香线蕉亚洲五| 99热这里只有精品一区 | 最新在线观看一区二区三区| 久久婷婷人人爽人人干人人爱| 淫妇啪啪啪对白视频| 欧美性猛交╳xxx乱大交人| 国语自产精品视频在线第100页| 麻豆国产av国片精品| 少妇人妻一区二区三区视频| 久久午夜综合久久蜜桃| 黄色视频,在线免费观看| 三级男女做爰猛烈吃奶摸视频| 美女 人体艺术 gogo| 男女做爰动态图高潮gif福利片| 一级毛片高清免费大全| 亚洲国产欧美一区二区综合| 日韩免费av在线播放|