• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Transformer-Aided Deep Double Dueling Spatial-Temporal Q-Network for Spatial Crowdsourcing Analysis

    2024-01-20 13:02:36YuLiMingxiaoLiDongyangOuJunjieGuoandFangyuanPan

    Yu Li,Mingxiao Li,Dongyang Ou,Junjie Guo and Fangyuan Pan

    Department of Computing,Hangzhou Dianzi University,Hangzhou,310018,China

    ABSTRACT With the rapid development of mobile Internet,spatial crowdsourcing has become more and more popular.Spatial crowdsourcing consists of many different types of applications,such as spatial crowd-sensing services.In terms of spatial crowd-sensing,it collects and analyzes traffic sensing data from clients like vehicles and traffic lights to construct intelligent traffic prediction models.Besides collecting sensing data,spatial crowdsourcing also includes spatial delivery services like DiDi and Uber.Appropriate task assignment and worker selection dominate the service quality for spatial crowdsourcing applications.Previous research conducted task assignments via traditional matching approaches or using simple network models.However,advanced mining methods are lacking to explore the relationship between workers,task publishers,and the spatio-temporal attributes in tasks.Therefore,in this paper,we propose a Deep Double Dueling Spatial-temporal Q Network(D3SQN)to adaptively learn the spatialtemporal relationship between task,task publishers,and workers in a dynamic environment to achieve optimal allocation.Specifically,D3SQN is revised through reinforcement learning by adding a spatial-temporal transformer that can estimate the expected state values and action advantages so as to improve the accuracy of task assignments.Extensive experiments are conducted over real data collected from DiDi and ELM,and the simulation results verify the effectiveness of our proposed models.

    KEYWORDS Historical behavior analysis;spatial crowdsourcing;deep double dueling Q-networks

    1 Introduction

    With the development of intelligent transportation,spatial crowdsourcing attracts more and more attention.Various applications of spatial crowdsourcing appear,such as spatial crowd-sensing which may collect and analyze traffic data from edge clients (e.g.,vehicles,traffic lights) to help predict and solve traffic jams.Besides,spatial crowdsourcing platforms like DiDi and Uber also utilize edge vehicles to provide intelligent services.For these spatial crowdsourcing applications,how to choose the appropriate crowdsourcing worker is an essential task.Therefore,in this paper,we study how to assign appropriate workers (i.e.,spatial vehicles and drivers) for spatial delivery tasks on spatial crowdsourcing platforms by analyzing workers’historical behavior data.

    Similar to traditional crowdsourcing platforms,spatial crowdsourcing platforms comprise three components: the task/request,the worker,and the platform.However,the tasks released on spatial crowdsourcing platforms are spatial tasks with spatio-temporal attributes.Spatial delivery tasks have spatio-temporal constraints,such as start and target locations,start times,and deadlines.A spatial delivery task is completed only if the requester is picked up at the source location within the requested time and successfully delivered to the target location before the deadline.In spatial crowdsourcing platforms,the positions of workers and requesters may change dynamically,especially the spatiotemporal attributes of workers during the completion of tasks are always changing.This paper focuses on common spatial delivery tasks in daily real-time ride-hailing services,such as Uber and DiDi Chuxing.

    In terms of spatial delivery task assignment on spatial crowdsourcing platforms,the key is to recommend a suitable task list to workers in a dynamic environment to maximize the benefits of workers,requesters and the platform.Traditional task assignment approaches to spatial crowdsourcing platforms mainly utilize matching approaches.However,with the explosive growth of vehicles in intelligent transportation,task allocation should take into account not only the location matching information but also the workers’preferences.The preferences of the workers and the requestors have a greater impact on the completion of the task.This can be critical to the user experience in real applications.For instance,DiDi Chuxing needed to serve 25 million ride requests a day and have more than 21 million registered drivers (i.e.,workers).Some requesters may choose a female driver for safety,while others may choose a male driver for speed.Some drivers may prefer delivery orders over downtown to get more future orders,while some drivers may prefer orders nearby to avoid traffic jams.In addition,we observed that the preferences of requesters and workers may change over time,making the presetting of preferences is impractical in real-world applications[1-3].

    To deal with the complex preferences of requesters and workers,neural networks are utilized to find appropriate task assignments.Shan et al.[4] was a state-of-the-art study that proposed a deep reinforcement learning model to solve the task scheduling problem in traditional non-spatial crowdsourcing platforms.However,task assignment of spatio-temporal delivery tasks is more complex since it is closely related to the spatio-temporal attributes of requesters and workers,and the constraints of task completion are also complex.The deep Q-network in [4] cannot deal with the input types of spatio-temporal tasks and workers well,moreover,it cannot extract the interrelation between workers,tasks,and the workers/requesters’preferences.As a result,it is impossible to recommend appropriate spatial delivery tasks to spatial crowdsourcing workers,resulting in poor revenue for the entire platform.

    In order to apply reinforcement learning models to solve task assignments on spatial crowdsourcing platforms,we refine the deep Q-network model in [4] by revising the architecture of deep QNetwork,including a State Transformer to process spatio-temporal input information.We propose a Double Dueling Deep Spatial Q Network(D3SQN)based on deep reinforcement learning framework specifically for spatio-temporal crowdsourcing task scheduling.In detail,we model the interaction between the spatiotemporal crowdsourcing environment(workers and requests)and the agent(platform) as a new Markov decision process (MDP),taking into account the longitude and latitude information of the task and worker.We apply our proposed D3SQN network to estimate the reward for recommending each task to an upcoming worker.D3SQN takes into account both current and future returns in the online environment,as well as workers’and requesters’preferences and how these preferences interrelate with spatio-temporal properties in the task.Applying reinforcement learning models to utilize user preferences in spatial crowdsourcing platforms has not been studied in previous literature,but this information is very important as the completion degree and satisfaction of the next task in the spatio-temporal crowdsourcing scenario are strongly related to it.Our contributions can be summarized as follows:

    · As far as we know,we are the first to apply deep reinforcement learning to assign tasks in spatial crowdsourcing platforms,and our proposed D3SQN can handle both current and future rewards to achieve long-term optimal allocation.In addition,the Q-value is quantified by using the user’s historical behavior,so that the model can reflect the user’s preference adaptively.

    · In addition to using the structure of Double DQN,we also modify the loss function and include a state transformer,which further improves the overall performance.

    · We use real data sets to demonstrate the effectiveness and efficiency of our framework.

    2 Related Work

    In this section,most related research is discussed.

    2.1 Deep Q Networks and Transformer

    Deep Q Network(DQN)[5]is a combination of deep learning and reinforcement learning,that is,neural networks are used to replace the Q table in Q-learning.In addition to being widely used in reference scenarios,more and more improved models have been proposed for the defects of DQN[5].For example,DDQN [6],a target network is added on the basis of DQN,which can reduce overestimation to some extent.D3QN uses Dueling Network[7]architecture on the basis of DDQN.It uses the network to express two estimators,namely the state value function and the action advantage function depending on the state.This factorization generalizes the learning of actions.Rainbow [8]combines 6 extended improvements to the DQN[5]algorithm,including DDQN[6],Dueling DQN[7],Prioritized Boy-Replay[9],Multi-step Learning[10],Distributional RL[11]and Noisy Net[12].

    EndToEnd[4]proposed the framework of deep reinforcement learning(RL)for task scheduling,which is a key issue for the success of crowdsourcing platforms.However,the original DQN[5]model is still used in EndToEnd [4],and the original DQN [5] and its variants are not well adapted to the spatio-temporal crowdsourcing problem.

    The transformer [13] has excelled in a wide variety of areas,including language modeling [14],summarization[15],question answering[16],and machine translation[17],etc.The transformer has made a significant breakthrough in reinforcement learning(RL)with the work of Parisotto et al.[18].The GTrxl architecture proposed by this work enables it to learn dependencies beyond a fixed length without breaking the temporal coherence.This allows it to make predictions using current input trajectories plus past trajectories.Not only that,but Transformer-XL introduces a new relative location encoding scheme that not only learns longer term dependencies,but also addresses context fragmentation.However,GTrxl cannot cope well with the input in the spatio-temporal crowdsourcing environment because of the permutation invariance of the input.

    2.2 Sptial Crowdsourcing

    Spatio-temporal crowdsourcing is a process in which a group of crowdsourcing tasks with spatiotemporal attributes are given to a group of workers.The spatio-temporal attributes of the workers must meet the spatio-temporal constraints of the tasks before they can perform the corresponding tasks.The research on spatial crowdsourcing has been very popular in recent years.Hassan et al.[19]defined an online space task allocation framework based on the formalization of the multi-armed robber problem to solve the online space task allocation problem.Wang et al.[20] designed a new adaptive batch-based constant competitive ratio solution framework to solve the dynamic bipartite graph matching(DBGM)problem.Liu et al.[21]proposed a two-stage solution to the on-demand food delivery problem in FooDNet.Cheng et al.[22]proposed two optimization methods,task-first greed(TPG) and game theory (GT),to solve the cooperative perceptual spatial crowdsourcing (CA-SC)problem.Shan et al.[4]proposed a deep reinforcement learning(RL)task assignment framework and used DQN for task recommendation in traditional non-spatial crowdsourcing platforms.However,their model cannot work well for spatial delivery tasks directly.

    In summary,none of the existing models can work well for spatial delivery task assignments.Inspired by the existing literature,we propose a variation of D3QN which applies transformer to reinforce learning to solve spatio-temporal crowdsourcing task assignment problems.

    3 Problem

    3.1 Problem Summary

    The goal of the task scheduling system of spatio-temporal crowdsourcing is to recommend a sorted list of tasks to a coming valid worker.As the platform’s profit model is to complete tasks on commission,the system should satisfy both workers and requesters.

    For each worker,as many suitable tasks as possible can be found (within the acceptable time and space of the worker).Requesters expect their tasks to be served as efficiently and comfortably as possible.That is,the task is picked up at the designated location before the expected start time of the task and the task is completed at the destination as early as possible before the specified time.

    In addition,since tasks and staff change dynamically,the system should cope with dynamically changed workers’and requesters’sets as well as their various and changeable preferences globally in real-time.

    3.2 Problem Definition

    In this section,we formally define our spatio-temporal crowdsourcing task allocation problem.

    In the spatio-temporal crowdsourcing platform,a worker is represented bywi,where i is the moment when the worker goes online.The coordinate of the worker at this moment is,and the worker can wait forto allocate the task at most.represents the maximum distance radius between the starting coordinate of the worker accepting the task and.The worker will be marked as invalid while performing the task and will be reset to valid when the task is complete.

    Unlike commercial crowdsourcing,a spatio-temporal crowdsourcing task is considered successful even if the workers take the order after the estimated start time or delays the completion after the estimated completion time,as long as the worker completes the two actions of pick and deliver.However,both late pickup and late delivery will affect the quality of task completion,reducing both requesters’satisfaction and workers’reward.Therefore,this paper studies the case of pickup before the expected time.Therefore,the task will only be completed if the worker picks up the request onbeforeand successfully delivers it at a coordinatebefore

    In summary,tasks that conform to the following constraints are optional tasks forwi:

    · The task is within the acceptable range of the worker,i.e.,distance

    · The task appears before the worker leaves,i.e.,i +Workers need to arrive at the starting point to pick up the task before the expected start time of the task,i.e.,Costtime

    4 System Overview

    We model the task scheduling problem in spatio-temporal crowdsourcing as a reinforcement learning problem.When a spatio-temporal crowdsourcing platform(broker)interacts with requesters and workers(environment),the requester influences the pool of available tasks in the broker by setting the start and end dates and start and end coordinates of tasks,and getting the results of the tasks after the completion of each task.The agent recommends tasks to future workers,and workers influence the agent through the completion of the tasks.Following the end-to-end MDP setting,since workers and requestors have different optimization goals,we use two Markov decision processes(MDP)to optimize workers and requestors separately,and finally combine them together for simultaneous optimization.

    For a coming workerwiat timestamp i,his available tasksTiare obtained.The features ofwiand all tasks inTiconstitute the state feature.With,Q-Network(W)and Q-Network(R)compute the Q-values that are aggregated to determine the actionai.airecommends a sorted list of tasks and the worker selects one accordingly.The action’s feedback as well as corresponding rewardriare passed to the predictor for future state estimationsi+1.Finally,ri,ai,siandsi+1are stored in the Memory for the Learner to train Q-Networks continuously.

    Fig.1 illustrates the system framework of the overall process.The timestampirequestor release tasks on the platform.At the moment,a driverwionline platform(Step 1)at the moment.We picked out a set of drivers from the task pools available taskT*,T*length less than or equal tomaxT,these tasks are according to the above constraints(Step 2).Driverwifeatures andT*ailable in the task of location information by geohash and one-hot coding or word2vec embedding coding,generation task pool each taskvector feature vector and the driver’s,connection alignment padding constitutes,which represents the platform of state vector(Step 3).Then,we inputinto two D3SQN-networks,namely D3SQN-Networks(W)and D3SQN-networks(R).considering the worker’s benefitQw(si,ai)and the requester’s benefitQr(si,ai),respectively,and predict the Q-value of each possible actionaiin statesiwhereairepresents which task the user chooses.Each network outputs a vector ofmaxT×1,where each value represents theQvalue of the corresponding taskT(Step 4).According to the aggregator will two network outputQw(si,ai) andQr(si,ai) polymerization to produce the final task list={Tj1,Tj2,Tj3,...},the aggregator performs a weighted calculation of the two network output values to balance the workers and requesters,with a default value of 0.5.CorrespondingQ(si,),* in descending order aggregation scoresQ(si,) as a recommended list of tasks for the drivers(step 5).Whenwisees a sorted task list,we assume that the driver follows a cascade model to view the task list and complete the first task of interest,which is the task with the largest Q-value as the driver’saiat the moment.The feedback is the completed tasks and the unfinished tasks suggested towi(Step 6).Then through quantitative feedback rewardedri(Step 7).The predictor postulates that the outcome of D3SQN determines the ultimate action of the present worker.Subsequently,it constructs the following state,assuming the current sequence is chosen,by aggregating the set of available tasks for the next incoming worker and the relevant details of the next worker,culminating in the formation of the resultant future statesi+1(step 8).Then,we will be successful tuples(si,ai,ri,si+1)in the memory pool,theaiis to complete tasks,and the memory pool is used for storing training data(Step 9).Each time we store an extra tuple into the memory pool,we use learners to update the parameters of the two D3SQN Networks,obtain good estimates ofQw(si,ai)andQr(si,ai),and derive the optimal strategy(Step 10).

    In the next section,we will introduce the characteristics of these parts of the system in detail.The model is introduced in Section 5.

    5 Feature Construction

    whereγ1∈[0,1]is the attenuation factor.In order to capture the long-term and short-term preferences of workers,we attenuated the historical task information so that the short-term interest accounted for more and the long-term preferences of workers could also be reflected.

    When thewionline platform seeks for task orders at timei,the system will select the available task list for thewiaccording to the constraints in Section 3.2.Then,the feature vectors of the tasks in the available list and the feature vectors of the workers are concatenated to obtain the feature representation vectorof the state at timei.Since the number of available tasks varies at different timestamps,we set the maximum number of available tasksmaxTand use zero padding,that is,adding zero to the end of,so that eachis denoted as:where?σ(wi),which σ(wi)iswiavailable task set.

    6 Double Dueling Deep Spatial Q Network(D3SQN)

    In this section,we introduce our proposed Q-learning-based deep reinforcement learning network model specifically for spatio-temporal crowdsourcing.

    As shown in Fig.2,based on the system state vectorat timei,the optimal allocation of spatiotemporal tasks is achieved through D3SQN.Specifically,this model uses the architecture of Dueling Network and two Spatial State Transformers to respectively predict the State Q-value and the dominant feature vector composed of the advantages of each action.Finally,andVaare spliced into the final action value vector

    Figure 2: (Continued)

    Figure 2: Double dueling deep spatial Q network (D3SQN) framework and details.(a) shows the framework of D3SQN.The network architecture is divided into two parts: the upper spatial state transformer used to predict the action advantage value Va and the lower spatial state transformer used to predict the state value Qsi.(b) shows the specific details of the encoder and decoder of the spatial state transformer

    Different from the traditional DQN,the states that need to be processed by our proposed D3SQN are significantly different from the traditional DQN network under spatio-temporal crowdsourcing.

    Most of the traditional DQN,including its variants,are suitable for the input of temporal or sequentially related feature vectors.For example,in a game scenario,the representation of state may be a vector matrix of continuous pictures,and the sorting of these pictures and the distribution of pixels inside the pictures are fixed.Once the distribution of these pixels changes,the prediction results of the network will be affected.However,in our spatial crowdsourcing environment,the representation of the state is the connection of a string of available lists and worker feature vectors,which is a set feature.The sorting order of these available lists will not affect the result of the final predicted action.It is difficult for traditional DQN and DQN variants to mine and learn the complex and dynamic spatiotemporal correlations in such permutation invariant inputs.Because this permutation invariant set feature is a global feature,the ability of CNN or traditional neural network architecture to extract local structural features limits the extraction ability of such global high-dimensional features.The spatial state Transformer is better able to handle high-dimensional and global information,mainly because the structure of the Transformer itself is better suited to extract this information.Therefore,this paper proposes Spatial State Transformer,a new deep reinforcement learning network model for spatial-temporal crowdsourcing.It will not affect the result of prediction even if he faces the state vector with a changed order.

    We use the structure of Double DQN and Dueling Network to alleviate the overestimation problem in Q-learning.Next,we will introduce the details and functions of Dueling Network and Spatial State Transformer,respectively.

    6.1 Dueling Network

    When we recommend a task set to a worker,the tasks in the optional task set are all the tasks that the worker may take orders from next,and each task represents an action.When there are several actions with multiple redundant or approximately equal Q-values,the original network will prefer to choose the first task,but in fact,different actions will lead to different states,and we may not be able to jump out of the optimal action under the current state,so the structure of Dueling network is adopted.Instead of relying solely on the value of the action,the state can make a separate value prediction.The performance of the network model is better.

    Using the structure of Dueling Network,the original direct predictive action value is divided into predictive state value and dominant value of each action.This effectively avoids the original over-estimation problem of Q-learning.The traditional network of DQN is approximateQ*(s,a)=maxπQπ(s,a) by network,which represents the value function of optimal actionaunder strategyπand states.The architecture of Dueling Network is to separate State from action to a certain extent,which is formalized as follows:

    As can be seen in Fig.2a,the Dueling Network in our proposed D3SQN consists of two Spatial State Transformer,which are formalized as follows:

    6.2 Spatial State Transformer

    In this section,we introduce the spatial state Transformer:an attention-based neural network for processing state vectors under spatial crowdsourcing.

    As shown in Fig.2a,Spatial State Transformer consists of EnCoder and DeCoder,our motivation for designing the spatial state converter is that self-attention allows the extraction of key global associations in the collection data,as well as spatio-temporal information associations.This makes the spatial state converter a more effective and efficient way to encode the entire collection simultaneously.At the same time,our model needs to extract high-level feature representations in order to use the self-attention mechanism to extract associations between information,which is why our model is designed as Encoder and Decoder architecture.Encoder and Decoder are the attention-based spatial set operation modules defined by us.The main purpose of the encoder is to represent permutation invariant state inputs,encoding independently a set of tasks of available size and each elementThe decoder aggregates these encoding featuresand uses self-attention mechanism to extract associations between high-level representations.All the blocks described here are neural network blocks with their own parameters,not fixed functions.

    We made improvements by referring to the structure of SetTransformer,without retaining the original pooling operation,and made improvements in EnCoder and DeCoder modules suitable for reinforcement learning,We tune the layerNorm layer after the feature input,instead of before the feature input as the Transformer usually does and replace the general residual connection with GatingLayer.In this way,the learning and training of Transformer structure in RL can be stabilized.We use this structure of SetTransformer,but do not retain the pooling operation,because the pooling operation in SetTransformer aims to extract the correlation between features from multiple dimensions.However,the state expression of spatio-temporal crowdsourcing is different from 3D data,which is three-dimensional collection data strongly correlated among multiple dimensions.The previous SetTransformer architecture is mainly aimed at such low-latitude replacement and unchanged collection data.What we need to mine is the relationship between the features of each element in each list(including task and worker address information),so it is counterproductive to try to improve the dimension to learn.We only need to use the feature expression proposed by Encoder and use Decoder to aggregate and finally get the prediction result.

    In the remainder of this section,we’ll describe the details of individual modules.

    DeCoder:As seen in Fig.2b,the Decoder module is designed to extract various aspects of spatio-temporal featuresobtained from the Encoder through self-attention.By stacking multiple Decoders,we can extract deeper spatio-temporal interaction information from the input set.The decoder structure includes layer normalization,multi-head self-attention,and a gating layer.We also use an identity map reordering technique inspired by GTrxl[18]to facilitate policy optimization,which is used to help with policy optimization by initializing the agent in a manner that is similar to a Markov policy/value function.

    Identity Map Reordering:We applied a modification called Identity Map Reordering to the Spatial Temporal Transformer,inspired by the GTrxl [18] design.This involved rearranging the order of Layer Norm and applying ReLU activation to the output of each submodule before joining with the residual connection.This reordering enables an identity mapping from the input of the Transformer at the first layer to the output at the last layer,which facilitates policy optimization in reinforcement learning.Specifically,this allows the agent to learn reactive actions first before focusing on memorybased behavior.In the spatio-temporal crowdsourcing environment,this also allows the agent to learn to select actions before using memory,even if the experience has not yet entered the memory.This modification improves the efficiency of learning and enables the agent to effectively utilize memory to improve performance.

    Gated-Recurrent-Unit-Type Gating:We use an explicit initialization of the GRU gating mechanism to approach the identity map.Gated recursive unit(GRU)is a recursive network,which behaves like LSTM,but with fewer parameters.The formalization is as follows:

    wherebgis the bias in the applicable gating layer.Initially setupbg>0 can greatly improve the learning speed.

    Encoder:The Decoder architecture in the Spatial State Transformer eliminates position encoding,enabling the network to extract mutual relationships between sets in the state embedding.However,using the Decoder directly can result in quadratic time complexity,which is not practical for large-scale set-structured datasets.To address this issue,we introduce trainable inducing points I into the Encoder architecture,which reduces the time complexity.An Encoder with m inducing points I is defined as follows:

    Encoder first converts I through multi-head processing of input set,and then enters GRU processing through input setX,and finally generates a set containingNelements.

    The Encoder’s objective is to extract a high-dimensional representation of the set state while mitigating the computational complexity resulting from the set’s size.In this study,the Encoder is responsible for extracting an attention-based spatio-temporal state embedding representation,which serves as a basis for discerning the spatio-temporal relationships between Workers and Requesters.Attention is computed between sets of size m and n,with the Encoder’s time complexity being O(mn),a substantial improvement over the quadratic complexity of the Decoder.Both set operations(Encoder and Decoder)are permutation invariant.

    Loss Function:we adopt a recently developed loss function,DQNReg,which is inspired by earlier researches[23-27].The formula for DQNReg is presented below:

    The first term of the loss function introduces regularization by multiplying Q-values with an adjustable weighted term,which helps to alleviate overestimation of Q-values.The second term encourages Q-values to approach target Q-values.DQNReg effectively prevents overestimation by directly regularizing Q-values with a weighted term that is always active.The adjustable weighted term in the loss function is set to 0.1 in the experiment.

    7 Experiment

    7.1 Experimental Setup

    Dataset:We conducted experiments on two real-world datasets,collected by DiDi Chuxing in Xi’an,China,the DiDi dataset released through its GAIA initiative,and the dataset from ELM released by Tianchi.Each piece of DiDi’s dataset contains drivers’ID,task ID,timestamp,longitude and latitude.From this data,we can get information about each task and worker.For our experiment,we utilized a sample of 80,000 orders that were recorded continuously throughout October 2016 in the didi dataset [28].We obtained the ELM dataset [29] which contains identical information and used the same methodology to acquire an equivalent amount of data.The ELM and DiDi datasets contain different numbers of workers because the average completion time per order on the ELM platform,which is a food delivery service,is much smaller than that of taxis.Therefore,the difference in the number of workers required for the same number of orders is significant.As shown in Table 1,the ELM dataset of 80,000 orders only has 857 workers,while DiDi has 15,367 workers.

    Table 1: Differences in number of workers in the data set

    Settings:Over time,we will resume the process of staff arrival,task creation,or task expiration.The collected dataset records each timestamp I at which a workerwistarts a task,and we assume that a start corresponds to a worker arrival and that the tasks completed by workers in the dataset are considered interesting.Since we do not know the available task list information that the platform assigns to thewiwhen it arrives,we cannot use the tasks completed by thewiin the original data set as our target for predicting success.Therefore,whenwiarrives,we will use the completed orders of a worker in the real data as the workers’preference and match a target taskttargetwfrom the current available event pool based on this preference.Considering the interests of workers,if the recommended task isttargetw,then the reward/label (for reinforcement/supervised learning) is 1.For the benefit of requesters,the reward/tag is the quality gain oftj.And the final reward will be the sum of the ratio of the workers and requesters,in our experiment,this ratio is split equally between the workers and requesters.

    Evaluation Measures:Considering that we are recommending a task or a list of tasks,we use CR and nDCG-CR as the metrics:

    · CR: Worker Completion Rate (CR).When theworkeriarrives at timestampi,the agent recommends a task.If the task is the same as the one actually selected byworkeri,Rjis 1,otherwise 0.

    · nDCG-CR: Normalized Discount Cumulative Gain.Instead of one task,the agent recommends a list of tasks.nDCGCR is more suitable for evaluating the recommended listσwi={Tj1,Tj2,Tj3,...}.ris the rank position of tasks in the list,Nis the number of available tasks when theworkeriarrives.If ther(Tjn) is the same as the one actually selected by theworkeri,yjr=0 is 1,otherwise 0.

    Competitors:We compare our method with three alternatives to reinforcement learning,namely DDQN,D3QN,and EndToEnd.All these methods are trained on real datasets,and the characteristics of workers and tasks are updated in real time.By aggregating the Q-values of the predicted tasks relative to the worker and the request,an available task is selected or the available tasks are ranked according to the aggregated Q-values.The model updates the parameters in real time after each recommendation.

    · DDQN [6]: Double DQN is the deep Learning realization of Double Q-learning,and uses Target network to reduce the overestimation problem in Q-learning.The training is based on the parameters of the current Q network,rather than the parameters of target-Q as in DQN.In this way,overestimation is reduced to a certain extent,so that the Q-value is closer to the real value.

    · D3QN[7]:Dueling Double Network,using dominance function to further solve the overestimation problem.Dueling DQN can estimate Q-value more accurately and select more appropriate actions after collecting data of only one discrete action.Double DQN,selects the target Q-value by the action of the target Q-value selection,thereby eliminating the problem of overestimating the Q-value.D3QN combines the advantages of Dueling DQN and Double DQN.

    · EndToEnd [4] proposes a DQN network model with permutation invariant input property,which is suitable for a commercial crowdsourcing environment,and depends on the previous task allocation model in the end-to-end crowdsourcing reinforcement learning framework.

    7.2 Experimental Results

    In this section,we depict the experimental results of D3SQN in terms of both effectiveness and efficiency.

    Implementation Details:Our model consists of two D3SQNs,and after experimental evaluation,the number of neurons in each layer of D3SQN is set to 128.Moreover,for other hyper-parameters in D3SQN,we set target Q update frequency as 50,learning rate as 0.001,buffer size as 1200,discount factor Gamma as 0.35 and batch size as 128.We used PyTorch to implement the entire algorithm,and the code ran on a GeForce GTX 2080 TI GPU.We use the definition in Sections 4 and 7.1 to construct the environment,action,state,and reward necessary for reinforcement learning,and our model learns spatio-temporal crowdsourced task allocation strategies within this framework.The number of layers of our encoder layer and decoder layer is the same as that of EndToEnd[4],and the complexity of our model is on the same order of magnitude as that of endtoend in terms of neuron parameters.

    Effectiveness:In Section 3,we constructed two D3SQNs,namely D3SQN(R)and D3SQN(W),for requesters and workers,respectively,and evaluated their benefits along with the aggregated balance in our experiment.We presented the CR,CR(W),CR(R),nDCG,nDCG(W),and nDCG(R)measures for each method and dataset in Fig.3.CR(W) and CR(R) denote the proportion of maximum Qvalues chosen by the two sub-networks of D3SQN that align with the final target.Similarly,nDCG(W)and nDCG(R)have the same interpretation.To account for both workers and requesters,we employed a ratio of 1:1 to aggregate the Q-values computed by the two sub-networks.This approach ensures that the workers and requesters contributions to the overall reward are equally weighted.

    Figure 3: (Continued)

    Figure 3:Benefits of workers&requesters

    The results,as depicted in Fig.3,demonstrate that D3SQN outperforms all competitors in terms of aggregated CR and aggregated nDCG-CR for both datasets.In contrast,DDQN performs poorly because it ignores the preferences of requesters and workers.Additionally,neither DDQN nor D3QN can handle permutation-invariant feature inputs,which leads to suboptimal performance.

    The D3SQN algorithm demonstrates superior performance in CR(W) and nDCG-CR (W)compared to the End-to-End method due to its consideration of spatio-temporal constraints and features.Our model outperforms all other competitors in both datasets.However,in terms of CR(R)and nDCG-CR (R),our model surpasses all competitors in the ELM dataset,performs better than most competitors in the DiDi’s dataset,and slightly lags behind the End-to-End method.This is attributed to our model’s attention to spatio-temporal attributes,which gives more weight to workers in each task’s spatio-temporal attributes.In datasets with relatively less user data,our model prioritizes the weight associated with learning workers,resulting in the best aggregation effect.Moreover,our model exhibits exceptional efficiency,achieving the fastest convergence rate and outperforming all metrics in both datasets.

    7.3 Ablation Study

    In this section,we will test the effect of each Module,we will test the SetTranformer architecture,GatingLayer&Identity Map Reordering and Dueling Network architecture in D3SQN.

    As shown in Fig.4,removing each improvement from the model reduces the prediction of the model.Especially when the SetTransformer structure is removed,the prediction effect of the model decreases most obviously.This is because the SetTransformer structure learns the worker preferences of all the training samples and is not affected by the variable task order.

    However,removing the GatingLayer &Identity Map Reordering significantly decreases the performance of the model,as the absence of the identity mapping makes it harder for the transformer architecture to learn memory-based policies in complex feature representations within the RL environment.In addition,removing the Dueling structure slows down the training process since the Dueling network architecture allows for feature sharing between different actions,which accelerates the learning process.

    7.4 Limitations

    Our experiments still have some limitations.For instance,there are too few open source spatiotemporal crowdsourcing datasets for us to verify the generalization of our model on more datasets,which may limit the generalizability of our proposed model to other platforms.However,we believe that the data from these two platforms(DiDi and ELM)are representative of a significant portion of the spatial crowdsourcing platforms,as they are widely used in the transportation industry.Secondly,our data set comes from the desensitization data of some crowdsourcing platforms,which lack more information of users and order feedback.The influence of this part of real information on the verification of the model effect also increases the limitation of our experiment.

    Figure 4:Ablation experiments with different point

    8 Conclusion

    This study proposed a deep reinforcement learning framework network D3SQN for spatial delivery task assignment in spatial edge intelligence platforms.It considered the critical role of spatiotemporal attributes in task assignment,and the spatio-temporal preferences of workers and requesters with respect to published tasks to facilitate better task assignment.The spatial temporal transformer proposed herein can effectively extract the interchangeable spatial-temporal state vector,and excavate the complex deep relationship between spatial-temporal attributes,workers and requesters in the task.This enables D3SQN to achieve optimal assignment in dynamic spatio-temporal crowdsourcing scenarios.The experimental results show that D3SQN provides effective and efficient assignment performance.

    Acknowledgement:We would like to thank Dr.Caihua Shan for sharing her code and data with us.

    Funding Statement:This work was supported in part by the Pioneer and Leading Goose R&D Program of Zhejiang Province under Grant 2022C01083 (Dr.Yu Li,https://zjnsf.kjt.zj.gov.cn/),Pioneer and Leading Goose R&D Program of Zhejiang Province under Grant 2023C01217 (Dr.Yu Li,https://zjnsf.kjt.zj.gov.cn/).

    Author Contributions:Study conception and design: Yu Li,Mingxiao Li;data collection: Yu Li,Mingxiao Li;analysis and interpretation of results: Yu Li,Mingxiao Li,Dongyang Ou;draft manuscript preparation: Mingxiao Li,Junjie Guo,Pangyuan Fan.All authors reviewed the results and approved the final version of the manuscript.

    Availability of Data and Materials:Our data access link is placed in the citation,and we access the data from the link in the citation;we are not the direct data publisher.

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    亚洲欧美成人精品一区二区| 国产激情偷乱视频一区二区| 亚洲精品456在线播放app| 国产v大片淫在线免费观看| 一卡2卡三卡四卡精品乱码亚洲| 国产男人的电影天堂91| 九色成人免费人妻av| or卡值多少钱| 最近中文字幕高清免费大全6| 免费不卡的大黄色大毛片视频在线观看 | 精品久久久久久久久久免费视频| 人妻久久中文字幕网| 麻豆精品久久久久久蜜桃| 99久国产av精品| 国产综合懂色| 精品久久久噜噜| av在线播放精品| 亚洲欧美中文字幕日韩二区| 最新在线观看一区二区三区| 晚上一个人看的免费电影| 又黄又爽又刺激的免费视频.| 久久久久久久久久成人| 亚洲乱码一区二区免费版| 精品久久久久久久久av| 亚洲aⅴ乱码一区二区在线播放| 亚洲高清免费不卡视频| 一级毛片久久久久久久久女| 两个人视频免费观看高清| 日韩欧美精品免费久久| 麻豆乱淫一区二区| 最近最新中文字幕大全电影3| 亚洲天堂国产精品一区在线| 中文字幕人妻熟人妻熟丝袜美| 国产精品一区二区免费欧美| 99精品在免费线老司机午夜| av.在线天堂| 欧美bdsm另类| 色综合站精品国产| 日韩强制内射视频| 免费观看人在逋| 亚洲av熟女| 国产成人a∨麻豆精品| 午夜福利视频1000在线观看| 亚洲三级黄色毛片| 国产高潮美女av| 午夜a级毛片| 精品国产三级普通话版| 高清日韩中文字幕在线| 日韩强制内射视频| 欧美激情在线99| 少妇人妻精品综合一区二区 | 成人av在线播放网站| 老司机午夜福利在线观看视频| 精品日产1卡2卡| 亚洲人成网站在线观看播放| 日本黄大片高清| 婷婷六月久久综合丁香| 亚洲18禁久久av| 国产精品爽爽va在线观看网站| 免费高清视频大片| 长腿黑丝高跟| 麻豆国产97在线/欧美| 国产成人影院久久av| 午夜福利成人在线免费观看| 免费无遮挡裸体视频| av国产免费在线观看| 国产成人91sexporn| 国产精品国产三级国产av玫瑰| 十八禁网站免费在线| 变态另类丝袜制服| 简卡轻食公司| 亚洲性久久影院| 成人av一区二区三区在线看| 日本与韩国留学比较| 久久欧美精品欧美久久欧美| 亚洲激情五月婷婷啪啪| 男女之事视频高清在线观看| 欧美极品一区二区三区四区| 亚洲专区国产一区二区| 九色成人免费人妻av| 久久久久九九精品影院| 国产真实乱freesex| 欧美极品一区二区三区四区| 青春草视频在线免费观看| 免费在线观看影片大全网站| 露出奶头的视频| 最近2019中文字幕mv第一页| 亚洲性久久影院| 18禁黄网站禁片免费观看直播| 国产精品女同一区二区软件| 亚洲精品久久国产高清桃花| 国产69精品久久久久777片| 成熟少妇高潮喷水视频| 国产精品伦人一区二区| 亚洲中文日韩欧美视频| av.在线天堂| 五月伊人婷婷丁香| 国产精品爽爽va在线观看网站| 久久久国产成人免费| 国产色婷婷99| 人人妻人人澡人人爽人人夜夜 | 中文字幕久久专区| 人人妻,人人澡人人爽秒播| 色综合站精品国产| 久久草成人影院| 亚洲,欧美,日韩| 精品久久久久久久久亚洲| 性插视频无遮挡在线免费观看| 校园春色视频在线观看| 熟女电影av网| 黄色视频,在线免费观看| 热99re8久久精品国产| 国产在视频线在精品| 国产伦在线观看视频一区| 看免费成人av毛片| 色哟哟哟哟哟哟| 18禁黄网站禁片免费观看直播| 欧美日韩乱码在线| 99久久成人亚洲精品观看| 中文亚洲av片在线观看爽| 亚洲精品日韩在线中文字幕 | 日韩精品青青久久久久久| 久久人人精品亚洲av| 国产一级毛片七仙女欲春2| 日韩欧美精品v在线| 日韩欧美一区二区三区在线观看| 欧美激情在线99| 麻豆av噜噜一区二区三区| 亚洲内射少妇av| 亚洲五月天丁香| 最后的刺客免费高清国语| 午夜激情福利司机影院| 国内精品一区二区在线观看| 五月玫瑰六月丁香| 国产亚洲欧美98| 日韩国内少妇激情av| 国产三级在线视频| 日韩成人av中文字幕在线观看 | 少妇丰满av| 欧美激情在线99| 国产精品久久久久久久久免| 免费一级毛片在线播放高清视频| 大香蕉久久网| 欧美又色又爽又黄视频| 免费av毛片视频| 亚洲精品国产成人久久av| 亚洲av免费高清在线观看| 国产av麻豆久久久久久久| 亚洲成人精品中文字幕电影| 午夜精品在线福利| 少妇的逼水好多| 亚洲国产精品sss在线观看| 久久午夜福利片| 中文字幕人妻熟人妻熟丝袜美| 久久精品国产亚洲av涩爱 | 中文字幕av成人在线电影| 亚洲性夜色夜夜综合| 亚洲一区高清亚洲精品| 婷婷亚洲欧美| 免费高清视频大片| 美女高潮的动态| 国内精品一区二区在线观看| 亚洲国产欧美人成| 色av中文字幕| 亚洲专区国产一区二区| 亚洲精品成人久久久久久| 长腿黑丝高跟| 亚洲av第一区精品v没综合| 99久久精品国产国产毛片| 小蜜桃在线观看免费完整版高清| av福利片在线观看| 亚洲av.av天堂| 国产精品久久视频播放| 可以在线观看毛片的网站| 国产成人精品久久久久久| 丰满的人妻完整版| 男女边吃奶边做爰视频| 插阴视频在线观看视频| 国产色爽女视频免费观看| 神马国产精品三级电影在线观看| 18禁在线无遮挡免费观看视频 | 丰满人妻一区二区三区视频av| 一本一本综合久久| 国产精品国产高清国产av| 国产精品,欧美在线| 51国产日韩欧美| 婷婷亚洲欧美| 舔av片在线| av黄色大香蕉| 久久国产乱子免费精品| 久久久久九九精品影院| 国产伦在线观看视频一区| 日本色播在线视频| 激情 狠狠 欧美| 免费在线观看成人毛片| 久久久欧美国产精品| 午夜福利18| 日韩成人av中文字幕在线观看 | 1000部很黄的大片| 淫秽高清视频在线观看| 欧美高清性xxxxhd video| 国产精品一区二区性色av| 天美传媒精品一区二区| 国产色婷婷99| 久久婷婷人人爽人人干人人爱| 国产精华一区二区三区| 国产精品99久久久久久久久| 成年女人永久免费观看视频| 尤物成人国产欧美一区二区三区| 欧美又色又爽又黄视频| 男人的好看免费观看在线视频| 黄色欧美视频在线观看| 日韩欧美 国产精品| 亚洲熟妇熟女久久| 日韩高清综合在线| 国产亚洲精品av在线| 日韩制服骚丝袜av| 最新在线观看一区二区三区| 欧美日韩综合久久久久久| 黄色视频,在线免费观看| 可以在线观看的亚洲视频| 亚洲欧美日韩东京热| 久久久久久久久大av| 美女cb高潮喷水在线观看| 极品教师在线视频| 成人精品一区二区免费| 国产伦一二天堂av在线观看| 天天躁夜夜躁狠狠久久av| 精品久久久久久久久av| 免费在线观看成人毛片| 两性午夜刺激爽爽歪歪视频在线观看| 国产午夜福利久久久久久| 久久中文看片网| 91在线精品国自产拍蜜月| 国产视频内射| 天堂av国产一区二区熟女人妻| 亚洲人成网站在线播放欧美日韩| 精华霜和精华液先用哪个| 午夜亚洲福利在线播放| 99在线视频只有这里精品首页| 精品久久久久久久久久久久久| av在线亚洲专区| 成人三级黄色视频| 欧美性感艳星| 国产在线男女| 五月玫瑰六月丁香| 亚洲美女搞黄在线观看 | 国产精品美女特级片免费视频播放器| 黄色一级大片看看| 国产精品日韩av在线免费观看| 欧美成人精品欧美一级黄| 精品人妻偷拍中文字幕| 欧美色视频一区免费| 内射极品少妇av片p| 国产黄a三级三级三级人| 中出人妻视频一区二区| 12—13女人毛片做爰片一| 哪里可以看免费的av片| 国产日本99.免费观看| 国产伦精品一区二区三区四那| 欧美区成人在线视频| 免费不卡的大黄色大毛片视频在线观看 | 黑人高潮一二区| 好男人在线观看高清免费视频| 国产午夜精品论理片| 免费高清视频大片| 精品久久久久久成人av| 国产爱豆传媒在线观看| 亚洲高清免费不卡视频| 午夜精品一区二区三区免费看| 成人毛片a级毛片在线播放| 久久婷婷人人爽人人干人人爱| 免费看美女性在线毛片视频| 国产v大片淫在线免费观看| 国产探花极品一区二区| 久久精品国产自在天天线| 少妇裸体淫交视频免费看高清| 一进一出抽搐动态| 日日摸夜夜添夜夜添小说| 精品人妻偷拍中文字幕| 观看美女的网站| 久久精品国产鲁丝片午夜精品| 中文亚洲av片在线观看爽| 亚洲精品色激情综合| 1024手机看黄色片| 国产精品人妻久久久久久| 天堂动漫精品| 国产av一区在线观看免费| 国产爱豆传媒在线观看| 俄罗斯特黄特色一大片| 亚洲久久久久久中文字幕| 久久精品人妻少妇| 久久久国产成人免费| 搡老熟女国产l中国老女人| 色在线成人网| 亚洲七黄色美女视频| 国产精品伦人一区二区| 亚洲丝袜综合中文字幕| 97碰自拍视频| 久久精品综合一区二区三区| 麻豆成人午夜福利视频| 啦啦啦韩国在线观看视频| 日韩一本色道免费dvd| 日韩成人av中文字幕在线观看 | 亚洲国产欧洲综合997久久,| 亚洲av不卡在线观看| 日韩av不卡免费在线播放| 欧美xxxx黑人xx丫x性爽| 两性午夜刺激爽爽歪歪视频在线观看| 精品午夜福利在线看| 男人狂女人下面高潮的视频| 在线天堂最新版资源| 亚洲熟妇熟女久久| 美女 人体艺术 gogo| 久久久久免费精品人妻一区二区| 亚洲精品成人久久久久久| 在现免费观看毛片| 国产亚洲精品综合一区在线观看| 2021天堂中文幕一二区在线观| 性色avwww在线观看| 久久精品国产亚洲av涩爱 | 97在线视频观看| 亚洲天堂国产精品一区在线| 菩萨蛮人人尽说江南好唐韦庄 | 亚洲欧美日韩无卡精品| 能在线免费观看的黄片| 欧美最新免费一区二区三区| АⅤ资源中文在线天堂| 色视频www国产| 久久亚洲国产成人精品v| 精品人妻视频免费看| 久久6这里有精品| 午夜久久久久精精品| 亚洲成人久久性| 亚洲熟妇熟女久久| 女人被狂操c到高潮| 精品熟女少妇av免费看| 性插视频无遮挡在线免费观看| 中文资源天堂在线| 十八禁网站免费在线| 人妻少妇偷人精品九色| 国产成人一区二区在线| 亚洲成人中文字幕在线播放| 91久久精品国产一区二区成人| 国产黄色视频一区二区在线观看 | 色哟哟哟哟哟哟| 亚洲精品粉嫩美女一区| 国产片特级美女逼逼视频| 日韩欧美国产在线观看| 99久久久亚洲精品蜜臀av| 免费高清视频大片| 亚洲国产色片| 特级一级黄色大片| 免费av毛片视频| 精品人妻熟女av久视频| 99久久九九国产精品国产免费| 国产黄a三级三级三级人| 成人二区视频| 国产成人精品久久久久久| 一级毛片久久久久久久久女| 欧美一级a爱片免费观看看| 91午夜精品亚洲一区二区三区| 99九九线精品视频在线观看视频| 欧美成人免费av一区二区三区| 亚洲第一区二区三区不卡| 一区二区三区四区激情视频 | 中文字幕人妻熟人妻熟丝袜美| 国内少妇人妻偷人精品xxx网站| 俄罗斯特黄特色一大片| 日韩在线高清观看一区二区三区| 欧美一区二区国产精品久久精品| 国产激情偷乱视频一区二区| 日韩强制内射视频| 国产精品电影一区二区三区| 亚洲精品成人久久久久久| 麻豆国产av国片精品| 欧美日韩国产亚洲二区| 久久精品国产清高在天天线| av女优亚洲男人天堂| 三级经典国产精品| 久久精品综合一区二区三区| 在线播放无遮挡| 听说在线观看完整版免费高清| 国产在线精品亚洲第一网站| 此物有八面人人有两片| 成人无遮挡网站| 国产激情偷乱视频一区二区| 狂野欧美激情性xxxx在线观看| 青春草视频在线免费观看| 天堂√8在线中文| 老师上课跳d突然被开到最大视频| 男女做爰动态图高潮gif福利片| 亚洲高清免费不卡视频| h日本视频在线播放| 欧美一区二区国产精品久久精品| 性欧美人与动物交配| 最好的美女福利视频网| 干丝袜人妻中文字幕| 久久国产乱子免费精品| 久久精品国产99精品国产亚洲性色| 蜜臀久久99精品久久宅男| 麻豆精品久久久久久蜜桃| 男人舔奶头视频| 最后的刺客免费高清国语| 久久亚洲国产成人精品v| 日韩制服骚丝袜av| 十八禁国产超污无遮挡网站| 长腿黑丝高跟| 青春草视频在线免费观看| 国产成人福利小说| 免费av观看视频| 一级毛片久久久久久久久女| 国产毛片a区久久久久| 少妇猛男粗大的猛烈进出视频 | 国产精品久久久久久久电影| 男插女下体视频免费在线播放| 国产精品久久久久久久电影| 性插视频无遮挡在线免费观看| 中国美白少妇内射xxxbb| 久久婷婷人人爽人人干人人爱| 麻豆国产av国片精品| 看十八女毛片水多多多| 俄罗斯特黄特色一大片| 18+在线观看网站| 黄色视频,在线免费观看| 免费av观看视频| 我要看日韩黄色一级片| a级一级毛片免费在线观看| 深夜精品福利| 在线播放国产精品三级| 精品人妻偷拍中文字幕| 性色avwww在线观看| 久久久久国产网址| 亚洲最大成人av| 99久久精品一区二区三区| av免费在线看不卡| 久久草成人影院| av在线蜜桃| 最后的刺客免费高清国语| 久久久久国产网址| 激情 狠狠 欧美| 国产色婷婷99| 久久精品影院6| 午夜福利成人在线免费观看| 国产av在哪里看| 免费人成视频x8x8入口观看| 国内久久婷婷六月综合欲色啪| 男女啪啪激烈高潮av片| 美女cb高潮喷水在线观看| 日韩中字成人| 99riav亚洲国产免费| 久久久欧美国产精品| 久久久久久久午夜电影| 国产综合懂色| 亚洲熟妇熟女久久| 国产精品人妻久久久久久| 一个人免费在线观看电影| 人人妻人人澡人人爽人人夜夜 | 亚洲最大成人av| 免费人成在线观看视频色| 久久中文看片网| 成人综合一区亚洲| 日韩三级伦理在线观看| 天堂影院成人在线观看| 九九久久精品国产亚洲av麻豆| 女人被狂操c到高潮| 一区二区三区高清视频在线| 插阴视频在线观看视频| 天天躁夜夜躁狠狠久久av| 别揉我奶头~嗯~啊~动态视频| 国产高清视频在线播放一区| 欧美色欧美亚洲另类二区| 午夜a级毛片| 中文在线观看免费www的网站| 亚洲精品色激情综合| 日韩欧美在线乱码| 欧美zozozo另类| 精品熟女少妇av免费看| 99热这里只有是精品在线观看| 精华霜和精华液先用哪个| 国产又黄又爽又无遮挡在线| 久久久久久国产a免费观看| 欧美人与善性xxx| 亚洲精品久久国产高清桃花| 亚洲激情五月婷婷啪啪| 免费黄网站久久成人精品| 九九久久精品国产亚洲av麻豆| 色av中文字幕| 久久亚洲精品不卡| 亚洲久久久久久中文字幕| 亚洲欧美成人综合另类久久久 | 亚洲精品成人久久久久久| 少妇熟女欧美另类| ponron亚洲| 国内精品宾馆在线| 午夜福利在线观看免费完整高清在 | 搡老岳熟女国产| 男人狂女人下面高潮的视频| 婷婷精品国产亚洲av在线| 91精品国产九色| 色综合亚洲欧美另类图片| 免费一级毛片在线播放高清视频| 综合色丁香网| 亚洲第一区二区三区不卡| 97超级碰碰碰精品色视频在线观看| 成人三级黄色视频| 一本精品99久久精品77| 亚洲性夜色夜夜综合| 少妇高潮的动态图| 人妻少妇偷人精品九色| 在线观看一区二区三区| 亚洲av成人精品一区久久| 久久久精品欧美日韩精品| 国产成人freesex在线 | 国产视频一区二区在线看| 中文字幕精品亚洲无线码一区| 亚洲美女黄片视频| 欧美高清性xxxxhd video| 哪里可以看免费的av片| 国产毛片a区久久久久| 精品一区二区三区视频在线| 美女大奶头视频| 亚洲国产精品久久男人天堂| 欧美一区二区国产精品久久精品| 两性午夜刺激爽爽歪歪视频在线观看| 亚洲精品影视一区二区三区av| 亚洲欧美精品自产自拍| aaaaa片日本免费| 久久精品国产自在天天线| 联通29元200g的流量卡| 亚洲七黄色美女视频| 国产精品一区二区性色av| 国产精品久久久久久久久免| 亚洲av成人av| 亚洲国产日韩欧美精品在线观看| 嫩草影院入口| 91午夜精品亚洲一区二区三区| 女人被狂操c到高潮| 搡老岳熟女国产| 亚洲图色成人| 亚洲一级一片aⅴ在线观看| 天堂网av新在线| 国产精品三级大全| 国产激情偷乱视频一区二区| 狂野欧美激情性xxxx在线观看| 18+在线观看网站| 亚洲婷婷狠狠爱综合网| 欧美又色又爽又黄视频| 国产综合懂色| 观看美女的网站| 精品一区二区三区人妻视频| 夜夜夜夜夜久久久久| 国产真实伦视频高清在线观看| 国产精品久久久久久久久免| 日韩精品有码人妻一区| 精品人妻一区二区三区麻豆 | 一个人观看的视频www高清免费观看| 国产伦在线观看视频一区| 欧美zozozo另类| 国产白丝娇喘喷水9色精品| 夜夜爽天天搞| 精品福利观看| 春色校园在线视频观看| 亚洲国产精品sss在线观看| 中文字幕久久专区| 99久久无色码亚洲精品果冻| 日韩一本色道免费dvd| 精品久久久久久久人妻蜜臀av| 菩萨蛮人人尽说江南好唐韦庄 | 高清日韩中文字幕在线| 国产精品99久久久久久久久| 最近在线观看免费完整版| www日本黄色视频网| 联通29元200g的流量卡| av卡一久久| 国产男靠女视频免费网站| 中文字幕人妻熟人妻熟丝袜美| 中文字幕熟女人妻在线| 久久精品91蜜桃| 性欧美人与动物交配| 国产精品免费一区二区三区在线| 搡老熟女国产l中国老女人| 插阴视频在线观看视频| 日本色播在线视频| 成人一区二区视频在线观看| 一区二区三区四区激情视频 | 国产精品国产高清国产av| 久久天躁狠狠躁夜夜2o2o| 天天一区二区日本电影三级| 少妇猛男粗大的猛烈进出视频 | 男人狂女人下面高潮的视频| 热99在线观看视频| 淫妇啪啪啪对白视频| 国产亚洲91精品色在线| 尾随美女入室| 欧美zozozo另类| 日本黄大片高清| 少妇熟女欧美另类| 久久这里只有精品中国| 色5月婷婷丁香| 亚洲国产精品成人久久小说 | 一级毛片久久久久久久久女| 亚洲成人久久性| 亚洲av电影不卡..在线观看| 嫩草影院精品99| 小说图片视频综合网站| 国产v大片淫在线免费观看| a级毛片免费高清观看在线播放| 国语自产精品视频在线第100页| 乱系列少妇在线播放| 成人三级黄色视频| 亚洲图色成人| 精品久久国产蜜桃| 伊人久久精品亚洲午夜| 久久九九热精品免费|