• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Interactive Transformer for Small Object Detection

    2023-12-15 03:57:28JianWeiQinzhaoWangandZixuZhao
    Computers Materials&Continua 2023年11期

    Jian Wei,Qinzhao Wangand Zixu Zhao

    Department of Weaponry and Control,Army Academy of Armored Forces,Beijing,100071,China

    ABSTRACT The detection of large-scale objects has achieved high accuracy,but due to the low peak signal to noise ratio(PSNR),fewer distinguishing features,and ease of being occluded by the surroundings,the detection of small objects,however,does not enjoy similar success.Endeavor to solve the problem,this paper proposes an attention mechanism based on cross-Key values.Based on the traditional transformer,this paper first improves the feature processing with the convolution module,effectively maintaining the local semantic context in the middle layer,and significantly reducing the number of parameters of the model.Then,to enhance the effectiveness of the attention mask,two Key values are calculated simultaneously along Query and Value by using the method of dual-branch parallel processing,which is used to strengthen the attention acquisition mode and improve the coupling of key information.Finally,focusing on the feature maps of different channels,the multi-head attention mechanism is applied to the channel attention mask to improve the feature utilization effect of the middle layer.By comparing three small object datasets,the plug-and-play interactive transformer (IT-transformer) module designed by us effectively improves the detection results of the baseline.

    KEYWORDS Small object detection;attention;transformer;plug-and-play

    1 Introduction

    The object detection model has achieved fruitful research results and has been widely used in production,life,and other fields,significantly improving efficiency.However,these detection models still face challenges from small object detection tasks.As shown in Fig.1,the model has more false detections and missed detections of small objects.There are three main reasons for this result: first,the small object lacks distinguishable and significant features,the second is that the small object is easy to be obliterated in the surrounding environment,and the third is that in the deep neural network,pooling,normalization,label matching and other modules will gradually attenuate the relevant features of the small objects layer by layer,resulting in the lack of relevant information at the detection head [1,2].The combined effect of these factors leads to the poor detection results of traditional models on small objects.

    Figure 1:Small object detection.The traditional first-stage and second-stage object detection models cannot effectively deal with unfavorable factors such as object occlusion,environmental interference,and small object size,resulting in easy misdetection and missed detection.The improved model with the addition of the IT-transformer effectively overcomes these challenges

    To solve this problem,models such as multi-scale pyramid [3-5] and feature pyramid [6-8] are used to process object features at different scales,that is to improve the detection accuracy of small objects by hierarchical processing and end fusion.Another approach is to use larger feature maps[1,9],such as [1] adding P2 layer features with less loss of feature information to the neck module,which effectively improves the available features of small objects;On the contrary,larger feature maps lead to slower inference speed;Focus[10]proposed a method of slice,which retains as many small object features as possible without compressing the size of the input image;The you only look once(YOLO)[11,12]models add data augmentation strategies,such as mosaic to diversify images in a wider range to improve the contribution of small objects to training loss.In [13,14],the method of deformation convolution is used to change the position of the convolution kernel and guide the convolution kernel to extract the characteristics of a more accurate position.Other studies have proposed the addition of the attention mechanism [15-17],by adding an attention mask representing the importance of each region,to improve the attention of the model to different regions during processing,and effectively suppress the noise of irrelevant regions.At present,the attention mechanism model represented by the transformer[18]shines in many image processing tasks[19-21]and has received more and more attention with its unique feature processing methods.

    In summary,in terms of the actual task requirements,based on the transformer attention mechanism,to fully construct the global and local semantic context of the avatar,we propose an IT-tansformer attention mechanism to solve the detection problem of small objects.Specifically,the traditional transformer adopts the calculation method based on the fully connected layer,resulting in a heavy number of parameters,extremely high requirements for hardware,and insufficient local semantic characteristics due to the serialized data processing mode.Second,in the multi-head attention mechanism,the query(Q),key(K),and value(V)are obtained separately.That is to say,they do not explore Q and K deeply,nevertheless,the poorly explored relationship between each other weakens the effectiveness of attention masking.To solve these two problems,we design an interactive transformer module that can be plug-and-play.In detail,based on the previous research,we first replace the fully connected layer with a 2D convolution module,use the characteristics of shared weight to provide evidence,reduce the number of overall parameters of the model,realize the lightweight processing of the model,and at the same time,improve the local context between the associated regions with the help of the local field of view of the convolution module.Then,to further enhance the feature representation ability of the middle layer and improve the accuracy of the attention mask,a feature processing method based on cross-fusion K is proposed,and the coupling relationship in the features of the middle layer is highlighted by fusing the K of the Q and V bidirectional branches,to improve the model’s attention to detailed information.Finally,unlike the fully connected layer to calculate the interaction effect between each pixel,we focus on the features between different channels,to maintain the consistency of the global spatial position relationship of the features,and effectively improve the feature representation of objects at each scale by applying channel-level multi-head attention to the features of the middle layer.

    In summary,our main contributions are:

    1.The object detection model based on the IT-transformer is proposed.From the perspective of improving the utilization efficiency of features in the middle layer,the dual-branch model is used to extract the key values of features and provide more effective comparison features for the attention module through cross-fusion.At the same time,to suppress the interference of noise channels,the multi-head attention mechanism is applied to the generation and optimization of channel attention masks,which significantly improves the differentiation of the characteristics of the middle layer.

    2.A new small object detection dataset was collected and organized.Given the existing small object detection data set,the types of objects are mostly common objects,and the object acquisition angle and the scene are simple,etc.At the same time,to expand the application of intelligent detection algorithms in the military field,we collect and sort out an Armored Vehicle dataset with diverse viewing angles,variable distances,and complex scenes through network collection and unmanned aerial vehicle(UAV)shooting,and carry out experiments on small object detection models in it.

    3.Extensive experimental comparisons and self-ablation experiments were carried out to verify the effectiveness of the module.The results show that the proposed IT-transformer can realize plug-andplay in the first-stage and second-stage detection models,which can effectively improve the detection accuracy of the baseline model.In the three datasets of Armored Vehicle,Guangdong University of Technology-Hardhat Wearing Detection(GDUT-HWD),and Visdrone-2019,the mAP was improved by 2.7,1.1,and 1.3 compared with the baseline,respectively.

    2 Structure

    2.1 Object Detection

    Object detection models based on deep learning have been fully developed,and they are mainly divided into four branches: first,first-stage detection models,with YOLO [11,12,22,23],single shot multibox detector(SSD)[24],and Retina[25].They integrate region of interest(ROI)generation and final result prediction,with faster image inference speed;Then there is the second-stage detection model,represented by Faster region-based convolutional network method (Faster RCNN) [26],Cascade RCNN [27],etc.Their main feature is to set a separate module for more accurate ROI extraction,and the addition of ROI alignment makes the detection accuracy of the object significantly improved;The third is the transformer-based detection model,such as vision transformer(ViT)[19],detection transformer (DETR) [21],DETR with improved denoising anchor box (DINO) [28],etc.Represented by the addition of transformers,they integrate the addition of transformers into object detection tasks,breaking the previous situation of convolutional modules in the image field,and with the unique attention mechanism in transformers,the detection accuracy of such models quickly catches up with a series of traditional state of the art (SOTA) models;The fourth is the detection architecture based on the diffusion model [29-31].Based on the diffusion model,they regard the positioning problem of the object as the iterative diffusion process from a random noise vector to the true value and complete the detection task of the object through cascade alignment.In this paper,we first take the second-stage detection model Cascade RCNN as the benchmark to make full use of the characteristics of the distributed model structure.At the same time,to further improve the model performance,we also integrate the transformer attention mechanism to achieve the organic integration of the two.Guided by the plug-and-play idea,we have designed an interactive attention module that can adapt to the existing first-stage and second-stage detection models,which can effectively improve the detection performance of the baseline model.

    2.2 Small Object Detection

    Small object detection is an important part of the computer vision task.According to the definition of the COCO dataset,when the object size is less than 32×32 pixels,the object can provide very limited feature information,resulting in increased detection difficulty.To solve this problem,there are currently four main ideas:first,increase the size of the input image[9,10],so that the feature can remain relatively stable,but too large input size will lead to a significant decrease in inference speed,which is not suitable for scenarios with high real-time requirements;The second is the data augmentation strategy[32,33],represented by the mosaic and generative adversarial network(GAN).In the data preprocessing stage equipped with mosaic,through controllable parameter adjustment,the proportion of small objects in all training instances in the training process is increased,and the parameter update process dominated by large-size objects in the past is improved.In [34],GAN synthesis and diversification of small objects are used to increase the number of positive samples in the training process;Third,the multi-scale training and testing strategy [6,35,36] is adopted to improve the consistency detection ability of the model for objects at each scale by changing the input image size within a large range.The fourth is to add an attention mechanism[2,17,37],which improves the attention of the model to specific regions and objects by additional calculation of attention masks that indicate the importance of pixels.Starting from the perspective of improving the attention of the model,this paper proposes an interactive attention mechanism.With the help of the IT-transformer,the model can effectively represent the importance of the feature under the single-scale training strategy,to improve the accuracy of the small object.

    2.3 Transformer

    Transformer [18] was originally used to process serialized text data and made its mark in the natural language processing(NLP)field.ViT[19]converts image data into serialized data composed of multiple pixel blocks for the first time,and then performs image classification and detection tasks in the way of transformers,opening the way for transformers to expand into the image field.Based on the transformer architecture,many excellent models have emerged,such as DETR[21],and Swin transformer [20].The main feature of the transformer is the feature processing method based on mutual attention between tokens,which covers the global semantic information in a single position,which greatly improves the accuracy of the model inference results.However,under the single scale setting,the transformer controls the number of model parameters by dividing the specified number of tokens,but it still produces significantly higher parameters than the convolution module.Because of the serialized image,the semantic relationship between adjacent tokens is broken.Experiments show that when the dataset is small,the transformer-based model is difficult to effectively learn the effective interrelationship matrix,resulting in low performance.This paper uses the attention mechanism in the transformer to improve the cross-K value by integrating the middle-layer features.Furthermore,by integrating the convolution module,we strengthen the semantic correlation between tokens,to improve the performance of the model in the smaller dataset.

    3 Method

    In this part,first,we briefly introduce the relevant content of traditional transformers and then introduce the structure and optimization indicators of IT-transformers in detail.

    3.1 Revisiting Transformer

    Transformer is a deep neural network model based on an encoder and decoder,and its core content is the construction of the attention mechanism.Thanks to the globally encoded token,the transformer uses fully connected modules to ensure that each token has a broad field of view and a full range of connection relationships,which ensures better performance in advanced visual tasks such as object detection and segmentation.The transformer attention mechanism is based on the calculation process of the matrix,specifically,the calculation of Q,K and V based on the characteristics of the middle layer,and then transpose and multiply the three.The interrelationship matrix reflecting the importance of each token is obtained,that is,the attention mask.The structure of a traditional transformer is shown in Fig.2.

    Figure 2:The traditional transformer

    Suppose the input characteristics areX∈RC×H×Wthat in the traditional transformer calculation process,it is necessary to first normalize the flattened two-dimensional matrix of X(X′∈RC×HW),and then multiply it with three weight matrices(WQ,WK·WV),representing fully connected operations to obtain the representation of Q,K,and V,where Q is calculated by:

    K and V are calculated similarly.In particular,to ensure the unbiased nature of the extracted features,the bias coefficient needs to be set to zero when processing with a fully connected matrix.

    Then,by transposing each multiplication of Q and K,the correlation matrix between the two is obtained.Finally,the softmax activation function is used to normalize it to(0-1),that is,the spatial attention mask reflecting the importance of each token is obtained.The calculation process is:

    Under multi-head attention,several such attention matrices can be calculated at the same time.Next,these matrices are integrated by stitching and merging.At last,the hop connection method is used to weighted fusion with the input features to obtain the feature map optimized by attention masking,and send it to the subsequent detection module.Its calculation formula is:

    In this process,since the calculation of Q,K,and V uses a fully connected layer module,its parameter quantity is (CHW)2.The increased parameters will lead to a decrease in training efficiency,increased energy consumption,and other problems.Therefore,many jobs are faced with the problem of controlling the number of overall parameters of the model when designing and deploying transformer models.

    In addition,it is worth noting that the calculation and processing of Q,K,and V are the core content of the transformer and directly determine the effectiveness of attention masking.However,traditional transformers are only processed through 3 separate fully connected layers.Q,K,and V are the basis for calculating attention masks,so it is necessary to explore their processing methods in more depth to improve the accuracy of attention masks.

    3.2 IT-Transformer

    The overall structure of the IT-transformer is shown in Fig.3.The research shows that the transformer structure is different from the traditional convolution-based model.In feature processing,due to the lack of the local field of view,the transformer-based architecture cannot complete the acquisition of local semantic context,which will significantly affect the detection performance of the model when the training dataset is small.In addition,as we introduced earlier,transformers widely use the fully connected layer to calculate the characteristics of the middle layer,resulting in a large number of parameters.In this regard,referring to the research results of many existing structural convolutions and transformers,to balance the number of parameters and the demand for attention mechanisms,we design Q,K,and V calculation methods based on convolutional modules.First of all,through weight sharing,the convolution module can effectively use the local correlation semantic context between adjacent pixels,that is,the local field of view of the convolution kernel.On the other hand,it can significantly reduce the parameters.

    Figure 3:The IT-transformer

    Taking the calculation of Q as an example,the traditional transformer middle layer features donated as X,and C,H,and W is 1024,64,64,64,respectively,so its parameter quantity is:Param(Qtransformer)=(1024×64×64)2.In IT-transformer,when the 3×3 convolution kernel module is used to obtain Q,K,and V,the parameters are:Param(QIT-transformer)=(1024×9)2.It can be seen that through this lightweight design,the number of parameters of the IT-transformer module has nothing to do with the size of the middle layer features,and the number of parameters is reduced by a factor of (64×64/9)2compared with the fully connected method in Fig.2.The number of parameters is greatly compressed,which helps to improve the efficiency of model training and reduce the hardware requirements of the model.

    At the same time,to further strengthen the connection between Q,K,and V,we use synchronous calculation.As can be seen from Fig.3,Q and K1,K2and V are calculated by the same convolution module,and then through channel splitting,we get Q,K,and V with more close coupling effects.

    By setting up the dual branch,we obtain K rooted in Q and V,and it can be said that the extracted featuresK1·K2contain more explicit cross-coupling features,which provide richer sampling information for attention calculations.When calculating attention,according to the unified requirements of the transformer architecture,we get the key feature expression after crossover,namely:

    We also use multi-head attention to complete the analysis from multiple different dimensions,and fully use the characteristics of the middle layer,to achieve the purpose of improving the effectiveness of attention masking.First of all,we know that the contribution of different channel feature maps is different,some feature maps are accurately extracted to the decisive features,while other channels may introduce noise.If the characteristics of each channel are set to the same weight,it will inevitably affect the final judgment of the model.Therefore,under the premise that the convolution module has been used to extract the local semantic context of the feature map,we pay more attention to which channel features have a more important position in the multi-head attention.Therefore,unlike the way the transformer module focuses more on spatial attention,IT-transformer focuses on different channels.Under the bullish attention mechanism,we divide Q,K,and V into subsets according to the number of headsN,where

    The computational focus of attention also becomes the acquisition channel-level attention mask,which is calculated as follows:

    We obtain the mask that reflects the attention of each group of channels by parallel computing,and then we also use the splicing method to obtain the attention mask that reflects the features of all the intermediate layers,among themAttentionchannel∈RC×C.Finally,by adding the input of the module by jumping the connection,the feature representation of the middle layer is further strengthened,and the enhanced feature map is obtained.

    3.3 Loss Function

    We detail the structure and working process of IT-transformers.In fact,in the detection task of small objects,to improve the overall detection accuracy of the model,we add the P2 level feature map refer to[1,9]and detect small objects in the large-size feature map.Here,using Cascade RCNN as the baseline,we design an IT-transformer-enhanced small object detection model,the overall structure of which is shown in Fig.4.

    Figure 4:The improved Cascade RCNN with IT-transformer

    It can be seen that the IT-transformer can be plugged directly into the back end of the feature pyramid network(FPN),which also means that the IT-transformer can achieve a plug-and-play effect.In this regard,we conducted experimental verification in Section 4.6,showing the wide utility and effectiveness of the IT-transformer.

    As shown in Fig.4,this paper selects Cascade RCNN[27]as the baseline model,and builds the object detection model by inserting IT-transformer into it,so its loss function is mainly composed of 2 parts,and its calculation formula is:

    Among them,theLRPNmodel makes the initial judgment of the object presence and position of the feature map,which is composed of binary classificationLobjectloss and location regression lossLloc,specifically:

    whereirepresents the serial number of the anchor,piis the probability that thei-th anchor has an object,p′iis the label assigned by the first anchor(1 when containing the object,otherwise 0),Nregis the total number of valid object boxes currently predicted by the model,biis the number the coordinates of the object position predicted by thei-th anchor,similarly,are the real coordinates assigned by thei-th anchor containing the object,which isλthe adjustment coefficient of loss,which is set to 1.0 by default according to mmdetetion1https://github.com/open-mmlab/mmdetection.

    So far,we get a series of ROIs.Then,theL1 loss fine-tuning object location box is used,which is calculated as:

    wheref(xi,bi)is the positional regression function,which is used to regress the candidate bounding boxbito the object bounding boxgi.In fact,due to the fine-tuned regression method using cascadingf,it consists of phased progressive functions,specifically:

    in this paperT=3,the regression position is fine-tuned under three conditions.

    Further,in thetfirst stage,the positionL1loss function based on the calculation formula is as follows:

    wherecirepresents the object class vector predicted by the anchor when the intersection over union(IOU)exceeds the threshold.

    Finally,we use the cross-entropy loss functionto calculate the category loss of the object,and then the total loss function is as follows:

    4 Experiments

    For small object detection tasks,GDUT-HWD2https://github.com/wujixiu/helmet-detection/tree/master/hardhatwearing-detection,Visdrone-20193https://github.com/VisDrone/VisDrone-Dataset,etc.,are available public benchmark datasets.To fully verify the effectiveness of the IT-transformer,we compare 8 typical algorithms in the above two datasets.In addition,we have built our own dataset of ground objects in the military field and conducted comparative experiments in it.The distribution of objects of each scale in the three datasets is shown in Fig.5,and it can be seen that the Armored Vehicle dataset we collected and sorted out has similar instance distribution characteristics to the other two,which are composed of small and medium-sized objects,which has great detection difficulty.

    Figure 5:The distribution of three used datasets

    4.1 Datasets

    Armored Vehicle: We collected,organized,and annotated 4975 images through online searches and local shooting.In the dataset,there are 10250 labeled boxes,and we use 3920 as the training set,which contains 8022 instances,and the remaining 1057 as the validation set,containing 2210 instances.There is only one type of object in the dataset,and its size distribution is shown in Fig.5.We label the data in a coco format to ensure that it can be used directly for multiple training architectures.The difference is that in the Armored Vehicle dataset,the object’s viewing angle,distance,environment,scale,weather,and other characteristics are more complex,making it more difficult to detect small objects.

    GDUT-HWD [38]: This is a very common hard hat detection dataset in industrial scenarios,containing 3174 training images,consisting of 5 types of labeled boxes,which is a lightweight benchmark for small object detection.

    Visdrone-2019 [9]: This is a small object dataset of large scenes from an aerial perspective,consisting of 10209 images and 2.6 million annotation boxes,which can be used to test the detection performance of the model on small objects,and at the same time can test the efficiency of model reasoning.Due to its large image size,we divide each image into 4 non-overlapping subplots concerning[39].

    4.2 Metrics

    We select mean average precision(mAP),APs,APm,and APl commonly used in object detection tasks as evaluation indicators and precision and recall are the basis for calculating each value.AP is the area around the precision-recall(P-R)curve and the coordinate axis,and its calculation formula is:

    For datasets with multiple class objects,mAP is the average of APs across all classes,expressed as:

    APs refer to objects with a size of less than 32×32,and in the same way,APm and APl correspond to 96×96,and 128×128,respectively.In the course of the experiment,we also calculate the evaluation results of mAP50 concerning the practice of[1,9],which means the mAP is calculated when IOU=0.5.

    4.3 Settings

    All experiments in this paper are based on the mmdetection architecture,which ensures the fairness and reproducibility of the test.In the experimental process,we adopt a single-scale training strategy,and the input image size is uniformly limited to 640×640(the Visdrone-2019 dataset is set to 1280×960),and only random flipping is used for data augmentation in the data preprocessing stage.For the learning rate and the number of detection heads,we determine through a grid search,which is described in Section 4.6.In the following experiment,learning rate (lr) is 4E-2 andNis 8 in the following experiment.Other parameters refer to the default settings of mmdetection.

    4.4 Results in the Armored Vehicle Dataset

    The experimental results are shown in Table 1,from which it can be seen that the improved IT-Cascade-RCNN model achieves higher detection accuracy.The longitudinal comparison shows that IT-Cascade-RCNN improves 14.8 mAP compared with the typical first-order detection model YOLOx and 12.8 mAP compared with the typical second-order model Sparse.In particular,ITCascade-RCNN also achieved better results than DINO and DiffusionDET which based on diffusion models,exceeded 4.5 and 4.4 mAP.Furthermore,the IT-transformer also surpassed another attentionbased model,named adaptive training sample selection (ATSS) [40],6 mAP in a word.It is worth noting that under the AP50,although the DINO and DiffusionDET model achieved higher detection results,the performances have not been well extended to other threshold conditions,and they failed to balance between various early warning restrictions,object detection accuracy,and false alarm rate.In contrast,IT-Cascade-RCNN provides better results at various IOU thresholds.Further,we find that the accuracy of IT-transformer for large objects has decreased,because we have integrated global features and local features in the middle features of IT-transformer,resulting in the introduction of interference in some environmental information brought by local semantic features,resulting in a smaller APl.

    Table 1: The metrics in armored vehicle dataset

    Fig.6 shows the visualization of the detection results of each model in the Armored Vehicle dataset.Fig.6 can more clearly show the detection effect of the five models,from which we can see that in the first line of images,Retina,YOLOx and DINO have serious false alarm problems,identifying non-existent areas as objects,while Faster RCNN fails to detect objects at all,and the improved model with cross-attention mechanism correctly detects objects;In the second line,Retina,Faster RCNN,and YOLOx also have the problem of missing detection,although DINO detects all objects,but the precision measurement accuracy is not as high as the improved model;Similarly,when the object in the third row is partially occluded,although the first three models are correctly positioned to the object,the detection accuracy does not reach a higher level,but unfortunately,DINO missed an object at this time;The fourth line shows the level difference between the models more vividly,when the object is obscured by smoke and dust,resulting in the object feature being disturbed,Retina,YOLOx and DINO fail to detect the object,and the Faster RCNN obtains less accurate detection results,compared to the improved Cascade RCNN model showing accurate results.

    4.5 Results in the GDUT-HWD Dataset

    We also perform experiments in lightweight GDUT-HWD datasets to test the ability of the ITtransformer to deal with small object detection in industrial scenarios,and the experimental results are shown in Table 2.From this,we found that IT-Cascade-RCNN also showed good performance advantages,improving by 14.1 mAP compared with the typical first-order detection model YOLOx,16.9 mAP compared with the second-order detection model represented by sparse,and 13.3 mAP higher than the DINO based diffusion model.Among the more challenging small-scale object detection results,IT-Cascade-RCNN also achieved the highest detection accuracy of 34.3,which is 2.1 higher than the benchmark Cascade-RCNN.In summary,the results show that IT-transformer can effectively improve the detection performance of the model.

    Table 2: The metrics in GDUT-HWD dataset

    Figure 6: Detection results (green circles indicate missed detections and yellow circles indicate false detections)

    Fig.7 is the visualization of some model detection results,and it is found that Retina,Faster RCNN,YOLOx,and DINO have serious missed detection problems,and none of them can detect the object marked by the green circle in the Fig.7.At the same time,Retina and Faster RCNN also have the problem of false detection,and they misjudge the object category marked by the yellow circle;Finally,Faster RCNN also has the problem of duplicate detection,and the object marked by the blue circle in the duplicate detection figure is repeated;Among the detected objects,the improved Cascade RCNN model has a higher degree of confidence.On the whole,the model improved by the cross-transformer shows better performance,which effectively improves the detection accuracy of the model for small objects.

    Figure 7: Detection results (green circles indicate missed detections,yellow circles indicate false detections and blue circles indicate retests)

    4.6 Ablation Experiments

    The IT-transformers we design are mainly affected by factors such as learning rate,normalization layer,number of detection heads,etc.,to test their impact on precision measurement accuracy more reliably,we carry out ablation experiments on them separately in this part.

    4.6.1 The Impact of lr

    We use the grid search method to test the influence of different learning rates on the detection accuracy of the model.During the experiment,we sampled 15 learning rates from 1E-3 to 5E-2 and experimented in the GDUT-HWD dataset,and the relevant results are shown in Fig.8.

    Observing the experimental results in Fig.8,it is first confirmed that the difference in learning rate does have a great impact on the detection accuracy of the model,for example,with the increase of the learning rate,the detection accuracy of the model shows an upward trend.Furthermore,it can be seen that when lr is set to 4E-2,the model achieves the highest results,reaching 48.9 mAP.Therefore,in the full text,we set the lr to 4E-2.

    Figure 8:The impact of lr

    4.6.2 The Impact of Head Number

    The bull attention mechanism determines how many angles the interrelationships between features need to be extracted,and we know that the number of attention heads is not as many as possible,and vice versa.Our ablation experiments confirmed this as well.As shown in Table 3,it can be seen that when the number of detection heads is small,an effective attention mask cannot be generated,resulting in an interaction between feature maps,which cannot provide more effective feature information for the detection head,and when the number of attention heads is too large,too much redundant information will be introduced,which will also weaken the expression ability of features.From the experimental results,when the head number is 8,the model performs better.

    Table 3: The results of different numbers of head

    As shown in Table 4,we also experiment with the normalization layer in the transformer.The results show that the model performs better when the normalization layer is not used.We believe that the possible reason is that the normalization operation affects the representation of the middlelayer features,and when the normalization operation is carried out,the features are compulsorily concentrated on some prior knowledge,which weakens the ability of the model to rely on its ability to induct effective bias,drowns the middle-layer features that have a direct impact on the detection results,and causes the model detection accuracy to decline.On the contrary,by reducing the constraints of prior knowledge on the model learning process,and more through self-learning guidance,the model can more effectively learn the universal characteristics of different object features,to achieve more accurate detection in the detection process.

    Table 4: The impact of normalized layer

    4.6.3 The Impact of Kernel Size

    IT-transformer better integrates the ability of convolutional modules to obtain local semantic features.In fact,local semantic features can provide more environment and reference information for the identification of small objects,and help achieve accurate classification and positioning.In order to determine a more suitable field of view,in this part,we conducted a comparative experiment on the size of the convolution kernel in the Armored Vehicle dataset,and the results are shown in Table 5.

    Table 5:The impact of kernel size(ensure that the size of the output feature map remains unchanged)

    It can be seen from Table 5 that the change in convolution kernel size has a significant impact on IT-transformer,in which with the increase of convolution kernel size,the receptive field of intermediate feature fusion also increases,providing the validity of intermediate layer features,which is reflected in the detection results is the steady improvement of various indicators,such as kernel size is 7,reaching a maximum value of 57.4,in which the APs reaches 41;However,with the further increase of the convolution kernel size,such as kernel size 9,too many environmental features are integrated into the middle layer features,which interferes with the utilization effect of the middle layer features,resulting in a downward trend in object detection accuracy.At the same time,it is obvious that as the size of the convolution kernel increases,the number of parameters of the model will increase simultaneously,and the computing power expenditure will increase,but it is worth the effort to improve the accuracy of object detection.

    4.6.4 The Result of Plug-and-Play

    As we mentioned earlier,IT-transformer has plug-and-play features and can significantly improve accuracy.In this regard,we selected typical first-stage and second-stage detection models such as Retina,Faster RCNN,and Cascade RCNN in the GDUT-HWD,Armored Vehicle,and Visdrone-2019 for experiments.The results are shown in Table 6.After adding IT-transformer,the baseline model has achieved significant performance improvements,such as in GDUT-HWD,with the ITtransformer,the mAP of Faster RCNN and Cascade RCNN increased by 8.8 and 1.1,respectively;meanwhile,in the Armored Vehicle dataset the accuracy of Retina is improved by 4.1 mAP and the accuracy of small objects by 20.56%,compared with 6.18% of APm and 0.95% of APl.ITtransformer’s effect on model performance improvement can also be reflected in the Visdrone-2019.

    Table 6: The results of plug-and-play

    Experimental results show that the IT-transformer designed in this paper does exhibit good plugand-play and can be directly used in many types of benchmark models.Fig.9 is the test results in the Visdrone-2019 dataset,and we test the effect of cross-transformer addition on the detection effect of the Cascade RCNN model before and after the addition of the cross-transformer.It can be seen that the addition of cross-transformers significantly improves the false detection and missed detection of Cascade RCNN.

    Figure 9: The result in Visdrone-2019 (yellow circle indicates false detection,green circle represents missed detection)

    5 Conclusion

    For the challenging small object detection task,we first analyze and sort out the existing solution ideas,summarize them into four basic methods,and then,combined them with the current mainstream attention mechanism,based on the traditional transformer model,from the perspective of compressing the number of model parameters and strengthening the coupling of middle-layer features,we design a cross-K-value transformer model with a double-branch structure,and at the same time,we apply the idea of multi-head attention to the processing process of channel attention masking.By experimenting with the self-built Armored Vehicle dataset and 2 additional benchmarks,the improved Cascade RCNN model based on cross-transformer was verified and a higher detection level was achieved.Finally,by combining the cross-transformer with the existing first-order and second-order detection models,the ablation experiment confirms that the cross-transformer has good plug-and-play performance and can effectively improve the detection results of each baseline.In addition,we also collected and collated an Armored Vehicle dataset containing a class of military ground objects to provide data support for related research.

    Acknowledgement:None.

    Funding Statement:The authors received no specific funding for this study.

    Author Contributions:The authors confirm contribution to the paper as follows: study conception and design: Qinzhao Wang;data collection,analysis,and interpretation of results: Jian Wei;draft manuscript preparation:Zixu Zhao.All authors reviewed the results and approved the final version of the manuscript.

    Availability of Data and Materials:Data will be available on request.

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    精品日产1卡2卡| 日韩欧美国产一区二区入口| 亚洲精品一二三| 久久精品国产亚洲av高清一级| 欧美日韩一级在线毛片| 精品一区二区三区av网在线观看| 精品福利观看| 久久国产精品男人的天堂亚洲| 精品久久久久久久毛片微露脸| 在线观看免费日韩欧美大片| 高清av免费在线| 一级毛片高清免费大全| 老汉色∧v一级毛片| 一区福利在线观看| 日韩欧美免费精品| 日韩有码中文字幕| 人妻丰满熟妇av一区二区三区| 变态另类成人亚洲欧美熟女 | av在线天堂中文字幕 | 中文亚洲av片在线观看爽| 国产精品香港三级国产av潘金莲| а√天堂www在线а√下载| 色婷婷av一区二区三区视频| 成人精品一区二区免费| 国产欧美日韩一区二区三区在线| 搡老岳熟女国产| 18禁观看日本| 国产黄a三级三级三级人| 麻豆久久精品国产亚洲av | 国产成人一区二区三区免费视频网站| 别揉我奶头~嗯~啊~动态视频| 精品午夜福利视频在线观看一区| 国产高清国产精品国产三级| 我的亚洲天堂| 欧美乱码精品一区二区三区| 热re99久久精品国产66热6| 日韩免费av在线播放| 国产精品99久久99久久久不卡| 波多野结衣高清无吗| 国产主播在线观看一区二区| 久久久久久久午夜电影 | 窝窝影院91人妻| 亚洲欧美一区二区三区久久| 色婷婷久久久亚洲欧美| 亚洲全国av大片| 国产高清激情床上av| 国产精品九九99| 亚洲黑人精品在线| 国产精品九九99| 欧美日韩中文字幕国产精品一区二区三区 | 久久精品国产亚洲av高清一级| 日本黄色日本黄色录像| 中文字幕人妻丝袜一区二区| 一a级毛片在线观看| 午夜影院日韩av| 在线观看免费视频日本深夜| 韩国av一区二区三区四区| 成人永久免费在线观看视频| 久久青草综合色| 视频区图区小说| 1024视频免费在线观看| 黄色视频,在线免费观看| 国产欧美日韩一区二区三| 亚洲全国av大片| 9热在线视频观看99| 脱女人内裤的视频| 国产精品永久免费网站| 一进一出抽搐动态| 国产欧美日韩综合在线一区二区| 亚洲av成人一区二区三| 18禁黄网站禁片午夜丰满| 欧美人与性动交α欧美精品济南到| 18禁黄网站禁片午夜丰满| a级毛片在线看网站| 精品福利永久在线观看| 高清黄色对白视频在线免费看| a级毛片在线看网站| 国产成人精品在线电影| 黄频高清免费视频| 亚洲七黄色美女视频| 香蕉丝袜av| 国产精品日韩av在线免费观看 | 老司机靠b影院| 国产精品1区2区在线观看.| 亚洲国产精品合色在线| 精品福利观看| 国产精品一区二区精品视频观看| 亚洲国产欧美网| 9191精品国产免费久久| 久久精品国产综合久久久| 国产1区2区3区精品| 国产aⅴ精品一区二区三区波| 99久久99久久久精品蜜桃| 久久久久国产精品人妻aⅴ院| 国产成人精品久久二区二区免费| 看黄色毛片网站| 香蕉久久夜色| 久久香蕉精品热| 亚洲第一av免费看| 亚洲,欧美精品.| 美女午夜性视频免费| 亚洲成人免费av在线播放| 天天躁狠狠躁夜夜躁狠狠躁| 神马国产精品三级电影在线观看 | 免费在线观看视频国产中文字幕亚洲| 又黄又粗又硬又大视频| 国产激情欧美一区二区| 精品无人区乱码1区二区| 欧美乱妇无乱码| cao死你这个sao货| 国产一区二区三区在线臀色熟女 | 亚洲国产中文字幕在线视频| 精品无人区乱码1区二区| 制服诱惑二区| a级毛片在线看网站| 亚洲七黄色美女视频| 国产亚洲精品久久久久5区| 日本五十路高清| 久久久久久人人人人人| 欧美亚洲日本最大视频资源| 亚洲精品一区av在线观看| 在线观看一区二区三区| 1024香蕉在线观看| 91成年电影在线观看| 俄罗斯特黄特色一大片| 欧美黑人欧美精品刺激| 免费av毛片视频| 亚洲第一青青草原| 亚洲精品粉嫩美女一区| 欧美激情高清一区二区三区| 色综合站精品国产| 免费在线观看完整版高清| 免费av毛片视频| 日韩 欧美 亚洲 中文字幕| 国产一卡二卡三卡精品| 久久精品国产清高在天天线| 中文字幕人妻丝袜一区二区| 欧美日韩亚洲综合一区二区三区_| 嫩草影院精品99| 亚洲少妇的诱惑av| 丁香欧美五月| 91精品三级在线观看| 久久久精品欧美日韩精品| 成人亚洲精品一区在线观看| 精品乱码久久久久久99久播| 久久精品aⅴ一区二区三区四区| 无遮挡黄片免费观看| 日本免费一区二区三区高清不卡 | 一边摸一边抽搐一进一出视频| 久久精品亚洲精品国产色婷小说| 黄片播放在线免费| 伦理电影免费视频| 高清在线国产一区| 国产成人啪精品午夜网站| 99久久综合精品五月天人人| 操美女的视频在线观看| 极品人妻少妇av视频| 无人区码免费观看不卡| 久久香蕉激情| 亚洲一卡2卡3卡4卡5卡精品中文| 久久性视频一级片| 久久精品亚洲av国产电影网| 18美女黄网站色大片免费观看| 亚洲国产欧美一区二区综合| 高清黄色对白视频在线免费看| 亚洲国产精品一区二区三区在线| 18禁美女被吸乳视频| 99精国产麻豆久久婷婷| 90打野战视频偷拍视频| 午夜福利免费观看在线| 嫩草影院精品99| 国内毛片毛片毛片毛片毛片| 一夜夜www| 中亚洲国语对白在线视频| 视频在线观看一区二区三区| 后天国语完整版免费观看| 在线av久久热| 欧美最黄视频在线播放免费 | 999久久久精品免费观看国产| 女人高潮潮喷娇喘18禁视频| 国产蜜桃级精品一区二区三区| 成人亚洲精品一区在线观看| 90打野战视频偷拍视频| 丝袜美腿诱惑在线| 午夜精品在线福利| 成熟少妇高潮喷水视频| 99国产精品99久久久久| 亚洲欧美一区二区三区久久| 韩国av一区二区三区四区| 欧美乱码精品一区二区三区| 亚洲va日本ⅴa欧美va伊人久久| 亚洲 欧美 日韩 在线 免费| 亚洲成人久久性| 神马国产精品三级电影在线观看 | 国产在线观看jvid| 涩涩av久久男人的天堂| 久久精品国产亚洲av香蕉五月| 国产一区二区在线av高清观看| 日韩av在线大香蕉| 精品熟女少妇八av免费久了| 别揉我奶头~嗯~啊~动态视频| 欧美日韩福利视频一区二区| 精品熟女少妇八av免费久了| 制服诱惑二区| 麻豆国产av国片精品| 如日韩欧美国产精品一区二区三区| 十八禁人妻一区二区| 中国美女看黄片| 久久99一区二区三区| 亚洲黑人精品在线| 黄片大片在线免费观看| 在线观看日韩欧美| av片东京热男人的天堂| 日本vs欧美在线观看视频| 亚洲一区高清亚洲精品| www.精华液| 丝袜美足系列| 99久久99久久久精品蜜桃| 亚洲va日本ⅴa欧美va伊人久久| 一级黄色大片毛片| 女人高潮潮喷娇喘18禁视频| 亚洲精品美女久久av网站| 久久中文字幕一级| 99热国产这里只有精品6| 国产精品永久免费网站| 黄色丝袜av网址大全| www.熟女人妻精品国产| 亚洲精品国产色婷婷电影| 老司机亚洲免费影院| 老熟妇仑乱视频hdxx| av视频免费观看在线观看| av视频免费观看在线观看| 亚洲激情在线av| 国产精品九九99| 久久中文字幕人妻熟女| 中文字幕人妻熟女乱码| 中文字幕色久视频| 十分钟在线观看高清视频www| 午夜老司机福利片| 高潮久久久久久久久久久不卡| 免费观看精品视频网站| 国产亚洲精品一区二区www| av免费在线观看网站| 精品人妻1区二区| 久久香蕉国产精品| 国产精品偷伦视频观看了| 国产精华一区二区三区| 视频区图区小说| 人妻久久中文字幕网| av天堂久久9| 欧美乱码精品一区二区三区| 男人舔女人的私密视频| 国产一卡二卡三卡精品| 亚洲精品美女久久av网站| 午夜亚洲福利在线播放| 又大又爽又粗| 极品人妻少妇av视频| 黄色丝袜av网址大全| 成人精品一区二区免费| 欧美日韩精品网址| 麻豆久久精品国产亚洲av | 天堂中文最新版在线下载| 日韩人妻精品一区2区三区| 一区在线观看完整版| 大码成人一级视频| 最近最新中文字幕大全电影3 | 变态另类成人亚洲欧美熟女 | 丝袜在线中文字幕| 久久久国产欧美日韩av| 国产一卡二卡三卡精品| 久久中文字幕人妻熟女| 免费在线观看黄色视频的| 色综合欧美亚洲国产小说| 夜夜躁狠狠躁天天躁| 欧美日韩瑟瑟在线播放| 日本欧美视频一区| 久久人妻福利社区极品人妻图片| 极品人妻少妇av视频| 可以在线观看毛片的网站| 亚洲一区二区三区色噜噜 | 亚洲国产欧美一区二区综合| 久久精品亚洲av国产电影网| 色婷婷av一区二区三区视频| 色尼玛亚洲综合影院| 老司机亚洲免费影院| 午夜影院日韩av| 亚洲精品美女久久av网站| 国产有黄有色有爽视频| 制服诱惑二区| 免费av中文字幕在线| 色综合婷婷激情| 国产男靠女视频免费网站| 国产精品久久久人人做人人爽| 精品久久久久久久毛片微露脸| 一级作爱视频免费观看| 久久影院123| 国产精品免费视频内射| ponron亚洲| avwww免费| 久久久国产精品麻豆| bbb黄色大片| 国产成人免费无遮挡视频| 国产亚洲精品久久久久久毛片| 久久国产精品男人的天堂亚洲| 欧美激情极品国产一区二区三区| 亚洲国产欧美一区二区综合| bbb黄色大片| 精品第一国产精品| 久久久久国产精品人妻aⅴ院| 日本免费一区二区三区高清不卡 | 一进一出抽搐gif免费好疼 | 午夜精品久久久久久毛片777| 久久性视频一级片| 国产麻豆69| 性欧美人与动物交配| 精品日产1卡2卡| 脱女人内裤的视频| 女同久久另类99精品国产91| 欧美成人午夜精品| 免费少妇av软件| 亚洲狠狠婷婷综合久久图片| 男人操女人黄网站| 亚洲欧洲精品一区二区精品久久久| 中文字幕精品免费在线观看视频| 午夜精品国产一区二区电影| 亚洲视频免费观看视频| 国产有黄有色有爽视频| 久久香蕉激情| 久久久久精品国产欧美久久久| 日韩免费高清中文字幕av| 每晚都被弄得嗷嗷叫到高潮| 国产蜜桃级精品一区二区三区| 亚洲av第一区精品v没综合| x7x7x7水蜜桃| 99精品在免费线老司机午夜| 三上悠亚av全集在线观看| 天天影视国产精品| 19禁男女啪啪无遮挡网站| 男人的好看免费观看在线视频 | 免费久久久久久久精品成人欧美视频| 欧美精品亚洲一区二区| 1024香蕉在线观看| 69精品国产乱码久久久| svipshipincom国产片| 老熟妇乱子伦视频在线观看| 国产精品久久久av美女十八| 好男人电影高清在线观看| 国产精品日韩av在线免费观看 | 久久精品91无色码中文字幕| 免费av中文字幕在线| 岛国在线观看网站| 国产真人三级小视频在线观看| 91av网站免费观看| 国产成人欧美在线观看| 欧美日韩精品网址| 欧美老熟妇乱子伦牲交| 中亚洲国语对白在线视频| 亚洲男人的天堂狠狠| 亚洲国产精品合色在线| 黄色女人牲交| 久久九九热精品免费| 国产激情欧美一区二区| 精品人妻1区二区| 国产精品av久久久久免费| 欧美一级毛片孕妇| 久久精品亚洲熟妇少妇任你| 80岁老熟妇乱子伦牲交| 国产精品偷伦视频观看了| 国产成人免费无遮挡视频| 国产精品综合久久久久久久免费 | 欧美人与性动交α欧美软件| 日韩精品青青久久久久久| 国产黄a三级三级三级人| 麻豆国产av国片精品| 一级毛片女人18水好多| 999精品在线视频| 亚洲久久久国产精品| 国产av又大| 看免费av毛片| ponron亚洲| 视频区欧美日本亚洲| 三级毛片av免费| 国产成人精品久久二区二区免费| 91字幕亚洲| 国产精品爽爽va在线观看网站 | 99精品在免费线老司机午夜| 亚洲精品国产区一区二| 免费高清视频大片| 真人一进一出gif抽搐免费| 人妻丰满熟妇av一区二区三区| 麻豆久久精品国产亚洲av | 国产精品永久免费网站| 久久久久国产一级毛片高清牌| 在线观看舔阴道视频| 久久久国产欧美日韩av| 天堂中文最新版在线下载| 99久久综合精品五月天人人| 久久久久国产精品人妻aⅴ院| 在线观看免费视频网站a站| 亚洲五月天丁香| 丰满人妻熟妇乱又伦精品不卡| 国产视频一区二区在线看| 国产精品免费一区二区三区在线| 青草久久国产| 国产精品电影一区二区三区| 在线视频色国产色| 久久久久久久久久久久大奶| 高清在线国产一区| 不卡一级毛片| 91av网站免费观看| 精品第一国产精品| 欧美日本亚洲视频在线播放| 99riav亚洲国产免费| 精品高清国产在线一区| 一区福利在线观看| 日韩有码中文字幕| 手机成人av网站| 日本黄色日本黄色录像| 很黄的视频免费| 欧美日韩av久久| 久久中文看片网| 99久久国产精品久久久| 又黄又粗又硬又大视频| 美女高潮喷水抽搐中文字幕| 精品无人区乱码1区二区| 久久国产精品影院| videosex国产| 最新在线观看一区二区三区| 色婷婷久久久亚洲欧美| 午夜福利一区二区在线看| 69精品国产乱码久久久| 午夜福利影视在线免费观看| 免费av毛片视频| 日本vs欧美在线观看视频| 一边摸一边做爽爽视频免费| 国产精品爽爽va在线观看网站 | 精品福利观看| 女生性感内裤真人,穿戴方法视频| 亚洲免费av在线视频| av免费在线观看网站| 久久精品影院6| 校园春色视频在线观看| 大型黄色视频在线免费观看| 69av精品久久久久久| 日日摸夜夜添夜夜添小说| 看免费av毛片| 变态另类成人亚洲欧美熟女 | 国产不卡一卡二| 在线观看日韩欧美| 曰老女人黄片| 国产三级黄色录像| 亚洲男人的天堂狠狠| 村上凉子中文字幕在线| 日韩高清综合在线| 欧美日韩瑟瑟在线播放| 成人免费观看视频高清| 91麻豆av在线| 亚洲精品中文字幕在线视频| 俄罗斯特黄特色一大片| 伦理电影免费视频| 91成人精品电影| 国产色视频综合| 国产精品乱码一区二三区的特点 | 精品无人区乱码1区二区| 国产熟女xx| 国产精品久久久人人做人人爽| 在线观看日韩欧美| 老司机在亚洲福利影院| 国产三级黄色录像| svipshipincom国产片| 亚洲 国产 在线| 超色免费av| a级毛片黄视频| 国产成人精品在线电影| 最近最新中文字幕大全电影3 | 亚洲av美国av| 精品欧美一区二区三区在线| 亚洲成人久久性| 精品福利永久在线观看| 国产成+人综合+亚洲专区| 免费av毛片视频| 国产区一区二久久| 久久婷婷成人综合色麻豆| 免费人成视频x8x8入口观看| 人人妻人人添人人爽欧美一区卜| 亚洲伊人色综图| 中亚洲国语对白在线视频| 热re99久久国产66热| 国产主播在线观看一区二区| 欧美人与性动交α欧美精品济南到| 亚洲一区二区三区不卡视频| 欧美激情 高清一区二区三区| 国产精品秋霞免费鲁丝片| 97人妻天天添夜夜摸| 天堂√8在线中文| 精品久久久久久成人av| 老熟妇仑乱视频hdxx| 久久久久久久精品吃奶| 脱女人内裤的视频| 日韩精品免费视频一区二区三区| 欧美日韩福利视频一区二区| 老熟妇乱子伦视频在线观看| 成人三级黄色视频| 两性夫妻黄色片| 日韩精品中文字幕看吧| 亚洲熟妇熟女久久| 久久久国产一区二区| 自线自在国产av| 一个人观看的视频www高清免费观看 | 日韩人妻精品一区2区三区| 韩国av一区二区三区四区| 免费在线观看完整版高清| 国产亚洲欧美精品永久| av天堂在线播放| 亚洲人成电影免费在线| 久久香蕉精品热| 69av精品久久久久久| 欧美国产精品va在线观看不卡| 大陆偷拍与自拍| 欧美最黄视频在线播放免费 | 丝袜美腿诱惑在线| 一级毛片高清免费大全| 美女高潮喷水抽搐中文字幕| 免费久久久久久久精品成人欧美视频| 色在线成人网| 午夜老司机福利片| 久久久久久免费高清国产稀缺| 国产成人精品久久二区二区91| 日日夜夜操网爽| 亚洲激情在线av| 亚洲男人的天堂狠狠| 国产三级在线视频| 一区在线观看完整版| 久久精品91蜜桃| 亚洲国产精品一区二区三区在线| 中文字幕高清在线视频| 宅男免费午夜| 国产片内射在线| 精品国产一区二区三区四区第35| x7x7x7水蜜桃| 成人国产一区最新在线观看| 大香蕉久久成人网| 两性午夜刺激爽爽歪歪视频在线观看 | 国产伦一二天堂av在线观看| 久久香蕉国产精品| 精品第一国产精品| 午夜福利欧美成人| 99久久精品国产亚洲精品| 亚洲一区二区三区色噜噜 | 亚洲人成伊人成综合网2020| 国产一区二区三区视频了| 麻豆av在线久日| 女性被躁到高潮视频| 国产一区二区三区在线臀色熟女 | 久久精品aⅴ一区二区三区四区| 精品久久久久久电影网| 丝袜在线中文字幕| 桃红色精品国产亚洲av| 岛国在线观看网站| 极品教师在线免费播放| 一级毛片高清免费大全| 99精品欧美一区二区三区四区| 午夜成年电影在线免费观看| 国产一区在线观看成人免费| 成人国语在线视频| 成人国产一区最新在线观看| 欧美日韩福利视频一区二区| 12—13女人毛片做爰片一| 久久精品亚洲av国产电影网| 亚洲精品在线美女| 91国产中文字幕| 欧美激情久久久久久爽电影 | 国产精品自产拍在线观看55亚洲| 亚洲av片天天在线观看| 视频区欧美日本亚洲| 色播在线永久视频| 久久精品人人爽人人爽视色| 男男h啪啪无遮挡| 精品久久久久久久毛片微露脸| 国产成人免费无遮挡视频| 18禁国产床啪视频网站| 国产片内射在线| 久久精品国产亚洲av香蕉五月| 高清在线国产一区| 午夜福利一区二区在线看| 成人18禁在线播放| 91麻豆精品激情在线观看国产 | 男女午夜视频在线观看| 免费在线观看亚洲国产| 18禁观看日本| 99精品久久久久人妻精品| 日本免费一区二区三区高清不卡 | 中国美女看黄片| 国产精品1区2区在线观看.| 69av精品久久久久久| 欧美日韩精品网址| 久久亚洲真实| 亚洲欧美精品综合一区二区三区| 日韩大码丰满熟妇| 夜夜看夜夜爽夜夜摸 | 国产亚洲欧美精品永久| 国产又爽黄色视频| 亚洲精品国产一区二区精华液| 亚洲少妇的诱惑av| 美国免费a级毛片| 亚洲专区字幕在线| 黄色女人牲交| 中文字幕色久视频| 在线观看一区二区三区| 成人国语在线视频| 黄网站色视频无遮挡免费观看| 超色免费av| 亚洲精品在线美女| 99在线视频只有这里精品首页| 午夜视频精品福利|