• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Activation Redistribution Based Hybrid Asymmetric Quantization Method of Neural Networks

    2024-02-20 12:03:30LuWeiZhongMaandChaojieYang

    Lu Wei,Zhong Maand Chaojie Yang

    R&D Innovation Center,Xi’an Microelectronics Technology Institute,Xi’an,710065,China

    ABSTRACT

    The demand for adopting neural networks in resource-constrained embedded devices is continuously increasing.Quantization is one of the most promising solutions to reduce computational cost and memory storage on embedded devices.In order to reduce the complexity and overhead of deploying neural networks on Integeronly hardware,most current quantization methods use a symmetric quantization mapping strategy to quantize a floating-point neural network into an integer network.However,although symmetric quantization has the advantage of easier implementation,it is sub-optimal for cases where the range could be skewed and not symmetric.This often comes at the cost of lower accuracy.This paper proposed an activation redistribution-based hybrid asymmetric quantization method for neural networks.The proposed method takes data distribution into consideration and can resolve the contradiction between the quantization accuracy and the ease of implementation,balance the trade-offbetween clipping range and quantization resolution,and thus improve the accuracy of the quantized neural network.The experimental results indicate that the accuracy of the proposed method is 2.02% and 5.52% higher than the traditional symmetric quantization method for classification and detection tasks,respectively.The proposed method paves the way for computationally intensive neural network models to be deployed on devices with limited computing resources.Codes will be available on https://github.com/ycjcy/Hybrid-Asymmetric-Quantization.

    KEYWORDS

    Quantization;neural network;hybrid asymmetric;accuracy

    1 Introduction

    Artificial intelligence with deep convolutional neural networks has made significant breakthroughs in many fields,which will be widely used in the aerospace field,such as situational awareness[1],intelligent obstacle avoidance [2],and remote sensing image in-orbit detection [3].The biggest challenge for applying artificial intelligence in the aerospace field is that these artificial intelligence algorithms based on deep convolutional neural networks require a lot of memory and computational cost.In order to efficiently deploy neural networks on embedded devices,several model compression methods have been widely explored.Quantization is an essential technique for adopting deep neural networks in energy-and memory-constrained devices.

    This paper is focused on Integer-only quantization for inference.Quantization is a method of quantizing the high-precision parameters of the neural network into low-precision parameters in a finite set,thereby speeding up the computation.High-precision parameters have a more extensive dynamic range,so the 32-bit floating-point data type is usually used in training.After training,in order to reduce the size of the neural network algorithm,the 32-bit floating-point neural network is quantized to an 8-bit or even lower bit integer network.

    How to quantize a floating-point network to an integer network requires designing a proper mapping method.Quantization usually results in a loss of accuracy due to information lost.How to improve the accuracy of the quantized neural network considering hardware efficiency is the key problem that needs to be solved.A good quantization mapping method should resolve the two following questions to improve the deployment performance.

    The first question is the trade-off between the accuracy of the quantized neural network and the difficulty of deployment and implementation.The simpler the mapping strategy is,the easier and faster the deployment on embedded devices will be,but the loss of accuracy will increase.The more complex the mapping strategy is,the lower the loss of accuracy will be.However,the deployment on embedded devices will be more difficult and result in enormous computational overhead.The commonly used quantization method is symmetric quantization for easy implementation on embedded devices.This method works well only for symmetric distributions,but most distributions of the neural networks are asymmetric.

    The second question is the trade-off between range and quantization resolution,which significantly influences quantization parameters’computation.The larger the clipping range is,the lower the data clipping loss will be.However,the quantization resolution will be lower.The smaller the data clipping range is,the higher the quantization resolution will be,but the data clipping loss will be greater.Range and quantization resolution affect each other,and there is no suitable method to guide how to balance them.

    We propose an activation redistribution hybrid asymmetric quantization mapping method for Integer-only inference to resolve these two questions.Our contribution can be listed as follows:

    Firstly,we propose a hardware-friendly hybrid asymmetric quantization method for Integer-only inference of neural networks,of which the activation uses asymmetric activation quantization and the weights use symmetric quantization.The proposed method can avoid the additional data-dependent computation,achieve higher accuracy without any computational overhead on embedded accelerators,and resolve the contradiction between the accuracy of the quantized neural network and the ease of deployment and implementation.

    Secondly,we introduce an activation redistribution method to compute the quantization parameters achieving lower quantization error.This method has no restrictions on data distribution,and can get the balance between range and quantization resolution.

    2 Related Works

    Most of the existing quantization approaches asymmetric quantization or symmetric quantization[4].The asymmetric quantization function is as follows:

    wherefandf-1are the quantization mapping function,f-1is the inverse function off,round is the rounding operation,ris the floating point real value,Qis the integer value after quantization,s and D are quantization parameters.s is the scaling factor,andDis the zero point,chosen such that the0value would exactly map to quantized values.

    Symmetric quantization is a simplified version of the general asymmetric case[5].The symmetric quantizer restricts the quantization parameterDto0[6].

    On the one hand,different quantization mapping functions are applicable for different data distributions.The data distributions of each layer in the neural network are not same.Figs.1 and 2 illustrate the activation distributions for each convolutional layer in the Yolo-v3 tiny model.We divide the data distributions into two categories: one is approximately symmetric,as shown in Fig.2,and the other is asymmetric,as shown in Fig.1.Symmetric quantization is much simpler and hardwarefriendly,but is only effective for symmetric distribution.The asymmetric quantization does not require the data distribution to be symmetric around zero,but it is more expressive because there is an extra quantization parameter D and a computational overhead.The activation distributions of twelve convolutional layers(layer 3,layer 5,layer 7,layer 9,layer 11,layer 13,layer 14,layer 15,layer 16,layer 21-1,layer 21-2,layer 23) are asymmetric.Only two activation distributions of convolutional layers(layer 1 and layer 19)are approximately symmetric.Most activation distributions of the Yolo-v3 tiny model for detection are asymmetric,so the traditional symmetric quantization method suffers from a considerable loss of accuracy for the small target detection tasks.

    On the other hand,the quantization parameters are very important for both asymmetric and symmetric quantization and affect the performance of the quantized neural network.The quantization parameters depend on the clipping range,and the scaling factors divides the given range of real values into a number of partitions.Usually,a series of calibrations are used as the input of a neural network to compute the typical range of activations[7,8].A straightforward choice is to use the min/max of the data for the clipping range[7],which may unnecessarily increase the range and reduce the quantization resolution.One approach is to use the i-th largest/smallest value instead of the min/max value as the clipping range[9].Another approach is to select the clipping range by some kinds of information loss between the original real values and the quantized values [10,11],including KL divergence [12,13],Mean Squared Error (MSE) [14–17],or entropy [18].There are other methods to get the clipping range by learning the clipping range during training,including PACT [19],LQNets [20],LSQ [21],and LSQ+[22].When computing the data clipping range by KL,MSE,or other methods between the original real value and the quantized value,the absolute value of the data is first taken.Therefore,the data distribution in the range of positive and negative values cannot be effectively measured,and there is a problem of wasting the dynamic range of the data.At the same time,simply and directly taking the maximum and minimum values as the clipping thresholds cannot reflect the data distribution.So for the hybrid asymmetric quantization mapping strategy,there is no suitable method to compute the clipping range.

    Therefore,the data distribution is not taken consideration in the current one-size-fits-all quantization methods,and there is no guiding principle on how to choose the most suitable method to compute the clipping range,so the current quantization methods cannot adapt to different neural network structures,and perform poorly for tasks with higher accuracy requirement.

    Figure 1 :The activation distributions of four representative convolutional layers(layer 5,layer 9,layer 16,layer 23)of the Yolo-v3 tiny model for detection.These activation distributions are asymmetric.The horizontal axis is the activation value,and the vertical axis is the activation density

    Figure 2 :The activation distributions of two convolutional layers(layer 1 and layer 19)of the Yolo-v3 tiny model for detection.These activation distributions are approximately symmetric.The horizontal axis is the activation value,and the vertical axis is the activation density

    3 Design

    3.1 Overall Design Scheme

    We propose an activation redistribution hybrid asymmetric quantization method for Integer-only inference of neural networks with simplicity and efficient implementation to hardware.The activation uses asymmetric activation quantization and the weights use symmetric quantization that avoids the additional data-dependent computation.A neural network usually consists of various layers,including the convolutional layer,the relu layer,the leaky-relu layer,the relu6 layer,the sigmoid layer,the tanh layer,and the FC layer,etc.We propose a hybrid asymmetric quantization method for neural networks and the corresponding method to compute the quantization parameters.For the computationally expensive layers,including the convolutional layer and the FC layer,we propose how to effectively quantize these layers according to the hybrid quantization parameters.For the non-linear layers,such as the relu layer,the leaky-relu layer,the relu6 layer,the sigmoid layer,etc,we propose a quantization template.All the non-linear layers can be quantized according to this template.

    3.2 The Hybrid Asymmetric Integer-Only Quantization Method

    In order to take into account the inference speed,accuracy,and convenience of the deployment for a quantized neural network,we propose a hybrid quantization method with asymmetric activation quantization and symmetric weight quantization.So the quantization mapping functions of the activation and weights are:

    where inputfi,j,kis the activation of the neural network,wfk,nrepresents the weights of the k-th input channel and the n-th output channel,inputqi,j,kis the quantized input andwqk,nis the quantized weight,swis the quantization parameter of the weights,sinandDinare the quantization parameters of the convolution input.

    For the computationally expensive layers,including the convolutional layer and the FC layer,we propose how to effectively quantize these layers according to the hybrid quantization parameters.The quantization of the FC layer is as same as the convolutional layer.

    For the non-linear layers,such as the relu layer,the leaky-relu layer,the relu6 layer,the sigmoid layer,etc.,we propose a quantization template.All the non-linear layers can be quantized according to this template.The proposed method can achieve higher accuracy without any execution time overhead on embedded accelerators.

    3.2.1TheMethodtoQuantizetheConvolutionalLayer

    How to quantize the convolutional layer needs to be inferred from the computational principles of the convolutional layer.The computation principle of the convolutional layer is:

    where biasfkis the k-th bias of the convolutional layer,andis the output of the convolutional layer.All the above data types are floating-point.

    According to the computation principle of the convolution layer and the proposed hybrid asymmetric quantization strategy,how to quantize the convolutional layer can be inferred.The activations of the convolutional layer(including the input and output)adopt asymmetric quantization mapping,and the weights of the convolutional layer adopt symmetric quantization mapping.The computation principle of the quantization for the convolutional layer is:

    wheresoutandDoutare the quantization parameters of the convolution output,and S is the shift parameter for the inference process of the convolution layer.

    The method to quantize the convolutional layer can be divided into 5 steps according to Eq.(6),as shown in Algorithm 1.Algorithm 1 is based on Eq.(6),and Eq.(6) illustrates how to get the integer output of the convolution layer from the integer input and the integer weights.In Eq.(6),the integer output is represented asthe integer input is represented as,and the integer weight is represented asshould be on the left side of Eq.(6),should be on the right side of Eq.(6),and in order to make both sides of Eq.(6) equal and involve only integer arithmetic,we can get that the left side of Eq.(6) should be2-Son the basis of Eq.(5).The first step to quantize the convolutional layer is to compute the hybrid asymmetric quantization parameters,includingsin,Din,sw,soutandDout.The second step is to quantize the floatingpoint activations and weights into integer with the hybrid asymmetric quantization parameters according to Eqs.(7)–(10).The third step is to execute the multiplication and accumulation operations of the integer activations and weights,which is represented as conv in Fig.3.The fourth step is to compute the dequantization parameters,including shift parameter S,multiplication parameter MUL,and addition parameter ADD.Dequantization is the procedure proposed in this paper to get the integer convolutional output from the results of the third step.The Dequantization procedure consists of multiplication,addition and shift.The last step is to complete the dequantization on the results of the third step in order of multiplying,adding and shifting.The multiplier is MUL.Adding uses parameter ADD,means Adding ADD.Shifting uses parameter S,means Multiplying 2-S,so 2-Scan be implemented with an efficient bit-shift.The procedure for quantizing the convolutional layer is shown in Fig.3.

    Algorithm 1:Quantizing a convolutional layer Inputs:the convolutional activation and weights Outputs:the convolutional output 1.Compute quantization parameters.How to get the hybrid asymmetric quantization parameters is shown in Section 3.3.2.Quantize the floating-point inputs and weights of the convolutional layer to integers.inputfi,j,k=MAXimages/BZ_975_596_1543_614_1589.pngMINimages/BZ_975_723_1543_742_1589.pnginputf i,j,k,maxinimages/BZ_975_1014_1543_1033_1589.png,mininimages/BZ_975_1153_1543_1171_1589.png(7)wf k,n=MAXimages/BZ_975_514_1626_532_1672.pngMINimages/BZ_975_641_1626_660_1672.pngwf k,n,maxoutimages/BZ_975_864_1626_883_1672.png,minoutimages/BZ_975_1016_1626_1035_1672.pngimages/BZ_975_591_1698_623_1744.pnginputf i,j,k-Dinimages/BZ_975_893_1698_925_1744.png(8)inputq i,j,k=round (9)images/BZ_975_509_1844_540_1890.pngwf images/BZ_975_616_1844_648_1890.pngsin wq (10)3.Execute the multiplication and accumulation operations of the integer activations and weights.4.Compute dequantization parameters,including shift parameter S,multiply parameter MUL,and add parameter ADD.S=floor k,n=round k,n swimages/BZ_975_456_2137_487_2183.pngimages/BZ_975_762_2137_793_2183.pngimages/BZ_975_793_2137_824_2183.png-log2images/BZ_975_613_2137_644_2183.pngsin·sw sout+(8-1)(11)images/BZ_975_554_2261_585_2307.pngsin·swimages/BZ_975_778_2261_809_2307.png(12)bias_newf MUL=round sout·2S k=biasfk+Din·images/BZ_975_750_2390_815_2436.pngwf k,n(13)images/BZ_975_976_2500_1007_2545.png(14)5.Execute dequantization to get the convolutional output.ADD=roundimages/BZ_975_547_2500_578_2545.pngbias_newf k-Dout sout·2S

    Figure 3 :The procedure for quantizing the convolutional layer

    In step 2,inputfi,j,kandare clipped according to their respective thresholds,as shown in Eqs.(7)and(8),and then they are quantized to integer according to Eqs.(9)and(10).MAX in Eqs.(7)and(8)is to take the maximum value and MIN in Eqs.(7)and(8)is to take the minimum value.

    In step 4,the shift parameter S,multiply parameter MUL,and add parameter ADD are computed according to Eqs.(11)–(14).A convolution layer has several groups of dequantization parameters,the number of dequantization parameters is the same as the number of output channels.We should convert the floating-point biases of a convolutional layer to add parameters.In order to simplify the process of inference,we modify the floating-point biasesto bias_newfk,thereby accelerating the inference speed of the convolutional layer.is the sum of the weights in units of output channels,that is,how many output channels there are,how manyvalues are computed.

    3.2.2TheMethodtoQuantizetheNon-LinearLayers

    This section introduces how to quantize the non-linear layers.We propose a quantization template for the non-linear layers.All the non-linear layers can be quantized according to this template.We introduce how to quantize the relu layer,the leaky-relu layer,the relu6 layer,the sigmoid layer,and the tanh layer according to the proposed quantization template.

    The computation principle of the nonlinear layers can be expressed as:

    whereFis the function of a non-linear layer,xis the fixed floating-point real value.For the relu layer and the leaky-relu layer,xis 0.For the relu6 layer,xis 6.The floating-point input is represented by inputfi,j,k,and the floating-point output is represented by outputfl,m,n.

    The quantization method of the non-linear layers is based on the lookup table.The proposed quantization template to compute the lookup table for the non-linear layers is:

    wherefis the quantization mapping function as in Eq.(1),f-1is the inverse function of mapping function f as in Eq.(2),the integer input is represented by inputqi,j,k,and the integer output is represented by outputql,m,n.

    How to use the proposed quantization template to compute the lookup tables for the relu layer,the leaky-relu layer,the relu6 layer,the sigmoid layer and the tanh layer is shown in Table 1.Negative_slope is the parameter of the leaky-relu layer.

    Table 1 : Quantization method for non-linear layers

    3.3 The Method to Compute Quantization Parameters

    This section introduces how to compute the hybrid asymmetric quantization parameters.Select several pictures as the calibration set to compute the quantization parameters for the neural network.The method to compute quantization parameters is divided into two steps.The first step is to get the clipping thresholds,and the second step is to compute the quantization parameters according to the clipping thresholds.The clipping thresholds significantly influence quantization parameters’computation.

    The method to compute the clipping thresholds should balance the trade-off between range and quantization resolution.Whether the data clipping thresholds are determined by KL,MSE or other methods between the original real values and the quantized values,there is a problem of wasting the dynamic range of the data.Because these methods are on the premise that data distribution is symmetric.But most of the activation distributions are asymmetric.These methods take the absolute value of the data first when computing the clipping range,and then select the data thresholds by a certain measurement method.The operation of taking the absolute value makes these methods unable to truly reflect the data distribution both in the positive and negative range.The quantization of nonnegative activations may be less effective at this point because the clipping range includes values that never appear in the input.

    In order to adopt asymmetric activation distributions,and balance the trade-off between range and quantization resolution,we propose an activation redistribution method to compute the clipping thresholds achieving lower quantization error,because this method takes data distribution into consideration.The optimal clipping range for the input is [minin,maxin],the optimal clipping range for the output is[minout,maxout],the threshold of the weights is thw.The procedure for computing these clipping thresholds is shown in Algorithm 2 and Fig.4.

    Algorithm 2:Computing the optimal clipping thresholds Inputs:the data distribution of activation and weights Outputs:clipping thresholds[minin,maxin],[minout,maxout]and thw.1.Transform the input activation distribution into a gaussian-like distribution by Box-Cox[23]as shown in Fig.4.λ determines the specific type of transformation,such as square root transformation,reciprocal transformation,etc.Different distributions should choose different λ.y=BC(x)■■■(x+c)λ-1 λ,if λ/=0 log(x+c),if λ=0(17)2.Compute the clipping thresholds by KL divergence to get[-maxKL,maxKL][10].3.Get[minin,maxin]by the inverse transformation as shown in Fig.4.4.The method to get the final output clipping range[minout,maxout]is the same as[minin,maxin].5.The weight clipping threshold thw is the maximum value of the absolute of the weights of a channel for the convolution layer.A convolution layer has several thw,the number of thw is the same as the number of output channels.

    Fig.4 shows how to get the clipping thresholds[minin,maxin]of the input.c is a constant to ensure that the input is positive.All of the inputs plus c,and then transform the input into a gaussianlike distribution by Box-Cox,which is represented as BC in Fig.4.d is a constant to ensure that the transformed gaussian-like distribution is symmetric.We use KL divergence to balance the tradeoff between range and quantization resolution.Get the clipping thresholds [-maxKL,maxKL] by KL divergence[10]of the symmetric gaussian-like distribution.In this way,the data both in the positive and negative range is taken into consideration.At last,get[minin,maxin]by the inverse transformation of Box-Cox,which is represented as BC-1.

    As can be seen from the above figure,when the data distribution is not symmetrical around 0,for example,the negative values are small,then the data thresholds determined by the KL divergence are not suitable,because the threshold selected for the negative value area is affected by the positive value,which cannot match the actual data distribution of negative values.The proposed method transforms an asymmetric and skewed activation distribution into a gaussian-like distribution,then get the clipping thresholds by KL divergence,and finally gets the final clipping range by the inverse transformation.

    Figure 4 :The activation redistribution method to compute the optimal clipping thresholds

    How to compute the quantization parameters according to the clipping thresholds is as follows.The quantization parameterssin,Din,sout,Dout,swof a layer are computed according to Eqs.(18)–(22).

    wherebwin,bwoutandbwware the bit width of input,output and weight,of which 8 is commonly used.

    4 Implementation and Experimental Results

    The purpose of the experiments is to verify the effectiveness of the proposed hybrid asymmetric Integer-only quantization method.

    4.1 Experimental Setting

    The neural networks adopted in the experiments are the image classification models and the small target detection model.All of the neural networks are quantized to INT8.

    Firstly,the experiments are implemented on the TIANJI NPU3.0 neural network accelerator proposed by Xi’an Microelectronics Technology Institute[24]and Cambricon MLU220[25].TIANJI NPU3.0 accelerator is implemented based on Xilinx ZCU102 FPGA,with self-controllable IP and application development tool chain[26].The FPGA is Zynq UltraScale+XCZU9EG,and there are 2520 DSP slices,and the DDR4 in the programmable logic is 4 Gb.MLU220 is Based on the Cambrian MLUv02 architecture.The theoretical peak performance is 8TOPS and the power consumption is 8.25 W.These two accelerators can be widely used in edge computing scenarios to support diverse AI applications.The image classification models and the small target detection model are deployed on the neural network accelerators,and the speed and accuracy are verified.The purpose of the experiments is to verify the effectiveness of the proposed hybrid asymmetric Integer-only quantization method on embedded devices.The expected experimental results are that the proposed quantization method can improve the accuracy without affecting the speed on embedded devices compared with the traditional symmetric quantization method adopted by most embedded neural network accelerators.

    Secondly,we compare the proposed method with PyTorch and NNI[27]on image classification models and the small target detection model,and the accuracy is verified on software.The CPU is Intel(R)Core(TM)i7-8700K,3.70 GHz,and the GPU is NVIDIA GeForce GTX1070.In order to get the experimental results conveniently,software with fake-quantization[28]modules is used to simulate the accuracy on the neural network accelerator.Fake-quantization models quantization errors in the forward passes.The reason to apply fake-quantization is to quickly simulate the effects of quantization using simulated quantization operations.Codes will be available on https://github.com/ycjcy/Hybrid-Asymmetric-Quantization,from which the experimental results of the proposed method,PyTorch,and NNI can be obtained.The purpose of the experiments is to verify the effectiveness of the proposed hybrid asymmetric quantization method with the state-of-art.The expected experimental results are that the proposed hybrid asymmetric quantization method can improve the accuracy compared with PyTorch and NNI.

    4.2 Dataset

    The dataset for image classification application is ImageNet.ImageNet is an image database organized according to the WordNet hierarchy,in which each node of the hierarchy is depicted by hundreds and thousands of images.The dataset has been instrumental in advancing computer vision and deep learning research.

    The dataset for is small target detection application is HRSID.HRSID is a dataset for ship detection,semantic segmentation,and instance segmentation tasks in high-resolution SAR images.The dataset contains 5604 SAR images with resolutions of 0.5,1,and 3 m.

    4.3 Evaluation Metrics

    In order to verify the accuracy of quantification methods extensively,we evaluate two aspects of quantization errors.One is the quantization error of a particular layer.The second is the overall quantization error of a model.

    For the first aspect of quantization error,there are no ideal metrics that can perfectly measure the quantization error.Different metrics reflect the quantization error from different points.We adopt the following three metrics to measure the quantization error,including Manhattan distance,Euclidean distance,and Signal to Noise Ratio.The range of Manhattan distance and Euclidean distance is 0 to+∞,and the range of Signal to Noise Ratio is-∞to+∞.The smaller the Manhattan distance and the Euclidean distance are,the lower the error will be.The higher the signal-to-noise ratio is,the lower the error will be.

    · Manhattan distance (sum of the absolute values of the difference between the original real values and the corresponding floating-point values after quantization):

    · Euclidean distance (the square root of the sum of the square of the difference between the original real values and the corresponding floating-point values after quantization):

    · Signal to Noise Ratio:

    whereriis the original real value of floating point,qiis the corresponding real value after quantization,and the size of the input or output data is t.

    For the second aspect of quantization error,the evaluation metrics are the accuracy metrics of that model.For image classification application,we use Top-1 Accuracy (the one with the highest probability must be exactly the expected answer).For small target detection application,we use mAP(Mean Average Precision).The calculation of mAP is the same as in the internationally renowned target detection competition PASCAL VOC Challenge.

    4.4 Baseline

    Firstly,we use a traditional symmetric quantization method as a baseline.This method adopts symmetric quantization for both activation and weights,with the clipping range determined by KL divergence.This method is adopted by most embedded neural network accelerators,such as Nvidia’s TensorRT[12],TVM[13],etc.

    Secondly,as a baseline,we compare the proposed method with PyTorch and NNI on PC.PyTorch supports INT8 quantization compared to typical FP32 models allowing for a 4x reduction in the model size and a 4x reduction in memory bandwidth requirements.Hardware support for INT8 computations is typically 2 to 4 times faster compared to FP32 computing.PyTorch supports multiple approaches to quantize a deep learning model.In most cases,the model is trained in FP32 and then the model is converted to INT8.In addition,there are three functions in PyTorch to compute the clipping range.The torch.quantization.observer module in Pytorch integrates three calibration strategies,including MinMaxObserver,MovingAverageMinMaxObserver,and HistogramObserver.There is no guide on how to get the most suitable strategy.The easiest way(and the default option in Pytorch)is to directly take the minimum and maximum values by the MinMaxObserver function.The method in NNI to compute the clipping range is also to take the minimum and maximum values.

    4.5 Results

    4.5.1ResultsonFPGA

    · Classification Application

    The models used for image classification are GoogleNet,MobileNetV2,and VGG16.For fair comparison and ease of reproducibility,we use well-trained models on the ImageNet dataset.For image classification application,we test Top-1 Accuracy and FPS(How many frames can be processed per second).

    TIANJI NPU3.0 accelerator runs at a frequency of 200M.The resources consumption on FPGA of TIANJI NPU3.0 is shown in Table 2,including LUT,FlipFlops,Block RAMs and DSPs.

    Table 2 : Resources Consumption on FPGA of TIANJI NPU3.0

    The results for the classification application on TIANJI NPU3.0 are shown in Table 3.A basic requirement of inference on TIANJI NPU3.0 is that it permits implementation of all arithmetic using only integer arithmetic operations,so it is a big challenge for the quantization method to reduce the accuracy loss.The proposed method ensures that all the layers of the neural network are inferenced by integer.For the three classification models,the FPS of the proposed hybrid asymmetric quantization method is the same as the FPS of the traditional symmetric quantization method.So the proposed hybrid asymmetric quantization method can improve the classification accuracy by an average of 2.02% without affecting the speed on FPGA.It can meet the accuracy requirements of image classification tasks.

    Table 3 : Experimental results of the proposed hybrid asymmetric quantization method and the traditional symmetric quantization method for classification application on TIANJI NPU3.0

    The results for the classification application on MLU220 are shown in Table 4.MLU220 only supports symmetric quantization,and only convolutional layers and FC layers can be quantized to INT8,others types of layers are all executed in FP32.The traditional method on MLU220 to compute the clipping range is to take the minimum and maximum values.We compare the proposed activation redistribution method with the traditional method on MLU220.The methods proposed in this paper are all computed offline and do not increase the computational overhead of the embedded devices,which is the same as other traditional methods.It is valid for different embedded devices,because this method proposes a new way to compute quantization parameters from a mathematical point of view,which has no effect on embedded devices.

    Table 4 : Experimental results of the proposed hybrid asymmetric quantization method and the traditional symmetric quantization method for classification application on MLU220

    Table 5 :Comparison of quantized error of the proposed hybrid asymmetric quantization method and the traditional symmetric quantization method for convolutional layers on TIANJI NPU3.0

    Table 6 :Comparison of quantized error of the proposed hybrid asymmetric quantization method and the traditional symmetric quantization method for RELU layers on TIANJI NPU3.0

    · Small Target Detection Application

    The small target detection task is very challenging because the loss of accuracy is very sensitive to quantization.The model we choose for this small target detection task is Yolo-v3 tiny,a typical object detection model that has been widely adopted.

    The experimental results measuring the quantization error of convolutional layers and relu layers for the small target detection application on TIANJI NPU3.0 are shown in Tables 5 and 6.It can be seen that the Manhattan distance and the Euclidean distance of the proposed hybrid asymmetric method are lower,and the signal-to-noise ratio SQNR of the proposed hybrid asymmetric method is higher.

    The speed and accuracy experimental results for the small target detection application on TIANJI NPU3.0 are shown in Fig.5 and Table 7.For the small target detection model,the FPS of the proposed hybrid asymmetric quantization method is the same as the FPS of the traditional symmetric quantization method.So the proposed hybrid asymmetric quantization method can improve the detection accuracy by 5.52% without affecting the speed on embedded devices.It can meet the accuracy requirements of small object detection tasks.

    Table 7 : Experimental results of the proposed hybrid asymmetric quantization method and the traditional symmetric quantization method for small target detection application on TIANJI NPU3.0

    Figure 5 :Small object detection on HRSID dataset

    The results for the small target detection application on MLU220 are shown in Table 8.The proposed activation redistribution method can improve the detection accuracy from 82.67% to 82.83%.

    Table 8 : Experimental results of the proposed hybrid asymmetric quantization method and the traditional symmetric quantization method for small target detection application on MLU220

    4.5.2ResultsonPC

    · Classification Application

    The models used for image classification to compare with PyTorch and NNI are the same as Section 4.5.1.The evaluation metric is Top-1 Accuracy.There are three ways in PyTorch to compute the clipping range,including MinMax,MovingAverage,and Histogram.We compare the proposed hybrid asymmetric quantization method with PyTorch and NNI,and compare the proposed activation redistribution method with MinMax,MovingAverage,and Histogram in PyTorch on fakequantization software.

    The results for the classification application are shown in Table 9 and Fig.6.For the three classification models,the accuracy of the proposed hybrid asymmetric quantization method(66.85%,67.38%,66.26%)is the highest compared with PyTorch(66.50%,66.82%,65.22%)and NNI(66.74%,66.50%,65.20%).At the same time,there are three ways in PyTorch and NNI to compute the clipping range.For different models,the best strategy of PyTorch and NNI to compute the clipping range is different.The proposed activation redistribution method outperforms the three strategies of PyTorch and NNI.

    Table 9 :Experimental results of the proposed hybrid asymmetric quantization method,pyTorch and NNI for classification application on fake-quantization software

    Figure 6 :Comparison of the proposed method and PyTorch for image classification application.The vertical axis is the Top-1 Accuracy

    · Small Target Detection Application

    The model used for small target detection to compare with PyTorch is the same as Section 4.5.1.The evaluation metric is mAP.We compare the proposed hybrid asymmetric quantization method with PyTorch,and compare the proposed activation redistribution method with MinMax,MovingAverage,and Histogram in PyTorch on fake-quantization software.

    The results for small target detection model application are shown in Table 10.The proposed hybrid asymmetric quantization method can improve the detection accuracy compared with PyTorch.

    Table 10 :Experimental results of the proposed hybrid asymmetric quantization method,pyTorch and NNI for small target detection application on fake-quantization software

    5 Conclusion and Future Directions

    We propose an activation redistribution hybrid asymmetric quantization method for Integer-only inference of neural networks.This method is suitable for both symmetric distributions and asymmetric distributions.When the proposed hybrid asymmetric Integer-only quantization method is applied to classification models,we can achieve an average accuracy improvement up to 2.02% compared with the traditional symmetric quantization method.When the proposed hybrid asymmetric Integer-only quantization method is applied to Yolo-v3 tiny model for detection,the accuracy improvement is 5.52% compared with the traditional symmetric quantization method.So,our method can make the neural networks quickly and easily deployed on the resource-constrained embedded devices.

    For further work,we believe that making the distribution more friendly to quantization is a promising research direction to improve the quantization performance further.

    Acknowledgement:The Authors acknowledge the support received from the Qian Xuesen Youth Innovation Foundation of China Aerospace Science and Technology Corporation under grant 2022JY51.

    Funding Statement:The Qian Xuesen Youth Innovation Foundation from China Aerospace Science and Technology Corporation(Grant Number 2022JY51).

    Author Contributions:The authors confirm contribution to the paper as follows:study conception and design:Lu Wei,Zhong Ma;data collection and experiment:Chaojie Yang;analysis and interpretation of results: Lu Wei,Chaojie Yang; draft manuscript preparation: Lu Wei,Zhong Ma.All authors reviewed the results and approved the final version of the manuscript.

    Availability of Data and Materials:The data that support the findings of this study are available from the accessible website https://github.com/ycjcy/Hybrid-Asymmetric-Quantization.

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    亚洲最大成人中文| 搞女人的毛片| 久久久久久久午夜电影| 2021天堂中文幕一二区在线观| 国产精品久久久久久久电影| 99re6热这里在线精品视频| 久久精品熟女亚洲av麻豆精品 | 日本一二三区视频观看| 国产在线男女| 国产成人a∨麻豆精品| 亚州av有码| 亚洲真实伦在线观看| 成人无遮挡网站| 在线天堂最新版资源| 免费大片黄手机在线观看| 久久鲁丝午夜福利片| 人妻制服诱惑在线中文字幕| 国产精品国产三级国产专区5o| 欧美+日韩+精品| 亚洲丝袜综合中文字幕| 亚洲国产精品成人综合色| 最近手机中文字幕大全| 国产真实伦视频高清在线观看| 日韩三级伦理在线观看| 永久网站在线| av福利片在线观看| 国产毛片a区久久久久| 黄色欧美视频在线观看| 国产亚洲最大av| 国产精品久久久久久精品电影| 天堂中文最新版在线下载 | 精品人妻熟女av久视频| 精品久久久久久电影网| 久久久久久久久久久免费av| 久久久久久久久久成人| 少妇高潮的动态图| 夜夜看夜夜爽夜夜摸| 精品一区二区三区人妻视频| 久久久久久久久中文| 日韩一区二区三区影片| 综合色丁香网| 国产精品精品国产色婷婷| 久久精品国产自在天天线| 亚洲自拍偷在线| 免费看a级黄色片| 三级国产精品欧美在线观看| 人人妻人人澡人人爽人人夜夜 | 边亲边吃奶的免费视频| 久久久久久久久久人人人人人人| 91精品一卡2卡3卡4卡| 精品久久久噜噜| av在线老鸭窝| 国产一级毛片在线| 国产精品国产三级国产专区5o| 五月天丁香电影| 亚洲综合精品二区| 亚洲国产精品成人久久小说| 久久韩国三级中文字幕| 一区二区三区四区激情视频| 我要看日韩黄色一级片| 97在线视频观看| 国产精品国产三级国产av玫瑰| 亚洲怡红院男人天堂| 久久久久久伊人网av| 最近最新中文字幕免费大全7| xxx大片免费视频| 色综合亚洲欧美另类图片| 亚洲国产精品专区欧美| 久久久午夜欧美精品| 伊人久久精品亚洲午夜| 国产伦精品一区二区三区视频9| 波野结衣二区三区在线| 亚洲无线观看免费| 国产欧美日韩精品一区二区| 最近的中文字幕免费完整| 午夜爱爱视频在线播放| 欧美xxxx性猛交bbbb| 好男人视频免费观看在线| 有码 亚洲区| 观看免费一级毛片| 亚洲精品中文字幕在线视频 | 日韩制服骚丝袜av| 午夜福利网站1000一区二区三区| 免费观看在线日韩| 最近中文字幕2019免费版| 色网站视频免费| 亚洲av二区三区四区| 大香蕉久久网| 久久久久久久久久成人| 嫩草影院新地址| 亚洲真实伦在线观看| 久久久久久久午夜电影| 免费播放大片免费观看视频在线观看| 最新中文字幕久久久久| 日本与韩国留学比较| 91久久精品电影网| 精品一区二区三区人妻视频| 国产免费福利视频在线观看| 久久久久久久久久人人人人人人| 国产免费福利视频在线观看| 精品熟女少妇av免费看| 国产爱豆传媒在线观看| av网站免费在线观看视频 | 午夜福利在线在线| freevideosex欧美| 亚洲国产精品sss在线观看| 亚洲自拍偷在线| 精品国产三级普通话版| 成人国产麻豆网| 草草在线视频免费看| 最新中文字幕久久久久| 国产精品一及| 亚洲真实伦在线观看| 久久99蜜桃精品久久| 亚洲精品亚洲一区二区| 日韩av在线免费看完整版不卡| av网站免费在线观看视频 | 免费高清在线观看视频在线观看| 亚洲av国产av综合av卡| 纵有疾风起免费观看全集完整版 | 色综合亚洲欧美另类图片| 亚洲av福利一区| 一级片'在线观看视频| 最近最新中文字幕免费大全7| 一本—道久久a久久精品蜜桃钙片 精品乱码久久久久久99久播 | 老女人水多毛片| 80岁老熟妇乱子伦牲交| 欧美变态另类bdsm刘玥| 内射极品少妇av片p| 黄色日韩在线| 国产精品一区二区性色av| 午夜日本视频在线| 久久精品人妻少妇| 国产一区二区三区av在线| 精华霜和精华液先用哪个| 国产 亚洲一区二区三区 | 精品国产三级普通话版| 看非洲黑人一级黄片| 日本wwww免费看| 一边亲一边摸免费视频| 日韩强制内射视频| 成人无遮挡网站| 亚洲欧美中文字幕日韩二区| 成年免费大片在线观看| 亚洲久久久久久中文字幕| 别揉我奶头 嗯啊视频| 亚洲av成人精品一区久久| av专区在线播放| 国产伦在线观看视频一区| 欧美区成人在线视频| 国产av码专区亚洲av| 男女啪啪激烈高潮av片| 国产精品无大码| 热99在线观看视频| 久久热精品热| 国产成年人精品一区二区| 国产精品av视频在线免费观看| 2018国产大陆天天弄谢| 一级二级三级毛片免费看| 亚洲国产最新在线播放| 在线免费十八禁| 久久99热6这里只有精品| 国产乱人偷精品视频| 精品亚洲乱码少妇综合久久| 在线观看人妻少妇| 亚洲乱码一区二区免费版| 亚洲一级一片aⅴ在线观看| 在线观看一区二区三区| 七月丁香在线播放| 成人特级av手机在线观看| 欧美激情在线99| 在线 av 中文字幕| 高清毛片免费看| 欧美潮喷喷水| 少妇的逼水好多| 少妇熟女aⅴ在线视频| 欧美区成人在线视频| 国产亚洲5aaaaa淫片| 亚洲精品,欧美精品| 精品不卡国产一区二区三区| 久久久久九九精品影院| 女人十人毛片免费观看3o分钟| 午夜激情福利司机影院| 午夜福利成人在线免费观看| 国产成人精品福利久久| 国产亚洲5aaaaa淫片| 在线免费观看的www视频| 偷拍熟女少妇极品色| 欧美xxxx性猛交bbbb| 精品99又大又爽又粗少妇毛片| av专区在线播放| 国产免费福利视频在线观看| 少妇丰满av| 青青草视频在线视频观看| 久久97久久精品| 国产人妻一区二区三区在| 亚洲国产精品国产精品| 看免费成人av毛片| .国产精品久久| 国产精品嫩草影院av在线观看| 成人av在线播放网站| 免费黄频网站在线观看国产| 精品国内亚洲2022精品成人| 国产精品一区二区三区四区免费观看| 国产白丝娇喘喷水9色精品| 夜夜看夜夜爽夜夜摸| 精品一区二区三区人妻视频| 成人毛片60女人毛片免费| 久久久久久九九精品二区国产| 街头女战士在线观看网站| 男女下面进入的视频免费午夜| 免费电影在线观看免费观看| 欧美性感艳星| 日本欧美国产在线视频| 国产视频内射| 韩国高清视频一区二区三区| 亚洲电影在线观看av| 身体一侧抽搐| av天堂中文字幕网| 精品一区二区三卡| 国产成人午夜福利电影在线观看| 高清在线视频一区二区三区| 日韩,欧美,国产一区二区三区| 少妇人妻一区二区三区视频| 搡老乐熟女国产| 激情 狠狠 欧美| 网址你懂的国产日韩在线| 汤姆久久久久久久影院中文字幕 | 久久久久久伊人网av| 18禁在线播放成人免费| 最近手机中文字幕大全| 国产亚洲一区二区精品| 精品久久久噜噜| 精品酒店卫生间| 免费观看在线日韩| 中文欧美无线码| 亚洲欧美成人综合另类久久久| 老师上课跳d突然被开到最大视频| 高清在线视频一区二区三区| 亚洲,欧美,日韩| 国产亚洲5aaaaa淫片| 日韩大片免费观看网站| 亚洲精品影视一区二区三区av| 国产精品久久视频播放| 日本与韩国留学比较| 日韩av在线免费看完整版不卡| 亚洲真实伦在线观看| 久久99热这里只有精品18| 国产精品一区二区三区四区久久| 亚洲精品久久午夜乱码| 午夜福利视频精品| 中文乱码字字幕精品一区二区三区 | 秋霞在线观看毛片| 国产精品嫩草影院av在线观看| 看非洲黑人一级黄片| 国产不卡一卡二| 少妇裸体淫交视频免费看高清| 亚洲四区av| 麻豆乱淫一区二区| 国产乱人偷精品视频| 午夜免费男女啪啪视频观看| 真实男女啪啪啪动态图| 91久久精品国产一区二区成人| 最后的刺客免费高清国语| 日本黄色片子视频| 亚洲人成网站高清观看| 亚洲欧美日韩东京热| 亚洲av国产av综合av卡| 一级毛片电影观看| 亚洲婷婷狠狠爱综合网| 亚洲成人久久爱视频| 最近最新中文字幕大全电影3| 不卡视频在线观看欧美| 婷婷色麻豆天堂久久| 26uuu在线亚洲综合色| 日韩av在线大香蕉| 亚洲欧美成人综合另类久久久| 一个人看的www免费观看视频| 久久久久久久久久成人| 在线免费观看不下载黄p国产| 国产 一区 欧美 日韩| 亚洲,欧美,日韩| 欧美日韩在线观看h| 久久精品综合一区二区三区| 成年人午夜在线观看视频 | 国产91av在线免费观看| 日韩成人伦理影院| 天天躁夜夜躁狠狠久久av| 热99在线观看视频| 嫩草影院精品99| 五月天丁香电影| av黄色大香蕉| 中文资源天堂在线| 国产免费一级a男人的天堂| 日韩制服骚丝袜av| 国精品久久久久久国模美| 三级国产精品片| 免费少妇av软件| 免费无遮挡裸体视频| 中文欧美无线码| 18禁在线播放成人免费| 日韩欧美精品免费久久| 午夜福利成人在线免费观看| 国产成人精品福利久久| 亚洲第一区二区三区不卡| 国产精品国产三级专区第一集| 免费不卡的大黄色大毛片视频在线观看 | a级毛片免费高清观看在线播放| 国产精品久久久久久精品电影小说 | 久久久精品免费免费高清| 1000部很黄的大片| 看非洲黑人一级黄片| 综合色av麻豆| 春色校园在线视频观看| 亚洲人成网站高清观看| 高清午夜精品一区二区三区| 三级经典国产精品| av天堂中文字幕网| 亚洲精品视频女| 国国产精品蜜臀av免费| 人妻系列 视频| 久久久久久久亚洲中文字幕| 99视频精品全部免费 在线| 黄片无遮挡物在线观看| av在线天堂中文字幕| 极品教师在线视频| 国产高潮美女av| 国产黄色小视频在线观看| 丝袜喷水一区| 久久久午夜欧美精品| 国产综合精华液| 免费黄网站久久成人精品| 欧美激情久久久久久爽电影| 国产有黄有色有爽视频| 亚洲欧洲国产日韩| 国产白丝娇喘喷水9色精品| 街头女战士在线观看网站| 嫩草影院精品99| 99热6这里只有精品| 别揉我奶头 嗯啊视频| 国产黄片美女视频| 在线观看一区二区三区| 少妇裸体淫交视频免费看高清| 水蜜桃什么品种好| 亚洲av在线观看美女高潮| 免费无遮挡裸体视频| 久久精品人妻少妇| av一本久久久久| 秋霞在线观看毛片| 成年女人在线观看亚洲视频 | 午夜免费观看性视频| 一区二区三区高清视频在线| 国产视频首页在线观看| 亚洲最大成人手机在线| 99热这里只有精品一区| 男女视频在线观看网站免费| 在线免费观看不下载黄p国产| 日韩电影二区| 亚洲成人av在线免费| 国产男人的电影天堂91| 久99久视频精品免费| 中文资源天堂在线| 国产午夜精品久久久久久一区二区三区| 一级二级三级毛片免费看| 日韩 亚洲 欧美在线| 国产伦一二天堂av在线观看| 热99在线观看视频| 国产亚洲精品久久久com| 久久久精品欧美日韩精品| 国产精品国产三级国产专区5o| 非洲黑人性xxxx精品又粗又长| 男人狂女人下面高潮的视频| 日本-黄色视频高清免费观看| 免费看美女性在线毛片视频| 欧美极品一区二区三区四区| 永久免费av网站大全| 校园人妻丝袜中文字幕| 老女人水多毛片| 三级经典国产精品| 久久这里有精品视频免费| 国产黄频视频在线观看| 午夜亚洲福利在线播放| 国产成人aa在线观看| 国产成人午夜福利电影在线观看| 亚洲成人中文字幕在线播放| 久久久久久久久久久免费av| freevideosex欧美| 亚洲av一区综合| 高清日韩中文字幕在线| 在线观看一区二区三区| 国产在视频线精品| 成人欧美大片| 亚洲精品日韩在线中文字幕| 日韩av在线大香蕉| 精品一区二区三卡| 亚洲精品久久久久久婷婷小说| 一二三四中文在线观看免费高清| 天堂影院成人在线观看| ponron亚洲| 毛片一级片免费看久久久久| 老师上课跳d突然被开到最大视频| 在线观看av片永久免费下载| 亚洲自偷自拍三级| 成年人午夜在线观看视频 | 在线播放无遮挡| 国产黄频视频在线观看| 日本av手机在线免费观看| 最近最新中文字幕免费大全7| 一个人免费在线观看电影| 在线免费观看不下载黄p国产| 伊人久久精品亚洲午夜| 亚洲自偷自拍三级| 成人二区视频| 国产精品一区二区性色av| 嫩草影院新地址| 欧美一级a爱片免费观看看| 国产午夜精品久久久久久一区二区三区| 97热精品久久久久久| 午夜视频国产福利| 草草在线视频免费看| 简卡轻食公司| 欧美不卡视频在线免费观看| 亚洲av成人av| 久久草成人影院| 寂寞人妻少妇视频99o| 亚洲国产精品专区欧美| 亚洲欧美中文字幕日韩二区| 91狼人影院| 色综合亚洲欧美另类图片| 成人性生交大片免费视频hd| 我的老师免费观看完整版| 美女xxoo啪啪120秒动态图| 天堂影院成人在线观看| 午夜福利高清视频| 18禁裸乳无遮挡免费网站照片| 日日摸夜夜添夜夜添av毛片| 精品久久久久久久久久久久久| 欧美日韩视频高清一区二区三区二| 久热久热在线精品观看| 永久免费av网站大全| 国产精品爽爽va在线观看网站| a级毛色黄片| 网址你懂的国产日韩在线| 国产 一区精品| 噜噜噜噜噜久久久久久91| 深爱激情五月婷婷| 麻豆成人午夜福利视频| 精品少妇黑人巨大在线播放| 精品人妻视频免费看| 午夜亚洲福利在线播放| 黑人高潮一二区| 国产精品久久久久久精品电影小说 | 男女边摸边吃奶| 只有这里有精品99| 99九九线精品视频在线观看视频| 免费黄网站久久成人精品| 高清午夜精品一区二区三区| 一级毛片久久久久久久久女| 国产一级毛片在线| 久久久久久久久久久免费av| 又粗又硬又长又爽又黄的视频| 亚洲欧美清纯卡通| 国产高清三级在线| 久久国产乱子免费精品| 舔av片在线| 午夜精品在线福利| 在线a可以看的网站| 有码 亚洲区| 蜜桃亚洲精品一区二区三区| 国产高清不卡午夜福利| 国产伦精品一区二区三区视频9| 亚洲精品第二区| 九草在线视频观看| 只有这里有精品99| 亚洲欧美中文字幕日韩二区| 熟妇人妻久久中文字幕3abv| 超碰av人人做人人爽久久| 欧美变态另类bdsm刘玥| 美女主播在线视频| 男人舔女人下体高潮全视频| 最近手机中文字幕大全| 久久精品国产亚洲av天美| 中文欧美无线码| 国内精品宾馆在线| 亚洲经典国产精华液单| 亚洲精品色激情综合| 色尼玛亚洲综合影院| 一级毛片黄色毛片免费观看视频| 亚洲av福利一区| 国产乱人偷精品视频| 亚洲真实伦在线观看| 国产中年淑女户外野战色| 色5月婷婷丁香| 国产一区二区三区av在线| 51国产日韩欧美| 国产综合懂色| 内射极品少妇av片p| 又粗又硬又长又爽又黄的视频| 99热6这里只有精品| 国产探花极品一区二区| 日韩欧美三级三区| 干丝袜人妻中文字幕| 男女边摸边吃奶| 男女视频在线观看网站免费| 韩国av在线不卡| 日韩人妻高清精品专区| 亚洲欧美清纯卡通| 看十八女毛片水多多多| 偷拍熟女少妇极品色| 国产又色又爽无遮挡免| 国产精品三级大全| 午夜爱爱视频在线播放| 色网站视频免费| 六月丁香七月| 精品久久久噜噜| 亚洲国产精品专区欧美| 午夜激情欧美在线| 亚洲性久久影院| 精品一区在线观看国产| 亚洲av在线观看美女高潮| 美女主播在线视频| 成人国产麻豆网| 免费观看的影片在线观看| 国产伦一二天堂av在线观看| 日韩欧美三级三区| 亚洲不卡免费看| 免费观看在线日韩| 丰满人妻一区二区三区视频av| 久久精品久久久久久久性| 熟妇人妻久久中文字幕3abv| eeuss影院久久| 麻豆av噜噜一区二区三区| 中国美白少妇内射xxxbb| 少妇人妻一区二区三区视频| 亚洲国产最新在线播放| 狠狠精品人妻久久久久久综合| 看免费成人av毛片| 丰满人妻一区二区三区视频av| 日韩av在线大香蕉| 欧美一级a爱片免费观看看| 永久网站在线| 麻豆久久精品国产亚洲av| 美女黄网站色视频| 国内精品宾馆在线| 看十八女毛片水多多多| 久久精品久久久久久噜噜老黄| 亚洲欧美成人综合另类久久久| 日日干狠狠操夜夜爽| 大又大粗又爽又黄少妇毛片口| 久久久久久久久久黄片| 国产精品蜜桃在线观看| 哪个播放器可以免费观看大片| 欧美xxxx黑人xx丫x性爽| 中文字幕人妻熟人妻熟丝袜美| 亚洲av成人精品一二三区| 超碰97精品在线观看| 色网站视频免费| 亚洲国产精品成人久久小说| 天天躁夜夜躁狠狠久久av| 国模一区二区三区四区视频| 亚洲成人av在线免费| 久久久久久国产a免费观看| 天堂av国产一区二区熟女人妻| 男女国产视频网站| 亚洲精品乱码久久久v下载方式| 国产成人一区二区在线| 网址你懂的国产日韩在线| 床上黄色一级片| 午夜激情欧美在线| 最近中文字幕2019免费版| 日韩三级伦理在线观看| 麻豆av噜噜一区二区三区| 美女被艹到高潮喷水动态| 有码 亚洲区| 国产在线一区二区三区精| 国产不卡一卡二| 国产老妇伦熟女老妇高清| 成年女人看的毛片在线观看| 一级毛片久久久久久久久女| 亚洲成人一二三区av| 亚洲精品乱码久久久久久按摩| 性色avwww在线观看| 成人亚洲精品一区在线观看 | 69av精品久久久久久| 少妇熟女aⅴ在线视频| 最近视频中文字幕2019在线8| 午夜久久久久精精品| 国产一区二区在线观看日韩| 超碰97精品在线观看| 一级毛片黄色毛片免费观看视频| 国产成人精品婷婷| 韩国av在线不卡| 亚洲最大成人中文| 91在线精品国自产拍蜜月| 国产色婷婷99| 亚洲第一区二区三区不卡| 在线观看一区二区三区| 男女下面进入的视频免费午夜| 91精品国产九色| 欧美bdsm另类| 成年av动漫网址| 婷婷色综合大香蕉| 九九爱精品视频在线观看| 美女主播在线视频| 中文字幕人妻熟人妻熟丝袜美| 免费不卡的大黄色大毛片视频在线观看 | 大香蕉97超碰在线| 中文天堂在线官网| 韩国av在线不卡| 亚洲欧美精品专区久久| 亚洲国产精品成人久久小说| 国产老妇女一区| 国产亚洲精品久久久com| 日本免费在线观看一区| 午夜福利视频1000在线观看|