• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Compiler IR-Based Program Encoding Method for Software Defect Prediction

    2022-11-11 10:47:20YongChenChaoXuJingSelenaHeShengXiaoandFanfanShen
    Computers Materials&Continua 2022年9期

    Yong Chen,Chao Xu,*,Jing Selena He,Sheng Xiao and Fanfan Shen

    1School of Information Engineering,Nanjing Audit University,Nanjing,211815,China

    2Department of Computer Science,Kennesaw State University,Kennesaw,30144-5588,USA

    3Information Science and Engineering Department,Hunan First Normal University,Changsha,410205,China

    Abstract: With the continuous expansion of software applications, people’s requirements for software quality are increasing.Software defect prediction is an important technology to improve software quality.It often encodes the software into several features and applies the machine learning method to build defect prediction classifiers, which can estimate the software areas is clean or buggy.However,the current encoding methods are mainly based on the traditional manual features or the AST of source code.Traditional manual features are difficult to reflect the deep semantics of programs, and there is a lot of noise information in AST, which affects the expression of semantic features.To overcome the above deficiencies,we combined with the Convolutional Neural Networks(CNN)and proposed a novel compiler Intermediate Representation (IR)based program encoding method for software defect prediction(CIR-CNN).Specifically,our program encoding method is based on the compiler IR,which can eliminate a large amount of noise information in the syntax structure of the source code and facilitate the acquisition of more accurate semantic information.Secondly,with the help of data flow analysis,a Data Dependency Graph(DDG)is constructed on the compiler IR,which helps to capture the deeper semantic information of the program.Finally,we use the widely used CNN model to build a software defect prediction model,which can increase the adaptive ability of the method.To evaluate the performance of the CIR-CNN,we use seven projects from PROMISE datasets to set up comparative experiments.The experiments results show that, in WPDP,with our CIR-CNN method,the prediction accuracy was improved by 12%for the AST-encoded CNN-based model and by 20.9%for the traditional features-based LR model,respectively.And in CPDP,the AST-encoded DBNbased model was improved by 9.1%and the traditional features-based TCA+model by 19.2%,respectively.

    Keywords:Compiler IR;CNN;data dependency graph;defect prediction

    1 Introduction

    With the continuous expansion of software applications, people’s requirements for software quality are increasing.People hope to eliminate software defects as much as possible before software release.Nevertheless,the software is larger and complexity,it is difficult to accurately locate the defects of a program at the semantic level.Software defect prediction is a helpful technology for detecting semantic defects.It often encodes the source code into several software features and applies the machine learning method to build defect prediction classifiers,which can estimate the software areas is clean or buggy[1-5].However,in the modeling process of defect prediction,there are some common challenges:such as how to encode the program,how to extract features from the high dimensionality of defect datasets, how to select the suitable defect training models, and so on.In the paper, we are focused on how to encode the program to extract features for defect prediction.

    Software features are the basis of defect prediction.Researchers design various defect features from different dimensions by the analysis of software defect-related factors.Such as the code size,code complexity (e.g., Halstead features based on the number of operators and operands, McCabe features based on dependencies,CK features for object-oriented programs),code churn features,et al.However,those features are traditionally handcrafted with the shallow representation of the programs’source code or development processing,not for the deep semantic information of the program,which is an important factor for software defect prediction.

    To mine the semantic information of the program for building accurate software defect prediction models,some approaches propose to leverage a powerful representation learning algorithm,namely deep learning, to capture the semantic representation of programs, automatically.They often use the Abstract Syntax Trees (AST)of the source code as basis and transform the ASTs’nodes of the program into tokens vectors.Then the word embedding techniques [6] are applied to encode the tokens vectors as numerical vectors, which are served as inputs to the deep learning models (i.e.,Deep Belief Network(DBN)[7], Convolutional Neural Networks(CNN)[8], and Recurrent Neural Networks(RNN)[9],et al.),to automatically extract the semantic features of the program.Programs have well-defined syntax and rich semantics hidden in the ASTs, which can assist to build a more accurate software defect prediction model.However, there are still some deficiencies in capturing program semantics,based on ASTs.

    Firstly, Phan et al.[10] shows that the code with the same semantic, such asFile1.candFile2.cin Fig.1, will suffer from varying structures of ASTs, which will affect the performance of defect prediction because of the weight matrices for each node, being determined based on the position in AST.

    Secondly, ASTs are not suitable for deep semantic analysis such as data flow analysis, which affects the prominence of software defect features.Figs.2a and 2b show two code snippets,which were extracted from the commit information of the Redis project in GitHub.The only difference between buggy code and clean code is in line 11 of different flag constants.The AST structure of the two code snippets will be the same,and the only different node is the constant node with a different value,which is ignored by the current ASTs-based method for limiting the number of tokens.Therefore,the current AST-based methods will be failed to capture the defect features in Fig.2a.

    Furthermore, most of the currently AST-based deep learning defect prediction model seldom considers the type information of variables, which is also an important expression to the semantic of the program.Figs.3a and 3b show two code snippets.Both define the function of delay10 that takes up ten integers adds time, which is often used in embedded systems to satisfy the precedence constraints.However,Fig.3a has a defect:because the variableiis used in the functiondelay10and has no side-effect, the code ofdelay10will be optimized to empty which should break precedence constraints in the calling points.Since the only difference between the two code snippets is the types ofi,the ASTs of both will be almost the same.The AST-based defect prediction methods are difficult to capture the defect in Fig.3a.

    Figure 1:Same semantic with different loop statements example[10]

    As we all know, the compiler is an essential tool for program transformation, and the AST is also the presentation form of the compiler front end.For better program analysis and optimization,compilers usually design a well-structured internal representation,called Intermediate Representation(IR).Peng et al.[11] shows that the IR is more applicable for learning code representation rather than high-level program language.Motivated by the powerful representation and widely applied in the program analysis, we propose a novel Compiler IR-based program encoding method for defect prediction with CNN model (CIR-CNN)aim to increase the performance in software defect prediction on seven PROMISE datasets.The main contributions of the paper can be summarized as follows:

    ?Being different from the AST-based program feature extracting methods,this paper encodes the program semantic defect features based on compiler IR,which is expected to obtain more accurate program semantic features for software defect prediction.

    ?Based on LLVM IR,we designed the token representation,which retains the type information for acquiring the type-related defects features.

    ?Combined with the data dependency,DDG was built as the basis of program analysis,which is helpful to extract more accurate data dependency features.

    ?To preserve the semantic and structure information of the graph,we redesigned the weighted adjacency matrix to represent the DDG of the program and used two-dimensional CNN to train and build the defect prediction model.

    Figure 2:A buggy example from Redis

    Figure 3:Type-based buggy motivation example

    The outline of this paper is as follows.In the next section,we briefly introduce the related work and background materials used in our work.Section 3 describes our proposed CIR-CNN approach,and the experiments are setup and evaluated in Section 4.Section 5 identifies some limitations of this research work.We conclude the paper and highlights future directions in the last section.

    2 Related Work and Background

    2.1 Software Defect Prediction

    Software Defect Prediction(SDP)technology has always been a research hotspot in the field of software engineering,and researchers have carried out extensive research in this field[12-14].Many machine learning methods have been designed for building defect prediction models[15,16].Ji et al.[17] proposed an improved Naive Bayes (NB)approach by using kernel density estimation.They compared their methods against four well-known classification algorithms on 34 software releases obtained from 10 open-source projects provided by the PROMISE repository.Li et al.[18]examined C4.5 in defect prediction,which is a kind of Decision Tree(DT)algorithm.Nam et al.[19]proposed TCA+,which adopted a state-of-the-art technique called Transfer Component Analysis(TCA)and the optimized TCA’s normalization process to improve cross-project defect prediction.Xia et al.[20]proposed HYDRA,which leverages a genetic algorithm and ensemble learning(EL)to improve crossproject defect prediction.But HYDRA requires massive training data to build and train the prediction models.Tabassum et al.[21]investigated when and to what extent cross-project data are useful for Just-In-Time Software Defect Prediction in a realistic online learning scenario.Zain et al.[22] proposed the 1D-CNN,a deep learning architecture to extract useful knowledge,for identifying and modelling the knowledge in the data sequence,reducing overfitting,and finally,predicting whether the units of code are defects prone.However,these methods are based on traditionally handcrafted features,which are the shallow representation of the programs’source code or development processing, not for the deep semantic information of the program.They will be affected by people’s experience and have weak adaptive ability.

    Recently,with the rapid development of deep learning technology and the increasing demand for semantic-based software defect prediction,many researchers explore the application of deep learning methods in software defect prediction.They use deep learning technology to automatically extract the semantic features of programs for building the defect prediction model.Wang et al.[7]leveraged DBN for software defect prediction.They used selected AST sequences taken from source code as input to the DBN model,which generate new expressive features,and used machine learning models for classification.Li et al.[8]proposed a CNN-based defect prediction model,which leveraged word embedding and a CNN model for defect prediction.Their experimental results show that the defect prediction performance of the CNN model is better than Wang’s DBN [7].Pan et al.[23] improved the Li’s CNN for within-project defect prediction(WPDP).The experimental results show that their CNN model was comparable to Li’s CNN model, and outperformed the state-of-the-art machine learning models significantly.Hoa et al.[9] leveraged tree-based LSTM models to predict defects.However,their results were not as good as the results of Li’s CNN model[8].Sun et al.[24]proposed an unsupervised domain adaptation based on the discriminative subspace learning (DSL)approach for CPDP.However,these methods are based on the AST of source code,which may be affected by the implementations of program.At the same time, they are not suitable for deep semantic analysis and insensitive to type related defects,which shown in Figs.1-3

    There was also research on deep defect prediction targeting assembly code[10],which leveraged a CNN model to learn from assembly instructions.However,the assembler is architecture related,and it is difficult to transplant to other platforms.

    2.2 Compiler IR

    Intermediate Representation (IR)is the foundation for a compiler to realize cross-language analysis and optimization.Based on the IR,the compiler analyzes the semantics of the source code and executes a variety of optimization passes to eliminates the useless code of the source program,which usually contains noisy semantic information.Therefore, the normalized and meaningful semantic information is preserved in the final IR by compiler optimization, and we can get the outstanding semantic features by IR.Different compilers have their own IRs,and in this paper,we use the IR of the LLVM compiler called LLVM IR for the following reasons.

    ?Unlike the IR of GCC,which has multiple IR such as GENERIC,GIMPLE,RTL,and so on,the IR of LLVM is unique.It is well defined and more suitable for processing and transforming.

    ? The LLVM IR representation aims to be lightweight and low-level while being expressive,typed,and extensible at the same time.It is convenient to extract type information for helping defect prediction.

    ?There are many program conversion and analysis tools for LLVM IR,such as JLang,RetDec,etc.We can transform different high-level programming source code and even binary code to LLVM IR easily.

    From the perspective of LLVM IR,the semantic information of the program is more prominent.For example,in Fig.1,althoughFile1.candFile2.chave different loop structures,their LLVM IRs are the same,shown in Fig.1c.

    And in Fig.2, therioGetWriteErrorandrioGetReadErrorare inline functions.The buggy code shown in Fig.2a defines the two methods the same.The condition code in line 15 and 17 will be considered the same,and lines 17 and 18 will be removed safely during the optimization of the compiler.Therefore,the final LLVM IRs of buggy code and clean code are differently shown in Figs.2c and 2d.For the two code snippets in Figs.3a and 3b,the LLVM IRs are shown in Figs.3c and 3d,respectively.From these two IR snippets,the differences are outstanding.For the code in Fig.3a,all of the codes are optimized and deleted,and the function becomes empty(see as Fig.3c).We can easily distinguish the semantic features from these IR differences.

    At the same time,compiler IR is easy for control flow analysis and data flow analysis to represent the relationship between instructions and data,which is helpful for the semantic features extracting.Therefore, we suspect that if the deep-learning-based feature extraction method was applied to the compiler IR, we could get more exactly the programs’semantic features.It should be noted that,although ASTs are also one former of compiler IR, the compiler IR mentioned in this paper is the IR that after compilation optimization and directly used as the input of the code generation of the compiler.

    2.3 CFG and DDG

    A CFG is a directed graph,G=(V,E)whereVis the set of vertices{v1,v2,...,vn}andEis the set of directed edges{<vi,vj>,<vk,vl>,...}.In the CFG,each vertex represents a basic block that is a linear sequence of IRs with one entry point (the first IR executed)and one exit point (the last IR executed).And the directed edges show the control flow paths.CFG can display the relationship between basic blocks,dynamic execution status,and statement table corresponding to each basic block in a process.However,it cannot deal with well the relationship between instructions in basic blocks.For example,in Fig.4,Fig.4b is the LLVM IR for Fig.4a.We can see that the CFG has only one node because there is no branch in the source code.If we analyze the semantics of the source code based on this CFG,we can form the order dependency shown in Fig.4c.These will lead to the segmentation of the most critical defect features between the instruction 2 and instruction 7.Therefore,we will further be able to construct the DDG based on the CFG.

    Figure 4:The motivation example for DDG

    A DDG is also a directed graph,G=(V,E)whereVis the set of vertices{v1,v2,...,vn}andEis the set of directed edges {<vi,vj>,<vk,vl>,...}.However, in the DDG, each vertex represents an IR,and the directed edges show the data dependencies.We use theIRirepresents theith IR in the program.IfIRimust execute beforeIRj,there is one directed edge fromIRitoIRj.For example,the DDG of Fig.4b is Fig.4d.Instruction 7 and instruction 2 are adjacent to use the same storage space“%3”,and instruction 2 is a function call that may have the side effect.Therefore,instruction 2 must execute before instruction 7,and the directed edge from instruction 2 to instruction 7 is added,which will make the defect features more prominent.

    2.4 CNN

    Convolutional Neural Networks (CNN)is a feedforward neural network with a structure to convolution calculation[25].It has been successfully applied in many practical fields,including image classification,speech recognition,and natural language processing[26-35].

    CNN includes a feature extractor composed of convolution layers and pooling layers.A convolution layer of CNN usually contains several feature maps.Each feature map is composed of some rectangular neurons.Neurons in the same feature plane are only connected with some adjacent neurons and share weights.These shared weights are convolution kernels.The convolution kernel is generally initialized in the form of a random decimal matrix.In the process of network training,the convolution kernel will learn to obtain reasonable weights.The direct benefit of convolution kernel is to reduce the connection between network layers and reduce the risk of overfitting.The pooling layer follows the convolution layer and is also composed of multiple feature maps.Each feature map of the pooling layer uniquely corresponds to one feature surface of the upper layer,and the max-pooling is often used.

    In recent years,some researchers[7-9]have explored the effect of CNN in building software defect prediction models and reached positive conclusions.However,at present,software defect prediction mainly focuses on one-dimensional CNN,the scene where CNN performs better is two-dimensional CNN,such as image recognition.Therefore,in our work,we leverage two-dimensional CNN which is trained by the adjacency matrix of the program for effective feature generation from LLVM IR.

    3 CIR-CNN

    3.1 Overall Framework

    Fig.5 shows the steps of our compiler IR-based program embedding method for defect prediction over CNN:a)Transform the Program to Compiler IR;b)Generate DDG from IR;c)Extracting and Encoding tokens for the DDG Nodes;d)Program encoded by the weighted adjacency matrix.e)Then,the weighted adjacency matrix will be used as inputs to train and build the CNN model for software defect prediction.f)When a program needs to predict defects,we first obtain the weighted adjacency matrix of the program and then input it into the built model,which will give the prediction results of clean or buggy.

    Figure 5:The framework of the CIR-CNN

    3.2 Transform the Program to Compiler IR

    The compiler IR is a kind of normalized representation of the program, which preserves the semantics of the program.The first step of our method is to transform the program to the LLVM IR.Specifically,the transformation to LLVM IR can be categorized into two cases.

    ?Input is the source code.We will use the corresponding compilers to complete the transform.For example,we can use Clang to transform the C and C++source code to LLVM IR and use the JLang to transform the Java source code to LLVM IR.

    ?Input is the binary code.We will use the RetDec tool to decompile it to LLVM IR.

    3.3 Generate the DDG from IR

    In order to obtain more accurate program semantic information to assist software defect prediction,we first extract the CFG from IR,then construct the DDG of the program by CFG.

    Fig.6 shows an example of transforming a piece of IRs (Fig.6a)to CFG (Fig.6b)and then generating DDG(Fig.6c).In Fig.6a,the first line is the definition of the function with the following IRs included.

    Figure 6:The example of IR to DDG transformation

    For the CFG construction, the primary work is to analyze the branch IRs.In Fig.6a, the first seven IRs are the load/storage and comparison instructions,and they are sequences executed without branch.Until the eighth“br”instruction,the program will jump to different positions according to the comparison result of“%7”.Therefore,the first eight statements are in the same basic block and can be organized as a node in CFG,calledB0in Fig.6b.Similarly,IRs in lines 9-14,15-19 and 20-23 are also basic blocks,which are corresponding to the nodesB1,B2andB3in Fig.6b,respectively.Then,we analyze the last IR of each node in CFG and extract the control flow information to form the edges of CFG.For example,in the last IR of B0,we can see that the execution after B0 is the IR at“l(fā)abel 8”or“l(fā)abel 12”.And“l(fā)abel 8”corresponds to the entry of theB1node,and“l(fā)abel 12”corresponds to the entry of theB2node.Hence,the converted CFG has an edge fromB0toB1and an edge fromB0toB2.Similarly,we can also get the edges fromB1toB3andB2toB3,and form the CFG shown in Fig.6b.

    When we get the CFG of the program,DDG can be generated by Algorithm 1,where we give the symbol definition in Tab.1.

    Table 1: The description of the symbols in Algorithm 1

    In the algorithm,for each node in CFG,each IR is traversed in turn,encapsulated as a DDG node bytransfunction.And the relationship between IR and DDG node is saved inH(lines 2-8).Then we traverse every node of CFG again,and analyze the attribute of each symbols s that ofIRidefined or used.If the attribute of s is defining,we will establish the mapping between s andH(IRi)and save it inR(lines 22-24).For example,for the firstIRin Fig.6,the operand“%3”indicates that a 32-bit space is defined,and the mapping relationship from“%3”toH(IR1)will be established and saved toR.If the symbolsis used in theIRi,we will search thedefsinR.Ifsanddefsare in the same basic block(line 14),an edge fromH(defs)toH(IRi)will be added toE′.For example,in the basic blockB0,theIR3uses the“%3”space defined by theIR1,so an edge fromH(IR1)toH(IR3)is added toE′.Ifdefsdoes not exist ordefsandIRibelong to the different basic blocks,we will get all parent blockpbofblockIRi,and add the edges from each)toH(IRi)(lines 12-21).For example,theIR9belongs toB1and uses the data“%4”that defined inIR2ofB0.Therefore,we get the parent blocks ofB1,which is the onlyB0,and add an edge from the exit IRH(IR8)ofB0toH(IR9).When all nodes are analyzed,we create an empty root nodevroot.And for all nodes inV’that have no incoming edge,we add an edge fromvrootto them(lines 29-33).Finally,we addvroottoV’and return the DDG(lines 34-35).The DDG of the example in Fig.6b is shown in Fig.6c.

    3.4 Extracting and Encoding Tokens for the DDG Nodes

    To make the CNN-based deep learning technology automatically extract software features from DDG for defect prediction, we encode each node of DDG into numerical vectors.Similar to the existing mainstream methods,the encoding process includes two stages:a)Extracting the tokens from the nodes;b)Transforming the tokens into numerical vectors.

    For the first step,the nodes of DDG are encapsulated by the compiler IR that is commonly divided into operator and operands, so we designed the tokens of DDG nodes also containing two parts:operator string and operands string.For the operator string,if it is the“call”instruction,we will set the token string by the specific calling method.When the calling method is a system library method such as “printf”, the calling method name will be used as the operator string token.Otherwise, the string“call”will be used as the operator string token.For example,the operator string of“%3=call i8*@malloc(i64 512)”is“malloc”,and the operator string of“%3=tail call i8*@calScore(i64 512)”is“call”,where“calScore”is a user-defined method.For the operator that is not the“call”,the string of operator name in LLVM IR is used as its operator string token.For example,the operator string of“store i32%0,i32*%3”is“store”.The operands string is represented by the type of operands in IR,which has the following three situations.

    ?If the type of the operand is the basic system type in LLVM,such as“int32”,“int8”,et al.,we will use the corresponding string in LLVM to represent them,such as“i32”,“i8”,et al.;

    ?If the type of the operand is defined by ourselves,we will use“mytype”to represent it.

    ?If the type is a pointer type,we will leave“*”after the type.

    After getting the operator string and operands string,we useto connect them as the token of the DDG node.For example,the token of node“storei32%0,i32*%3”is“store_i32_i32*”,and the token of node“%3=call i8*@malloc(i64 512)”is“malloc_i8*”.

    When the DDG nodes are converted into tokens,a method similar to Wang’s DBN[7]is applied.We first build a mapping between integers and tokens, and each token is associated with a unique integer identifier which ranges from 1 to the total number of tokens.Then, the word embedding technique is used to further map each DDG token into a numerical vector,which is trained regarding the context of each token.However,being different from one-dimensional word embedding in NLP[6],we extract the association information of tokens based on graph structure.Although we can obtain one-dimensional tokens through graph traversal,it will destroy the graph structure,affecting the word embedding and in turn defect prediction.In order to maintain the information on the graph structure,we design a graph-based word embedding method based on CBOW[6].We select the parent node and child nodes of the central DDG node as the context for training the distributed representation of the DDG node to preserve the graph structure to the greatest extent.For example,to evaluate the node 7 in Fig.6,we will use its parents of nodes 5 and 6,and its child of node 8 as its context.Eq.(1)describes our way to capture context for the central word n and calculate the projection value(P is the parent of the word n and C is all the children of the word n).After the CBOW based graph word embedding transformation,tokens appearing in similar context tend to have similar vector representations that are close in the feature space, which can benefit CNN in learning the program semantics in certain contexts.

    3.5 Program Encoded by Weighted Adjacency Matrix

    At present,the most successful application of CNN is mainly in the field of image recognition,of which the input is two-dimensional.We thought that if the DDG is transformed into two-dimensional expression,it will be conducive to CNN model to obtain better classification effect.Since the adjacency matrix is the widely used two-dimensional graph representation,we also use it for our purposes.When DDG nodes are transformed into tokens and encoded into numerical vectors, the program can be expressed as a weighted adjacency matrixMbyN×N, whereNis the number of tokens.In order to meet the constraints of CNN model on the fixed input shape, we arrange the nodes in DDG in descending order of occurrence frequency,and then take the firstNnodes as the observation nodes construct the adjacency matrix.We usemijto represent the weight in the rowiand columnjof the adjacency matrix M, which can be calculated by Eq.(2).In Eq.(2),nijdenotes the number of edges fromtokenitotokenjin DDG,tixdenotes theith value in the numerical vector oftokeni,kis the length of the numerical vector,andεis an infinitesimal number to prevent the denominator from being zero.

    The weight calculation is critical to the prediction model.The basic principle of software defect prediction is to detect whether the program has the defects characteristics in the semantic level.In our method,we have normalized the semantics of the program into the DDG.The weight should reflect the characteristics of the DDG.Since DDG is a graph, its characteristics can be measured by the structural information of the graph,which can be expressed by node and edges.So we calculate the weight from two dimensions.The first is for the nodes.The more similar the two nodes are,the greater the value will be.Here, we calculate the Euclidean distance between nodes and find the reciprocal,which is the denominator of Eq.(2).The second is the dependency strength between nodes,which is expressed by the number of edges,i.e.,the molecular of Eq.(2).

    3.6 Generate the CNN Model

    In this step,we take advantage of CNN’s powerful capability of feature generation,and capture semantic and local structural information of the weighted adjacency matrix represented program.Because we focus on the impact of IR-based encoding method on the defect prediction,and engage CNN only as an application, we adopt a similar architecture and parameters of Li’s CNN [8].In particular, our CIR-CNN consists of one two-dimensional convolutional layers and max-pooling layers to extract global patterns, a flattening layer, one dense layer, and finally, a logistic regression classifier to predict whether a LLVM IR file was buggy.Our CIR-CNN framework is built with Keras tools,using TensorFlow as the backend.We take minibatch Stochastic Gradient Descent(SGD)as an optimization strategy and use the Adam algorithm as the optimizer to adjust the learning rate.The detail information of our CNN model is shown in Tab.2.Here, because the two-dimensional CNN is used in our method,the kernel size of the convolution layer and the pool size of the pooling layer cannot be obtained from Li’s CNN[8].Therefore,we determine their values through experiments.See the Section 4 for details.

    images/BZ_1027_541_659_583_696.pngimages/BZ_1027_661_1400_697_1437.pngimages/BZ_1027_734_1648_779_1684.png

    Table 2: The parameters of CNN model

    3.7 Defect Prediction

    Logistic Regression as the final classifier.We process each file in both training set and test set following the above steps,and obtain the weighted adjacency matrix of each source file.After we train our model using the training files with their corresponding labels,both the weights and the biases in our CNN and Logistic Regression are fixed.Then for each file in the test set,we feed it into our defect prediction model and the final classifier will give us a value,indicating the probability of this file being buggy.

    4 Experimental Setup and Analysis

    In this section,we compare our proposed method with the performance of existing methods.In particular,our experiments were based on the following questions:

    ?RQ1:How to set the hyperparameters in two-dimensional CNN?

    ?RQ2: Does our proposed CIR-CNN method improve the performance of within-project defect prediction(WPDP)?

    ?RQ3:Does our proposed CIR-CNN method improve the performance of cross-project defect prediction(CPDP)?

    All of our experiments were run on a Linux server with one Intel(R)Xeon(R)Gold 5218 CPU and one GeForce RTX 2080 Ti GPU.

    4.1 Dataset

    To facilitate the replication and verification of our experiments, we collected Java projects from the PROMISE data repository, where the version numbers, the class name of each file, and most importantly,the defect labels for each source file are provided.In total,7 Java projects are collected,and we select two versions of each project as our dataset.Tab.3 shows the details of these projects,including project description,versions,the total number of files,the buggy number of files,the buggy rate,and the buggy rate reduce percentage caused by file delete of our data preprocessing.

    Table 3: Evaluated projects for defect prediction

    Since our CIR-CNN method based on the LLVM IR,we downloaded the corresponding versions of the projects from open source repositories rather than using the existing traditional features.To parse source files into LLVM IR,we utilized a tool called JLang.It enables translating Java 7 source code into LLVM IR,except for some advanced reflection features,primarily related to generic types.Hence,due to the limited functionality of JLang,several Java source files could not be parsed correctly,which may have hampered data preprocessing.We adopted the following four strategies to solve the problem.

    ?Correct the source file grammar so that JLang could parse,such as replace the variable symbol“enum”with“enum1”.

    ?Delete part of the source code that could not parse.

    ?Delete the file directly and add the corresponding classes files to JLang’s dependency library.

    ?Delete the project if most of the files in the project fail to parse.

    4.2 Evaluation Metrics

    To measure the performance of the defect prediction, we computed the F1 score which is composed of Precision and Recall and widely used for evaluating the performance of software defect prediction [7,8].We estimated the values of Precision, Recall, and F1 score based on four statistics:True Positives(TP),False Positives(FP),False Negatives(FN),True Negatives(TN).Their definitions are as follows:if a file is classified as defective when it is truly defective,the classification isTP.If the file is classified as defective when it is clean,then the classification isFP.If the file is classified as clean when it is defective,then the classification isFN.Finally,if the issue is classified as clean but in fact is clean,then the classification isTN.We use the above statistics to estimate Precision,Recall,and F1 score by Eqs.(3)-(5),respectively.

    Both Precision and Recall reflect the effectiveness of our prediction model.According to the above formulas,Precisionis the ratio between the number of true positives over the number of link candidates that are predicted as true links by our model.On the other hand,Recallis the percentage of the number of true positives over the total amount of true links.Importantly,betweenPrecisionandRecall,there is usually an inverse relationship where higherPrecisionmight come with lowerRecalland vice versa.Thus,theF1score,which is the harmonic mean ofPrecisionandRecall,is used to synthesize the two metrics into a summary measure.

    4.3 Baseline Methods

    To evaluate the performance of our proposed CIR-CNN method, we conducted a comparative experiment from two aspects:WPDP and CPDP.

    In WPDP,we compare CIR-CNN with the Traditional LR,DBN,and CNN.The Traditional LR is a Logistic Regression classifier.It is based on 20 traditional code features [36] to build a logistic regression model for defect prediction.These features have been widely used in previous work to build effective defect prediction models [19].The DBN is a state-of-the-art method that leverages a deep belief network(DBN)to automatically learn semantic features using token vectors extracted from the programs’ASTs.For the defect prediction performance of the traditional LR and DBN,we directly cite the experimental results in[7].The CNN method is a variant of DP-CNN that directly feeds the CNN-learned features to the final classifier without combining traditional features.It is also based on the ASTs but utilizes CNN for automated feature generation from source code.We implement the CNN method by Keras with the same network architecture and parameter settings as Li’s CNN[8].

    In the aspect of CPDP,we take DBN-CP and TCA+[19]as our comparisons baseline methods which are also evaluated in Wang’s paper[7].And the same as WPDP,we use the performance results from the Wang’s paper[7]for easy comparison.

    4.4 Performance of CIR-CNN under Different Hyperparameters(RQ1)

    As a two-dimensional CNN model,its hyperparameters of kernel size and pool size will also be two-dimensional,for which the parameters in Li’s CNN[8]cannot be used.Therefore,we uselunce,poi,xalanandxercesfrom PROMISE as the dataset by WPDP evaluation to tune the hyperparameters.The WPDP evaluation method is shown in the next section.The kernel size and pool size are varied within the range of {2×2, 3×3, 4×4, 5×5}, and the remaining parameters are set according to Tab.2.Fig.7 shows the performance and average performance of the four projects in WPDP under different kernel size and pool size.Where,the x-axis is the value of two super parameters,and the y-axis is the F1 score and average F1 score of each project under the hyperparameters setting corresponding to x-axis.

    Figure 7:The performance of different hyperparameters

    From the Fig.7, we can see that different hyperparameters settings have different effects on the prediction performance of different projects.For example, forxalanandxerces, theirF1scores fluctuate greatly by the hyperparameters,while forlunceandpoi,theirF1scores fluctuate less.These may be due to different defect characteristics.Therefore, in order to maximize the performance of software defect prediction,we use the point with the largest mean value as the selected value for the two hyperparameters,that is,the kernel size and pool size are set to 4×4.

    4.5 Performance of CIR-CNN in WPDP(RQ2)

    To evaluate the performance of CIR-CNN in WPDP,we carried out comparative experiments on the seven projects listed in Tab.3.We use the older version to train prediction models and the newer version as the test set to evaluate the trained models.The F1 score on each project by applying the four competing methods is shown in Tab.4.The highest F1 score of them is shown in bold.For example,in thexalanproject,we usexalan2.4 as the training set andxalan2.5 as the test set,and we get the F1 score of defect prediction is 0.627,0.681,0.678,and 0.782 for Traditional LR,DBN,CNN,and CIRCNN,respectively.And the best result is 0.782 for CIR-CNN.From the experimental results,we can see that the CIR-CNN method is better than the Traditional LR method in all cases.On average,the F1 score of CIR-CNN is 0.105 higher than the traditional LR method,which improves 20.9%.These indicate that the features obtained by compiler IR have a better prospect than traditional features in defect prediction.By comparing the F1 score of CIR-CNN with DBN and CNN,we can find that for most cases,the F1 score of CIR-CNN is competitive with these of DBN.Although DBN is 0.038 higher than CIR-CNN on average,it is mainly contributed by the ant project.If the ant project is excluded,the F1 score of CIR-CNN is 0.017 higher than DBN on average.More significantly, CNN whose network structure and parameters are the same as CIR-CNN,gets a lower F1 score than CIR-CNN for most cases.On average,the F1 score of CIR-CNN is 0.065 higher than CNN,which improves 12%.These show that the compiler IR-based feature can get better performance than AST-based features in WPDP.

    Table 4: Performance comparison of different defect prediction methods in WPDP

    4.6 Performance of CIR-CNN in CPDP(RQ3)

    To evaluate the performance of CIR-CNN in CPDP,we collect a set of 10 cross-project test pairs.Each experiment takes two versions separately from two different projects,the one used as the training set and the other used as the test set.TheF1score on each pair projects by applying the three competing methods is shown in Tab.5.The highestF1score of them is shown in bold.From the experimental results,we can see that the CIR-CNN is still competitive in CPDP.For half of the cases,CIR-CNN can get the highestF1score.On average,the F1 score of CIR-CNN is 0.095 higher than TCA+and 0.049 higher than DBN,which is 19.2%and 9.1%improvements respectively.These indicate that compiler IR-based features are a better choice for CPDP.

    Table 5: Performance comparison of different defect prediction methods in CPDP

    4.7 Discussion

    From the above experimental results, we can see that our method can get better performance than above references methods in many projects, and we think the main reasons are as follows.Firstly, the LR and TCA+ are based on the traditional features.These features are designed by the people’s experience, and may not be able to adapt to different programming modes [7,8].Li’s CNN and Wang’s DBN are based on the AST of the source program.Due to the same semantics may implement with different grammatical structure,the extracted features are not obvious,as described in Section 1.However,our method is based on the compiler IR.It eliminates the syntax differences at the source program,which can obtain more accurate semantic features,shown in Section 2.2.And furthermore,we combined with the type information to extract more type-related features.Therefore,our method can use more and accurate information to train the defect prediction model and get better performance.

    5 Limitations

    5.1 Implementation of CNN

    For the comparative analysis,we compare our CIR-CNN method with CNN,which is the stateof-the-art within project defect prediction technique.Since the original implementation of CNN is not released,we have reimplemented our version of CNN by Keras.Although we strictly followed the procedures described in their work,our new implementation may not reflect all the implementation details of the original CNN.However,we test our implemented one with the data provided by their work.The results show that our version can achieve very similar results to the original one.Hence,we are confident that our implementation reflects the performance of the original CNN.

    5.2 Dataset Selection

    We conducted our experiments using seven open-source projects in the PROMISE dataset, and they might not be representative of all software projects.Besides, we only evaluated CIR-CNN on projects written in Java language.Given projects that are not included in the seven projects or written in other programming languages(e.g.,C++or Python),our proposed method might generate better or worse results.To make CIR-CNN more generalizable, in the future,we will conduct experiments on a variety of projects including open-source and closed-source projects,and extend our method to other programming languages.

    5.3 Dataset Preprocess

    When we convert the dataset source program to compiler IR, we delete a few source files due to JLang’s limited syntax support of the Java programming language.However, from the statistical results,the deleted files did not make the buggy rate of the dataset change significantly.On average,the buggy rate only increased by 1.02%as compared with the original PROMISE repository.Therefore,we claim that deletion of files would not influence the validity of our results that much.

    6 Conclusion and Future Work

    To improve the ability of software defect prediction at the semantic level, we propose a novel compiler IR-based program encoding method for defect prediction with the CNN model(CIR-CNN).Specifically,we first transform the source code and binary code into the compiler IR by compiler and decompiler tools,respectively.Then,the data dependency graph(DDG)is constructed over CFG by data flow analysis.Next,we encode the DDG to the weighted adjacency matrix by word embedding technology combined.Finally, we use the weighted adjacency matrix as the input and the existing mature CNN network structure to train and build the defect prediction model.

    Based on the compiler IR, our method eliminates the noise information at the syntax level of the source program and obtains more essential program semantic information for software defect prediction.Therefore,the extracted defect features will be more accurate.At the same time,through the token representation with types,the detection ability of type-related defects is improved.Therefore,our method can achieve good results in both WPDP and CPDP.We examined the performance of features automatically extracted by the compiler IR-based program encoding method on two filelevel defect prediction tasks, i.e., within-project defect prediction (WPDP)and cross-project defect prediction (CPDP).In WPDP, our experiments on seven open-source projects show that averagely,CIR-CNN improves the AST-based CNN and traditional feature-based methods by 20.9%and 12%,respectively,in terms of F1 score in defect prediction.And CIR-CNN is competitive with the state-ofthe-art DBN-based method.In CPDP,our experiments on ten pairs of open source projects show that averagely,CIR-CNN improves the AST-based DBN and traditional features-based TCA+methods by 19.2%and 9.1%,respectively.

    The novelty of this paper is that we combined with the two-dimensional CNN and proposed a compiler IR based program encoding method for software defect prediction, which can get the performance increase in seven projects of PROMISE dataset.

    As the future works, we are planning to make our method more generalizable and effective.Specifically,we will conduct experiments on more projects and more deep learning models,combine CIR-based features with other features,extend our method to both other programming languages and binary code,and also integrate different programming languages for coordinated prediction.

    Funding Statement:This work was supported by the Universities Natural Science Research Project of Jiangsu Province under Grant 20KJB520026 and 20KJA520002;the Foundation for Young Teachers of Nanjing Auditing University under Grant 19QNPY018;the National Nature Science Foundation of China under Grant 71972102 and 61902189.

    Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

    一本大道久久a久久精品| 伦理电影免费视频| 黄色视频,在线免费观看| 亚洲av日韩在线播放| 村上凉子中文字幕在线| 欧美日韩瑟瑟在线播放| 波多野结衣一区麻豆| 亚洲av第一区精品v没综合| 老汉色av国产亚洲站长工具| 久久久国产欧美日韩av| 老汉色∧v一级毛片| 欧美不卡视频在线免费观看 | 人妻一区二区av| 亚洲片人在线观看| 欧美日韩亚洲高清精品| 黑人巨大精品欧美一区二区mp4| 咕卡用的链子| 精品少妇久久久久久888优播| 多毛熟女@视频| 操美女的视频在线观看| 久久九九热精品免费| 亚洲国产精品一区二区三区在线| 国产一区二区激情短视频| 宅男免费午夜| 一本综合久久免费| 久久青草综合色| 91大片在线观看| 国产精品永久免费网站| 国产区一区二久久| av福利片在线| bbb黄色大片| 成人黄色视频免费在线看| 国产精品久久视频播放| 国产成人精品无人区| 一夜夜www| 成在线人永久免费视频| 国产成人精品在线电影| 国产一区二区三区综合在线观看| 1024香蕉在线观看| 久热这里只有精品99| 波多野结衣av一区二区av| videos熟女内射| 丁香欧美五月| 黄色 视频免费看| 自线自在国产av| 老熟女久久久| av中文乱码字幕在线| 18禁观看日本| 亚洲中文字幕日韩| 亚洲熟女毛片儿| 国产亚洲一区二区精品| 久久久精品区二区三区| 丝袜在线中文字幕| 精品电影一区二区在线| 大型黄色视频在线免费观看| 亚洲 国产 在线| 女人被躁到高潮嗷嗷叫费观| 国产无遮挡羞羞视频在线观看| 亚洲精品一卡2卡三卡4卡5卡| 久久久久国产精品人妻aⅴ院 | 天堂√8在线中文| 日韩三级视频一区二区三区| 大香蕉久久网| 一个人免费在线观看的高清视频| 国产精品一区二区在线观看99| 成人免费观看视频高清| 国产免费现黄频在线看| www.999成人在线观看| 男女床上黄色一级片免费看| 欧美老熟妇乱子伦牲交| 日本精品一区二区三区蜜桃| 91精品三级在线观看| 岛国毛片在线播放| 成人亚洲精品一区在线观看| 久久久国产精品麻豆| 亚洲性夜色夜夜综合| 成人永久免费在线观看视频| 久久青草综合色| 日日夜夜操网爽| 亚洲成人手机| 一级毛片精品| 99国产精品99久久久久| 午夜福利影视在线免费观看| 午夜福利影视在线免费观看| 成年版毛片免费区| 一边摸一边做爽爽视频免费| 变态另类成人亚洲欧美熟女 | 国产精品成人在线| 在线观看免费日韩欧美大片| 青草久久国产| 国产熟女午夜一区二区三区| 日韩有码中文字幕| 老司机午夜十八禁免费视频| 欧美丝袜亚洲另类 | 久久99一区二区三区| 久久亚洲真实| 18在线观看网站| 成人黄色视频免费在线看| 视频区图区小说| 两人在一起打扑克的视频| 欧洲精品卡2卡3卡4卡5卡区| 搡老乐熟女国产| 亚洲色图综合在线观看| 王馨瑶露胸无遮挡在线观看| 国产免费男女视频| 欧美日韩精品网址| 天堂俺去俺来也www色官网| 18禁观看日本| 欧美精品高潮呻吟av久久| 亚洲片人在线观看| 欧美黑人欧美精品刺激| 亚洲av日韩精品久久久久久密| 老司机影院毛片| 精品免费久久久久久久清纯 | 免费在线观看视频国产中文字幕亚洲| av免费在线观看网站| 精品少妇一区二区三区视频日本电影| 亚洲精品国产区一区二| 免费黄频网站在线观看国产| 色在线成人网| 丰满的人妻完整版| 欧美激情极品国产一区二区三区| 精品国产乱码久久久久久男人| 91av网站免费观看| 69av精品久久久久久| 嫁个100分男人电影在线观看| 69精品国产乱码久久久| 国产精品美女特级片免费视频播放器 | 男女午夜视频在线观看| 男女午夜视频在线观看| 久久精品国产亚洲av香蕉五月 | 日韩有码中文字幕| 亚洲成国产人片在线观看| 久久ye,这里只有精品| 天天躁日日躁夜夜躁夜夜| 亚洲中文日韩欧美视频| 韩国av一区二区三区四区| 亚洲一区中文字幕在线| 日韩中文字幕欧美一区二区| 中文字幕制服av| 久久人人97超碰香蕉20202| 精品国产一区二区三区四区第35| 国产主播在线观看一区二区| 99精品欧美一区二区三区四区| 大香蕉久久网| 亚洲国产精品一区二区三区在线| 欧美亚洲 丝袜 人妻 在线| 视频区图区小说| 国内久久婷婷六月综合欲色啪| 精品第一国产精品| 久久国产亚洲av麻豆专区| 一二三四在线观看免费中文在| 久久久久久久国产电影| 色在线成人网| 午夜免费观看网址| 一个人免费在线观看的高清视频| 亚洲av欧美aⅴ国产| 午夜精品久久久久久毛片777| 最新在线观看一区二区三区| 国产男女内射视频| 久久青草综合色| 日韩免费高清中文字幕av| 国产精品1区2区在线观看. | 老汉色av国产亚洲站长工具| 国产男女超爽视频在线观看| 中亚洲国语对白在线视频| 欧美日韩福利视频一区二区| 欧美成人免费av一区二区三区 | 97人妻天天添夜夜摸| 91字幕亚洲| 激情在线观看视频在线高清 | 天堂动漫精品| 老司机影院毛片| 一区在线观看完整版| 高清av免费在线| 成人18禁高潮啪啪吃奶动态图| 高清欧美精品videossex| 日本黄色视频三级网站网址 | 热99re8久久精品国产| 多毛熟女@视频| 久久国产亚洲av麻豆专区| 欧美精品亚洲一区二区| 自线自在国产av| 精品国产美女av久久久久小说| 亚洲精品美女久久av网站| 国产精品久久久久久人妻精品电影| 久久久久国内视频| 久久久国产成人精品二区 | 亚洲欧洲精品一区二区精品久久久| 国产精品电影一区二区三区 | 日韩视频一区二区在线观看| 日韩人妻精品一区2区三区| 一进一出抽搐动态| 人人妻人人添人人爽欧美一区卜| 美国免费a级毛片| 在线观看免费视频日本深夜| 欧美精品高潮呻吟av久久| 久久久久久久午夜电影 | 亚洲aⅴ乱码一区二区在线播放 | 国产深夜福利视频在线观看| 另类亚洲欧美激情| 国内毛片毛片毛片毛片毛片| 高清毛片免费观看视频网站 | 男女免费视频国产| 一区在线观看完整版| 色尼玛亚洲综合影院| 热99国产精品久久久久久7| 久久久久久久国产电影| 日韩欧美一区二区三区在线观看 | e午夜精品久久久久久久| 人妻丰满熟妇av一区二区三区 | 色在线成人网| 亚洲成av片中文字幕在线观看| 亚洲av熟女| 久99久视频精品免费| av一本久久久久| 中文字幕精品免费在线观看视频| 国产高清视频在线播放一区| 精品视频人人做人人爽| 国产精品九九99| 免费看十八禁软件| 国产激情久久老熟女| 中文字幕精品免费在线观看视频| av线在线观看网站| 国产欧美亚洲国产| 两性夫妻黄色片| 国产男女内射视频| 一本综合久久免费| 老熟女久久久| 欧美黄色片欧美黄色片| 天天操日日干夜夜撸| 黄色片一级片一级黄色片| 最新美女视频免费是黄的| 亚洲国产精品合色在线| www.自偷自拍.com| 欧美黄色片欧美黄色片| 久久精品成人免费网站| 久久久久国产一级毛片高清牌| 色婷婷久久久亚洲欧美| 亚洲欧美一区二区三区久久| 国产欧美日韩一区二区三区在线| 亚洲第一青青草原| 一边摸一边抽搐一进一小说 | 91字幕亚洲| 欧美老熟妇乱子伦牲交| 亚洲av日韩在线播放| 成人黄色视频免费在线看| xxx96com| 国产精品免费视频内射| 精品国产一区二区三区四区第35| 久久人妻熟女aⅴ| 性少妇av在线| 国产成人av激情在线播放| 女同久久另类99精品国产91| 搡老岳熟女国产| 极品教师在线免费播放| 99热国产这里只有精品6| av一本久久久久| 亚洲自偷自拍图片 自拍| 欧美日韩亚洲高清精品| 日本精品一区二区三区蜜桃| 色精品久久人妻99蜜桃| 欧美成人免费av一区二区三区 | 亚洲专区中文字幕在线| 18禁国产床啪视频网站| 在线看a的网站| 久久久国产成人精品二区 | av国产精品久久久久影院| 精品一区二区三区视频在线观看免费 | 老司机午夜十八禁免费视频| √禁漫天堂资源中文www| 99国产精品一区二区三区| 精品人妻1区二区| 黑丝袜美女国产一区| 99热网站在线观看| 亚洲一区二区三区不卡视频| 极品少妇高潮喷水抽搐| 亚洲三区欧美一区| 日韩欧美一区二区三区在线观看 | 精品久久久久久电影网| 脱女人内裤的视频| 热re99久久精品国产66热6| 久久中文看片网| 桃红色精品国产亚洲av| 日韩欧美在线二视频 | 香蕉丝袜av| 久久香蕉激情| 两个人看的免费小视频| 麻豆乱淫一区二区| 欧美黑人精品巨大| 69av精品久久久久久| 亚洲综合色网址| 国产亚洲精品一区二区www | 欧美日韩视频精品一区| 国内久久婷婷六月综合欲色啪| netflix在线观看网站| 国产无遮挡羞羞视频在线观看| 亚洲男人天堂网一区| 成人手机av| 国产精品一区二区免费欧美| 亚洲精品成人av观看孕妇| av视频免费观看在线观看| 亚洲av欧美aⅴ国产| 免费久久久久久久精品成人欧美视频| 亚洲午夜精品一区,二区,三区| 在线观看一区二区三区激情| 久久人妻福利社区极品人妻图片| 成人国语在线视频| 91在线观看av| 亚洲国产毛片av蜜桃av| 亚洲国产精品一区二区三区在线| 国产高清国产精品国产三级| 欧美国产精品va在线观看不卡| 天天躁狠狠躁夜夜躁狠狠躁| 亚洲av第一区精品v没综合| 少妇的丰满在线观看| 国产免费男女视频| 亚洲精品在线观看二区| 黄色视频,在线免费观看| 精品一区二区三区四区五区乱码| 国产亚洲精品一区二区www | 亚洲人成电影免费在线| 午夜日韩欧美国产| 日韩欧美国产一区二区入口| av有码第一页| 亚洲欧美激情综合另类| 亚洲欧美一区二区三区久久| 欧美精品高潮呻吟av久久| 午夜亚洲福利在线播放| 亚洲成国产人片在线观看| 欧美性长视频在线观看| 丰满的人妻完整版| 桃红色精品国产亚洲av| 黄片播放在线免费| 亚洲精品国产区一区二| 精品人妻在线不人妻| 黄色视频,在线免费观看| 极品教师在线免费播放| 高清视频免费观看一区二区| 国产在线观看jvid| 久久国产精品影院| 熟女少妇亚洲综合色aaa.| av网站在线播放免费| 精品高清国产在线一区| 男女午夜视频在线观看| a级毛片在线看网站| 国产在线精品亚洲第一网站| 欧美日韩亚洲高清精品| 精品久久久久久久毛片微露脸| 日韩欧美在线二视频 | 桃红色精品国产亚洲av| 国产99久久九九免费精品| www.自偷自拍.com| 亚洲av第一区精品v没综合| 欧美大码av| 天天躁夜夜躁狠狠躁躁| 天堂中文最新版在线下载| 制服人妻中文乱码| 免费黄频网站在线观看国产| 国产三级黄色录像| 在线观看日韩欧美| 香蕉久久夜色| 国产蜜桃级精品一区二区三区 | 国产精品99久久99久久久不卡| 黄片小视频在线播放| 男人的好看免费观看在线视频 | av网站免费在线观看视频| 国产日韩一区二区三区精品不卡| 国产欧美日韩一区二区三区在线| 俄罗斯特黄特色一大片| 老司机福利观看| 婷婷成人精品国产| 满18在线观看网站| 国产日韩一区二区三区精品不卡| 国产欧美日韩一区二区三区在线| 99精国产麻豆久久婷婷| 黑人猛操日本美女一级片| 999精品在线视频| 国产有黄有色有爽视频| svipshipincom国产片| 成人特级黄色片久久久久久久| 国产成人精品在线电影| 校园春色视频在线观看| 亚洲欧美日韩另类电影网站| 19禁男女啪啪无遮挡网站| 久久精品国产99精品国产亚洲性色 | 亚洲一区二区三区不卡视频| 在线观看免费视频网站a站| 国产精华一区二区三区| 看黄色毛片网站| 一a级毛片在线观看| 成人手机av| 搡老熟女国产l中国老女人| 日韩欧美一区视频在线观看| 久久午夜亚洲精品久久| 9191精品国产免费久久| av在线播放免费不卡| 人人妻人人澡人人看| 少妇 在线观看| 成人亚洲精品一区在线观看| 国产激情久久老熟女| 国产精品98久久久久久宅男小说| 美女福利国产在线| 久久国产乱子伦精品免费另类| 亚洲五月天丁香| 国产99久久九九免费精品| 久久精品国产a三级三级三级| 狠狠狠狠99中文字幕| 欧美久久黑人一区二区| 天天躁狠狠躁夜夜躁狠狠躁| 大香蕉久久成人网| 精品卡一卡二卡四卡免费| 一区福利在线观看| 亚洲精华国产精华精| xxxhd国产人妻xxx| 精品视频人人做人人爽| 欧美日韩视频精品一区| 99国产精品一区二区三区| a级片在线免费高清观看视频| 精品久久久久久电影网| 亚洲欧美精品综合一区二区三区| 免费高清在线观看日韩| 国产激情久久老熟女| 亚洲专区中文字幕在线| cao死你这个sao货| 久久国产亚洲av麻豆专区| 国产免费av片在线观看野外av| 日本一区二区免费在线视频| 一级黄色大片毛片| 免费观看人在逋| 丰满迷人的少妇在线观看| 亚洲成人免费av在线播放| 两性夫妻黄色片| 国产成+人综合+亚洲专区| 免费在线观看完整版高清| 极品人妻少妇av视频| 国产精品免费视频内射| 国产成人精品在线电影| 在线播放国产精品三级| 日本五十路高清| 久久午夜亚洲精品久久| 美国免费a级毛片| 久久久水蜜桃国产精品网| 多毛熟女@视频| 女人高潮潮喷娇喘18禁视频| 电影成人av| 搡老熟女国产l中国老女人| a级片在线免费高清观看视频| 亚洲精品乱久久久久久| 不卡一级毛片| 自拍欧美九色日韩亚洲蝌蚪91| 美女福利国产在线| 91麻豆精品激情在线观看国产 | 国产精品秋霞免费鲁丝片| 亚洲精品成人av观看孕妇| 中文字幕另类日韩欧美亚洲嫩草| 黄片播放在线免费| 新久久久久国产一级毛片| 午夜福利在线观看吧| 一本综合久久免费| 99热只有精品国产| 在线av久久热| 少妇粗大呻吟视频| 久久精品91无色码中文字幕| 午夜91福利影院| a级片在线免费高清观看视频| 精品一区二区三区四区五区乱码| 亚洲成人免费电影在线观看| 久久人妻熟女aⅴ| 免费在线观看日本一区| 80岁老熟妇乱子伦牲交| 色婷婷av一区二区三区视频| 妹子高潮喷水视频| 亚洲aⅴ乱码一区二区在线播放 | 香蕉丝袜av| 国产99久久九九免费精品| 热99久久久久精品小说推荐| 国产成人欧美在线观看 | 国产av又大| 欧美精品啪啪一区二区三区| 久久久久精品人妻al黑| 免费久久久久久久精品成人欧美视频| 欧美+亚洲+日韩+国产| 免费观看a级毛片全部| 999久久久精品免费观看国产| 国产精品久久久久久人妻精品电影| 真人做人爱边吃奶动态| 欧美黄色片欧美黄色片| 久久久国产一区二区| 老熟女久久久| 久久午夜亚洲精品久久| 国产乱人伦免费视频| 国产伦人伦偷精品视频| 怎么达到女性高潮| 另类亚洲欧美激情| 狂野欧美激情性xxxx| 亚洲 国产 在线| 国产精品偷伦视频观看了| 欧美不卡视频在线免费观看 | 两性午夜刺激爽爽歪歪视频在线观看 | 制服诱惑二区| 老司机亚洲免费影院| 国产成人精品在线电影| 午夜免费鲁丝| 欧美激情 高清一区二区三区| 欧美丝袜亚洲另类 | 亚洲av片天天在线观看| 日韩欧美一区二区三区在线观看 | 精品国产国语对白av| 99国产精品免费福利视频| 高清在线国产一区| 巨乳人妻的诱惑在线观看| 成年女人毛片免费观看观看9 | 人人妻人人澡人人爽人人夜夜| 看免费av毛片| 男女床上黄色一级片免费看| 国产精品秋霞免费鲁丝片| 亚洲va日本ⅴa欧美va伊人久久| 欧美日韩中文字幕国产精品一区二区三区 | 黄片大片在线免费观看| 久久精品亚洲熟妇少妇任你| 一区福利在线观看| 日本一区二区免费在线视频| 欧美日韩亚洲高清精品| 久久久国产成人免费| 建设人人有责人人尽责人人享有的| 亚洲国产欧美一区二区综合| 免费日韩欧美在线观看| 午夜亚洲福利在线播放| 中文字幕精品免费在线观看视频| 老司机影院毛片| 男男h啪啪无遮挡| 亚洲精品美女久久av网站| 午夜免费鲁丝| 亚洲av美国av| 国产精品一区二区免费欧美| 精品乱码久久久久久99久播| 多毛熟女@视频| 91成人精品电影| 最新在线观看一区二区三区| 午夜免费成人在线视频| 日本vs欧美在线观看视频| 欧美日韩亚洲高清精品| 一区在线观看完整版| 欧美日韩黄片免| tocl精华| 精品国产亚洲在线| 这个男人来自地球电影免费观看| 午夜激情av网站| 免费人成视频x8x8入口观看| 不卡av一区二区三区| 99国产综合亚洲精品| 国产亚洲精品一区二区www | 国产99久久九九免费精品| 日本黄色日本黄色录像| 一级毛片女人18水好多| 国产精品二区激情视频| xxx96com| 99国产精品免费福利视频| 久久久久久久久免费视频了| 久久中文看片网| 精品国产乱子伦一区二区三区| 国产一卡二卡三卡精品| 淫妇啪啪啪对白视频| 精品国产美女av久久久久小说| 黑人猛操日本美女一级片| 国产蜜桃级精品一区二区三区 | 亚洲精品久久成人aⅴ小说| 大码成人一级视频| 免费黄频网站在线观看国产| 欧美黄色淫秽网站| 国产男靠女视频免费网站| 操美女的视频在线观看| 男女午夜视频在线观看| 在线av久久热| 国产精品久久久av美女十八| 大片电影免费在线观看免费| 久久狼人影院| 多毛熟女@视频| 久久天堂一区二区三区四区| 制服人妻中文乱码| 在线观看免费视频网站a站| svipshipincom国产片| avwww免费| √禁漫天堂资源中文www| 国产又色又爽无遮挡免费看| 久久久国产一区二区| 女人高潮潮喷娇喘18禁视频| 日本撒尿小便嘘嘘汇集6| 亚洲国产精品sss在线观看 | 捣出白浆h1v1| 亚洲色图综合在线观看| 亚洲av电影在线进入| 国产成+人综合+亚洲专区| 最近最新中文字幕大全免费视频| 欧美日韩av久久| 亚洲精品自拍成人| 国产一区二区三区视频了| 成人黄色视频免费在线看| 国产精品综合久久久久久久免费 | 亚洲国产欧美网| 黄色视频,在线免费观看| 国产成人免费无遮挡视频| 午夜福利在线免费观看网站| 免费在线观看视频国产中文字幕亚洲| 亚洲一码二码三码区别大吗| 精品福利观看| 日韩大码丰满熟妇| 飞空精品影院首页| 极品人妻少妇av视频| 男女午夜视频在线观看| 亚洲一卡2卡3卡4卡5卡精品中文| 女人被躁到高潮嗷嗷叫费观| 国产av又大| 日韩欧美国产一区二区入口|