• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce

    2022-10-26 12:24:06LunHuShichengYangXinLuoSeniorHuaqiangYuanKhaledSedraouiandMengChuZhou
    IEEE/CAA Journal of Automatica Sinica 2022年1期

    Lun Hu,, Shicheng Yang, Xin Luo, Senior,Huaqiang Yuan, Khaled Sedraoui, and MengChu Zhou,

    Abstract—Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies, massive protein-protein interaction (PPI) data have been generated, making it very difficult to analyze them efficiently. To address this problem, this paper presents a distributed framework by reimplementing one of state-of-the-art algorithms, i.e., CoFex, using MapReduce. To do so, an in-depth analysis of its limitations is conducted from the perspectives of efficiency and memory consumption when applying it for largescale PPI data analysis and prediction. Respective solutions are then devised to overcome these limitations. In particular, we adopt a novel tree-based data structure to reduce the heavy memory consumption caused by the huge sequence information of proteins. After that, its procedure is modified by following the MapReduce framework to take the prediction task distributively.A series of extensive experiments have been conducted to evaluate the performance of our framework in terms of both efficiency and accuracy. Experimental results well demonstrate that the proposed framework can considerably improve its computational efficiency by more than two orders of magnitude while retaining the same high accuracy.

    I. INTRODUCTION

    PROTEINS are crucial to provide valuable insight into understanding the mechanisms of cellular functions for a variety of living organisms. However, proteins have to interact with each other in order to function well [1]. In this regard,protein-protein interactions (PPIs) are of interest in biology because of their ability of regulating a variety of cellular processes, including but not limited to metabolic cycles, cell signaling and DNA transcription. Due to the successful applications of machine learning techniques in bioinformatics[2], [3], a considerable number of computational techniques have been developed in recent years to complete the task of PPI prediction.

    When performing PPI prediction, most of existing algorithms require additional information of proteins to be known in advance [4]. Some of them take into account genomic information, such as the conversion of gene-order[5], [6] and the priors of genomic features among interacting proteins [7]. Motivated by the observation that coevolving proteins are more likely to interact with each other, several techniques address the problem from an alternative viewpoint by making use of evolutionary information among proteins,such as phylogenetic trees [8], [9] and protein domain information [10]–[12]. Though efficient, these two kinds of algorithms commonly suffer from the disadvantage that both of genomic and coevolutionary information required by them is not always available especially for newly discovered proteins. To overcome this disadvantage, some attempts have been made by using the sequence information of proteins to predict PPIs [13]–[17], as such sequence information can be easily obtained by high-throughput technologies. Since the performance of sequence-based algorithms heavily relies on the evidence collected from protein sequences, different strategies are adopted to extract appropriate patterns from protein sequences for more accurate PPI prediction.

    In [18], a novel feature extraction algorithm, namely CoFex,has been proposed to make use of variable-lengthk-mers.Moreover, unlike existing sequence-based algorithms that intend to concatenate the feature vectors of two proteins,CoFex explicitly composes feature vectors for pairs of proteins based on the presence ofk-mers that provide certain evidence to support the existence of interactions. It can then discover the relationship between the sequence information of proteins and their interactions in a more intuitive manner. To evaluate its performance, a support vector machine (SVM)classifier with sigmoid kernel is trained based on its extracted feature vectors and the results show that such SVM beats the state-of-the-art algorithms in prediction accuracy.

    With the rapid development of high-throughput genomic technology, PPI data become so massive that most of existing prediction algorithms are no longer favorable. In order to fulfill the demanding requirements for large-scale PPI prediction, it is essential for us to design a scalable prediction algorithm that not only improves the computational efficiency but also retains high accuracy. Since existing prediction algorithms have been proved with a promising performance when applied to predict PPIs, we adopt a rather intuitive strategy by modifying an existing prediction algorithm instead of developing completely new algorithms that are capable of handling such huge data. To this end, we develop a distributed framework in this work by reimplementing CoFex under the MapReduce framework.

    To do so, we first conduct an in-depth analysis to identify the limitations of CoFex in terms of memory consumption and efficiency. There are several points worth noting. First of all,the sequence information of proteins has to be traversed multiple times when CoFex extracts coevolutionary patterns that arek-mers frequently observed in the interacting proteins with statistical significance. Such traversal mechanism could take much time for a large number of proteins. Secondly,CoFex requires the massive data of protein sequence information to be stored in the memory, thus leading to high memory consumption for large-scale PPI prediction. Lastly,its efficiency is constrained by the integration with SVM,which is not applicable to proceed the training process when the number of feature vectors is huge. In this work, respective solutions are developed to overcome these limitations and a distributed framework modified from CoFex, namely CoFex+,is proposed for significantly improved efficiency when applied to large-scale PPI prediction. The experimental results show that CoFex+achieves more than two orders of magnitude improvement in running time with only a negligible loss in accuracy.

    The rest of this paper is organized as follows. In Section II,a detailed literature review about related work of PPI prediction is presented. In Section III, preliminaries about the mathematical symbols used are introduced. The details of CoFex+are described in Section IV. Experimental results are discussed in Section V, following which we present an indepth discussion about the effectiveness and efficiency of CoFex+in Section VI. The paper ends with a conclusion in Section VII.

    II. RELATED WORK

    In this section, a detailed literature review about PPI prediction is presented. Existing algorithms can be classified by following different criteria. To distinguish them, we summarize the representative work by following the biological information used to construct the feature vectors of proteins or PPIs. The main sources of such information include genomic information, evolutionary profiles, and protein sequences, thus leading to three categories of algorithms. Moreover, we introduce recent attempts to conduct large-scale PPI prediction.

    A. Prediction Algorithms Using Genomic Information

    Genes located in genome sequences can hint at the existence of interaction between pairwise proteins at a comprehensive level due to the availability of whole genomic sequencing data. Dandekaret al. [5] find that proteins encoded by conserved gene pairs are more likely to interact with each other and such conserved gene pairs are within a low level conservation of gene-order. Although this algorithm is useful to predict PPIs by using the conservation of gene-order, it is incapable of predicting the interaction of proteins that are not conserved for gene sequences, such as those encoded by distantly located genes.

    In [7], different genomic features, such as messenger RNA coexpression, coessentiality and colocalization, are used to quantify the associations between them and PPIs. Based on these quantified associations, Bayesian networks are constructed to predict PPIs when the genomic features of query proteins are provided.

    Another motivation about the use of genomic information for PPI prediction is that interacting proteins are found to be homologous in another genome where they are fused into a single protein [19]. Furthermore, Enrightet al. [20] have developed a computational algorithm to identify such fusion events in different genomes to predict the PPIs of proteins involved in these events. However, its disadvantage is that it is unable to work with proteins whose fusion events are uncovered through the analysis of genomic sequencing.

    B. Prediction Algorithms Using Evolutionary Profiles

    Evolutionary information of proteins reveals the procedure of how proteins evolve across different species. Since proteins that co-evolve are more likely to interact with each other, the similarity in evolutionary information is of potential relevance to the prediction of PPIs as it indicates to what extent the two proteins co-evolve.

    Pazos and Valencia [8] make use of phylogenetic trees of proteins to indicate PPIs. They propose a distance measure to compute the similarity among the phylogenetic trees of proteins, thus determining whether there is a possible interaction among them. Another source of evolutionary information is the domain knowledge of proteins. It is believed that proteins interact as a result of their interacting domains, and hence there are many computational algorithms proposed to predict PPIs based on domain knowledge. A maximum likelihood estimation algorithm is applied to identify interacting domains that infer curated PPIs. Then with such inferred interacting domains the interactions between proteins can be predicted [21]. Similarly, random forest of decision trees is trained by taking into consideration all the proteins domains; and then used to predict PPIs [22].Maetschkeet al. [23] introduce the concept of inducers to make use of gene ontology (GO) information more effectively. Given two GO terms, their induced term set is composed of all GO terms along the shortest path between the two terms. After that, GO2PPI is proposed by integrating all induced GO terms with popular classification techniques to predict PPIs.

    C. Prediction Algorithms Using Protein Sequences

    Protein sequences, composed of amino acids, are the primary structures of proteins and the motivation behind the use of protein sequences for predicting PPIs derives from the hypothesis that sequence information may contribute to mediate PPIs.

    When extracting patterns from protein sequences, sequencebased algorithms normally takek-mers into account to compose feature vectors of proteins. Thesek-mers are amino acid sequence segments with length equal tok. Taking [13] as an example, the value of each element in the feature vector of a protein is the number of occurrences of a particular 3-mers in the sequence of that protein. Once all the feature vectors are obtained for all the proteins in the training dataset, these sequence-based algorithms apply them to traditional classifiers, such as SVM and random forest, to distinguish between interacting and non-interacting proteins. Similar to[13], an SVM-based prediction algorithm is proposed by combining a kernel function and a conjoint triad feature for the description of amino acids [14].

    In addition to SVM-based algorithms, Pitreet al. [24]propose PIPE to tackle the problem of predicting PPIs from a different perspective. Their idea is to measure how often pairs of subsequences in the two query proteins co-occur in pairs of proteins that are known to interact. In [25], PPIevo is proposed to extract the feature vectors from Position-Specific Scoring Matrix for each of proteins based on sequence information. It adopts the random forest classifier to predict PPIs based on the feature vectors of all protein pairs involved.

    Instead of constructing feature vectors for individual proteins, CoFex [18] intends to extract coevolutionary features from pairs of protein sequences for predicting PPIs. DNN-PPI[26] develop a novel deep neural network framework to automatically extract features, including including semantic associations between amino acids, position-related sequence segments, and their long- and short-term dependencies, from protein primary sequences. As a recent attempt, a novel computational model has been developed to predict PPIs effectively [17]. To do so, it first combines the F-vector,composition and transition to map each protein sequence onto numeric feature vectors, and then employs a principal component analysis to reconstruct the most discriminative feature subspaces as the input for weighted sparse representation classification.

    D. Prediction Algorithms Using PPI Network

    Since PPI networks have rich connectivity patterns [1],network-based prediction algorithms have been developed to make use of these patterns to characterize known PPIs in a given PPI network. Defining a degree-normalized score based on network paths with length three, L3 [27] is able to determine whether two proteins are interact or not if one is similar to the interacting partners of the other. To alleviate the restriction on the length of network paths, Wanget al. [28]develop a novel prediction algorithm based on a mixed membership stochastic block model. The algorithm simulates the generative process of a PPI network according to the likelihoods of pairwise proteins being grouped in the same protein complex. Recently, an improved graph representation learning method, namely S-VGAE, is proposed to predict PPIs based on both sequence information and network structure[29].

    E. Prediction Algorithms for Large-scale PPI Data

    At present, PPIs that have been identified take less than 20% of the whole interactome [30]. With the development of high-throughput technologies, the size and complexity of protein interaction data have increased significantly. A new challenge is thus raised for large-scale PPI prediction.Recently, several attempts have been made in this field.

    In order to achieve the purpose of effectively and accurately predicting large-scale PPIs, Youet al. [31] propose a parallel SVM model by only requiring the use of protein sequence information for large-scale PPI prediction. Huet al. [32]propose a large-scale protein interaction prediction algorithm based on MapReduce. It first extracts amino acid fragments from sequences of proteins for statistical analysis, and then constructs the corresponding feature vectors of proteins to train SVM. Motivated by the idea of distributed computing,Chenet al. [33] propose a Multi-source Learning based Protein Community Detection algorithm by integrating gene expression data and implement its parallel version on the Apache Spark platform to enhance computational efficiency.

    F. Noise Reduction in PPI Prediction

    Due to the limitation of high-throughput genomic experiments, existing PPI data suffers the disadvantage of high false-positive and false-negative rates, which could indicate the existence of noise in PPI data. In this regard,several attempts have been made to reduce noise and can also be applied to noise reduction in PPI prediction. Biet al. [34]examine a variety of sequence smoothing methods to eliminate the interference of noise points in the original sequence, and identify Savitzky-Golay (SG) filter as the most effective method among them. Similarly, SGW-SCN(Savitzky-Golay and wavelet-supported stochastic configuration networks) [35] adopts the SG filter method to remove possible outliers and noises in non-stationary workload time series. Kritikoset al. [36] integrates several well established PPI weighting methods to assign weights that can indicate the confidence level that a given PPI is a true-positive one. Luoet al. [37] propose a collaborative filtering-enhanced topology-based method to compute an inter-neighbourhood similarity for assessing PPIs.

    III. PRELIMINARIES

    A. Mathematical Notations

    Given an alphabet set Γ={τ1,τ2,...,τnΓ} consisting of totalnΓamino acids, a protein sequenceSis represented asS=(st)1≤t≤ns, wherest∈Γ andnsis the length ofS.Therefore, ak-mer segment starting from the positiontinSis denoted asSt,k=(st,st+1,...,st+k?1), where 1 ≤t≤ns?k+1.

    A pair of coevolving positions with lengthkmeans that there arek?2 do not-care positions between them. For the sake of convenience, we use (τi,τj)k, where τi,τj∈Γ to represent a possible candidate of coevolutionary patterns. IfScontains (τi,τj)k, it means that ?St,k:st=τiandst+k?1=τj.is the set of coevolutionary patterns with lengthk. Assuming thatkmaxis the maximum value ofk, the set of all coevolutionary patterns is denoted as

    B. Efficiency Bottlenecks of CoFex

    As one of state-of-the-art algorithms proposed for PPI prediction, CoFex yields better accuracy than several popular sequence-based algorithms. In the rest of this section, we briefly introduce it and then discuss its efficiency bottlenecks when deployed to perform large-scale PPI prediction.

    CoFex is composed of four steps. First, it targets to identify all coevolutionary patterns so as to compose V with some statistical knowledge. Next, it makes use of mutual information theory to quantify the amount of evidence to support or oppose the existence of interaction for each coevolutionary pattern and obtains a weighted version of V denoted as. Thirdly, for each pair of proteins, it constructs a feature vector based on theLastly, it trains an SVM classifier is trained by CoFex to predict PPIs.

    Although a series of extensive experiments have demonstrated its promising accuracy, its efficiency could not fulfill the demanding requirements for large-scale PPI prediction. According to Fig. 1(b), we note that its running time is exponentially increased with the number of interacting proteins. Moreover, CoFex is unable to perform its task when the number of interacting proteins is larger than 800 000.

    To identify its efficiency bottlenecks, we conduct an indepth analysis into its procedure and identify several key issues. First of all, when extracting coevolutionary patterns to compose V, it has to traverse the sequence information of all proteins for each candidate, and such traversal is very inefficient when the number of proteins increases. Secondly,the heavy computation in its second and third steps further constraints its efficiency. Lastly, due to the memory limitation in a single machine, SVM adopted by CoFex is not feasible for training and testing in the context of big data. Hence, in the following section, we intend to address these bottlenecks of CoFex in a distributed manner so as to well conduct largescale PPI prediction.

    C. MapReduce Framework

    As a programming model, MapReduce is convenient to allow programs running in a high-performance parallel platform consisting of a huge number of computing nodes[38]. Engineers only need to provide high-level parallelism information and design map and reduce functions to be executed in parallel across multiple nodes. Because of its convenience and efficiency, it has become one of the most popular frameworks for large-scale date analysis [31],[39]–[41]. It has two parallel processing phases, i.e.,MapandReduce. Each phase adopts a user-defined function to complete its task. The function used inMaptakes key-value pairs as input and generates another series of key-value pairs as intermediate output to write to local disk. The function used inReduceintends to aggregate all the values with the same key obtained from the output ofMap.

    As for the implementation of CoFex+, a main concern lies in the specialized environments and configuration requirements when using MapReduce. In order to make it easy to use and simplify the construction of complex environments, we choose a distributed integration platform called Hadoop [42]as the base of our experiments. The reason is that Hadoop is an open source cloud computing platform. It implements the MapReduce framework that have been successfully evaluated by a large number of practical applications in many fields,such as machine learning and computational biology.Moreover, we have accumulated solid experiences in implementing distributed algorithms in Hadoop. In addition to Hadoop, it is also possible for us to implement CoFex+by using other distributed platforms, such as spark [43], [44], if we follow the details of CoFex+and design the corresponding MapReduce-like functions provided by these platforms.

    IV. DETAILS OF COFEX+

    In this section, we describe the procedure of CoFex+. It addresses the efficiency bottlenecks of CoFex by following a MapReduce framework. As shown in Fig. 2, it is divided into four steps including V identification, V weighting, feature vector construction, and large-scale PPI prediction. In fact, the structure of CoFex+is similar to that of CoFex, which also has these four steps. However, the structure of CoFex is not compatible with popular distributed integration platforms, and thus it is impossible to explicitly allow CoFex to execute in a distributed manner. Hence, for each step of CoFex+, we follow the MapReduce framework to redesign its process, and divide it into theMapandReducephases. By doing so, we are able to deploy CoFex+ in a distributed integration platform, thus completing the task of large-scale PPI prediction. The two algorithms mentioned in Fig. 2 are independent with each other, and thus they do not have relationships in terms of input and output parameters. Note that all algorithms and procedures of CoFex+are presented in the supplementary material.

    Fig. 1. Efficiency and ROC curves of CoFex+. (a) The efficiency of CoFex+ on the datasets composed of proteins at different magnitudes; (b) the efficiency of CoFex+ on the datasets composed of pairwise proteins at different magnitudes; (c) the ROC curves of CoFex+ on the datasets composed of proteins at different magnitudes.

    Fig. 2. The complete procedure of CoFex+.

    A. Identification

    When verifying each candidate of coevolutionary patterns,CoFex has to traverse the sequence information of all proteins,which is very inefficient for large-scale PPI datasets. To avoid unnecessary traversal, CoFex+adopts a tree-based data structure, namely CF-tree [45], to record the occurrences of all candidates by only traversing the input dataset once. In particular, CF-tree is designed as a multiway tree where each node can take more than two children. For each node in the tree, we define two items, i.e.,AminoAcidandProteinSet. The former is the amino acid assigned to this node, while the latter is a set of proteins where the coevolutionary pattern defined by the root and this node is observed. Each coevolutionary pattern in V is identified by CoFex+with two phases, i.e.,MapandReduce.

    1) Map:In this phase, CoFex+aims to construct a CF-tree from the dataset of protein sequences assigned to an arbitrary computing node running a map task. Since the largest length of coevolutionary patterns iskmax, it first splits each sequence, i.e.,S, into a set of segments denoted as{St,kmax}(1 ≤t≤ns?kmax+1). After that, several CF-trees are built by following Algorithm 1 and each CF-tree has a unique amino acid assigned to its root node. For each segment inSt,kmax, we first compose a branch for it and then select the CFtree whose root node is the amino acid at the first position of this segment. Finally, we append the branch to a CF-tree by following Procedure 1. Hence, for each amino acid in this segment, its previous amino acid is the parent node while its next one is the child node. In addition to the growth of the CFtree, the protein owning this segment is added toProteinSetof each node in the corresponding branch.

    Algorithm 1 Construction of CF-trees{S} kmax Input: protein sequences , {Tτi}Output: CF-trees Tτi AminoAcid τi 1: initialize each by setting the value of as for its root node{S}2: for S in do St,kmax {St,kmax}3: for in do Grow(Tst,St,kmax)4: call 5: end for 6: end for{Tτi}7: return

    Procedure 1 Grow(Tst,St,kmax)Input: , TstSt,kmax Output: Tst Node(St,k)AminoAcid st+k?1 St,kmax Tst 1: assume that returns the kth node whose value of is along the branch of in Node(S t,1)2: add the protein of S to i ∈[1,kmax ?1]3: for each do Node(S t,i+1)4: if does not exist then 5: initialize a node by setting the value of as Node(S t,i)AminoAcid st+i 6: set the parent of this node as 7: end if ProteinS et Node(S t,i+1)8: add the protein of S to the of 9: end for 10: return Tst

    The output ofMapare the key-value pairs by following the format 〈τi,Tτi〉, whererepresents the CF-tree whose root node has the value τiin itsAminoAcid.

    2) Reduce:CoFex+tends to merge all the CF-trees whose root nodes have the same value ofAminoAcid, thus identifying the coevolutionary patterns to compose V. Since the input ofReduceis the key-value pairs with format 〈τi,{Tτi}〉, each CFtree inis processed by the same computing node according to a MapReduce framework. The identification of V on each computing node is described by Algorithm 2. In particular, givenCoFex+first merges all CF-trees in it into a single CF-tree, as their root nodes have the same value,i.e.,forAminoAcid. For convenience,used afterward r epresents the merged CF-tree.

    Algorithm 2 Identification V Input: , V τi{Tτi}Output: V=?1: set 2: merge all CF-trees in to compose τj Γ{Tτi} Tτi 3: for in do k=2 →kmax 4: for do f((τi,τ j)k)5: obtain with (1) and defined by Procedure 2 f((τi,τ j)k)≥1.96 6: if then Count(Tτi,τ j,k)7: add to 8: end if 9: end for 10: end for V(τi,τ j)k V 11: return

    Procedure 2 Count(Tτi,τ j,k)Input: , , k Tτiτj Output: the number of occurrences of L={}(τi,τ j)k 1: initialize 2: assume that return all nodes at the depth k of node Nodes(Tτi,k)Nodes(Tτi,k) Tτi 3: for each in do AminoAcid node τj 4: if the of is then L=L∪ProteinS et node 5: of 6: end if 7: end for 8: remove duplicate proteins in L 9: return the size of L

    The next step verifies whether a candidate (τi,τj)kcan be considered as a coevolutionary pattern with some statistical knowledge. Assuming that there are totalnunique proteins in the dataset, we would like to confirm that the value ofCount(,τj,k) is sufficient enough to reach the conclusion t hat the occurrences of (τi,τj)kare frequently observed among the sequences of all proteins. Following the statistical analysis in [18], we can further rewrite the definition off((τi,τj)k) as

    where

    Given (1), we reckon that (τi,τj)kis frequently found am(ong all) protein sequences at a confidence level of 95% iff(τi,τj)k≥1.96, and it can be considered as a coevolutionary pattern. V is obtained once we examine all the candidates by following the same procedure, and hence the output of this phase follows the format of 〈 (τi,τj)k,null〉.

    B. Weighting

    Weighting coevolutionary patterns is another step requiring multiple traversal and heavy computation. In particular, for each pattern, CoFex has to obtain its occurrences in the interacting and non-interacting proteins by traversing the training dataset and then calculate its weight according to mutual information theory. To accelerate such computation,CoFex+decomposes the weighting procedure intoMapandReducephases. InMap, the occurrence information of coevolutionary patterns in the interacting and non-interacting proteins is obtained while the weights of coevolutionary patterns are computed inReduce.

    1) Map:The input dataset is composed of interacting and non-interacting proteins. For each computing node, CoFex+targets to count the occurrences of evolutionary patterns from a part of the input dataset assigned to this node. To do so, it first obtains all the CF-trees and coevolutionary patterns generated in the first step. Since similar counting procedures are applied to all coevolutionary patterns, we take an arbitrary pattern, i.e., (τi,τj)k, as an example to illustrate it. For a particular computing node taking the map task, we assume that I is the dataset of interacting proteins andis the dataset of non-interacting proteins, and both of them are allocated to that computing node. Furthermore, we introduceandto denote the numbers of occurrences of (τi,τj)kin I andrespectively. Verifying (τi,τj)kin the sequences of two proteins is done via Procedure 3.

    Assuming thatpxandpyare two interacting proteins in I,CoFex+examines CF-treeto verify the occurrences of(τi,τj)kin the sequences ofpxandpy. It first obtains all the nodes from the CF-tree at the depth ofkand only disregards those whose labels are not τjas indicated by Line 4 in Procedure 3. For the remaining nodes, CoFex+further checks whetherpxandpyare found in the values of theirProteinSetas indicated by Lines 5–11. If variablesmatchXandmatchYare both true, we reckon that coevolutionary pattern (τi,τj)koccurs in the sequences ofpxandpy. In this regard, the value ofis increased by 1. After examining all pairs of interacting proteins, we obtain the final value of. Similarly,we could obtainfrom.

    ?

    Input: , , , true false Tτi(τi,τ j)k px py Output: or result ←false 1: matchX ←false 2: matchY ←false 3: 4: for each in do AminoAcid node τj node Nodes(Tτi,k)5: if the value of of is then px ∈ProteinS et node 6: if of then matchX=true 7: 8: end if py ∈ProteinS et node 9: if of then matchY=true 10: 11: end if result=matchX&&matchY 12: 13: end if

    14: end for result 15: return

    C. Constructing Feature Vectors

    This step targets to construct a feature vector for each pair of proteins, thus preparing for the training of distributed SVM classifiers in the last step.

    1) Map:CoFex+makes use of CF-trees, i.e., {} and V to construct the feature vectors for pairwise proteins.Accordingly, the input is the pairs of proteins inDintand.CoFex+splits them into many data blocks, which are then processed by the computing nodes inMap.

    Algorithm 3 constructs feature vectors. Taking two proteinspxandpyas an example, it first initializes a feature vector denoted asVxyfor them. Each element inVxyrepresents a unique coevolutionary pattern in V and accordingly the length ofVxyis equal to |V|. For each (τi,τj)k∈V, it selects CF-treeand traverses this tree to verify whether (τi,τj)koccurs in the sequences ofpxandpyas indicated by Line 4 in Algorithm 3. If so, the weight of (τi,τj)k, i.e.,is assigned to the corresponding element inVxyas indicated by Line 5.The output of this phase follows the format of 〈 {px,py},Vxy〉.

    2) Reduce:All the feature vectors generated byMapare saved to the distributed file system.

    D. Performing Large-scale PPI Prediction

    In the final step of CoFex+, we adopt a divide-and-conquer strategy to perform large-scale PPI prediction. To prediction PPIs, we use SVM as the predictor, as the experimental results presented in [18] indicate that the combination of CoFex and SVM has been proved to yield the best accuracy. However,SVMs are unable to handle large-scale training data. To overcome this problem, we first split the training dataset into several data blocks such that SVM could be trained for each block. Hence, the output ofMapfollows the formatwherepxandpyare a pair of query proteins in the testing dataset, andis the prediction score yielded by SVM in them-th computing node. ForReduce, the input of a particular computing node isThus, the final prediction score forpxandpyis the average value of, and can be determined as

    wherenxyis the number of items in {}.

    It is clearly seen that each of efficiency bottlenecks of CoFex is specifically addressed by CoFex+. In particular, the inefficient traversal of CoFex is overcome by CoFex+by using a CF-tree data structure where a training dataset is only traversed once. The heavy computation caused by the second and third steps of CoFex is decomposed into many subtasks by following the MapReduce framework and these subtasks are able to well run in a parallel manner. Instead of training a single SVM, a divide-and-conquer strategy is adopted by CoFex+such that a set of SVMs are trained on different data blocks of the training data. With all these particularly designed solutions, CoFex+is capable of performing the prediction task for large-scale PPIs, to be shown next.

    E. Computational Complexity Analysis

    V. EXPERIMENTAL RESULTS

    To evaluate the accuracy and efficiency of CoFex+, we have conducted a series of experiments and the results are discussed in this section. When analyzing its efficiency, we have also investigated the influences made by the change in the number of computing nodes used inMapandReducephases. Each computing node is equipped with Intel SkyLake processors at 3.0 GHz and 16 GB of RAM.

    A. Benchmarking Datasets

    To obtain convincing results, two kinds of benchmarking datasets are used. One is composed of three small datasets collected from the species of arabidopsis thaliana (AT),escherichia coli (EC) and schizosaccharomyces pombe (SP)and the other involves a huge dataset obtained from the human species. The three small datasets are from STRING [46],which is a database of known PPIs stemmed from knowledge transfer between organisms, and from interactions aggregated from other databases. The latter is from the human protein reference database (HPRD) [15] where all the information has been manually extracted from the literature by expert biologists who read, interpret and analyze the published data.These four datasets have been widely adopted for performance evaluation on PPI prediction [18], [23], [26], [28], [29], and their numbers of proteins and pairs of interacting proteins and non-interacting proteins are presented in Table I.

    TABLE I DESCRIPTION OF BENCHMARKING DATASETS

    Since PPIs generated by high-throughput technologies suffer from the disadvantage of high false-positive and falsenegative rates, such noisy PPI data obviously could negatively influence the accuracy of PPI prediction. To avoid this problem, we only selected PPIs with confidence scores not less than 0.9 from the STRING database for AT, EC and SP datasets. This operation is not applied to the human dataset as all PPIs are manually curated to avoid errors as indicated by[15]. Regarding the generation of non-interacting proteins, we randomly paired up proteins whose interactions were not reported previously.

    B. Results

    To verify whether CoFex+is applicable to large-scale PPI prediction while maintaining a promising accuracy comparable to CoFex, the experiments are divided into two parts. The fist part of experiments focuses on evaluating the accuracy of CoFex+by comparing it with several state-of-theart sequence-based algorithms. Since all the algorithms except CoFex+are incapable of performing their prediction tasks on the Human dataset, another three small benchmarking datasets, i.e., AT, EC and SP, are used for accuracy evaluation. The second part of experiments concentrated on verifying its efficiency for large-scale PPI prediction.

    a) Accuracy

    1) Experimental Setup:In the experiments, we compare the accuracy of CoFex+with CoFex, Ben-Hur and Nobel [13],Shenet al. [14], FCTP-WSRC [17] and S-VGAE [29]. All algorithms are evaluated under the scheme of five-fold crossvalidation to each small dataset. We adopt the receiver operating characteristic (ROC) analysis and use the area under ROC curve (AUC) to compare algorithms’ accuracy. AUC scores are within the range from 0 to 1, and a higher AUC score indicates a better performance in terms of accuracy. In addition, we adopt several evaluation indicators to evaluate them from different perspectives, i.e., i) sensitivity (Sn) that is the percentage of correctly identified PPIs; ii) specificity (Sp)that is the percentage of correctly identified non-interacting proteins; iii) accuracy (Acc) that is the percentage of correctly identified PPIs and non-interacting proteins; iv) matthew’s correlation coefficient (Mcc) that is a more strict evaluation standard considering both under and over predictions.

    Regarding the parameter settings, we set the maximum valuekmaxas 10 in the experiments. For a particular PPI dataset, we verify it from 2 to 10 with a step size of 1 and the one that achieves the best performance is taken as its value.The reason why we set the maximum value ofkmaxas 10 is that after our study we find that the size of Vkis rather small whenkis larger than 10. For the distributed configuration of CoFex+, total 10 computing nodes are used in bothMapandReduce. For the other algorithms we use in the experiments,parameters that have to be determined for them to work properly are set to the values recommended by the authors of their references and we list them in Table II, whereCandγused by all algorithms except FCTP-SWRC and S-VGAE arethe parameters of SVM.

    TABLE II PARAMETER SETTINGS OF ALGORITHMS

    2) Experiment Results:Regarding the AUC scores presented in Table III, CoFex is only better by 1.3% than CoFex+.Moreover, we also perform an independent samples t-test and the results reveals that the difference in the all metrics between CoFex and CoFex+is not significant (p<0.05). The reason for such a slight difference is mainly ascribed to the divide-and-conquer strategy adopted by CoFex+. In particular,a set of SVMs are trained by CoFex+on different data blocks of training data whereas CoFex performs its prediction task on a single SVM. When comparing CoFex+with other sequencebased algorithms, we note the strong performance of CoFex+,as it yields a better average score for each evaluation indicator. We also note that CoFex+performs better than the two state-of-the-art algorithms, i.e., FCTP-SWRC and SVGAE, as the average scores of AUC, Acc and Mcc obtained by CoFex+are larger by 30.2%, 23.6% and 15.1%respectively than those of FCTP-SWRC, and by 7.1%, 7.1%and 7.7% respectively than those of S-VGAE, for the three small datasets. Hence, according to the performance of CoFex+in the three small datasets, we reason that as the reimplementation of CoFex using MapReduce, CoFex+still maintains promising accuracy comparable to CoFex.

    TABLE III PERFORMANCE COMPARISON OF SEQUENCE-BASED ALGORITHMS IN THE AT, E C AND SP DATASETS

    b) Efficiency

    1) Experimental Setup:We apply CoFex and CoFex+to two categories of PPI datasets generated from the Human dataset.The first category includes three datasets composed of proteins at magnitudes of 102, 103, 104and All respectively,and the second category involves the datasets composed of pairwise proteins at magnitudes of 102, 103, 1 04, 105, 106and All respectively. Note that since we concentrate analyzing their efficiency, all the interacting and non-interacting proteins in each dataset are used only for training, and we record their running time. For a fair comparison, CoFex is executed in the machine with the same hardware configuration as for CoFex+.

    Regarding the parameter settings, we set the maximum valuekmaxas 5 and take it as an example to demonstrate their efficiency. For the distributed configuration of CoFex+, total 10 computing nodes are used in theMapandReducephases.

    2) Experimental Results:From Fig. 1, both of them take more running time when either proteins or PPIs increase.Concerning their efficiency comparison, we have the following findings. First, CoFex+is not as efficient as CoFex in all small datasets. A possible reason for that phenomenon is that the time used for transferring data among different computing nodes is considerably larger than that for computation. Second, the advantage of CoFex+is evident in larger datasets, as the blue curves of CoFex+are always below the red ones of CoFex after the intersection points in Fig. 1(a)and Fig. 1(b). Although the increase in the size of training data requires a heavier computation with more time and resources, CoFex+is able to divide such computation into many tiny tasks and assign them to the computing nodes. In this regard, computing nodes only handle small parts of the original computation task in a parallel manner, thus reducing the running time of the entire procedure. Thirdly, from Fig. 1(b), the training task of CoFex is unable to complete when the amount of interacting proteins exceeds 800 000.Although we could upgrade the hardware configuration of our machine, it is not the best solution as the efficiency bottlenecks of CoFex are still unsolved. That is the reason why we must develop CoFex+for large-scale PPI prediction.Lastly, for two datasets from different categories but at the same magnitude, CoFex+takes more time for the dataset in the first category. In other words, for CoFex+, the majority of computation takes place in the second step where coevolutionary patterns are weighted. Hence, we suggest that CoFex+is preferred for large-scale PPI prediction while CoFex for small datasets.

    In order to evaluate whether CoFex+is still effective as the scale of data increases, we apply a five-fold cross-validation to the datasets in the first category, and present corresponding ROC curves in Fig. 1(c). We observe the promising performance of CoFex+when the number of proteins increases. In other words, with the consideration of more protein sequences, CoFex+is able to better identify coevolutionary patterns that are useful for PPI prediction.

    In Fig. 3, we present the speedup performance of CoFex+in the datasets of PPIs at different magnitudes. The baseline to calculate the speedup scores is the running time of CoFex given in Fig. 1(b). It is worth noting that its speedup performance is generally improved when PPI dataset size increases. When the size of PPI dataset exceeds 1 05, it is more obvious, as it can considerably improve the efficiency of CoFex by achieving more than two orders of magnitude improvement in computational efficiency as indicated by the red bar in Fig. 3.

    Fig. 3. The speedup performance of CoFex+ in the datasets composed of PPIs at different magnitudes.

    When compared with CoFex, CoFex+adopts a tree-based data structure, i.e., CF-tree, to accelerate the procedure of identifying coevolutionary patterns. To verify such acceleration effect, we compare the running time of CoFex+using a local mode with that of CoFex on the Human dataset where we have to identify coevolutionary patterns from the sequences of total 13 730 proteins. The reason why we use the local mode for CoFex+in Hadoop is to avoid the bias benefited from distributed computing. The comparison in running time taken by CoFex+and CoFex in the identification of coevolutionary patterns is presented in Fig. 4, where we note that the blue curve representing CoFex+is always below the red one of CoFex. In this regard, we reason that the identification of coevolutionary patterns cost less time with CoFex+than CoFex.

    Fig. 4. The comparison in running time taken by CoFex+ and CoFex in the identification of coevolutionary patterns on the Human dataset.

    In summary, although the increase in the size of training data requires more computation and thus more time and resources, CoFex+is able to divide such computation into many tiny tasks and assign them to the computing nodes.According to the results and discussions presented above,CoFex+yields more promising performance for large datasets in terms of efficiency and is verified to have the ability to complete the task of large-scale PPI prediction.

    Memory Consumption

    1) Experimental Setup:Since the change in memory consumption of CoFex+is similar when we increase the number of computing nodes in eitherMaporReduce, we take the results ofMapas an example to discuss how the memory is consumed by CoFex+when training it by using the entire Human dataset.

    2) Experimental Results:From Table IV, we note that the amount of memory consumed by CoFex+gradually becomes larger when the number of computing nodes inMapincreases.Obviously, CoFex+requires much less memory than CoFex,as CoFex is unable to handle the entire Human dataset by using the machine whose memory is 16 GB. The major reason for this phenomenon is due to the integration with the CF-tree data structure, which saves a lot of memory space by avoiding storing massive protein sequence information. In the step of V Identification, CoFex+constructs at mostnΓCF-trees in each computing node ofMap. That is to say, when we deploy more computing nodes performing the map tasks of CoFex+, the increase in memory consumption is mainly from newly generated CF-trees.

    TABLE IV MEMORY CONSUMPTION OF COFEX+ GIVEN DIFFERENT NUMBER OF COMPUTING NODES IN MAP

    VI. DISCUSSIONS

    When using CoFex+in reality, the only assumption is the possible integration with popular distributed computing platforms, such as Hadoop and Spark. Given that CoFex+is designed by following the MapReduce framework, this assumption can be easily satisfied in reality, as both Hadoop and Spark provide MapReduce-like functions for the integration. In this regard, it is possible to implement CoFex+with Spark by using its MapReduce-like functions. The major reason for us to choose Hadoop is that we have accumulated solid experiences to implement algorithms in Hadoop environment.

    Regardless of the parameters we set for SVM, the only parameter we need to specify in advance iskmax, which is the maximum value ofkand also is the depth of CF-trees constructed in the first step of CoFex+. Although a larger value ofkallows CoFex+exploit more information in protein sequences, the improvement in accuracy is limited as the number of coevolutionary patterns extracted by using a largerkbecomes smaller. Thus, these patterns have less significance,as they are not sufficient enough for SVM to find an optimum hypersurface. Taking the five-fold cross validation on the EC dataset as an example, the number of coevolutionary patterns extracted whenk=2 is 207 while that number is only 39 whenk=10. In other words, for two amino acids located far away from each other in a protein sequence, their ability in terms of providing evidence supporting or refusing the existence of interaction is not as strong as those located closely.

    According to our computational complexity analysis, there are several parameters affecting the efficiency of CoFex+, i.e.,n,,n1,n2, |I| a ndIn particular,nandn? plays a critical role in the first step of CoFex+, as we need to identify coevolutionary patterns from the sequences of all proteins.The last four parameters decides the efficiency of its second and third steps. Specifically, the processes of weighting V and constructing feature vectors are applied to each pair of proteins in |I| andby traversing CF-trees. Moreover, there are also relationships among these parameters. For example,the values ofn1andn2are determined by that ofn, as the sizes of CF-trees will become larger when more protein sequences are considered. Hence, the number of proteins, i.e.,n, and that of pairwise proteins, i.e.,are the key factors impacting the efficiency of CoFex+.

    However, the upper limit of CoFex+’s efficiency exists, as it is impossible to reduce its running time by simply increasing the number of computing nodes. In fact, the runtime overhead of CoFex+is composed of two parts, one is the time used for computation and the other is the time spent to transfer data among different computing nodes. When the number of computing nodes exceeds some threshold, the process of data transfer takes more time than the computation, and thus constrains the further improvement of efficiency.

    VII. CONCLUSION

    In this paper, we for the first time propose a distributed framework, namely CoFex+, to complete a challenging prediction task for large-scale PPIs. Modified from a wellestablished algorithm CoFex, it overcomes CoFex’s efficiency bottlenecks from two perspectives. First, it adopts a tree-based data structure to avoid the heavy memory consumption caused by the huge sequence information of proteins. Second, its implementation is integrated with the MapReduce framework such that the task can be completed in a distributed manner. A series of extensive experiments have been conducted and the results demonstrate that CoFex+can considerably improve the efficiency of CoFex by achieving more than two orders of magnitude improvement in computational efficiency while retaining a comparable level of accuracy for large-scale PPI prediction. Moreover, regarding the distributed configuration of CoFex+, we conclude that the efficiency of CoFex+could gradually be improved till a certain level when the number of computing nodes used in eitherMaporReducephase increases. Our next work intends to integrate the noise reduction methods with CoFex+to further improve its accuracy in PPI prediction. We are also interested in exploring the possibility of applying deep neural network [47]–[49],clustering methods [50]–[54] to further improve the effectiveness of CoFex+.

    欧美一区二区精品小视频在线| 午夜激情福利司机影院| 国产日本99.免费观看| 午夜精品一区二区三区免费看| 老熟妇乱子伦视频在线观看| 久久婷婷成人综合色麻豆| 丁香欧美五月| 天天添夜夜摸| 精品熟女少妇八av免费久了| 久久 成人 亚洲| 黑人巨大精品欧美一区二区mp4| 国产亚洲精品第一综合不卡| 免费观看人在逋| 午夜精品久久久久久毛片777| 日韩欧美国产在线观看| 在线观看免费视频日本深夜| 国产精品久久久人人做人人爽| 精品国产乱码久久久久久男人| 亚洲av第一区精品v没综合| 99在线人妻在线中文字幕| 久久精品人妻少妇| av天堂在线播放| 国产熟女xx| xxx96com| 精品午夜福利视频在线观看一区| svipshipincom国产片| 欧美日韩福利视频一区二区| 丝袜人妻中文字幕| 无人区码免费观看不卡| 美女 人体艺术 gogo| 夜夜夜夜夜久久久久| 91麻豆av在线| 老司机午夜福利在线观看视频| 亚洲九九香蕉| 国产成人精品无人区| 两个人免费观看高清视频| 熟女电影av网| 成人高潮视频无遮挡免费网站| 亚洲国产欧美网| 搡老熟女国产l中国老女人| 亚洲va日本ⅴa欧美va伊人久久| 午夜两性在线视频| 手机成人av网站| 看免费av毛片| 国产伦在线观看视频一区| 国产精品乱码一区二三区的特点| 老司机在亚洲福利影院| 国语自产精品视频在线第100页| 激情在线观看视频在线高清| 日韩欧美在线乱码| 欧美丝袜亚洲另类 | 波多野结衣高清作品| 亚洲成人精品中文字幕电影| 免费观看精品视频网站| 成人国语在线视频| 亚洲人成电影免费在线| 久久香蕉激情| 97超级碰碰碰精品色视频在线观看| 亚洲av成人av| 91av网站免费观看| av片东京热男人的天堂| 成人高潮视频无遮挡免费网站| 精华霜和精华液先用哪个| 我要搜黄色片| 99久久综合精品五月天人人| 亚洲欧美精品综合一区二区三区| 午夜免费观看网址| 后天国语完整版免费观看| 欧美日韩亚洲国产一区二区在线观看| 成年人黄色毛片网站| 国产高清视频在线播放一区| 淫妇啪啪啪对白视频| 三级毛片av免费| 一夜夜www| 午夜老司机福利片| 91在线观看av| 欧美av亚洲av综合av国产av| 色精品久久人妻99蜜桃| 国产99久久九九免费精品| 我的老师免费观看完整版| 在线永久观看黄色视频| 久久中文字幕一级| 久久精品影院6| 欧美国产日韩亚洲一区| 极品教师在线免费播放| 国产一区二区激情短视频| 午夜精品在线福利| 精品高清国产在线一区| 午夜福利欧美成人| 两性夫妻黄色片| 精品久久久久久久久久免费视频| 好男人电影高清在线观看| 国产高清视频在线观看网站| av在线天堂中文字幕| 久久亚洲精品不卡| 丁香欧美五月| 亚洲全国av大片| 亚洲国产日韩欧美精品在线观看 | 香蕉国产在线看| 免费看日本二区| 黄片大片在线免费观看| 黄色视频,在线免费观看| 免费在线观看完整版高清| 女同久久另类99精品国产91| 手机成人av网站| 午夜a级毛片| 夜夜爽天天搞| 夜夜躁狠狠躁天天躁| 不卡一级毛片| 九九热线精品视视频播放| 亚洲成a人片在线一区二区| 好看av亚洲va欧美ⅴa在| 亚洲成人精品中文字幕电影| 亚洲精品国产精品久久久不卡| 久久久久久久午夜电影| 亚洲最大成人中文| 日韩欧美在线二视频| 成人18禁高潮啪啪吃奶动态图| 热99re8久久精品国产| 亚洲国产欧美一区二区综合| 久久久久久亚洲精品国产蜜桃av| 超碰成人久久| 国产精品久久久久久精品电影| 国产精品 国内视频| 欧美3d第一页| 久久久久精品国产欧美久久久| 两人在一起打扑克的视频| 黄片大片在线免费观看| 久久久久久久久久黄片| 天堂√8在线中文| 99在线视频只有这里精品首页| 啦啦啦韩国在线观看视频| 91字幕亚洲| 亚洲九九香蕉| 亚洲第一电影网av| 不卡av一区二区三区| 91在线观看av| 黄色丝袜av网址大全| 欧美3d第一页| 丁香欧美五月| 少妇被粗大的猛进出69影院| 久久久国产成人免费| 免费一级毛片在线播放高清视频| 99re在线观看精品视频| 亚洲欧美日韩高清专用| 久久热在线av| 超碰成人久久| 日韩欧美在线乱码| 777久久人妻少妇嫩草av网站| 欧洲精品卡2卡3卡4卡5卡区| 精品久久蜜臀av无| 国产av一区在线观看免费| 国产熟女午夜一区二区三区| 国产成人精品无人区| 国产精品美女特级片免费视频播放器 | 亚洲性夜色夜夜综合| 两人在一起打扑克的视频| 激情在线观看视频在线高清| 亚洲精品国产精品久久久不卡| 又爽又黄无遮挡网站| 三级男女做爰猛烈吃奶摸视频| 美女高潮喷水抽搐中文字幕| 久久久国产成人精品二区| 国语自产精品视频在线第100页| 欧美黑人欧美精品刺激| 久久久国产成人精品二区| 久久国产精品影院| 免费看美女性在线毛片视频| e午夜精品久久久久久久| 狂野欧美激情性xxxx| 婷婷丁香在线五月| 伦理电影免费视频| 午夜激情福利司机影院| 国模一区二区三区四区视频 | 亚洲在线自拍视频| 在线观看www视频免费| 99热只有精品国产| 精品国产乱码久久久久久男人| 亚洲av日韩精品久久久久久密| 久久久久久久午夜电影| 两个人免费观看高清视频| 国产成人系列免费观看| 国产黄a三级三级三级人| 亚洲国产精品成人综合色| 毛片女人毛片| 久久人妻福利社区极品人妻图片| 亚洲精品中文字幕一二三四区| 亚洲aⅴ乱码一区二区在线播放 | 观看免费一级毛片| 好看av亚洲va欧美ⅴa在| 国产在线观看jvid| 最近在线观看免费完整版| 91在线观看av| 亚洲国产欧洲综合997久久,| 国产69精品久久久久777片 | 一二三四社区在线视频社区8| xxxwww97欧美| 人妻夜夜爽99麻豆av| 听说在线观看完整版免费高清| 日本在线视频免费播放| 亚洲真实伦在线观看| 淫秽高清视频在线观看| 琪琪午夜伦伦电影理论片6080| 精品免费久久久久久久清纯| 一本一本综合久久| 欧美最黄视频在线播放免费| 老司机深夜福利视频在线观看| 亚洲成人免费电影在线观看| 成人国产一区最新在线观看| 日韩欧美一区二区三区在线观看| 亚洲精品中文字幕一二三四区| 国产av又大| aaaaa片日本免费| 国产高清视频在线观看网站| 久久久精品欧美日韩精品| 久久久久九九精品影院| 国产又黄又爽又无遮挡在线| 中文字幕高清在线视频| 日本黄色视频三级网站网址| 人人妻,人人澡人人爽秒播| 在线观看一区二区三区| 亚洲,欧美精品.| 中文在线观看免费www的网站 | 一区二区三区激情视频| 全区人妻精品视频| 黄色 视频免费看| www.自偷自拍.com| 日韩欧美国产在线观看| 岛国在线观看网站| av福利片在线观看| 91字幕亚洲| 中文字幕人成人乱码亚洲影| 国产欧美日韩一区二区三| 精品久久久久久成人av| 亚洲人成网站高清观看| 又紧又爽又黄一区二区| 亚洲片人在线观看| 免费观看人在逋| 狂野欧美激情性xxxx| 久久人妻av系列| 在线a可以看的网站| 午夜福利在线在线| 久久久水蜜桃国产精品网| 国产aⅴ精品一区二区三区波| 久久精品国产99精品国产亚洲性色| 日本在线视频免费播放| 又黄又粗又硬又大视频| 一级毛片精品| 欧美日韩精品网址| 啦啦啦免费观看视频1| 性色av乱码一区二区三区2| 精品无人区乱码1区二区| 99热这里只有是精品50| 欧美性长视频在线观看| 色噜噜av男人的天堂激情| av有码第一页| 午夜福利视频1000在线观看| 99热这里只有是精品50| 男女做爰动态图高潮gif福利片| 国产成人系列免费观看| 亚洲七黄色美女视频| 黄色片一级片一级黄色片| svipshipincom国产片| 国产午夜精品论理片| 久久久久久九九精品二区国产 | 午夜免费观看网址| 国内精品一区二区在线观看| 日韩中文字幕欧美一区二区| 亚洲熟妇熟女久久| 久久亚洲真实| 久久久久性生活片| 淫妇啪啪啪对白视频| 国产精品免费视频内射| 国产精品av视频在线免费观看| 午夜激情av网站| 亚洲av片天天在线观看| 精品一区二区三区视频在线观看免费| 男人舔奶头视频| 午夜亚洲福利在线播放| 在线观看免费午夜福利视频| 天堂影院成人在线观看| 亚洲国产日韩欧美精品在线观看 | 少妇的丰满在线观看| 一本精品99久久精品77| 岛国在线观看网站| 久久久久久免费高清国产稀缺| 男人的好看免费观看在线视频 | 亚洲成a人片在线一区二区| 窝窝影院91人妻| 色哟哟哟哟哟哟| 变态另类丝袜制服| 午夜福利免费观看在线| 久久久久久久久久黄片| 制服诱惑二区| 国产精品永久免费网站| 曰老女人黄片| 美女午夜性视频免费| 国产一区二区三区在线臀色熟女| 中文字幕精品亚洲无线码一区| 人妻丰满熟妇av一区二区三区| 亚洲色图av天堂| 人人妻人人澡欧美一区二区| 亚洲av成人av| 欧美成人午夜精品| 一a级毛片在线观看| 亚洲七黄色美女视频| 国产成人欧美在线观看| 亚洲av片天天在线观看| 嫩草影视91久久| videosex国产| 制服诱惑二区| 亚洲在线自拍视频| 国产av一区在线观看免费| 日韩大码丰满熟妇| 高潮久久久久久久久久久不卡| 亚洲国产欧美网| 男女午夜视频在线观看| 变态另类丝袜制服| 日韩有码中文字幕| 久久久精品大字幕| 在线a可以看的网站| 欧洲精品卡2卡3卡4卡5卡区| 啦啦啦韩国在线观看视频| 国产精品免费视频内射| 搡老岳熟女国产| 日韩av在线大香蕉| 99热6这里只有精品| 亚洲av五月六月丁香网| 午夜福利免费观看在线| 熟女少妇亚洲综合色aaa.| 一边摸一边做爽爽视频免费| 亚洲真实伦在线观看| 国产成人精品无人区| 又爽又黄无遮挡网站| 日韩精品中文字幕看吧| 国产在线观看jvid| 久久精品成人免费网站| 最新美女视频免费是黄的| 天天躁夜夜躁狠狠躁躁| 免费看美女性在线毛片视频| 精品久久蜜臀av无| av免费在线观看网站| 久久久国产欧美日韩av| 性色av乱码一区二区三区2| 国产日本99.免费观看| 日韩欧美免费精品| 国产真实乱freesex| 在线国产一区二区在线| 午夜福利在线在线| 天堂动漫精品| 欧美日韩中文字幕国产精品一区二区三区| 99热6这里只有精品| 老司机深夜福利视频在线观看| 亚洲精品色激情综合| 校园春色视频在线观看| 久久精品夜夜夜夜夜久久蜜豆 | 久久久久免费精品人妻一区二区| 久久精品综合一区二区三区| 黄色丝袜av网址大全| 精品少妇一区二区三区视频日本电影| 精品第一国产精品| 又紧又爽又黄一区二区| 免费在线观看影片大全网站| 亚洲国产精品999在线| 国内毛片毛片毛片毛片毛片| 午夜成年电影在线免费观看| 不卡一级毛片| 国产欧美日韩精品亚洲av| 国产成人av教育| 日本一本二区三区精品| 成年女人毛片免费观看观看9| 精品国内亚洲2022精品成人| 国产精品久久久av美女十八| 99久久无色码亚洲精品果冻| www国产在线视频色| 在线观看66精品国产| 国产一区二区激情短视频| 亚洲人与动物交配视频| 99热这里只有是精品50| 精品国产亚洲在线| 床上黄色一级片| 午夜精品在线福利| 88av欧美| 99riav亚洲国产免费| 老熟妇乱子伦视频在线观看| 国产精品九九99| 国产片内射在线| 久久久久亚洲av毛片大全| 国产成人影院久久av| 老司机在亚洲福利影院| 久久中文字幕人妻熟女| 欧美大码av| 嫩草影视91久久| 久久精品91无色码中文字幕| 亚洲国产精品合色在线| 叶爱在线成人免费视频播放| 啦啦啦观看免费观看视频高清| 久久这里只有精品中国| 99在线视频只有这里精品首页| 国产伦在线观看视频一区| 精品国产美女av久久久久小说| 久久久国产成人精品二区| 在线观看66精品国产| 中文字幕熟女人妻在线| 亚洲欧美精品综合久久99| 亚洲专区字幕在线| 五月玫瑰六月丁香| 精品欧美一区二区三区在线| 在线观看www视频免费| 欧美一区二区国产精品久久精品 | 老司机午夜十八禁免费视频| 亚洲成av人片在线播放无| 久久久国产欧美日韩av| 亚洲七黄色美女视频| 久久香蕉国产精品| 国产欧美日韩一区二区三| 男人的好看免费观看在线视频 | 日本黄大片高清| 亚洲精华国产精华精| 国产激情偷乱视频一区二区| 免费看a级黄色片| 精品欧美国产一区二区三| 一级黄色大片毛片| 亚洲av熟女| 99精品在免费线老司机午夜| 色在线成人网| 男女视频在线观看网站免费 | 黄色片一级片一级黄色片| 亚洲人成网站高清观看| 欧美一区二区精品小视频在线| 变态另类丝袜制服| 国产精品久久久久久精品电影| 精品无人区乱码1区二区| 变态另类成人亚洲欧美熟女| 宅男免费午夜| 国内精品久久久久久久电影| 精品少妇一区二区三区视频日本电影| 久久久久久大精品| 国产精品久久视频播放| 制服诱惑二区| 黑人操中国人逼视频| 色综合亚洲欧美另类图片| 亚洲人成伊人成综合网2020| 国产精品亚洲一级av第二区| 极品教师在线免费播放| 国产aⅴ精品一区二区三区波| 亚洲美女黄片视频| 国产成人影院久久av| 无限看片的www在线观看| 欧美黑人欧美精品刺激| 欧美丝袜亚洲另类 | 久久久精品欧美日韩精品| 亚洲欧美精品综合一区二区三区| 色av中文字幕| 又粗又爽又猛毛片免费看| 亚洲欧美日韩高清专用| 欧美国产日韩亚洲一区| 黑人操中国人逼视频| 一级a爱片免费观看的视频| x7x7x7水蜜桃| 一区二区三区激情视频| 欧美在线一区亚洲| 午夜日韩欧美国产| 18禁国产床啪视频网站| 国产精品美女特级片免费视频播放器 | 欧美性猛交黑人性爽| 国产视频内射| 亚洲国产精品久久男人天堂| 男女午夜视频在线观看| 亚洲成a人片在线一区二区| 国产视频一区二区在线看| av视频在线观看入口| 国产伦在线观看视频一区| 叶爱在线成人免费视频播放| 久久久久久久久中文| 淫秽高清视频在线观看| 一级毛片高清免费大全| 国产三级黄色录像| 特级一级黄色大片| 久久精品国产清高在天天线| 中出人妻视频一区二区| 国产又色又爽无遮挡免费看| 91麻豆av在线| avwww免费| 色av中文字幕| 男人舔女人下体高潮全视频| 国产成人欧美在线观看| 老司机在亚洲福利影院| 国产一区二区在线av高清观看| 亚洲第一电影网av| av在线播放免费不卡| 一本久久中文字幕| 女同久久另类99精品国产91| 精品久久久久久久毛片微露脸| 久久久国产精品麻豆| 日韩欧美国产在线观看| 国产av一区二区精品久久| 国产69精品久久久久777片 | 国产激情偷乱视频一区二区| 久久精品影院6| 国产aⅴ精品一区二区三区波| 他把我摸到了高潮在线观看| 国产精品一区二区三区四区久久| 久久人妻福利社区极品人妻图片| 精品久久久久久久久久免费视频| 香蕉av资源在线| 中文在线观看免费www的网站 | 午夜两性在线视频| 欧美性猛交╳xxx乱大交人| 50天的宝宝边吃奶边哭怎么回事| 久久精品aⅴ一区二区三区四区| 非洲黑人性xxxx精品又粗又长| 国产一区二区三区视频了| 18美女黄网站色大片免费观看| 特级一级黄色大片| 亚洲精品美女久久av网站| 搡老妇女老女人老熟妇| 99国产精品99久久久久| 中文字幕人妻丝袜一区二区| 欧美极品一区二区三区四区| 巨乳人妻的诱惑在线观看| 国产真实乱freesex| 久久久久国产精品人妻aⅴ院| 久久久精品欧美日韩精品| 真人做人爱边吃奶动态| 麻豆国产97在线/欧美 | 高潮久久久久久久久久久不卡| 免费av毛片视频| 精品免费久久久久久久清纯| 两人在一起打扑克的视频| www.自偷自拍.com| 亚洲欧美日韩高清专用| 麻豆成人av在线观看| 国产精品久久久久久人妻精品电影| 亚洲 欧美一区二区三区| 一进一出好大好爽视频| 中文在线观看免费www的网站 | 国产一级毛片七仙女欲春2| 欧美在线一区亚洲| 亚洲av五月六月丁香网| 大型av网站在线播放| 不卡av一区二区三区| av在线天堂中文字幕| 脱女人内裤的视频| 免费在线观看完整版高清| 亚洲av电影不卡..在线观看| 美女大奶头视频| 亚洲人成伊人成综合网2020| 日本五十路高清| 麻豆成人午夜福利视频| 两个人的视频大全免费| 欧美av亚洲av综合av国产av| 国产伦人伦偷精品视频| 国产激情偷乱视频一区二区| 亚洲精品美女久久久久99蜜臀| 中文在线观看免费www的网站 | 国产精品自产拍在线观看55亚洲| 一级毛片精品| 99在线人妻在线中文字幕| 午夜日韩欧美国产| 国产伦人伦偷精品视频| av福利片在线观看| a在线观看视频网站| 一区二区三区高清视频在线| 中文字幕精品亚洲无线码一区| 色在线成人网| 亚洲欧美激情综合另类| 村上凉子中文字幕在线| 蜜桃久久精品国产亚洲av| 欧美在线黄色| 精品久久久久久久久久久久久| 99re在线观看精品视频| 黄色丝袜av网址大全| 免费av毛片视频| 日日夜夜操网爽| 琪琪午夜伦伦电影理论片6080| 中文字幕av在线有码专区| 亚洲精品国产精品久久久不卡| 色综合婷婷激情| 狠狠狠狠99中文字幕| 国产精品乱码一区二三区的特点| 国产精品日韩av在线免费观看| 天堂影院成人在线观看| 他把我摸到了高潮在线观看| 久久久久精品国产欧美久久久| 日本 av在线| 久久人妻av系列| 国产高清视频在线观看网站| 看片在线看免费视频| 无遮挡黄片免费观看| 深夜精品福利| 成人一区二区视频在线观看| 国产三级黄色录像| 久久精品亚洲精品国产色婷小说| cao死你这个sao货| 手机成人av网站| 一边摸一边抽搐一进一小说| 亚洲欧美精品综合一区二区三区| 又大又爽又粗| 亚洲 欧美 日韩 在线 免费| 亚洲一卡2卡3卡4卡5卡精品中文| 国产私拍福利视频在线观看| 中亚洲国语对白在线视频| 久久精品国产综合久久久| 久久久精品欧美日韩精品| 在线十欧美十亚洲十日本专区| 国产男靠女视频免费网站| 欧美成人性av电影在线观看| 免费无遮挡裸体视频| 青草久久国产| 99国产极品粉嫩在线观看| 成人三级做爰电影| 久久精品影院6| 无限看片的www在线观看| 日本一二三区视频观看|