• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce

    2022-10-26 12:24:06LunHuShichengYangXinLuoSeniorHuaqiangYuanKhaledSedraouiandMengChuZhou
    IEEE/CAA Journal of Automatica Sinica 2022年1期

    Lun Hu,, Shicheng Yang, Xin Luo, Senior,Huaqiang Yuan, Khaled Sedraoui, and MengChu Zhou,

    Abstract—Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies, massive protein-protein interaction (PPI) data have been generated, making it very difficult to analyze them efficiently. To address this problem, this paper presents a distributed framework by reimplementing one of state-of-the-art algorithms, i.e., CoFex, using MapReduce. To do so, an in-depth analysis of its limitations is conducted from the perspectives of efficiency and memory consumption when applying it for largescale PPI data analysis and prediction. Respective solutions are then devised to overcome these limitations. In particular, we adopt a novel tree-based data structure to reduce the heavy memory consumption caused by the huge sequence information of proteins. After that, its procedure is modified by following the MapReduce framework to take the prediction task distributively.A series of extensive experiments have been conducted to evaluate the performance of our framework in terms of both efficiency and accuracy. Experimental results well demonstrate that the proposed framework can considerably improve its computational efficiency by more than two orders of magnitude while retaining the same high accuracy.

    I. INTRODUCTION

    PROTEINS are crucial to provide valuable insight into understanding the mechanisms of cellular functions for a variety of living organisms. However, proteins have to interact with each other in order to function well [1]. In this regard,protein-protein interactions (PPIs) are of interest in biology because of their ability of regulating a variety of cellular processes, including but not limited to metabolic cycles, cell signaling and DNA transcription. Due to the successful applications of machine learning techniques in bioinformatics[2], [3], a considerable number of computational techniques have been developed in recent years to complete the task of PPI prediction.

    When performing PPI prediction, most of existing algorithms require additional information of proteins to be known in advance [4]. Some of them take into account genomic information, such as the conversion of gene-order[5], [6] and the priors of genomic features among interacting proteins [7]. Motivated by the observation that coevolving proteins are more likely to interact with each other, several techniques address the problem from an alternative viewpoint by making use of evolutionary information among proteins,such as phylogenetic trees [8], [9] and protein domain information [10]–[12]. Though efficient, these two kinds of algorithms commonly suffer from the disadvantage that both of genomic and coevolutionary information required by them is not always available especially for newly discovered proteins. To overcome this disadvantage, some attempts have been made by using the sequence information of proteins to predict PPIs [13]–[17], as such sequence information can be easily obtained by high-throughput technologies. Since the performance of sequence-based algorithms heavily relies on the evidence collected from protein sequences, different strategies are adopted to extract appropriate patterns from protein sequences for more accurate PPI prediction.

    In [18], a novel feature extraction algorithm, namely CoFex,has been proposed to make use of variable-lengthk-mers.Moreover, unlike existing sequence-based algorithms that intend to concatenate the feature vectors of two proteins,CoFex explicitly composes feature vectors for pairs of proteins based on the presence ofk-mers that provide certain evidence to support the existence of interactions. It can then discover the relationship between the sequence information of proteins and their interactions in a more intuitive manner. To evaluate its performance, a support vector machine (SVM)classifier with sigmoid kernel is trained based on its extracted feature vectors and the results show that such SVM beats the state-of-the-art algorithms in prediction accuracy.

    With the rapid development of high-throughput genomic technology, PPI data become so massive that most of existing prediction algorithms are no longer favorable. In order to fulfill the demanding requirements for large-scale PPI prediction, it is essential for us to design a scalable prediction algorithm that not only improves the computational efficiency but also retains high accuracy. Since existing prediction algorithms have been proved with a promising performance when applied to predict PPIs, we adopt a rather intuitive strategy by modifying an existing prediction algorithm instead of developing completely new algorithms that are capable of handling such huge data. To this end, we develop a distributed framework in this work by reimplementing CoFex under the MapReduce framework.

    To do so, we first conduct an in-depth analysis to identify the limitations of CoFex in terms of memory consumption and efficiency. There are several points worth noting. First of all,the sequence information of proteins has to be traversed multiple times when CoFex extracts coevolutionary patterns that arek-mers frequently observed in the interacting proteins with statistical significance. Such traversal mechanism could take much time for a large number of proteins. Secondly,CoFex requires the massive data of protein sequence information to be stored in the memory, thus leading to high memory consumption for large-scale PPI prediction. Lastly,its efficiency is constrained by the integration with SVM,which is not applicable to proceed the training process when the number of feature vectors is huge. In this work, respective solutions are developed to overcome these limitations and a distributed framework modified from CoFex, namely CoFex+,is proposed for significantly improved efficiency when applied to large-scale PPI prediction. The experimental results show that CoFex+achieves more than two orders of magnitude improvement in running time with only a negligible loss in accuracy.

    The rest of this paper is organized as follows. In Section II,a detailed literature review about related work of PPI prediction is presented. In Section III, preliminaries about the mathematical symbols used are introduced. The details of CoFex+are described in Section IV. Experimental results are discussed in Section V, following which we present an indepth discussion about the effectiveness and efficiency of CoFex+in Section VI. The paper ends with a conclusion in Section VII.

    II. RELATED WORK

    In this section, a detailed literature review about PPI prediction is presented. Existing algorithms can be classified by following different criteria. To distinguish them, we summarize the representative work by following the biological information used to construct the feature vectors of proteins or PPIs. The main sources of such information include genomic information, evolutionary profiles, and protein sequences, thus leading to three categories of algorithms. Moreover, we introduce recent attempts to conduct large-scale PPI prediction.

    A. Prediction Algorithms Using Genomic Information

    Genes located in genome sequences can hint at the existence of interaction between pairwise proteins at a comprehensive level due to the availability of whole genomic sequencing data. Dandekaret al. [5] find that proteins encoded by conserved gene pairs are more likely to interact with each other and such conserved gene pairs are within a low level conservation of gene-order. Although this algorithm is useful to predict PPIs by using the conservation of gene-order, it is incapable of predicting the interaction of proteins that are not conserved for gene sequences, such as those encoded by distantly located genes.

    In [7], different genomic features, such as messenger RNA coexpression, coessentiality and colocalization, are used to quantify the associations between them and PPIs. Based on these quantified associations, Bayesian networks are constructed to predict PPIs when the genomic features of query proteins are provided.

    Another motivation about the use of genomic information for PPI prediction is that interacting proteins are found to be homologous in another genome where they are fused into a single protein [19]. Furthermore, Enrightet al. [20] have developed a computational algorithm to identify such fusion events in different genomes to predict the PPIs of proteins involved in these events. However, its disadvantage is that it is unable to work with proteins whose fusion events are uncovered through the analysis of genomic sequencing.

    B. Prediction Algorithms Using Evolutionary Profiles

    Evolutionary information of proteins reveals the procedure of how proteins evolve across different species. Since proteins that co-evolve are more likely to interact with each other, the similarity in evolutionary information is of potential relevance to the prediction of PPIs as it indicates to what extent the two proteins co-evolve.

    Pazos and Valencia [8] make use of phylogenetic trees of proteins to indicate PPIs. They propose a distance measure to compute the similarity among the phylogenetic trees of proteins, thus determining whether there is a possible interaction among them. Another source of evolutionary information is the domain knowledge of proteins. It is believed that proteins interact as a result of their interacting domains, and hence there are many computational algorithms proposed to predict PPIs based on domain knowledge. A maximum likelihood estimation algorithm is applied to identify interacting domains that infer curated PPIs. Then with such inferred interacting domains the interactions between proteins can be predicted [21]. Similarly, random forest of decision trees is trained by taking into consideration all the proteins domains; and then used to predict PPIs [22].Maetschkeet al. [23] introduce the concept of inducers to make use of gene ontology (GO) information more effectively. Given two GO terms, their induced term set is composed of all GO terms along the shortest path between the two terms. After that, GO2PPI is proposed by integrating all induced GO terms with popular classification techniques to predict PPIs.

    C. Prediction Algorithms Using Protein Sequences

    Protein sequences, composed of amino acids, are the primary structures of proteins and the motivation behind the use of protein sequences for predicting PPIs derives from the hypothesis that sequence information may contribute to mediate PPIs.

    When extracting patterns from protein sequences, sequencebased algorithms normally takek-mers into account to compose feature vectors of proteins. Thesek-mers are amino acid sequence segments with length equal tok. Taking [13] as an example, the value of each element in the feature vector of a protein is the number of occurrences of a particular 3-mers in the sequence of that protein. Once all the feature vectors are obtained for all the proteins in the training dataset, these sequence-based algorithms apply them to traditional classifiers, such as SVM and random forest, to distinguish between interacting and non-interacting proteins. Similar to[13], an SVM-based prediction algorithm is proposed by combining a kernel function and a conjoint triad feature for the description of amino acids [14].

    In addition to SVM-based algorithms, Pitreet al. [24]propose PIPE to tackle the problem of predicting PPIs from a different perspective. Their idea is to measure how often pairs of subsequences in the two query proteins co-occur in pairs of proteins that are known to interact. In [25], PPIevo is proposed to extract the feature vectors from Position-Specific Scoring Matrix for each of proteins based on sequence information. It adopts the random forest classifier to predict PPIs based on the feature vectors of all protein pairs involved.

    Instead of constructing feature vectors for individual proteins, CoFex [18] intends to extract coevolutionary features from pairs of protein sequences for predicting PPIs. DNN-PPI[26] develop a novel deep neural network framework to automatically extract features, including including semantic associations between amino acids, position-related sequence segments, and their long- and short-term dependencies, from protein primary sequences. As a recent attempt, a novel computational model has been developed to predict PPIs effectively [17]. To do so, it first combines the F-vector,composition and transition to map each protein sequence onto numeric feature vectors, and then employs a principal component analysis to reconstruct the most discriminative feature subspaces as the input for weighted sparse representation classification.

    D. Prediction Algorithms Using PPI Network

    Since PPI networks have rich connectivity patterns [1],network-based prediction algorithms have been developed to make use of these patterns to characterize known PPIs in a given PPI network. Defining a degree-normalized score based on network paths with length three, L3 [27] is able to determine whether two proteins are interact or not if one is similar to the interacting partners of the other. To alleviate the restriction on the length of network paths, Wanget al. [28]develop a novel prediction algorithm based on a mixed membership stochastic block model. The algorithm simulates the generative process of a PPI network according to the likelihoods of pairwise proteins being grouped in the same protein complex. Recently, an improved graph representation learning method, namely S-VGAE, is proposed to predict PPIs based on both sequence information and network structure[29].

    E. Prediction Algorithms for Large-scale PPI Data

    At present, PPIs that have been identified take less than 20% of the whole interactome [30]. With the development of high-throughput technologies, the size and complexity of protein interaction data have increased significantly. A new challenge is thus raised for large-scale PPI prediction.Recently, several attempts have been made in this field.

    In order to achieve the purpose of effectively and accurately predicting large-scale PPIs, Youet al. [31] propose a parallel SVM model by only requiring the use of protein sequence information for large-scale PPI prediction. Huet al. [32]propose a large-scale protein interaction prediction algorithm based on MapReduce. It first extracts amino acid fragments from sequences of proteins for statistical analysis, and then constructs the corresponding feature vectors of proteins to train SVM. Motivated by the idea of distributed computing,Chenet al. [33] propose a Multi-source Learning based Protein Community Detection algorithm by integrating gene expression data and implement its parallel version on the Apache Spark platform to enhance computational efficiency.

    F. Noise Reduction in PPI Prediction

    Due to the limitation of high-throughput genomic experiments, existing PPI data suffers the disadvantage of high false-positive and false-negative rates, which could indicate the existence of noise in PPI data. In this regard,several attempts have been made to reduce noise and can also be applied to noise reduction in PPI prediction. Biet al. [34]examine a variety of sequence smoothing methods to eliminate the interference of noise points in the original sequence, and identify Savitzky-Golay (SG) filter as the most effective method among them. Similarly, SGW-SCN(Savitzky-Golay and wavelet-supported stochastic configuration networks) [35] adopts the SG filter method to remove possible outliers and noises in non-stationary workload time series. Kritikoset al. [36] integrates several well established PPI weighting methods to assign weights that can indicate the confidence level that a given PPI is a true-positive one. Luoet al. [37] propose a collaborative filtering-enhanced topology-based method to compute an inter-neighbourhood similarity for assessing PPIs.

    III. PRELIMINARIES

    A. Mathematical Notations

    Given an alphabet set Γ={τ1,τ2,...,τnΓ} consisting of totalnΓamino acids, a protein sequenceSis represented asS=(st)1≤t≤ns, wherest∈Γ andnsis the length ofS.Therefore, ak-mer segment starting from the positiontinSis denoted asSt,k=(st,st+1,...,st+k?1), where 1 ≤t≤ns?k+1.

    A pair of coevolving positions with lengthkmeans that there arek?2 do not-care positions between them. For the sake of convenience, we use (τi,τj)k, where τi,τj∈Γ to represent a possible candidate of coevolutionary patterns. IfScontains (τi,τj)k, it means that ?St,k:st=τiandst+k?1=τj.is the set of coevolutionary patterns with lengthk. Assuming thatkmaxis the maximum value ofk, the set of all coevolutionary patterns is denoted as

    B. Efficiency Bottlenecks of CoFex

    As one of state-of-the-art algorithms proposed for PPI prediction, CoFex yields better accuracy than several popular sequence-based algorithms. In the rest of this section, we briefly introduce it and then discuss its efficiency bottlenecks when deployed to perform large-scale PPI prediction.

    CoFex is composed of four steps. First, it targets to identify all coevolutionary patterns so as to compose V with some statistical knowledge. Next, it makes use of mutual information theory to quantify the amount of evidence to support or oppose the existence of interaction for each coevolutionary pattern and obtains a weighted version of V denoted as. Thirdly, for each pair of proteins, it constructs a feature vector based on theLastly, it trains an SVM classifier is trained by CoFex to predict PPIs.

    Although a series of extensive experiments have demonstrated its promising accuracy, its efficiency could not fulfill the demanding requirements for large-scale PPI prediction. According to Fig. 1(b), we note that its running time is exponentially increased with the number of interacting proteins. Moreover, CoFex is unable to perform its task when the number of interacting proteins is larger than 800 000.

    To identify its efficiency bottlenecks, we conduct an indepth analysis into its procedure and identify several key issues. First of all, when extracting coevolutionary patterns to compose V, it has to traverse the sequence information of all proteins for each candidate, and such traversal is very inefficient when the number of proteins increases. Secondly,the heavy computation in its second and third steps further constraints its efficiency. Lastly, due to the memory limitation in a single machine, SVM adopted by CoFex is not feasible for training and testing in the context of big data. Hence, in the following section, we intend to address these bottlenecks of CoFex in a distributed manner so as to well conduct largescale PPI prediction.

    C. MapReduce Framework

    As a programming model, MapReduce is convenient to allow programs running in a high-performance parallel platform consisting of a huge number of computing nodes[38]. Engineers only need to provide high-level parallelism information and design map and reduce functions to be executed in parallel across multiple nodes. Because of its convenience and efficiency, it has become one of the most popular frameworks for large-scale date analysis [31],[39]–[41]. It has two parallel processing phases, i.e.,MapandReduce. Each phase adopts a user-defined function to complete its task. The function used inMaptakes key-value pairs as input and generates another series of key-value pairs as intermediate output to write to local disk. The function used inReduceintends to aggregate all the values with the same key obtained from the output ofMap.

    As for the implementation of CoFex+, a main concern lies in the specialized environments and configuration requirements when using MapReduce. In order to make it easy to use and simplify the construction of complex environments, we choose a distributed integration platform called Hadoop [42]as the base of our experiments. The reason is that Hadoop is an open source cloud computing platform. It implements the MapReduce framework that have been successfully evaluated by a large number of practical applications in many fields,such as machine learning and computational biology.Moreover, we have accumulated solid experiences in implementing distributed algorithms in Hadoop. In addition to Hadoop, it is also possible for us to implement CoFex+by using other distributed platforms, such as spark [43], [44], if we follow the details of CoFex+and design the corresponding MapReduce-like functions provided by these platforms.

    IV. DETAILS OF COFEX+

    In this section, we describe the procedure of CoFex+. It addresses the efficiency bottlenecks of CoFex by following a MapReduce framework. As shown in Fig. 2, it is divided into four steps including V identification, V weighting, feature vector construction, and large-scale PPI prediction. In fact, the structure of CoFex+is similar to that of CoFex, which also has these four steps. However, the structure of CoFex is not compatible with popular distributed integration platforms, and thus it is impossible to explicitly allow CoFex to execute in a distributed manner. Hence, for each step of CoFex+, we follow the MapReduce framework to redesign its process, and divide it into theMapandReducephases. By doing so, we are able to deploy CoFex+ in a distributed integration platform, thus completing the task of large-scale PPI prediction. The two algorithms mentioned in Fig. 2 are independent with each other, and thus they do not have relationships in terms of input and output parameters. Note that all algorithms and procedures of CoFex+are presented in the supplementary material.

    Fig. 1. Efficiency and ROC curves of CoFex+. (a) The efficiency of CoFex+ on the datasets composed of proteins at different magnitudes; (b) the efficiency of CoFex+ on the datasets composed of pairwise proteins at different magnitudes; (c) the ROC curves of CoFex+ on the datasets composed of proteins at different magnitudes.

    Fig. 2. The complete procedure of CoFex+.

    A. Identification

    When verifying each candidate of coevolutionary patterns,CoFex has to traverse the sequence information of all proteins,which is very inefficient for large-scale PPI datasets. To avoid unnecessary traversal, CoFex+adopts a tree-based data structure, namely CF-tree [45], to record the occurrences of all candidates by only traversing the input dataset once. In particular, CF-tree is designed as a multiway tree where each node can take more than two children. For each node in the tree, we define two items, i.e.,AminoAcidandProteinSet. The former is the amino acid assigned to this node, while the latter is a set of proteins where the coevolutionary pattern defined by the root and this node is observed. Each coevolutionary pattern in V is identified by CoFex+with two phases, i.e.,MapandReduce.

    1) Map:In this phase, CoFex+aims to construct a CF-tree from the dataset of protein sequences assigned to an arbitrary computing node running a map task. Since the largest length of coevolutionary patterns iskmax, it first splits each sequence, i.e.,S, into a set of segments denoted as{St,kmax}(1 ≤t≤ns?kmax+1). After that, several CF-trees are built by following Algorithm 1 and each CF-tree has a unique amino acid assigned to its root node. For each segment inSt,kmax, we first compose a branch for it and then select the CFtree whose root node is the amino acid at the first position of this segment. Finally, we append the branch to a CF-tree by following Procedure 1. Hence, for each amino acid in this segment, its previous amino acid is the parent node while its next one is the child node. In addition to the growth of the CFtree, the protein owning this segment is added toProteinSetof each node in the corresponding branch.

    Algorithm 1 Construction of CF-trees{S} kmax Input: protein sequences , {Tτi}Output: CF-trees Tτi AminoAcid τi 1: initialize each by setting the value of as for its root node{S}2: for S in do St,kmax {St,kmax}3: for in do Grow(Tst,St,kmax)4: call 5: end for 6: end for{Tτi}7: return

    Procedure 1 Grow(Tst,St,kmax)Input: , TstSt,kmax Output: Tst Node(St,k)AminoAcid st+k?1 St,kmax Tst 1: assume that returns the kth node whose value of is along the branch of in Node(S t,1)2: add the protein of S to i ∈[1,kmax ?1]3: for each do Node(S t,i+1)4: if does not exist then 5: initialize a node by setting the value of as Node(S t,i)AminoAcid st+i 6: set the parent of this node as 7: end if ProteinS et Node(S t,i+1)8: add the protein of S to the of 9: end for 10: return Tst

    The output ofMapare the key-value pairs by following the format 〈τi,Tτi〉, whererepresents the CF-tree whose root node has the value τiin itsAminoAcid.

    2) Reduce:CoFex+tends to merge all the CF-trees whose root nodes have the same value ofAminoAcid, thus identifying the coevolutionary patterns to compose V. Since the input ofReduceis the key-value pairs with format 〈τi,{Tτi}〉, each CFtree inis processed by the same computing node according to a MapReduce framework. The identification of V on each computing node is described by Algorithm 2. In particular, givenCoFex+first merges all CF-trees in it into a single CF-tree, as their root nodes have the same value,i.e.,forAminoAcid. For convenience,used afterward r epresents the merged CF-tree.

    Algorithm 2 Identification V Input: , V τi{Tτi}Output: V=?1: set 2: merge all CF-trees in to compose τj Γ{Tτi} Tτi 3: for in do k=2 →kmax 4: for do f((τi,τ j)k)5: obtain with (1) and defined by Procedure 2 f((τi,τ j)k)≥1.96 6: if then Count(Tτi,τ j,k)7: add to 8: end if 9: end for 10: end for V(τi,τ j)k V 11: return

    Procedure 2 Count(Tτi,τ j,k)Input: , , k Tτiτj Output: the number of occurrences of L={}(τi,τ j)k 1: initialize 2: assume that return all nodes at the depth k of node Nodes(Tτi,k)Nodes(Tτi,k) Tτi 3: for each in do AminoAcid node τj 4: if the of is then L=L∪ProteinS et node 5: of 6: end if 7: end for 8: remove duplicate proteins in L 9: return the size of L

    The next step verifies whether a candidate (τi,τj)kcan be considered as a coevolutionary pattern with some statistical knowledge. Assuming that there are totalnunique proteins in the dataset, we would like to confirm that the value ofCount(,τj,k) is sufficient enough to reach the conclusion t hat the occurrences of (τi,τj)kare frequently observed among the sequences of all proteins. Following the statistical analysis in [18], we can further rewrite the definition off((τi,τj)k) as

    where

    Given (1), we reckon that (τi,τj)kis frequently found am(ong all) protein sequences at a confidence level of 95% iff(τi,τj)k≥1.96, and it can be considered as a coevolutionary pattern. V is obtained once we examine all the candidates by following the same procedure, and hence the output of this phase follows the format of 〈 (τi,τj)k,null〉.

    B. Weighting

    Weighting coevolutionary patterns is another step requiring multiple traversal and heavy computation. In particular, for each pattern, CoFex has to obtain its occurrences in the interacting and non-interacting proteins by traversing the training dataset and then calculate its weight according to mutual information theory. To accelerate such computation,CoFex+decomposes the weighting procedure intoMapandReducephases. InMap, the occurrence information of coevolutionary patterns in the interacting and non-interacting proteins is obtained while the weights of coevolutionary patterns are computed inReduce.

    1) Map:The input dataset is composed of interacting and non-interacting proteins. For each computing node, CoFex+targets to count the occurrences of evolutionary patterns from a part of the input dataset assigned to this node. To do so, it first obtains all the CF-trees and coevolutionary patterns generated in the first step. Since similar counting procedures are applied to all coevolutionary patterns, we take an arbitrary pattern, i.e., (τi,τj)k, as an example to illustrate it. For a particular computing node taking the map task, we assume that I is the dataset of interacting proteins andis the dataset of non-interacting proteins, and both of them are allocated to that computing node. Furthermore, we introduceandto denote the numbers of occurrences of (τi,τj)kin I andrespectively. Verifying (τi,τj)kin the sequences of two proteins is done via Procedure 3.

    Assuming thatpxandpyare two interacting proteins in I,CoFex+examines CF-treeto verify the occurrences of(τi,τj)kin the sequences ofpxandpy. It first obtains all the nodes from the CF-tree at the depth ofkand only disregards those whose labels are not τjas indicated by Line 4 in Procedure 3. For the remaining nodes, CoFex+further checks whetherpxandpyare found in the values of theirProteinSetas indicated by Lines 5–11. If variablesmatchXandmatchYare both true, we reckon that coevolutionary pattern (τi,τj)koccurs in the sequences ofpxandpy. In this regard, the value ofis increased by 1. After examining all pairs of interacting proteins, we obtain the final value of. Similarly,we could obtainfrom.

    ?

    Input: , , , true false Tτi(τi,τ j)k px py Output: or result ←false 1: matchX ←false 2: matchY ←false 3: 4: for each in do AminoAcid node τj node Nodes(Tτi,k)5: if the value of of is then px ∈ProteinS et node 6: if of then matchX=true 7: 8: end if py ∈ProteinS et node 9: if of then matchY=true 10: 11: end if result=matchX&&matchY 12: 13: end if

    14: end for result 15: return

    C. Constructing Feature Vectors

    This step targets to construct a feature vector for each pair of proteins, thus preparing for the training of distributed SVM classifiers in the last step.

    1) Map:CoFex+makes use of CF-trees, i.e., {} and V to construct the feature vectors for pairwise proteins.Accordingly, the input is the pairs of proteins inDintand.CoFex+splits them into many data blocks, which are then processed by the computing nodes inMap.

    Algorithm 3 constructs feature vectors. Taking two proteinspxandpyas an example, it first initializes a feature vector denoted asVxyfor them. Each element inVxyrepresents a unique coevolutionary pattern in V and accordingly the length ofVxyis equal to |V|. For each (τi,τj)k∈V, it selects CF-treeand traverses this tree to verify whether (τi,τj)koccurs in the sequences ofpxandpyas indicated by Line 4 in Algorithm 3. If so, the weight of (τi,τj)k, i.e.,is assigned to the corresponding element inVxyas indicated by Line 5.The output of this phase follows the format of 〈 {px,py},Vxy〉.

    2) Reduce:All the feature vectors generated byMapare saved to the distributed file system.

    D. Performing Large-scale PPI Prediction

    In the final step of CoFex+, we adopt a divide-and-conquer strategy to perform large-scale PPI prediction. To prediction PPIs, we use SVM as the predictor, as the experimental results presented in [18] indicate that the combination of CoFex and SVM has been proved to yield the best accuracy. However,SVMs are unable to handle large-scale training data. To overcome this problem, we first split the training dataset into several data blocks such that SVM could be trained for each block. Hence, the output ofMapfollows the formatwherepxandpyare a pair of query proteins in the testing dataset, andis the prediction score yielded by SVM in them-th computing node. ForReduce, the input of a particular computing node isThus, the final prediction score forpxandpyis the average value of, and can be determined as

    wherenxyis the number of items in {}.

    It is clearly seen that each of efficiency bottlenecks of CoFex is specifically addressed by CoFex+. In particular, the inefficient traversal of CoFex is overcome by CoFex+by using a CF-tree data structure where a training dataset is only traversed once. The heavy computation caused by the second and third steps of CoFex is decomposed into many subtasks by following the MapReduce framework and these subtasks are able to well run in a parallel manner. Instead of training a single SVM, a divide-and-conquer strategy is adopted by CoFex+such that a set of SVMs are trained on different data blocks of the training data. With all these particularly designed solutions, CoFex+is capable of performing the prediction task for large-scale PPIs, to be shown next.

    E. Computational Complexity Analysis

    V. EXPERIMENTAL RESULTS

    To evaluate the accuracy and efficiency of CoFex+, we have conducted a series of experiments and the results are discussed in this section. When analyzing its efficiency, we have also investigated the influences made by the change in the number of computing nodes used inMapandReducephases. Each computing node is equipped with Intel SkyLake processors at 3.0 GHz and 16 GB of RAM.

    A. Benchmarking Datasets

    To obtain convincing results, two kinds of benchmarking datasets are used. One is composed of three small datasets collected from the species of arabidopsis thaliana (AT),escherichia coli (EC) and schizosaccharomyces pombe (SP)and the other involves a huge dataset obtained from the human species. The three small datasets are from STRING [46],which is a database of known PPIs stemmed from knowledge transfer between organisms, and from interactions aggregated from other databases. The latter is from the human protein reference database (HPRD) [15] where all the information has been manually extracted from the literature by expert biologists who read, interpret and analyze the published data.These four datasets have been widely adopted for performance evaluation on PPI prediction [18], [23], [26], [28], [29], and their numbers of proteins and pairs of interacting proteins and non-interacting proteins are presented in Table I.

    TABLE I DESCRIPTION OF BENCHMARKING DATASETS

    Since PPIs generated by high-throughput technologies suffer from the disadvantage of high false-positive and falsenegative rates, such noisy PPI data obviously could negatively influence the accuracy of PPI prediction. To avoid this problem, we only selected PPIs with confidence scores not less than 0.9 from the STRING database for AT, EC and SP datasets. This operation is not applied to the human dataset as all PPIs are manually curated to avoid errors as indicated by[15]. Regarding the generation of non-interacting proteins, we randomly paired up proteins whose interactions were not reported previously.

    B. Results

    To verify whether CoFex+is applicable to large-scale PPI prediction while maintaining a promising accuracy comparable to CoFex, the experiments are divided into two parts. The fist part of experiments focuses on evaluating the accuracy of CoFex+by comparing it with several state-of-theart sequence-based algorithms. Since all the algorithms except CoFex+are incapable of performing their prediction tasks on the Human dataset, another three small benchmarking datasets, i.e., AT, EC and SP, are used for accuracy evaluation. The second part of experiments concentrated on verifying its efficiency for large-scale PPI prediction.

    a) Accuracy

    1) Experimental Setup:In the experiments, we compare the accuracy of CoFex+with CoFex, Ben-Hur and Nobel [13],Shenet al. [14], FCTP-WSRC [17] and S-VGAE [29]. All algorithms are evaluated under the scheme of five-fold crossvalidation to each small dataset. We adopt the receiver operating characteristic (ROC) analysis and use the area under ROC curve (AUC) to compare algorithms’ accuracy. AUC scores are within the range from 0 to 1, and a higher AUC score indicates a better performance in terms of accuracy. In addition, we adopt several evaluation indicators to evaluate them from different perspectives, i.e., i) sensitivity (Sn) that is the percentage of correctly identified PPIs; ii) specificity (Sp)that is the percentage of correctly identified non-interacting proteins; iii) accuracy (Acc) that is the percentage of correctly identified PPIs and non-interacting proteins; iv) matthew’s correlation coefficient (Mcc) that is a more strict evaluation standard considering both under and over predictions.

    Regarding the parameter settings, we set the maximum valuekmaxas 10 in the experiments. For a particular PPI dataset, we verify it from 2 to 10 with a step size of 1 and the one that achieves the best performance is taken as its value.The reason why we set the maximum value ofkmaxas 10 is that after our study we find that the size of Vkis rather small whenkis larger than 10. For the distributed configuration of CoFex+, total 10 computing nodes are used in bothMapandReduce. For the other algorithms we use in the experiments,parameters that have to be determined for them to work properly are set to the values recommended by the authors of their references and we list them in Table II, whereCandγused by all algorithms except FCTP-SWRC and S-VGAE arethe parameters of SVM.

    TABLE II PARAMETER SETTINGS OF ALGORITHMS

    2) Experiment Results:Regarding the AUC scores presented in Table III, CoFex is only better by 1.3% than CoFex+.Moreover, we also perform an independent samples t-test and the results reveals that the difference in the all metrics between CoFex and CoFex+is not significant (p<0.05). The reason for such a slight difference is mainly ascribed to the divide-and-conquer strategy adopted by CoFex+. In particular,a set of SVMs are trained by CoFex+on different data blocks of training data whereas CoFex performs its prediction task on a single SVM. When comparing CoFex+with other sequencebased algorithms, we note the strong performance of CoFex+,as it yields a better average score for each evaluation indicator. We also note that CoFex+performs better than the two state-of-the-art algorithms, i.e., FCTP-SWRC and SVGAE, as the average scores of AUC, Acc and Mcc obtained by CoFex+are larger by 30.2%, 23.6% and 15.1%respectively than those of FCTP-SWRC, and by 7.1%, 7.1%and 7.7% respectively than those of S-VGAE, for the three small datasets. Hence, according to the performance of CoFex+in the three small datasets, we reason that as the reimplementation of CoFex using MapReduce, CoFex+still maintains promising accuracy comparable to CoFex.

    TABLE III PERFORMANCE COMPARISON OF SEQUENCE-BASED ALGORITHMS IN THE AT, E C AND SP DATASETS

    b) Efficiency

    1) Experimental Setup:We apply CoFex and CoFex+to two categories of PPI datasets generated from the Human dataset.The first category includes three datasets composed of proteins at magnitudes of 102, 103, 104and All respectively,and the second category involves the datasets composed of pairwise proteins at magnitudes of 102, 103, 1 04, 105, 106and All respectively. Note that since we concentrate analyzing their efficiency, all the interacting and non-interacting proteins in each dataset are used only for training, and we record their running time. For a fair comparison, CoFex is executed in the machine with the same hardware configuration as for CoFex+.

    Regarding the parameter settings, we set the maximum valuekmaxas 5 and take it as an example to demonstrate their efficiency. For the distributed configuration of CoFex+, total 10 computing nodes are used in theMapandReducephases.

    2) Experimental Results:From Fig. 1, both of them take more running time when either proteins or PPIs increase.Concerning their efficiency comparison, we have the following findings. First, CoFex+is not as efficient as CoFex in all small datasets. A possible reason for that phenomenon is that the time used for transferring data among different computing nodes is considerably larger than that for computation. Second, the advantage of CoFex+is evident in larger datasets, as the blue curves of CoFex+are always below the red ones of CoFex after the intersection points in Fig. 1(a)and Fig. 1(b). Although the increase in the size of training data requires a heavier computation with more time and resources, CoFex+is able to divide such computation into many tiny tasks and assign them to the computing nodes. In this regard, computing nodes only handle small parts of the original computation task in a parallel manner, thus reducing the running time of the entire procedure. Thirdly, from Fig. 1(b), the training task of CoFex is unable to complete when the amount of interacting proteins exceeds 800 000.Although we could upgrade the hardware configuration of our machine, it is not the best solution as the efficiency bottlenecks of CoFex are still unsolved. That is the reason why we must develop CoFex+for large-scale PPI prediction.Lastly, for two datasets from different categories but at the same magnitude, CoFex+takes more time for the dataset in the first category. In other words, for CoFex+, the majority of computation takes place in the second step where coevolutionary patterns are weighted. Hence, we suggest that CoFex+is preferred for large-scale PPI prediction while CoFex for small datasets.

    In order to evaluate whether CoFex+is still effective as the scale of data increases, we apply a five-fold cross-validation to the datasets in the first category, and present corresponding ROC curves in Fig. 1(c). We observe the promising performance of CoFex+when the number of proteins increases. In other words, with the consideration of more protein sequences, CoFex+is able to better identify coevolutionary patterns that are useful for PPI prediction.

    In Fig. 3, we present the speedup performance of CoFex+in the datasets of PPIs at different magnitudes. The baseline to calculate the speedup scores is the running time of CoFex given in Fig. 1(b). It is worth noting that its speedup performance is generally improved when PPI dataset size increases. When the size of PPI dataset exceeds 1 05, it is more obvious, as it can considerably improve the efficiency of CoFex by achieving more than two orders of magnitude improvement in computational efficiency as indicated by the red bar in Fig. 3.

    Fig. 3. The speedup performance of CoFex+ in the datasets composed of PPIs at different magnitudes.

    When compared with CoFex, CoFex+adopts a tree-based data structure, i.e., CF-tree, to accelerate the procedure of identifying coevolutionary patterns. To verify such acceleration effect, we compare the running time of CoFex+using a local mode with that of CoFex on the Human dataset where we have to identify coevolutionary patterns from the sequences of total 13 730 proteins. The reason why we use the local mode for CoFex+in Hadoop is to avoid the bias benefited from distributed computing. The comparison in running time taken by CoFex+and CoFex in the identification of coevolutionary patterns is presented in Fig. 4, where we note that the blue curve representing CoFex+is always below the red one of CoFex. In this regard, we reason that the identification of coevolutionary patterns cost less time with CoFex+than CoFex.

    Fig. 4. The comparison in running time taken by CoFex+ and CoFex in the identification of coevolutionary patterns on the Human dataset.

    In summary, although the increase in the size of training data requires more computation and thus more time and resources, CoFex+is able to divide such computation into many tiny tasks and assign them to the computing nodes.According to the results and discussions presented above,CoFex+yields more promising performance for large datasets in terms of efficiency and is verified to have the ability to complete the task of large-scale PPI prediction.

    Memory Consumption

    1) Experimental Setup:Since the change in memory consumption of CoFex+is similar when we increase the number of computing nodes in eitherMaporReduce, we take the results ofMapas an example to discuss how the memory is consumed by CoFex+when training it by using the entire Human dataset.

    2) Experimental Results:From Table IV, we note that the amount of memory consumed by CoFex+gradually becomes larger when the number of computing nodes inMapincreases.Obviously, CoFex+requires much less memory than CoFex,as CoFex is unable to handle the entire Human dataset by using the machine whose memory is 16 GB. The major reason for this phenomenon is due to the integration with the CF-tree data structure, which saves a lot of memory space by avoiding storing massive protein sequence information. In the step of V Identification, CoFex+constructs at mostnΓCF-trees in each computing node ofMap. That is to say, when we deploy more computing nodes performing the map tasks of CoFex+, the increase in memory consumption is mainly from newly generated CF-trees.

    TABLE IV MEMORY CONSUMPTION OF COFEX+ GIVEN DIFFERENT NUMBER OF COMPUTING NODES IN MAP

    VI. DISCUSSIONS

    When using CoFex+in reality, the only assumption is the possible integration with popular distributed computing platforms, such as Hadoop and Spark. Given that CoFex+is designed by following the MapReduce framework, this assumption can be easily satisfied in reality, as both Hadoop and Spark provide MapReduce-like functions for the integration. In this regard, it is possible to implement CoFex+with Spark by using its MapReduce-like functions. The major reason for us to choose Hadoop is that we have accumulated solid experiences to implement algorithms in Hadoop environment.

    Regardless of the parameters we set for SVM, the only parameter we need to specify in advance iskmax, which is the maximum value ofkand also is the depth of CF-trees constructed in the first step of CoFex+. Although a larger value ofkallows CoFex+exploit more information in protein sequences, the improvement in accuracy is limited as the number of coevolutionary patterns extracted by using a largerkbecomes smaller. Thus, these patterns have less significance,as they are not sufficient enough for SVM to find an optimum hypersurface. Taking the five-fold cross validation on the EC dataset as an example, the number of coevolutionary patterns extracted whenk=2 is 207 while that number is only 39 whenk=10. In other words, for two amino acids located far away from each other in a protein sequence, their ability in terms of providing evidence supporting or refusing the existence of interaction is not as strong as those located closely.

    According to our computational complexity analysis, there are several parameters affecting the efficiency of CoFex+, i.e.,n,,n1,n2, |I| a ndIn particular,nandn? plays a critical role in the first step of CoFex+, as we need to identify coevolutionary patterns from the sequences of all proteins.The last four parameters decides the efficiency of its second and third steps. Specifically, the processes of weighting V and constructing feature vectors are applied to each pair of proteins in |I| andby traversing CF-trees. Moreover, there are also relationships among these parameters. For example,the values ofn1andn2are determined by that ofn, as the sizes of CF-trees will become larger when more protein sequences are considered. Hence, the number of proteins, i.e.,n, and that of pairwise proteins, i.e.,are the key factors impacting the efficiency of CoFex+.

    However, the upper limit of CoFex+’s efficiency exists, as it is impossible to reduce its running time by simply increasing the number of computing nodes. In fact, the runtime overhead of CoFex+is composed of two parts, one is the time used for computation and the other is the time spent to transfer data among different computing nodes. When the number of computing nodes exceeds some threshold, the process of data transfer takes more time than the computation, and thus constrains the further improvement of efficiency.

    VII. CONCLUSION

    In this paper, we for the first time propose a distributed framework, namely CoFex+, to complete a challenging prediction task for large-scale PPIs. Modified from a wellestablished algorithm CoFex, it overcomes CoFex’s efficiency bottlenecks from two perspectives. First, it adopts a tree-based data structure to avoid the heavy memory consumption caused by the huge sequence information of proteins. Second, its implementation is integrated with the MapReduce framework such that the task can be completed in a distributed manner. A series of extensive experiments have been conducted and the results demonstrate that CoFex+can considerably improve the efficiency of CoFex by achieving more than two orders of magnitude improvement in computational efficiency while retaining a comparable level of accuracy for large-scale PPI prediction. Moreover, regarding the distributed configuration of CoFex+, we conclude that the efficiency of CoFex+could gradually be improved till a certain level when the number of computing nodes used in eitherMaporReducephase increases. Our next work intends to integrate the noise reduction methods with CoFex+to further improve its accuracy in PPI prediction. We are also interested in exploring the possibility of applying deep neural network [47]–[49],clustering methods [50]–[54] to further improve the effectiveness of CoFex+.

    日韩成人在线观看一区二区三区| 黑人猛操日本美女一级片| 国产精品久久视频播放| 亚洲人成77777在线视频| 啦啦啦 在线观看视频| 在线免费观看的www视频| 99国产综合亚洲精品| 一区在线观看完整版| 亚洲精品国产色婷婷电影| 搡老熟女国产l中国老女人| 每晚都被弄得嗷嗷叫到高潮| 国产一区二区在线av高清观看| 日韩大码丰满熟妇| 国产高清国产精品国产三级| 亚洲三区欧美一区| 99热只有精品国产| 国产色视频综合| 国产精品久久电影中文字幕| 长腿黑丝高跟| 欧美成人性av电影在线观看| 女警被强在线播放| 在线观看免费高清a一片| 精品一区二区三区视频在线观看免费 | 少妇 在线观看| 丁香欧美五月| 亚洲av熟女| 欧美中文日本在线观看视频| www日本在线高清视频| 国产av在哪里看| 啦啦啦 在线观看视频| 成年人黄色毛片网站| 国产片内射在线| 精品久久久精品久久久| 亚洲成人精品中文字幕电影 | 黑人猛操日本美女一级片| 亚洲一区中文字幕在线| 精品一品国产午夜福利视频| 欧美在线一区亚洲| 久久久久国产一级毛片高清牌| 黑人操中国人逼视频| 国产高清视频在线播放一区| 亚洲精品美女久久久久99蜜臀| 这个男人来自地球电影免费观看| 黑人欧美特级aaaaaa片| 亚洲国产欧美日韩在线播放| 色综合站精品国产| 99久久精品国产亚洲精品| 亚洲色图综合在线观看| 91字幕亚洲| 日本黄色日本黄色录像| 久久性视频一级片| 欧美日韩视频精品一区| 91av网站免费观看| 久久久国产精品麻豆| 久久性视频一级片| 自拍欧美九色日韩亚洲蝌蚪91| 中文字幕最新亚洲高清| 亚洲色图综合在线观看| 九色亚洲精品在线播放| 91字幕亚洲| 国产av一区二区精品久久| 久久香蕉国产精品| 久久人妻熟女aⅴ| 极品教师在线免费播放| 久久久久久久久久久久大奶| 国产麻豆69| 淫妇啪啪啪对白视频| 咕卡用的链子| 亚洲精品一卡2卡三卡4卡5卡| 精品电影一区二区在线| 国产精品久久久人人做人人爽| 久久香蕉国产精品| 91av网站免费观看| 大型av网站在线播放| 国产成人影院久久av| cao死你这个sao货| 国产精品国产av在线观看| 新久久久久国产一级毛片| 中文亚洲av片在线观看爽| 欧美一区二区精品小视频在线| 精品人妻1区二区| 一a级毛片在线观看| 免费在线观看完整版高清| 日韩高清综合在线| 日日夜夜操网爽| 每晚都被弄得嗷嗷叫到高潮| 五月开心婷婷网| 一进一出抽搐动态| 美女福利国产在线| 国产av在哪里看| 欧美日韩乱码在线| 色尼玛亚洲综合影院| 侵犯人妻中文字幕一二三四区| 他把我摸到了高潮在线观看| 免费在线观看黄色视频的| 丁香六月欧美| 亚洲第一av免费看| 亚洲国产精品一区二区三区在线| 级片在线观看| 不卡一级毛片| videosex国产| 亚洲人成电影免费在线| 午夜福利影视在线免费观看| 免费少妇av软件| 满18在线观看网站| 国产高清国产精品国产三级| 国产伦一二天堂av在线观看| 成人三级黄色视频| 天堂√8在线中文| 男女下面插进去视频免费观看| 欧美中文综合在线视频| 亚洲欧美精品综合一区二区三区| 午夜免费观看网址| 成人黄色视频免费在线看| 国产99久久九九免费精品| ponron亚洲| 国产一区二区三区在线臀色熟女 | 波多野结衣高清无吗| 叶爱在线成人免费视频播放| 国产激情久久老熟女| 亚洲人成网站在线播放欧美日韩| 成人免费观看视频高清| 18禁黄网站禁片午夜丰满| 美女福利国产在线| 久久精品aⅴ一区二区三区四区| 中国美女看黄片| 精品久久久久久久久久免费视频 | 亚洲人成电影观看| 男男h啪啪无遮挡| 国产三级在线视频| 18禁裸乳无遮挡免费网站照片 | 成人手机av| 国产精品电影一区二区三区| 亚洲国产精品sss在线观看 | 一本综合久久免费| 国产精品久久久av美女十八| 久久久久九九精品影院| 国产亚洲精品一区二区www| 亚洲国产看品久久| 亚洲专区字幕在线| 久久精品国产亚洲av香蕉五月| 久久久久九九精品影院| 久久午夜综合久久蜜桃| 男人舔女人的私密视频| 俄罗斯特黄特色一大片| 国产精品永久免费网站| 精品国产超薄肉色丝袜足j| 中文字幕人妻丝袜一区二区| 很黄的视频免费| 国产免费男女视频| 精品国产美女av久久久久小说| 欧美在线一区亚洲| 国产在线精品亚洲第一网站| 女人高潮潮喷娇喘18禁视频| 免费看十八禁软件| 国产99久久九九免费精品| 神马国产精品三级电影在线观看 | 国产亚洲av高清不卡| 亚洲成av片中文字幕在线观看| 真人一进一出gif抽搐免费| 窝窝影院91人妻| 久久欧美精品欧美久久欧美| 亚洲一区二区三区欧美精品| 9191精品国产免费久久| 男女做爰动态图高潮gif福利片 | 欧美另类亚洲清纯唯美| 亚洲成人久久性| 亚洲精品国产精品久久久不卡| 日本wwww免费看| 亚洲自偷自拍图片 自拍| 婷婷六月久久综合丁香| 无人区码免费观看不卡| 黑人巨大精品欧美一区二区mp4| 中文字幕人妻丝袜一区二区| 91麻豆av在线| 一进一出抽搐gif免费好疼 | 麻豆久久精品国产亚洲av | 国产精品免费视频内射| 女性生殖器流出的白浆| 高清在线国产一区| 成人av一区二区三区在线看| 在线看a的网站| 中文字幕最新亚洲高清| 国产三级在线视频| 欧美色视频一区免费| 一本综合久久免费| 老汉色av国产亚洲站长工具| 亚洲精品国产一区二区精华液| 999久久久精品免费观看国产| 90打野战视频偷拍视频| 丰满迷人的少妇在线观看| 男人舔女人的私密视频| a在线观看视频网站| 免费一级毛片在线播放高清视频 | 交换朋友夫妻互换小说| 怎么达到女性高潮| 亚洲片人在线观看| 老司机午夜十八禁免费视频| 一二三四在线观看免费中文在| 国产欧美日韩精品亚洲av| 国产区一区二久久| 身体一侧抽搐| 亚洲七黄色美女视频| 久久影院123| 亚洲全国av大片| 日本a在线网址| 国产精品一区二区在线不卡| 国产野战对白在线观看| 久久九九热精品免费| 亚洲av片天天在线观看| 久久天堂一区二区三区四区| 成人国语在线视频| 动漫黄色视频在线观看| 久久久久久久精品吃奶| 在线国产一区二区在线| 国产又色又爽无遮挡免费看| 在线观看免费午夜福利视频| 午夜激情av网站| 99精品欧美一区二区三区四区| aaaaa片日本免费| 亚洲av五月六月丁香网| 久久久久久大精品| 中文字幕最新亚洲高清| 天天影视国产精品| 如日韩欧美国产精品一区二区三区| 91在线观看av| 日本精品一区二区三区蜜桃| 老司机福利观看| 动漫黄色视频在线观看| cao死你这个sao货| 精品国内亚洲2022精品成人| 黄色女人牲交| 亚洲成a人片在线一区二区| 国产精品99久久99久久久不卡| 在线天堂中文资源库| 精品久久蜜臀av无| 18禁观看日本| 18禁裸乳无遮挡免费网站照片 | 99精品久久久久人妻精品| 咕卡用的链子| 美女高潮到喷水免费观看| 老司机福利观看| 女人被躁到高潮嗷嗷叫费观| 老鸭窝网址在线观看| 日日夜夜操网爽| 99re在线观看精品视频| tocl精华| 亚洲精品一卡2卡三卡4卡5卡| 久久久久久大精品| 免费观看精品视频网站| 亚洲视频免费观看视频| 最新在线观看一区二区三区| 悠悠久久av| 免费人成视频x8x8入口观看| 超碰成人久久| 天天添夜夜摸| 国产成人免费无遮挡视频| 国产精品久久久人人做人人爽| 国产区一区二久久| 99久久人妻综合| 中文欧美无线码| 美女国产高潮福利片在线看| 亚洲av电影在线进入| 久久国产精品人妻蜜桃| 日韩欧美一区二区三区在线观看| 欧美成人免费av一区二区三区| 成人国产一区最新在线观看| 成人黄色视频免费在线看| 欧美日韩精品网址| 男女之事视频高清在线观看| 欧美日韩国产mv在线观看视频| 人人妻人人爽人人添夜夜欢视频| 色尼玛亚洲综合影院| 久久国产乱子伦精品免费另类| 欧美日韩视频精品一区| 久久精品91蜜桃| 国产成人精品久久二区二区91| 精品卡一卡二卡四卡免费| 精品乱码久久久久久99久播| 国产精品久久电影中文字幕| 午夜福利欧美成人| 精品欧美一区二区三区在线| 精品久久久久久,| 99香蕉大伊视频| 久热爱精品视频在线9| 中文字幕人妻丝袜制服| 99久久99久久久精品蜜桃| 亚洲av美国av| 国产人伦9x9x在线观看| 搡老岳熟女国产| 又大又爽又粗| 色播在线永久视频| 欧美 亚洲 国产 日韩一| 亚洲成人国产一区在线观看| 多毛熟女@视频| 涩涩av久久男人的天堂| 波多野结衣av一区二区av| 久久久国产成人免费| 色婷婷久久久亚洲欧美| 亚洲第一青青草原| 国产精品久久电影中文字幕| 18禁观看日本| 中国美女看黄片| 日韩精品中文字幕看吧| 天堂√8在线中文| 日本撒尿小便嘘嘘汇集6| 亚洲欧美日韩另类电影网站| 19禁男女啪啪无遮挡网站| 十分钟在线观看高清视频www| 黑人巨大精品欧美一区二区mp4| 国产成人精品无人区| 久久人妻av系列| 国产欧美日韩一区二区三区在线| 成人黄色视频免费在线看| 欧美人与性动交α欧美软件| 1024香蕉在线观看| 美女 人体艺术 gogo| 亚洲 欧美 日韩 在线 免费| 极品教师在线免费播放| 9色porny在线观看| 国产av在哪里看| 国产国语露脸激情在线看| 女性被躁到高潮视频| a级片在线免费高清观看视频| 久久香蕉激情| 不卡av一区二区三区| 国产欧美日韩一区二区三| 国产麻豆69| 日韩免费av在线播放| 欧美成人性av电影在线观看| 一区二区三区国产精品乱码| 五月开心婷婷网| 欧美日韩亚洲国产一区二区在线观看| 老汉色∧v一级毛片| 一个人免费在线观看的高清视频| 丰满迷人的少妇在线观看| 国产精品影院久久| 久久久久国内视频| 国产视频一区二区在线看| 国产亚洲欧美在线一区二区| 法律面前人人平等表现在哪些方面| 亚洲国产毛片av蜜桃av| 男人操女人黄网站| 亚洲色图av天堂| 日日干狠狠操夜夜爽| 久久热在线av| 国产精品久久久av美女十八| 亚洲成国产人片在线观看| 麻豆久久精品国产亚洲av | 国产精品一区二区免费欧美| 国产精品久久久人人做人人爽| x7x7x7水蜜桃| 久久精品91蜜桃| 亚洲五月婷婷丁香| 免费日韩欧美在线观看| 色综合站精品国产| 看黄色毛片网站| 另类亚洲欧美激情| 黄色片一级片一级黄色片| 999久久久国产精品视频| 啪啪无遮挡十八禁网站| 99久久99久久久精品蜜桃| 成人亚洲精品一区在线观看| 中文字幕最新亚洲高清| 91成人精品电影| 欧美成人性av电影在线观看| 最近最新中文字幕大全电影3 | 十八禁网站免费在线| 久久久久国产一级毛片高清牌| 波多野结衣av一区二区av| 久9热在线精品视频| 欧美乱色亚洲激情| 中文字幕人妻丝袜一区二区| 多毛熟女@视频| 乱人伦中国视频| 黄色片一级片一级黄色片| 黄色成人免费大全| 亚洲九九香蕉| 又黄又爽又免费观看的视频| 久久中文字幕人妻熟女| 精品午夜福利视频在线观看一区| 啦啦啦免费观看视频1| 午夜精品久久久久久毛片777| 免费在线观看视频国产中文字幕亚洲| 国产有黄有色有爽视频| 国产精品野战在线观看 | 黄色视频,在线免费观看| 午夜91福利影院| 黄片小视频在线播放| 国产区一区二久久| 日本欧美视频一区| 黄色片一级片一级黄色片| 日韩欧美国产一区二区入口| 亚洲一卡2卡3卡4卡5卡精品中文| 一二三四在线观看免费中文在| 18美女黄网站色大片免费观看| 91在线观看av| 人人妻人人爽人人添夜夜欢视频| av网站在线播放免费| 日韩 欧美 亚洲 中文字幕| 久久香蕉激情| 精品一区二区三区视频在线观看免费 | 婷婷六月久久综合丁香| 亚洲精华国产精华精| 黄色毛片三级朝国网站| 久久亚洲精品不卡| 成人18禁在线播放| 巨乳人妻的诱惑在线观看| 精品无人区乱码1区二区| 波多野结衣高清无吗| 在线观看免费日韩欧美大片| 老熟妇乱子伦视频在线观看| 丝袜人妻中文字幕| 欧美激情久久久久久爽电影 | 国产精品一区二区三区四区久久 | 1024视频免费在线观看| 亚洲熟女毛片儿| 日韩大码丰满熟妇| 91在线观看av| 一a级毛片在线观看| 久久久水蜜桃国产精品网| 国产人伦9x9x在线观看| 午夜精品国产一区二区电影| 国产午夜精品久久久久久| 亚洲 欧美一区二区三区| 在线永久观看黄色视频| 夜夜夜夜夜久久久久| 波多野结衣av一区二区av| av福利片在线| 国产99白浆流出| 中文字幕人妻熟女乱码| 久久香蕉精品热| 欧美日韩乱码在线| 亚洲专区国产一区二区| 久久99一区二区三区| 黑人操中国人逼视频| 久久人妻熟女aⅴ| 男男h啪啪无遮挡| 天堂√8在线中文| 在线国产一区二区在线| 天堂动漫精品| 两个人看的免费小视频| 成年人免费黄色播放视频| 国产人伦9x9x在线观看| 国产精品国产av在线观看| 精品久久久久久久久久免费视频 | 久久99一区二区三区| 黄色丝袜av网址大全| 亚洲久久久国产精品| 多毛熟女@视频| 久久人人精品亚洲av| 成熟少妇高潮喷水视频| 99久久人妻综合| 国产精品成人在线| 久久久久亚洲av毛片大全| 在线观看免费日韩欧美大片| 久久久久久久精品吃奶| 丁香六月欧美| 精品国产美女av久久久久小说| 人人妻人人添人人爽欧美一区卜| 国产精品免费一区二区三区在线| 精品国产乱子伦一区二区三区| 亚洲成人国产一区在线观看| 国产片内射在线| 国产一区二区三区综合在线观看| 久久影院123| 十八禁网站免费在线| 国产日韩一区二区三区精品不卡| 在线永久观看黄色视频| 欧美最黄视频在线播放免费 | 一级,二级,三级黄色视频| 黄网站色视频无遮挡免费观看| 99国产精品一区二区蜜桃av| 欧美国产精品va在线观看不卡| 亚洲黑人精品在线| 黄色成人免费大全| 亚洲国产欧美一区二区综合| 身体一侧抽搐| 两性夫妻黄色片| 国产精品99久久99久久久不卡| 欧美日韩视频精品一区| 妹子高潮喷水视频| 极品人妻少妇av视频| 欧美一区二区精品小视频在线| 十八禁人妻一区二区| 女人被躁到高潮嗷嗷叫费观| 真人一进一出gif抽搐免费| 亚洲成人免费电影在线观看| 国产成人av教育| 丝袜在线中文字幕| 久久久久久亚洲精品国产蜜桃av| 亚洲精品在线观看二区| 热99re8久久精品国产| 日本欧美视频一区| 精品欧美一区二区三区在线| 欧美激情 高清一区二区三区| 亚洲精品国产色婷婷电影| 一a级毛片在线观看| 我的亚洲天堂| 国产精品一区二区三区四区久久 | 亚洲av日韩精品久久久久久密| 欧美最黄视频在线播放免费 | 午夜免费激情av| 亚洲专区中文字幕在线| 一进一出抽搐动态| av在线播放免费不卡| 精品日产1卡2卡| 国产成年人精品一区二区 | 丰满的人妻完整版| 一级毛片高清免费大全| 十八禁网站免费在线| 久久久国产成人免费| 一级a爱视频在线免费观看| 电影成人av| 手机成人av网站| 69精品国产乱码久久久| 一进一出抽搐gif免费好疼 | 高清欧美精品videossex| 丝袜人妻中文字幕| 中文亚洲av片在线观看爽| 很黄的视频免费| 亚洲少妇的诱惑av| 在线av久久热| 欧美黑人欧美精品刺激| 国产伦一二天堂av在线观看| 亚洲五月婷婷丁香| 日本黄色视频三级网站网址| 性少妇av在线| 亚洲成人免费av在线播放| 欧美日韩亚洲国产一区二区在线观看| 精品一品国产午夜福利视频| 亚洲精品国产色婷婷电影| 亚洲av电影在线进入| netflix在线观看网站| 国产欧美日韩一区二区精品| 91成人精品电影| 亚洲免费av在线视频| 国产精品免费视频内射| 真人一进一出gif抽搐免费| 久久久久国产精品人妻aⅴ院| 亚洲熟妇中文字幕五十中出 | 精品久久久久久电影网| 亚洲精品成人av观看孕妇| 男女床上黄色一级片免费看| 最新在线观看一区二区三区| 激情在线观看视频在线高清| 少妇被粗大的猛进出69影院| 国产激情久久老熟女| 久热这里只有精品99| 热re99久久国产66热| 久久中文字幕一级| 99国产精品一区二区三区| 成人黄色视频免费在线看| 每晚都被弄得嗷嗷叫到高潮| 欧美另类亚洲清纯唯美| av欧美777| 国产极品粉嫩免费观看在线| 黑人巨大精品欧美一区二区蜜桃| 夜夜夜夜夜久久久久| 亚洲少妇的诱惑av| 欧美久久黑人一区二区| 在线天堂中文资源库| 欧美日韩精品网址| 一边摸一边抽搐一进一小说| 侵犯人妻中文字幕一二三四区| 日本vs欧美在线观看视频| 啦啦啦免费观看视频1| 日日爽夜夜爽网站| 成在线人永久免费视频| 精品国产超薄肉色丝袜足j| 久久久国产成人免费| 国产主播在线观看一区二区| 欧美久久黑人一区二区| a级片在线免费高清观看视频| 亚洲情色 制服丝袜| 99热只有精品国产| 高清欧美精品videossex| 99国产精品一区二区蜜桃av| www.自偷自拍.com| 午夜精品久久久久久毛片777| 久久香蕉精品热| 天堂中文最新版在线下载| 一本综合久久免费| 长腿黑丝高跟| 欧美久久黑人一区二区| 宅男免费午夜| 99国产精品99久久久久| www.999成人在线观看| 国产精品久久电影中文字幕| 久久欧美精品欧美久久欧美| 如日韩欧美国产精品一区二区三区| 亚洲午夜理论影院| 国产黄a三级三级三级人| 最新美女视频免费是黄的| 亚洲国产看品久久| 淫妇啪啪啪对白视频| 免费在线观看完整版高清| 伦理电影免费视频| 久久精品影院6| 久久久久精品国产欧美久久久| 天天影视国产精品| 国产精品久久视频播放| 欧美日韩黄片免| 国产精品 国内视频| 热99国产精品久久久久久7| 久久中文字幕人妻熟女| 亚洲专区字幕在线| 99精品欧美一区二区三区四区| 女人被狂操c到高潮| 99久久人妻综合| 视频区图区小说| 国产成人系列免费观看| 国产区一区二久久| 999久久久国产精品视频| 99精品久久久久人妻精品|