• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    A Distributed Framework for Large-scale Protein-protein Interaction Data Analysis and Prediction Using MapReduce

    2022-10-26 12:24:06LunHuShichengYangXinLuoSeniorHuaqiangYuanKhaledSedraouiandMengChuZhou
    IEEE/CAA Journal of Automatica Sinica 2022年1期

    Lun Hu,, Shicheng Yang, Xin Luo, Senior,Huaqiang Yuan, Khaled Sedraoui, and MengChu Zhou,

    Abstract—Protein-protein interactions are of great significance for human to understand the functional mechanisms of proteins.With the rapid development of high-throughput genomic technologies, massive protein-protein interaction (PPI) data have been generated, making it very difficult to analyze them efficiently. To address this problem, this paper presents a distributed framework by reimplementing one of state-of-the-art algorithms, i.e., CoFex, using MapReduce. To do so, an in-depth analysis of its limitations is conducted from the perspectives of efficiency and memory consumption when applying it for largescale PPI data analysis and prediction. Respective solutions are then devised to overcome these limitations. In particular, we adopt a novel tree-based data structure to reduce the heavy memory consumption caused by the huge sequence information of proteins. After that, its procedure is modified by following the MapReduce framework to take the prediction task distributively.A series of extensive experiments have been conducted to evaluate the performance of our framework in terms of both efficiency and accuracy. Experimental results well demonstrate that the proposed framework can considerably improve its computational efficiency by more than two orders of magnitude while retaining the same high accuracy.

    I. INTRODUCTION

    PROTEINS are crucial to provide valuable insight into understanding the mechanisms of cellular functions for a variety of living organisms. However, proteins have to interact with each other in order to function well [1]. In this regard,protein-protein interactions (PPIs) are of interest in biology because of their ability of regulating a variety of cellular processes, including but not limited to metabolic cycles, cell signaling and DNA transcription. Due to the successful applications of machine learning techniques in bioinformatics[2], [3], a considerable number of computational techniques have been developed in recent years to complete the task of PPI prediction.

    When performing PPI prediction, most of existing algorithms require additional information of proteins to be known in advance [4]. Some of them take into account genomic information, such as the conversion of gene-order[5], [6] and the priors of genomic features among interacting proteins [7]. Motivated by the observation that coevolving proteins are more likely to interact with each other, several techniques address the problem from an alternative viewpoint by making use of evolutionary information among proteins,such as phylogenetic trees [8], [9] and protein domain information [10]–[12]. Though efficient, these two kinds of algorithms commonly suffer from the disadvantage that both of genomic and coevolutionary information required by them is not always available especially for newly discovered proteins. To overcome this disadvantage, some attempts have been made by using the sequence information of proteins to predict PPIs [13]–[17], as such sequence information can be easily obtained by high-throughput technologies. Since the performance of sequence-based algorithms heavily relies on the evidence collected from protein sequences, different strategies are adopted to extract appropriate patterns from protein sequences for more accurate PPI prediction.

    In [18], a novel feature extraction algorithm, namely CoFex,has been proposed to make use of variable-lengthk-mers.Moreover, unlike existing sequence-based algorithms that intend to concatenate the feature vectors of two proteins,CoFex explicitly composes feature vectors for pairs of proteins based on the presence ofk-mers that provide certain evidence to support the existence of interactions. It can then discover the relationship between the sequence information of proteins and their interactions in a more intuitive manner. To evaluate its performance, a support vector machine (SVM)classifier with sigmoid kernel is trained based on its extracted feature vectors and the results show that such SVM beats the state-of-the-art algorithms in prediction accuracy.

    With the rapid development of high-throughput genomic technology, PPI data become so massive that most of existing prediction algorithms are no longer favorable. In order to fulfill the demanding requirements for large-scale PPI prediction, it is essential for us to design a scalable prediction algorithm that not only improves the computational efficiency but also retains high accuracy. Since existing prediction algorithms have been proved with a promising performance when applied to predict PPIs, we adopt a rather intuitive strategy by modifying an existing prediction algorithm instead of developing completely new algorithms that are capable of handling such huge data. To this end, we develop a distributed framework in this work by reimplementing CoFex under the MapReduce framework.

    To do so, we first conduct an in-depth analysis to identify the limitations of CoFex in terms of memory consumption and efficiency. There are several points worth noting. First of all,the sequence information of proteins has to be traversed multiple times when CoFex extracts coevolutionary patterns that arek-mers frequently observed in the interacting proteins with statistical significance. Such traversal mechanism could take much time for a large number of proteins. Secondly,CoFex requires the massive data of protein sequence information to be stored in the memory, thus leading to high memory consumption for large-scale PPI prediction. Lastly,its efficiency is constrained by the integration with SVM,which is not applicable to proceed the training process when the number of feature vectors is huge. In this work, respective solutions are developed to overcome these limitations and a distributed framework modified from CoFex, namely CoFex+,is proposed for significantly improved efficiency when applied to large-scale PPI prediction. The experimental results show that CoFex+achieves more than two orders of magnitude improvement in running time with only a negligible loss in accuracy.

    The rest of this paper is organized as follows. In Section II,a detailed literature review about related work of PPI prediction is presented. In Section III, preliminaries about the mathematical symbols used are introduced. The details of CoFex+are described in Section IV. Experimental results are discussed in Section V, following which we present an indepth discussion about the effectiveness and efficiency of CoFex+in Section VI. The paper ends with a conclusion in Section VII.

    II. RELATED WORK

    In this section, a detailed literature review about PPI prediction is presented. Existing algorithms can be classified by following different criteria. To distinguish them, we summarize the representative work by following the biological information used to construct the feature vectors of proteins or PPIs. The main sources of such information include genomic information, evolutionary profiles, and protein sequences, thus leading to three categories of algorithms. Moreover, we introduce recent attempts to conduct large-scale PPI prediction.

    A. Prediction Algorithms Using Genomic Information

    Genes located in genome sequences can hint at the existence of interaction between pairwise proteins at a comprehensive level due to the availability of whole genomic sequencing data. Dandekaret al. [5] find that proteins encoded by conserved gene pairs are more likely to interact with each other and such conserved gene pairs are within a low level conservation of gene-order. Although this algorithm is useful to predict PPIs by using the conservation of gene-order, it is incapable of predicting the interaction of proteins that are not conserved for gene sequences, such as those encoded by distantly located genes.

    In [7], different genomic features, such as messenger RNA coexpression, coessentiality and colocalization, are used to quantify the associations between them and PPIs. Based on these quantified associations, Bayesian networks are constructed to predict PPIs when the genomic features of query proteins are provided.

    Another motivation about the use of genomic information for PPI prediction is that interacting proteins are found to be homologous in another genome where they are fused into a single protein [19]. Furthermore, Enrightet al. [20] have developed a computational algorithm to identify such fusion events in different genomes to predict the PPIs of proteins involved in these events. However, its disadvantage is that it is unable to work with proteins whose fusion events are uncovered through the analysis of genomic sequencing.

    B. Prediction Algorithms Using Evolutionary Profiles

    Evolutionary information of proteins reveals the procedure of how proteins evolve across different species. Since proteins that co-evolve are more likely to interact with each other, the similarity in evolutionary information is of potential relevance to the prediction of PPIs as it indicates to what extent the two proteins co-evolve.

    Pazos and Valencia [8] make use of phylogenetic trees of proteins to indicate PPIs. They propose a distance measure to compute the similarity among the phylogenetic trees of proteins, thus determining whether there is a possible interaction among them. Another source of evolutionary information is the domain knowledge of proteins. It is believed that proteins interact as a result of their interacting domains, and hence there are many computational algorithms proposed to predict PPIs based on domain knowledge. A maximum likelihood estimation algorithm is applied to identify interacting domains that infer curated PPIs. Then with such inferred interacting domains the interactions between proteins can be predicted [21]. Similarly, random forest of decision trees is trained by taking into consideration all the proteins domains; and then used to predict PPIs [22].Maetschkeet al. [23] introduce the concept of inducers to make use of gene ontology (GO) information more effectively. Given two GO terms, their induced term set is composed of all GO terms along the shortest path between the two terms. After that, GO2PPI is proposed by integrating all induced GO terms with popular classification techniques to predict PPIs.

    C. Prediction Algorithms Using Protein Sequences

    Protein sequences, composed of amino acids, are the primary structures of proteins and the motivation behind the use of protein sequences for predicting PPIs derives from the hypothesis that sequence information may contribute to mediate PPIs.

    When extracting patterns from protein sequences, sequencebased algorithms normally takek-mers into account to compose feature vectors of proteins. Thesek-mers are amino acid sequence segments with length equal tok. Taking [13] as an example, the value of each element in the feature vector of a protein is the number of occurrences of a particular 3-mers in the sequence of that protein. Once all the feature vectors are obtained for all the proteins in the training dataset, these sequence-based algorithms apply them to traditional classifiers, such as SVM and random forest, to distinguish between interacting and non-interacting proteins. Similar to[13], an SVM-based prediction algorithm is proposed by combining a kernel function and a conjoint triad feature for the description of amino acids [14].

    In addition to SVM-based algorithms, Pitreet al. [24]propose PIPE to tackle the problem of predicting PPIs from a different perspective. Their idea is to measure how often pairs of subsequences in the two query proteins co-occur in pairs of proteins that are known to interact. In [25], PPIevo is proposed to extract the feature vectors from Position-Specific Scoring Matrix for each of proteins based on sequence information. It adopts the random forest classifier to predict PPIs based on the feature vectors of all protein pairs involved.

    Instead of constructing feature vectors for individual proteins, CoFex [18] intends to extract coevolutionary features from pairs of protein sequences for predicting PPIs. DNN-PPI[26] develop a novel deep neural network framework to automatically extract features, including including semantic associations between amino acids, position-related sequence segments, and their long- and short-term dependencies, from protein primary sequences. As a recent attempt, a novel computational model has been developed to predict PPIs effectively [17]. To do so, it first combines the F-vector,composition and transition to map each protein sequence onto numeric feature vectors, and then employs a principal component analysis to reconstruct the most discriminative feature subspaces as the input for weighted sparse representation classification.

    D. Prediction Algorithms Using PPI Network

    Since PPI networks have rich connectivity patterns [1],network-based prediction algorithms have been developed to make use of these patterns to characterize known PPIs in a given PPI network. Defining a degree-normalized score based on network paths with length three, L3 [27] is able to determine whether two proteins are interact or not if one is similar to the interacting partners of the other. To alleviate the restriction on the length of network paths, Wanget al. [28]develop a novel prediction algorithm based on a mixed membership stochastic block model. The algorithm simulates the generative process of a PPI network according to the likelihoods of pairwise proteins being grouped in the same protein complex. Recently, an improved graph representation learning method, namely S-VGAE, is proposed to predict PPIs based on both sequence information and network structure[29].

    E. Prediction Algorithms for Large-scale PPI Data

    At present, PPIs that have been identified take less than 20% of the whole interactome [30]. With the development of high-throughput technologies, the size and complexity of protein interaction data have increased significantly. A new challenge is thus raised for large-scale PPI prediction.Recently, several attempts have been made in this field.

    In order to achieve the purpose of effectively and accurately predicting large-scale PPIs, Youet al. [31] propose a parallel SVM model by only requiring the use of protein sequence information for large-scale PPI prediction. Huet al. [32]propose a large-scale protein interaction prediction algorithm based on MapReduce. It first extracts amino acid fragments from sequences of proteins for statistical analysis, and then constructs the corresponding feature vectors of proteins to train SVM. Motivated by the idea of distributed computing,Chenet al. [33] propose a Multi-source Learning based Protein Community Detection algorithm by integrating gene expression data and implement its parallel version on the Apache Spark platform to enhance computational efficiency.

    F. Noise Reduction in PPI Prediction

    Due to the limitation of high-throughput genomic experiments, existing PPI data suffers the disadvantage of high false-positive and false-negative rates, which could indicate the existence of noise in PPI data. In this regard,several attempts have been made to reduce noise and can also be applied to noise reduction in PPI prediction. Biet al. [34]examine a variety of sequence smoothing methods to eliminate the interference of noise points in the original sequence, and identify Savitzky-Golay (SG) filter as the most effective method among them. Similarly, SGW-SCN(Savitzky-Golay and wavelet-supported stochastic configuration networks) [35] adopts the SG filter method to remove possible outliers and noises in non-stationary workload time series. Kritikoset al. [36] integrates several well established PPI weighting methods to assign weights that can indicate the confidence level that a given PPI is a true-positive one. Luoet al. [37] propose a collaborative filtering-enhanced topology-based method to compute an inter-neighbourhood similarity for assessing PPIs.

    III. PRELIMINARIES

    A. Mathematical Notations

    Given an alphabet set Γ={τ1,τ2,...,τnΓ} consisting of totalnΓamino acids, a protein sequenceSis represented asS=(st)1≤t≤ns, wherest∈Γ andnsis the length ofS.Therefore, ak-mer segment starting from the positiontinSis denoted asSt,k=(st,st+1,...,st+k?1), where 1 ≤t≤ns?k+1.

    A pair of coevolving positions with lengthkmeans that there arek?2 do not-care positions between them. For the sake of convenience, we use (τi,τj)k, where τi,τj∈Γ to represent a possible candidate of coevolutionary patterns. IfScontains (τi,τj)k, it means that ?St,k:st=τiandst+k?1=τj.is the set of coevolutionary patterns with lengthk. Assuming thatkmaxis the maximum value ofk, the set of all coevolutionary patterns is denoted as

    B. Efficiency Bottlenecks of CoFex

    As one of state-of-the-art algorithms proposed for PPI prediction, CoFex yields better accuracy than several popular sequence-based algorithms. In the rest of this section, we briefly introduce it and then discuss its efficiency bottlenecks when deployed to perform large-scale PPI prediction.

    CoFex is composed of four steps. First, it targets to identify all coevolutionary patterns so as to compose V with some statistical knowledge. Next, it makes use of mutual information theory to quantify the amount of evidence to support or oppose the existence of interaction for each coevolutionary pattern and obtains a weighted version of V denoted as. Thirdly, for each pair of proteins, it constructs a feature vector based on theLastly, it trains an SVM classifier is trained by CoFex to predict PPIs.

    Although a series of extensive experiments have demonstrated its promising accuracy, its efficiency could not fulfill the demanding requirements for large-scale PPI prediction. According to Fig. 1(b), we note that its running time is exponentially increased with the number of interacting proteins. Moreover, CoFex is unable to perform its task when the number of interacting proteins is larger than 800 000.

    To identify its efficiency bottlenecks, we conduct an indepth analysis into its procedure and identify several key issues. First of all, when extracting coevolutionary patterns to compose V, it has to traverse the sequence information of all proteins for each candidate, and such traversal is very inefficient when the number of proteins increases. Secondly,the heavy computation in its second and third steps further constraints its efficiency. Lastly, due to the memory limitation in a single machine, SVM adopted by CoFex is not feasible for training and testing in the context of big data. Hence, in the following section, we intend to address these bottlenecks of CoFex in a distributed manner so as to well conduct largescale PPI prediction.

    C. MapReduce Framework

    As a programming model, MapReduce is convenient to allow programs running in a high-performance parallel platform consisting of a huge number of computing nodes[38]. Engineers only need to provide high-level parallelism information and design map and reduce functions to be executed in parallel across multiple nodes. Because of its convenience and efficiency, it has become one of the most popular frameworks for large-scale date analysis [31],[39]–[41]. It has two parallel processing phases, i.e.,MapandReduce. Each phase adopts a user-defined function to complete its task. The function used inMaptakes key-value pairs as input and generates another series of key-value pairs as intermediate output to write to local disk. The function used inReduceintends to aggregate all the values with the same key obtained from the output ofMap.

    As for the implementation of CoFex+, a main concern lies in the specialized environments and configuration requirements when using MapReduce. In order to make it easy to use and simplify the construction of complex environments, we choose a distributed integration platform called Hadoop [42]as the base of our experiments. The reason is that Hadoop is an open source cloud computing platform. It implements the MapReduce framework that have been successfully evaluated by a large number of practical applications in many fields,such as machine learning and computational biology.Moreover, we have accumulated solid experiences in implementing distributed algorithms in Hadoop. In addition to Hadoop, it is also possible for us to implement CoFex+by using other distributed platforms, such as spark [43], [44], if we follow the details of CoFex+and design the corresponding MapReduce-like functions provided by these platforms.

    IV. DETAILS OF COFEX+

    In this section, we describe the procedure of CoFex+. It addresses the efficiency bottlenecks of CoFex by following a MapReduce framework. As shown in Fig. 2, it is divided into four steps including V identification, V weighting, feature vector construction, and large-scale PPI prediction. In fact, the structure of CoFex+is similar to that of CoFex, which also has these four steps. However, the structure of CoFex is not compatible with popular distributed integration platforms, and thus it is impossible to explicitly allow CoFex to execute in a distributed manner. Hence, for each step of CoFex+, we follow the MapReduce framework to redesign its process, and divide it into theMapandReducephases. By doing so, we are able to deploy CoFex+ in a distributed integration platform, thus completing the task of large-scale PPI prediction. The two algorithms mentioned in Fig. 2 are independent with each other, and thus they do not have relationships in terms of input and output parameters. Note that all algorithms and procedures of CoFex+are presented in the supplementary material.

    Fig. 1. Efficiency and ROC curves of CoFex+. (a) The efficiency of CoFex+ on the datasets composed of proteins at different magnitudes; (b) the efficiency of CoFex+ on the datasets composed of pairwise proteins at different magnitudes; (c) the ROC curves of CoFex+ on the datasets composed of proteins at different magnitudes.

    Fig. 2. The complete procedure of CoFex+.

    A. Identification

    When verifying each candidate of coevolutionary patterns,CoFex has to traverse the sequence information of all proteins,which is very inefficient for large-scale PPI datasets. To avoid unnecessary traversal, CoFex+adopts a tree-based data structure, namely CF-tree [45], to record the occurrences of all candidates by only traversing the input dataset once. In particular, CF-tree is designed as a multiway tree where each node can take more than two children. For each node in the tree, we define two items, i.e.,AminoAcidandProteinSet. The former is the amino acid assigned to this node, while the latter is a set of proteins where the coevolutionary pattern defined by the root and this node is observed. Each coevolutionary pattern in V is identified by CoFex+with two phases, i.e.,MapandReduce.

    1) Map:In this phase, CoFex+aims to construct a CF-tree from the dataset of protein sequences assigned to an arbitrary computing node running a map task. Since the largest length of coevolutionary patterns iskmax, it first splits each sequence, i.e.,S, into a set of segments denoted as{St,kmax}(1 ≤t≤ns?kmax+1). After that, several CF-trees are built by following Algorithm 1 and each CF-tree has a unique amino acid assigned to its root node. For each segment inSt,kmax, we first compose a branch for it and then select the CFtree whose root node is the amino acid at the first position of this segment. Finally, we append the branch to a CF-tree by following Procedure 1. Hence, for each amino acid in this segment, its previous amino acid is the parent node while its next one is the child node. In addition to the growth of the CFtree, the protein owning this segment is added toProteinSetof each node in the corresponding branch.

    Algorithm 1 Construction of CF-trees{S} kmax Input: protein sequences , {Tτi}Output: CF-trees Tτi AminoAcid τi 1: initialize each by setting the value of as for its root node{S}2: for S in do St,kmax {St,kmax}3: for in do Grow(Tst,St,kmax)4: call 5: end for 6: end for{Tτi}7: return

    Procedure 1 Grow(Tst,St,kmax)Input: , TstSt,kmax Output: Tst Node(St,k)AminoAcid st+k?1 St,kmax Tst 1: assume that returns the kth node whose value of is along the branch of in Node(S t,1)2: add the protein of S to i ∈[1,kmax ?1]3: for each do Node(S t,i+1)4: if does not exist then 5: initialize a node by setting the value of as Node(S t,i)AminoAcid st+i 6: set the parent of this node as 7: end if ProteinS et Node(S t,i+1)8: add the protein of S to the of 9: end for 10: return Tst

    The output ofMapare the key-value pairs by following the format 〈τi,Tτi〉, whererepresents the CF-tree whose root node has the value τiin itsAminoAcid.

    2) Reduce:CoFex+tends to merge all the CF-trees whose root nodes have the same value ofAminoAcid, thus identifying the coevolutionary patterns to compose V. Since the input ofReduceis the key-value pairs with format 〈τi,{Tτi}〉, each CFtree inis processed by the same computing node according to a MapReduce framework. The identification of V on each computing node is described by Algorithm 2. In particular, givenCoFex+first merges all CF-trees in it into a single CF-tree, as their root nodes have the same value,i.e.,forAminoAcid. For convenience,used afterward r epresents the merged CF-tree.

    Algorithm 2 Identification V Input: , V τi{Tτi}Output: V=?1: set 2: merge all CF-trees in to compose τj Γ{Tτi} Tτi 3: for in do k=2 →kmax 4: for do f((τi,τ j)k)5: obtain with (1) and defined by Procedure 2 f((τi,τ j)k)≥1.96 6: if then Count(Tτi,τ j,k)7: add to 8: end if 9: end for 10: end for V(τi,τ j)k V 11: return

    Procedure 2 Count(Tτi,τ j,k)Input: , , k Tτiτj Output: the number of occurrences of L={}(τi,τ j)k 1: initialize 2: assume that return all nodes at the depth k of node Nodes(Tτi,k)Nodes(Tτi,k) Tτi 3: for each in do AminoAcid node τj 4: if the of is then L=L∪ProteinS et node 5: of 6: end if 7: end for 8: remove duplicate proteins in L 9: return the size of L

    The next step verifies whether a candidate (τi,τj)kcan be considered as a coevolutionary pattern with some statistical knowledge. Assuming that there are totalnunique proteins in the dataset, we would like to confirm that the value ofCount(,τj,k) is sufficient enough to reach the conclusion t hat the occurrences of (τi,τj)kare frequently observed among the sequences of all proteins. Following the statistical analysis in [18], we can further rewrite the definition off((τi,τj)k) as

    where

    Given (1), we reckon that (τi,τj)kis frequently found am(ong all) protein sequences at a confidence level of 95% iff(τi,τj)k≥1.96, and it can be considered as a coevolutionary pattern. V is obtained once we examine all the candidates by following the same procedure, and hence the output of this phase follows the format of 〈 (τi,τj)k,null〉.

    B. Weighting

    Weighting coevolutionary patterns is another step requiring multiple traversal and heavy computation. In particular, for each pattern, CoFex has to obtain its occurrences in the interacting and non-interacting proteins by traversing the training dataset and then calculate its weight according to mutual information theory. To accelerate such computation,CoFex+decomposes the weighting procedure intoMapandReducephases. InMap, the occurrence information of coevolutionary patterns in the interacting and non-interacting proteins is obtained while the weights of coevolutionary patterns are computed inReduce.

    1) Map:The input dataset is composed of interacting and non-interacting proteins. For each computing node, CoFex+targets to count the occurrences of evolutionary patterns from a part of the input dataset assigned to this node. To do so, it first obtains all the CF-trees and coevolutionary patterns generated in the first step. Since similar counting procedures are applied to all coevolutionary patterns, we take an arbitrary pattern, i.e., (τi,τj)k, as an example to illustrate it. For a particular computing node taking the map task, we assume that I is the dataset of interacting proteins andis the dataset of non-interacting proteins, and both of them are allocated to that computing node. Furthermore, we introduceandto denote the numbers of occurrences of (τi,τj)kin I andrespectively. Verifying (τi,τj)kin the sequences of two proteins is done via Procedure 3.

    Assuming thatpxandpyare two interacting proteins in I,CoFex+examines CF-treeto verify the occurrences of(τi,τj)kin the sequences ofpxandpy. It first obtains all the nodes from the CF-tree at the depth ofkand only disregards those whose labels are not τjas indicated by Line 4 in Procedure 3. For the remaining nodes, CoFex+further checks whetherpxandpyare found in the values of theirProteinSetas indicated by Lines 5–11. If variablesmatchXandmatchYare both true, we reckon that coevolutionary pattern (τi,τj)koccurs in the sequences ofpxandpy. In this regard, the value ofis increased by 1. After examining all pairs of interacting proteins, we obtain the final value of. Similarly,we could obtainfrom.

    ?

    Input: , , , true false Tτi(τi,τ j)k px py Output: or result ←false 1: matchX ←false 2: matchY ←false 3: 4: for each in do AminoAcid node τj node Nodes(Tτi,k)5: if the value of of is then px ∈ProteinS et node 6: if of then matchX=true 7: 8: end if py ∈ProteinS et node 9: if of then matchY=true 10: 11: end if result=matchX&&matchY 12: 13: end if

    14: end for result 15: return

    C. Constructing Feature Vectors

    This step targets to construct a feature vector for each pair of proteins, thus preparing for the training of distributed SVM classifiers in the last step.

    1) Map:CoFex+makes use of CF-trees, i.e., {} and V to construct the feature vectors for pairwise proteins.Accordingly, the input is the pairs of proteins inDintand.CoFex+splits them into many data blocks, which are then processed by the computing nodes inMap.

    Algorithm 3 constructs feature vectors. Taking two proteinspxandpyas an example, it first initializes a feature vector denoted asVxyfor them. Each element inVxyrepresents a unique coevolutionary pattern in V and accordingly the length ofVxyis equal to |V|. For each (τi,τj)k∈V, it selects CF-treeand traverses this tree to verify whether (τi,τj)koccurs in the sequences ofpxandpyas indicated by Line 4 in Algorithm 3. If so, the weight of (τi,τj)k, i.e.,is assigned to the corresponding element inVxyas indicated by Line 5.The output of this phase follows the format of 〈 {px,py},Vxy〉.

    2) Reduce:All the feature vectors generated byMapare saved to the distributed file system.

    D. Performing Large-scale PPI Prediction

    In the final step of CoFex+, we adopt a divide-and-conquer strategy to perform large-scale PPI prediction. To prediction PPIs, we use SVM as the predictor, as the experimental results presented in [18] indicate that the combination of CoFex and SVM has been proved to yield the best accuracy. However,SVMs are unable to handle large-scale training data. To overcome this problem, we first split the training dataset into several data blocks such that SVM could be trained for each block. Hence, the output ofMapfollows the formatwherepxandpyare a pair of query proteins in the testing dataset, andis the prediction score yielded by SVM in them-th computing node. ForReduce, the input of a particular computing node isThus, the final prediction score forpxandpyis the average value of, and can be determined as

    wherenxyis the number of items in {}.

    It is clearly seen that each of efficiency bottlenecks of CoFex is specifically addressed by CoFex+. In particular, the inefficient traversal of CoFex is overcome by CoFex+by using a CF-tree data structure where a training dataset is only traversed once. The heavy computation caused by the second and third steps of CoFex is decomposed into many subtasks by following the MapReduce framework and these subtasks are able to well run in a parallel manner. Instead of training a single SVM, a divide-and-conquer strategy is adopted by CoFex+such that a set of SVMs are trained on different data blocks of the training data. With all these particularly designed solutions, CoFex+is capable of performing the prediction task for large-scale PPIs, to be shown next.

    E. Computational Complexity Analysis

    V. EXPERIMENTAL RESULTS

    To evaluate the accuracy and efficiency of CoFex+, we have conducted a series of experiments and the results are discussed in this section. When analyzing its efficiency, we have also investigated the influences made by the change in the number of computing nodes used inMapandReducephases. Each computing node is equipped with Intel SkyLake processors at 3.0 GHz and 16 GB of RAM.

    A. Benchmarking Datasets

    To obtain convincing results, two kinds of benchmarking datasets are used. One is composed of three small datasets collected from the species of arabidopsis thaliana (AT),escherichia coli (EC) and schizosaccharomyces pombe (SP)and the other involves a huge dataset obtained from the human species. The three small datasets are from STRING [46],which is a database of known PPIs stemmed from knowledge transfer between organisms, and from interactions aggregated from other databases. The latter is from the human protein reference database (HPRD) [15] where all the information has been manually extracted from the literature by expert biologists who read, interpret and analyze the published data.These four datasets have been widely adopted for performance evaluation on PPI prediction [18], [23], [26], [28], [29], and their numbers of proteins and pairs of interacting proteins and non-interacting proteins are presented in Table I.

    TABLE I DESCRIPTION OF BENCHMARKING DATASETS

    Since PPIs generated by high-throughput technologies suffer from the disadvantage of high false-positive and falsenegative rates, such noisy PPI data obviously could negatively influence the accuracy of PPI prediction. To avoid this problem, we only selected PPIs with confidence scores not less than 0.9 from the STRING database for AT, EC and SP datasets. This operation is not applied to the human dataset as all PPIs are manually curated to avoid errors as indicated by[15]. Regarding the generation of non-interacting proteins, we randomly paired up proteins whose interactions were not reported previously.

    B. Results

    To verify whether CoFex+is applicable to large-scale PPI prediction while maintaining a promising accuracy comparable to CoFex, the experiments are divided into two parts. The fist part of experiments focuses on evaluating the accuracy of CoFex+by comparing it with several state-of-theart sequence-based algorithms. Since all the algorithms except CoFex+are incapable of performing their prediction tasks on the Human dataset, another three small benchmarking datasets, i.e., AT, EC and SP, are used for accuracy evaluation. The second part of experiments concentrated on verifying its efficiency for large-scale PPI prediction.

    a) Accuracy

    1) Experimental Setup:In the experiments, we compare the accuracy of CoFex+with CoFex, Ben-Hur and Nobel [13],Shenet al. [14], FCTP-WSRC [17] and S-VGAE [29]. All algorithms are evaluated under the scheme of five-fold crossvalidation to each small dataset. We adopt the receiver operating characteristic (ROC) analysis and use the area under ROC curve (AUC) to compare algorithms’ accuracy. AUC scores are within the range from 0 to 1, and a higher AUC score indicates a better performance in terms of accuracy. In addition, we adopt several evaluation indicators to evaluate them from different perspectives, i.e., i) sensitivity (Sn) that is the percentage of correctly identified PPIs; ii) specificity (Sp)that is the percentage of correctly identified non-interacting proteins; iii) accuracy (Acc) that is the percentage of correctly identified PPIs and non-interacting proteins; iv) matthew’s correlation coefficient (Mcc) that is a more strict evaluation standard considering both under and over predictions.

    Regarding the parameter settings, we set the maximum valuekmaxas 10 in the experiments. For a particular PPI dataset, we verify it from 2 to 10 with a step size of 1 and the one that achieves the best performance is taken as its value.The reason why we set the maximum value ofkmaxas 10 is that after our study we find that the size of Vkis rather small whenkis larger than 10. For the distributed configuration of CoFex+, total 10 computing nodes are used in bothMapandReduce. For the other algorithms we use in the experiments,parameters that have to be determined for them to work properly are set to the values recommended by the authors of their references and we list them in Table II, whereCandγused by all algorithms except FCTP-SWRC and S-VGAE arethe parameters of SVM.

    TABLE II PARAMETER SETTINGS OF ALGORITHMS

    2) Experiment Results:Regarding the AUC scores presented in Table III, CoFex is only better by 1.3% than CoFex+.Moreover, we also perform an independent samples t-test and the results reveals that the difference in the all metrics between CoFex and CoFex+is not significant (p<0.05). The reason for such a slight difference is mainly ascribed to the divide-and-conquer strategy adopted by CoFex+. In particular,a set of SVMs are trained by CoFex+on different data blocks of training data whereas CoFex performs its prediction task on a single SVM. When comparing CoFex+with other sequencebased algorithms, we note the strong performance of CoFex+,as it yields a better average score for each evaluation indicator. We also note that CoFex+performs better than the two state-of-the-art algorithms, i.e., FCTP-SWRC and SVGAE, as the average scores of AUC, Acc and Mcc obtained by CoFex+are larger by 30.2%, 23.6% and 15.1%respectively than those of FCTP-SWRC, and by 7.1%, 7.1%and 7.7% respectively than those of S-VGAE, for the three small datasets. Hence, according to the performance of CoFex+in the three small datasets, we reason that as the reimplementation of CoFex using MapReduce, CoFex+still maintains promising accuracy comparable to CoFex.

    TABLE III PERFORMANCE COMPARISON OF SEQUENCE-BASED ALGORITHMS IN THE AT, E C AND SP DATASETS

    b) Efficiency

    1) Experimental Setup:We apply CoFex and CoFex+to two categories of PPI datasets generated from the Human dataset.The first category includes three datasets composed of proteins at magnitudes of 102, 103, 104and All respectively,and the second category involves the datasets composed of pairwise proteins at magnitudes of 102, 103, 1 04, 105, 106and All respectively. Note that since we concentrate analyzing their efficiency, all the interacting and non-interacting proteins in each dataset are used only for training, and we record their running time. For a fair comparison, CoFex is executed in the machine with the same hardware configuration as for CoFex+.

    Regarding the parameter settings, we set the maximum valuekmaxas 5 and take it as an example to demonstrate their efficiency. For the distributed configuration of CoFex+, total 10 computing nodes are used in theMapandReducephases.

    2) Experimental Results:From Fig. 1, both of them take more running time when either proteins or PPIs increase.Concerning their efficiency comparison, we have the following findings. First, CoFex+is not as efficient as CoFex in all small datasets. A possible reason for that phenomenon is that the time used for transferring data among different computing nodes is considerably larger than that for computation. Second, the advantage of CoFex+is evident in larger datasets, as the blue curves of CoFex+are always below the red ones of CoFex after the intersection points in Fig. 1(a)and Fig. 1(b). Although the increase in the size of training data requires a heavier computation with more time and resources, CoFex+is able to divide such computation into many tiny tasks and assign them to the computing nodes. In this regard, computing nodes only handle small parts of the original computation task in a parallel manner, thus reducing the running time of the entire procedure. Thirdly, from Fig. 1(b), the training task of CoFex is unable to complete when the amount of interacting proteins exceeds 800 000.Although we could upgrade the hardware configuration of our machine, it is not the best solution as the efficiency bottlenecks of CoFex are still unsolved. That is the reason why we must develop CoFex+for large-scale PPI prediction.Lastly, for two datasets from different categories but at the same magnitude, CoFex+takes more time for the dataset in the first category. In other words, for CoFex+, the majority of computation takes place in the second step where coevolutionary patterns are weighted. Hence, we suggest that CoFex+is preferred for large-scale PPI prediction while CoFex for small datasets.

    In order to evaluate whether CoFex+is still effective as the scale of data increases, we apply a five-fold cross-validation to the datasets in the first category, and present corresponding ROC curves in Fig. 1(c). We observe the promising performance of CoFex+when the number of proteins increases. In other words, with the consideration of more protein sequences, CoFex+is able to better identify coevolutionary patterns that are useful for PPI prediction.

    In Fig. 3, we present the speedup performance of CoFex+in the datasets of PPIs at different magnitudes. The baseline to calculate the speedup scores is the running time of CoFex given in Fig. 1(b). It is worth noting that its speedup performance is generally improved when PPI dataset size increases. When the size of PPI dataset exceeds 1 05, it is more obvious, as it can considerably improve the efficiency of CoFex by achieving more than two orders of magnitude improvement in computational efficiency as indicated by the red bar in Fig. 3.

    Fig. 3. The speedup performance of CoFex+ in the datasets composed of PPIs at different magnitudes.

    When compared with CoFex, CoFex+adopts a tree-based data structure, i.e., CF-tree, to accelerate the procedure of identifying coevolutionary patterns. To verify such acceleration effect, we compare the running time of CoFex+using a local mode with that of CoFex on the Human dataset where we have to identify coevolutionary patterns from the sequences of total 13 730 proteins. The reason why we use the local mode for CoFex+in Hadoop is to avoid the bias benefited from distributed computing. The comparison in running time taken by CoFex+and CoFex in the identification of coevolutionary patterns is presented in Fig. 4, where we note that the blue curve representing CoFex+is always below the red one of CoFex. In this regard, we reason that the identification of coevolutionary patterns cost less time with CoFex+than CoFex.

    Fig. 4. The comparison in running time taken by CoFex+ and CoFex in the identification of coevolutionary patterns on the Human dataset.

    In summary, although the increase in the size of training data requires more computation and thus more time and resources, CoFex+is able to divide such computation into many tiny tasks and assign them to the computing nodes.According to the results and discussions presented above,CoFex+yields more promising performance for large datasets in terms of efficiency and is verified to have the ability to complete the task of large-scale PPI prediction.

    Memory Consumption

    1) Experimental Setup:Since the change in memory consumption of CoFex+is similar when we increase the number of computing nodes in eitherMaporReduce, we take the results ofMapas an example to discuss how the memory is consumed by CoFex+when training it by using the entire Human dataset.

    2) Experimental Results:From Table IV, we note that the amount of memory consumed by CoFex+gradually becomes larger when the number of computing nodes inMapincreases.Obviously, CoFex+requires much less memory than CoFex,as CoFex is unable to handle the entire Human dataset by using the machine whose memory is 16 GB. The major reason for this phenomenon is due to the integration with the CF-tree data structure, which saves a lot of memory space by avoiding storing massive protein sequence information. In the step of V Identification, CoFex+constructs at mostnΓCF-trees in each computing node ofMap. That is to say, when we deploy more computing nodes performing the map tasks of CoFex+, the increase in memory consumption is mainly from newly generated CF-trees.

    TABLE IV MEMORY CONSUMPTION OF COFEX+ GIVEN DIFFERENT NUMBER OF COMPUTING NODES IN MAP

    VI. DISCUSSIONS

    When using CoFex+in reality, the only assumption is the possible integration with popular distributed computing platforms, such as Hadoop and Spark. Given that CoFex+is designed by following the MapReduce framework, this assumption can be easily satisfied in reality, as both Hadoop and Spark provide MapReduce-like functions for the integration. In this regard, it is possible to implement CoFex+with Spark by using its MapReduce-like functions. The major reason for us to choose Hadoop is that we have accumulated solid experiences to implement algorithms in Hadoop environment.

    Regardless of the parameters we set for SVM, the only parameter we need to specify in advance iskmax, which is the maximum value ofkand also is the depth of CF-trees constructed in the first step of CoFex+. Although a larger value ofkallows CoFex+exploit more information in protein sequences, the improvement in accuracy is limited as the number of coevolutionary patterns extracted by using a largerkbecomes smaller. Thus, these patterns have less significance,as they are not sufficient enough for SVM to find an optimum hypersurface. Taking the five-fold cross validation on the EC dataset as an example, the number of coevolutionary patterns extracted whenk=2 is 207 while that number is only 39 whenk=10. In other words, for two amino acids located far away from each other in a protein sequence, their ability in terms of providing evidence supporting or refusing the existence of interaction is not as strong as those located closely.

    According to our computational complexity analysis, there are several parameters affecting the efficiency of CoFex+, i.e.,n,,n1,n2, |I| a ndIn particular,nandn? plays a critical role in the first step of CoFex+, as we need to identify coevolutionary patterns from the sequences of all proteins.The last four parameters decides the efficiency of its second and third steps. Specifically, the processes of weighting V and constructing feature vectors are applied to each pair of proteins in |I| andby traversing CF-trees. Moreover, there are also relationships among these parameters. For example,the values ofn1andn2are determined by that ofn, as the sizes of CF-trees will become larger when more protein sequences are considered. Hence, the number of proteins, i.e.,n, and that of pairwise proteins, i.e.,are the key factors impacting the efficiency of CoFex+.

    However, the upper limit of CoFex+’s efficiency exists, as it is impossible to reduce its running time by simply increasing the number of computing nodes. In fact, the runtime overhead of CoFex+is composed of two parts, one is the time used for computation and the other is the time spent to transfer data among different computing nodes. When the number of computing nodes exceeds some threshold, the process of data transfer takes more time than the computation, and thus constrains the further improvement of efficiency.

    VII. CONCLUSION

    In this paper, we for the first time propose a distributed framework, namely CoFex+, to complete a challenging prediction task for large-scale PPIs. Modified from a wellestablished algorithm CoFex, it overcomes CoFex’s efficiency bottlenecks from two perspectives. First, it adopts a tree-based data structure to avoid the heavy memory consumption caused by the huge sequence information of proteins. Second, its implementation is integrated with the MapReduce framework such that the task can be completed in a distributed manner. A series of extensive experiments have been conducted and the results demonstrate that CoFex+can considerably improve the efficiency of CoFex by achieving more than two orders of magnitude improvement in computational efficiency while retaining a comparable level of accuracy for large-scale PPI prediction. Moreover, regarding the distributed configuration of CoFex+, we conclude that the efficiency of CoFex+could gradually be improved till a certain level when the number of computing nodes used in eitherMaporReducephase increases. Our next work intends to integrate the noise reduction methods with CoFex+to further improve its accuracy in PPI prediction. We are also interested in exploring the possibility of applying deep neural network [47]–[49],clustering methods [50]–[54] to further improve the effectiveness of CoFex+.

    黄色配什么色好看| 亚洲人成77777在线视频| 三上悠亚av全集在线观看| 国语对白做爰xxxⅹ性视频网站| 亚洲内射少妇av| 国产 一区精品| 日韩av不卡免费在线播放| 90打野战视频偷拍视频| 亚洲国产精品999| 久久久国产欧美日韩av| 激情视频va一区二区三区| 精品久久久精品久久久| xxxhd国产人妻xxx| 自拍欧美九色日韩亚洲蝌蚪91| 亚洲天堂av无毛| 视频在线观看一区二区三区| 国产一区二区三区综合在线观看| 午夜免费男女啪啪视频观看| 久久久久久久大尺度免费视频| 国产片特级美女逼逼视频| 久久99热这里只频精品6学生| 欧美中文综合在线视频| 搡女人真爽免费视频火全软件| 亚洲美女黄色视频免费看| 亚洲精品久久久久久婷婷小说| 母亲3免费完整高清在线观看 | 国产成人一区二区在线| 欧美变态另类bdsm刘玥| 男人添女人高潮全过程视频| 新久久久久国产一级毛片| 成人二区视频| av女优亚洲男人天堂| 啦啦啦中文免费视频观看日本| 国产精品三级大全| 亚洲av男天堂| 69精品国产乱码久久久| 精品人妻偷拍中文字幕| 国产免费福利视频在线观看| 波野结衣二区三区在线| 久久久国产精品麻豆| 亚洲经典国产精华液单| 国产黄色免费在线视频| 中文字幕另类日韩欧美亚洲嫩草| 老司机影院毛片| 免费大片黄手机在线观看| 人人妻人人爽人人添夜夜欢视频| 亚洲国产精品999| 80岁老熟妇乱子伦牲交| 在线观看人妻少妇| 女的被弄到高潮叫床怎么办| 国产精品久久久久久久久免| 久久久久久久大尺度免费视频| 18禁动态无遮挡网站| 亚洲国产欧美日韩在线播放| 国产精品免费视频内射| a 毛片基地| 久久久久久免费高清国产稀缺| 超碰97精品在线观看| 国产成人欧美| 飞空精品影院首页| 久久久久久人人人人人| 欧美亚洲日本最大视频资源| 欧美日韩精品网址| 在线 av 中文字幕| 国产免费一区二区三区四区乱码| 91精品国产国语对白视频| 午夜久久久在线观看| 国产精品人妻久久久影院| 国产一区二区三区av在线| 一区二区三区精品91| 亚洲精品日韩在线中文字幕| 精品国产一区二区三区久久久樱花| 久久精品人人爽人人爽视色| 黄色配什么色好看| 90打野战视频偷拍视频| 成人亚洲精品一区在线观看| 成人亚洲精品一区在线观看| 黄色一级大片看看| 美女高潮到喷水免费观看| 国产精品二区激情视频| 成年人免费黄色播放视频| 免费看不卡的av| 美女午夜性视频免费| 国产色婷婷99| 97在线人人人人妻| 国产日韩欧美亚洲二区| 波多野结衣av一区二区av| 日本91视频免费播放| 十八禁高潮呻吟视频| 国产亚洲精品第一综合不卡| 精品亚洲成国产av| 一级爰片在线观看| 久久精品熟女亚洲av麻豆精品| 久久久精品免费免费高清| 国产精品不卡视频一区二区| 18+在线观看网站| 久久毛片免费看一区二区三区| 在线观看一区二区三区激情| 免费黄频网站在线观看国产| 日韩精品免费视频一区二区三区| 亚洲三级黄色毛片| 久久久国产欧美日韩av| 日韩视频在线欧美| av在线app专区| 国产xxxxx性猛交| 国产成人精品一,二区| 午夜免费观看性视频| 秋霞伦理黄片| 精品一品国产午夜福利视频| 国产不卡av网站在线观看| 在线天堂中文资源库| 日韩,欧美,国产一区二区三区| 日本黄色日本黄色录像| 黄色一级大片看看| 欧美激情极品国产一区二区三区| 亚洲男人天堂网一区| 亚洲精品一二三| 午夜福利,免费看| 黄色配什么色好看| 伦理电影大哥的女人| 中文字幕人妻丝袜制服| 欧美日韩精品成人综合77777| 国产免费一区二区三区四区乱码| 午夜91福利影院| 中文字幕精品免费在线观看视频| 一级a爱视频在线免费观看| 婷婷成人精品国产| 老女人水多毛片| 老女人水多毛片| 亚洲人成网站在线观看播放| 亚洲欧美成人精品一区二区| 夫妻性生交免费视频一级片| 永久免费av网站大全| 日本午夜av视频| 少妇的丰满在线观看| 国产高清国产精品国产三级| 男男h啪啪无遮挡| 热99国产精品久久久久久7| 最近的中文字幕免费完整| 婷婷成人精品国产| 伊人久久国产一区二区| 久久毛片免费看一区二区三区| 九草在线视频观看| 美女脱内裤让男人舔精品视频| 亚洲精品aⅴ在线观看| 国产黄频视频在线观看| 日韩不卡一区二区三区视频在线| 大片电影免费在线观看免费| av天堂久久9| 少妇猛男粗大的猛烈进出视频| 超碰成人久久| 国产黄色视频一区二区在线观看| 欧美日韩av久久| 免费黄色在线免费观看| 韩国av在线不卡| 国产在线一区二区三区精| 国产日韩欧美在线精品| 亚洲天堂av无毛| 十分钟在线观看高清视频www| 欧美精品国产亚洲| av一本久久久久| 美女福利国产在线| 日本欧美国产在线视频| 老司机亚洲免费影院| 麻豆乱淫一区二区| 蜜桃国产av成人99| 十八禁高潮呻吟视频| 亚洲人成网站在线观看播放| 女人久久www免费人成看片| 色网站视频免费| 这个男人来自地球电影免费观看 | 97人妻天天添夜夜摸| 亚洲五月色婷婷综合| 秋霞在线观看毛片| 黑人巨大精品欧美一区二区蜜桃| 亚洲欧美色中文字幕在线| 久久毛片免费看一区二区三区| 国产一区二区三区综合在线观看| 亚洲人成77777在线视频| 亚洲精品aⅴ在线观看| 成人黄色视频免费在线看| 亚洲美女视频黄频| 亚洲内射少妇av| 人人澡人人妻人| 免费观看性生交大片5| 男人爽女人下面视频在线观看| 青春草国产在线视频| 国产一级毛片在线| 青春草视频在线免费观看| 欧美激情极品国产一区二区三区| 男女免费视频国产| 香蕉国产在线看| 精品久久久久久电影网| 日本猛色少妇xxxxx猛交久久| 精品少妇内射三级| 亚洲五月色婷婷综合| 涩涩av久久男人的天堂| 亚洲男人天堂网一区| 亚洲一码二码三码区别大吗| 看非洲黑人一级黄片| 多毛熟女@视频| 久久精品国产综合久久久| 国产亚洲一区二区精品| 久久久久久免费高清国产稀缺| 亚洲精品一区蜜桃| 亚洲精品一二三| 日本-黄色视频高清免费观看| 久久免费观看电影| 精品久久久久久电影网| 成人漫画全彩无遮挡| av卡一久久| 秋霞在线观看毛片| 午夜福利视频在线观看免费| 伦精品一区二区三区| 在线看a的网站| 亚洲国产av新网站| 成人漫画全彩无遮挡| 日本欧美国产在线视频| 久久精品久久久久久久性| 日本91视频免费播放| 交换朋友夫妻互换小说| 久久精品国产亚洲av涩爱| 熟女电影av网| 男人操女人黄网站| 免费不卡的大黄色大毛片视频在线观看| 精品国产国语对白av| 如何舔出高潮| 欧美人与善性xxx| 国产精品国产三级专区第一集| 女人精品久久久久毛片| 国精品久久久久久国模美| 三上悠亚av全集在线观看| 国产一区二区三区综合在线观看| 一级片免费观看大全| 极品人妻少妇av视频| 黄色怎么调成土黄色| 国产精品久久久久成人av| 啦啦啦啦在线视频资源| 巨乳人妻的诱惑在线观看| 岛国毛片在线播放| 精品国产乱码久久久久久男人| 中文字幕人妻熟女乱码| 国产激情久久老熟女| 最近手机中文字幕大全| 国产日韩欧美视频二区| 美女中出高潮动态图| 久久久久久久久久人人人人人人| 欧美亚洲 丝袜 人妻 在线| av又黄又爽大尺度在线免费看| 久久人人爽av亚洲精品天堂| 国产精品熟女久久久久浪| av线在线观看网站| 午夜福利一区二区在线看| 午夜福利在线观看免费完整高清在| 日韩精品有码人妻一区| 午夜免费鲁丝| 国产精品99久久99久久久不卡 | 欧美日韩亚洲国产一区二区在线观看 | 777米奇影视久久| 卡戴珊不雅视频在线播放| 春色校园在线视频观看| 九九爱精品视频在线观看| 亚洲欧美成人精品一区二区| 纵有疾风起免费观看全集完整版| 最新中文字幕久久久久| 中国国产av一级| 五月天丁香电影| videos熟女内射| 国产欧美亚洲国产| 亚洲精品日韩在线中文字幕| 欧美日韩一级在线毛片| 成人漫画全彩无遮挡| 美女国产高潮福利片在线看| 免费黄频网站在线观看国产| 亚洲成人av在线免费| 国产麻豆69| 久久精品久久精品一区二区三区| 夜夜骑夜夜射夜夜干| 男人舔女人的私密视频| 免费在线观看完整版高清| 高清黄色对白视频在线免费看| 亚洲国产av影院在线观看| 综合色丁香网| 青春草国产在线视频| 日韩精品免费视频一区二区三区| 成人亚洲精品一区在线观看| 26uuu在线亚洲综合色| 国产视频首页在线观看| 热99国产精品久久久久久7| 欧美精品高潮呻吟av久久| 精品国产乱码久久久久久男人| 精品一区二区免费观看| 最近中文字幕高清免费大全6| 在线亚洲精品国产二区图片欧美| 欧美日本中文国产一区发布| av在线观看视频网站免费| 免费不卡的大黄色大毛片视频在线观看| av片东京热男人的天堂| 人人妻人人添人人爽欧美一区卜| 免费看av在线观看网站| 久久婷婷青草| 制服丝袜香蕉在线| 国产熟女午夜一区二区三区| 国产一区二区 视频在线| 人妻少妇偷人精品九色| 精品人妻一区二区三区麻豆| 国产在线免费精品| 在线观看免费日韩欧美大片| 91久久精品国产一区二区三区| 亚洲欧美精品综合一区二区三区 | 高清视频免费观看一区二区| 亚洲欧美精品自产自拍| 一本—道久久a久久精品蜜桃钙片| 精品国产国语对白av| 老熟女久久久| 亚洲av电影在线进入| 只有这里有精品99| 青春草亚洲视频在线观看| 少妇人妻精品综合一区二区| 亚洲欧美色中文字幕在线| 美女脱内裤让男人舔精品视频| av免费观看日本| 亚洲 欧美一区二区三区| 热re99久久精品国产66热6| 伦精品一区二区三区| 国产成人精品无人区| 中文字幕色久视频| 我要看黄色一级片免费的| 久热久热在线精品观看| videosex国产| 国产精品秋霞免费鲁丝片| 王馨瑶露胸无遮挡在线观看| av在线app专区| 国产成人精品在线电影| 亚洲欧洲精品一区二区精品久久久 | 日本-黄色视频高清免费观看| 熟女少妇亚洲综合色aaa.| 超碰97精品在线观看| 女性生殖器流出的白浆| 久久久久视频综合| 一级a爱视频在线免费观看| 国产一区二区三区综合在线观看| 女人精品久久久久毛片| 美女国产高潮福利片在线看| 99re6热这里在线精品视频| 国产成人91sexporn| 女性被躁到高潮视频| 久久亚洲国产成人精品v| 精品国产一区二区久久| 日本欧美国产在线视频| 日韩在线高清观看一区二区三区| 街头女战士在线观看网站| av国产精品久久久久影院| 日日啪夜夜爽| 伊人亚洲综合成人网| 中文字幕最新亚洲高清| 男人舔女人的私密视频| 成人毛片60女人毛片免费| 久久热在线av| 97在线人人人人妻| 亚洲av欧美aⅴ国产| 久久久久久久精品精品| 欧美成人午夜免费资源| 国产福利在线免费观看视频| 中文字幕最新亚洲高清| 亚洲图色成人| 人成视频在线观看免费观看| av天堂久久9| 久久精品国产鲁丝片午夜精品| 精品福利永久在线观看| 国产精品女同一区二区软件| 午夜福利视频在线观看免费| 国产精品国产三级专区第一集| 精品国产国语对白av| 成年动漫av网址| 香蕉国产在线看| 亚洲综合精品二区| 午夜免费观看性视频| 香蕉国产在线看| 一级毛片我不卡| 午夜福利乱码中文字幕| 91aial.com中文字幕在线观看| 日韩欧美精品免费久久| 成人午夜精彩视频在线观看| 观看av在线不卡| 亚洲精品国产av成人精品| 久久久久精品人妻al黑| 免费不卡的大黄色大毛片视频在线观看| 一区二区三区四区激情视频| 巨乳人妻的诱惑在线观看| 2022亚洲国产成人精品| 亚洲国产欧美网| kizo精华| av片东京热男人的天堂| 国产精品免费视频内射| 欧美激情极品国产一区二区三区| av片东京热男人的天堂| 欧美精品亚洲一区二区| www.av在线官网国产| 久久久国产一区二区| 999久久久国产精品视频| 男的添女的下面高潮视频| 亚洲情色 制服丝袜| 国产精品麻豆人妻色哟哟久久| 午夜日本视频在线| 午夜福利视频精品| 精品人妻在线不人妻| 日本午夜av视频| 久久久久国产精品人妻一区二区| 90打野战视频偷拍视频| 国产日韩一区二区三区精品不卡| 在线观看人妻少妇| 亚洲精品自拍成人| 色播在线永久视频| 国产精品国产av在线观看| 如日韩欧美国产精品一区二区三区| 国语对白做爰xxxⅹ性视频网站| 18禁国产床啪视频网站| 日韩一卡2卡3卡4卡2021年| 免费黄色在线免费观看| 90打野战视频偷拍视频| 亚洲精品国产色婷婷电影| 精品人妻一区二区三区麻豆| 欧美亚洲 丝袜 人妻 在线| 精品少妇久久久久久888优播| 久久综合国产亚洲精品| 男男h啪啪无遮挡| 少妇人妻精品综合一区二区| 亚洲精品国产av蜜桃| 丁香六月天网| 精品一区二区三卡| 欧美97在线视频| 午夜福利在线观看免费完整高清在| 一边亲一边摸免费视频| 国产日韩欧美视频二区| 嫩草影院入口| 观看美女的网站| h视频一区二区三区| 久久这里只有精品19| 丝袜脚勾引网站| 成人亚洲精品一区在线观看| 男女国产视频网站| 飞空精品影院首页| 天天躁夜夜躁狠狠久久av| 99久久综合免费| 国产成人一区二区在线| 国产男女内射视频| av线在线观看网站| 哪个播放器可以免费观看大片| 最近2019中文字幕mv第一页| 亚洲成人av在线免费| 看免费av毛片| 亚洲精品一二三| 国产激情久久老熟女| 亚洲av在线观看美女高潮| 亚洲天堂av无毛| 亚洲欧美精品综合一区二区三区 | 少妇被粗大猛烈的视频| 欧美日韩精品网址| 我的亚洲天堂| 久久久精品免费免费高清| 一级毛片 在线播放| 久久久久久久国产电影| 国产精品偷伦视频观看了| 亚洲欧美中文字幕日韩二区| 精品国产乱码久久久久久男人| 日韩成人av中文字幕在线观看| 在线观看免费视频网站a站| 一本色道久久久久久精品综合| 中文字幕制服av| av网站免费在线观看视频| 99九九在线精品视频| 久久精品国产亚洲av涩爱| 男女边吃奶边做爰视频| av在线老鸭窝| 久久久精品国产亚洲av高清涩受| 少妇人妻精品综合一区二区| 国产有黄有色有爽视频| 99久久综合免费| 亚洲精品美女久久av网站| 国产亚洲最大av| 精品一品国产午夜福利视频| 少妇人妻久久综合中文| 亚洲精品国产av蜜桃| 午夜av观看不卡| 777久久人妻少妇嫩草av网站| 肉色欧美久久久久久久蜜桃| 一区二区三区四区激情视频| 18禁国产床啪视频网站| www.精华液| 亚洲国产毛片av蜜桃av| 国产成人91sexporn| 伊人久久大香线蕉亚洲五| 亚洲美女黄色视频免费看| 麻豆乱淫一区二区| 国产女主播在线喷水免费视频网站| videossex国产| 亚洲成色77777| 好男人视频免费观看在线| 人人妻人人澡人人爽人人夜夜| 久久青草综合色| 老汉色av国产亚洲站长工具| 久久久久人妻精品一区果冻| 久久精品aⅴ一区二区三区四区 | 我要看黄色一级片免费的| 成人亚洲欧美一区二区av| 久久国产亚洲av麻豆专区| 久久鲁丝午夜福利片| 国产av国产精品国产| 人人妻人人爽人人添夜夜欢视频| 免费av中文字幕在线| 久久久国产欧美日韩av| 国产国语露脸激情在线看| 自线自在国产av| 最新中文字幕久久久久| 伊人久久大香线蕉亚洲五| 日韩精品有码人妻一区| 久久人人97超碰香蕉20202| 哪个播放器可以免费观看大片| 亚洲国产欧美日韩在线播放| 欧美精品一区二区免费开放| 亚洲精品第二区| 最新的欧美精品一区二区| 丝袜脚勾引网站| 男女啪啪激烈高潮av片| 少妇人妻精品综合一区二区| 男女边摸边吃奶| 曰老女人黄片| 亚洲精品aⅴ在线观看| 在现免费观看毛片| 国产福利在线免费观看视频| 又黄又粗又硬又大视频| av.在线天堂| 2021少妇久久久久久久久久久| 丝袜美腿诱惑在线| 午夜老司机福利剧场| 精品国产一区二区三区久久久樱花| 亚洲国产欧美网| 90打野战视频偷拍视频| av卡一久久| 成人亚洲精品一区在线观看| 中文字幕最新亚洲高清| 免费看不卡的av| 国产亚洲最大av| 欧美老熟妇乱子伦牲交| 中文字幕最新亚洲高清| kizo精华| 色视频在线一区二区三区| 亚洲精品美女久久久久99蜜臀 | 久久亚洲国产成人精品v| 国产精品女同一区二区软件| 亚洲欧美成人精品一区二区| 美女xxoo啪啪120秒动态图| 亚洲第一青青草原| 9色porny在线观看| 精品少妇黑人巨大在线播放| 高清不卡的av网站| 久久女婷五月综合色啪小说| videossex国产| 寂寞人妻少妇视频99o| 午夜激情av网站| 亚洲av中文av极速乱| 亚洲三区欧美一区| 久久99精品国语久久久| 狂野欧美激情性bbbbbb| 国产黄频视频在线观看| 一区二区三区激情视频| 波多野结衣一区麻豆| 亚洲国产成人一精品久久久| 亚洲欧洲国产日韩| 丁香六月天网| 两个人看的免费小视频| 国产成人一区二区在线| 欧美日韩视频高清一区二区三区二| 捣出白浆h1v1| 毛片一级片免费看久久久久| 99精国产麻豆久久婷婷| 欧美国产精品va在线观看不卡| av网站免费在线观看视频| 韩国av在线不卡| 午夜av观看不卡| www.自偷自拍.com| av免费在线看不卡| 精品视频人人做人人爽| 国产片内射在线| 亚洲欧美成人精品一区二区| 校园人妻丝袜中文字幕| av有码第一页| 日本爱情动作片www.在线观看| 在线天堂中文资源库| 91精品伊人久久大香线蕉| 夫妻性生交免费视频一级片| 9191精品国产免费久久| 制服丝袜香蕉在线| 国产精品免费视频内射| 国产精品国产三级专区第一集| 国产成人欧美| 亚洲国产av新网站| 老女人水多毛片| 日韩三级伦理在线观看| 精品一品国产午夜福利视频| 国产片内射在线| 91精品三级在线观看| 91精品国产国语对白视频| 最近中文字幕2019免费版| 欧美日韩精品网址| 在线观看免费日韩欧美大片| 免费黄网站久久成人精品| av免费在线看不卡| 在线观看免费视频网站a站| 亚洲精品久久午夜乱码| 久久久久久久久久人人人人人人| 老司机影院成人| 国产黄色免费在线视频| 亚洲精品中文字幕在线视频|