• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Knowledge Learning With Crowdsourcing:A Brief Review and Systematic Perspective

    2022-05-23 03:02:38JingZhang
    IEEE/CAA Journal of Automatica Sinica 2022年5期

    Jing Zhang,

    Abstract—Big data have the characteristics of enormous volume, high velocity, diversity, value-sparsity, and uncertainty,which lead the knowledge learning from them full of challenges.With the emergence of crowdsourcing, versatile information can be obtained on-demand so that the wisdom of crowds is easily involved to facilitate the knowledge learning process. During the past thirteen years, researchers in the AI community made great efforts to remove the obstacles in the field of learning from crowds. This concentrated survey paper comprehensively reviews the technical progress in crowdsourcing learning from a systematic perspective that includes three dimensions of data,models, and learning processes. In addition to reviewing existing important work, the paper places a particular emphasis on providing some promising blueprints on each dimension as well as discussing the lessons learned from our past research work,which will light up the way for new researchers and encourage them to pursue new contributions.

    I. INTRODUCTION

    IN today’s era of big data, the acquisition of massive raw data is no longer a tricky thing, but exploring and exploiting knowledge from these data is still full of challenges. Facing the enormous volume, high velocity, diversity, value-sparsity,and uncertainty of big data, the knowledge learning process has never been entirely automated, which though is a beautiful vision in the artificial intelligence (AI) research community.For example, many deep learning models that have achieved great successes in recent years still heavily rely on large datasets with good annotations provided by skilled humans.Therefore, current knowledge discovery and learning process are still inseparable from the investment of a large amount of human labor and wisdom. The emergence of crowdsourcing has provided a viable solution to this difficulty.Crowdsourcing is defined as the practice of obtaining information or input into a task or project by enlisting the services of a large number of people, either paid or unpaid,typically via the Internet [1]. Instead of seeking domain experts to perform data pre-processing, requesters are looking for workers entirely from the Internet to handle raw data,among whom there are professionals, ordinary people,spammers, and even some adversaries. Compared with the traditional way of hiring experts, resorting to crowdsourcing is faster, lower-cost, more creative, but also with more unneglectable uncertainty. Although imperfect, learning with crowdsourcing has appeared many successful cases, from natural language processing [2], computer vision [3],bioinformatics [4] to medical diagnosis [5]. Crowdsourcing has created many opportunities for AI-related disciplines [6].

    AI Community first foresaw the opportunities brought by crowdsourcing to knowledge discovery and machine learning around 2008. In 2008, there appeared two milestone studies[7], [8]. Shenget al.[7] investigated therepeated labelingand majority voting scheme, where requesters ask multiple crowd workers to label the same objects and then determine the(integrated) labels of the objects by voting the different judgments. In addition, they also investigated the impact of integrated labels on the learned prediction models. Compared with [7] that focused on general learning problems by simulation, Snowet al.[8] specified their research on five natural language processing tasks, collecting annotations from the Amazon Mechanical Turk (MTurk) crowdsourcing platform. They used a classical Dawid & Schene’s model [9]to integrate multiple noisy labels and showed that the quality of integrated labels can meet the natural language processing(NLP) requirements if good label aggregation algorithms are used. This work showed the usability of real crowdsourcing annotations in knowledge learning tasks for the first time.From then, during the past thirteen years, researchers in the AI community have developed many techniques to tackle the defects of using crowdsourcing in machine learning. A large number of studies focused on general-purpose technologies of learning from crowdsourced annotated data [10], including statistical truth inference [11], predictive model training with noisy labels [7], [12]–[14], optimization for cost-effectiveness trade-off [15]–[17], etc.

    There are several articles reviewing the technical progress in this field from different angles. For example, Danielet al.[18]reviewed the literature from the perspective of quality control for crowdsourcing. Besides quality control, Chittilappillyet al.[19] comprehensively reviewed a wider work including incentive design and task assignment. Another previous survey [10] focused on both truth inference and predictive model learning while article [11] merely focused on truth inference. However, several years have passed since these reviews were published, and more new technologies have emerged recently. Compared with [20] that reviews the studies from the machine learning viewpoint and also serves as a basis of this study, this paper proposes a bigger systematic knowledge learning framework and derives many interesting and meaningful research topics that existing studies have not touched or deeply investigated.

    Fig. 1. Our systematic perspective of a knowledge-learning-with-crowdsourcing framework. The gray parts are what currently has not been well studied (only a few or no studies can be found). The interactive protocols between crowd workers and knowledge learning systems are out of the scope of this paper.

    The objectives of this paper are as follows: 1) It briefly reviews the typical general-purpose technology in this field from the perspective of knowledge learning in the knowledge discovery from data (KDD) process, which will help readers quickly understand the main scientific issues in the field of knowledge learning with crowdsourcing, the development context of mainstream technologies, and the current state-ofthe-art achieved; 2) It also provides our viewpoints on the development direction of this field, which may illuminate young researchers who want to enter this field. As a concentrated review paper with our perspectives, we do not intend to include every research work with trivial contributions. Instead, we particularly emphasize the forecast of the development trend of techniques and the construction of a larger systematic blueprint that encompasses these techniques. More precisely, what makes this paper quite different from the previous survey papers, which is also its contributions, lies in three points:

    1) We embrace existing techniques into a systematic framework from the perspective of knowledge learning.Differently from the previous machine learning-oriented thought, we believe that the improvement of the knowledge learning process can reflect the advantages of crowdsourcing in terms of diversity. Besides, a new trend in machine learning is to add domain knowledge into learned models. Therefore, a knowledge learning-oriented framework is more accordant with the technical development trend.

    2) We are not intended to comprehensively review all existing studies. Instead, we more emphasize our own thoughts while reviewing the typical progress. We have obtained some experience and lessons in the past ten years of research in this field, which propels us to propose a knowledge-learning framework to deal with future challenges.

    3) We present many future research topics in different dimensions of our knowledge-learning framework and also discuss the technical roadmaps to realize them as well as provide some primary ideas of the solutions, which makes this paper not only a mirror reflecting the past but a guide to the future.

    Fig. 1 shows our proposed systematic perspective of a knowledge-learning framework for crowdsourcing. We categorize the techniques into three dimensions1Although Fig. 1 appears a hierarchical structure, we use the term dimension while not layer because these techniques are independent from one another and have no strict interfaces between them. Thus, they do not form a hierarchical structure.: data, model,and systemic. Each dimension has its own research contents and objectives. The lower-layer techniques can provide supports for the upper layers or directly be used by the upper layers. The data dimension focuses on data fusion from multiple heterogeneous crowdsourced sources (workers and raw data). It provides various kinds of data with different qualities for the model dimension. The model dimension uses these (maybe noisy) data to train robust and complex predictive (knowledge) models. The systemic dimension seeks the techniques that can optimize the knowledge learning process, including reducing the cost, enhancing the capability of workers, and improving the availability of knowledge learning systems. All of them form an overall solution for knowledge learning with crowdsourcing. Now, we discuss the techniques in each dimension and their development trends.

    II. DATA FUSION FOR CROWDSOURCING

    Data acquisition is one of the basic goals for us to use crowdsourcing. Since the data are provided by crowd workers with different characteristics, together with the raw data published on the platforms, this process can be viewed as a kind of data fusion from multiple heterogeneous sources. The machine learning and data mining community first realized the opportunity that crowdsourcing brought to supervised learning, i.e., obtaining class labels for training sets. To improve the quality of labels, both Shenget al.[7] and Snowet al.[8] proposed arepeated-labelingscheme in 2008, which let multiple crowd workers to label the same objects and the true labels of the objects are inferred from these multiple noisy labels. Using the repeated-labeling scheme, truth inference became one of the fundamental topics in knowledge learning with crowdsourcing. In crowdsourcing learning, truth inference is defined as a process that infers (or discovers) true values for unknown and latent variables (such as labels of instances, the community of workers, etc.) and parameters(such as difficulties of instances, reliability of workers, etc.) of crowdsourced annotation systems from the observed data including the original instances and the noisy annotations provided by the crowd workers.

    TABLE I TAXONOMY FOR AGNOSTIC TRUE INFERENCE METHODS FOR CROWDSOURCED ANNOTATION

    A. Truth Inference

    The general-purpose truth inference methods for crowdsourced annotation systems have been well studied in the past. The current mainstream statistical-based methods often work in an agnostic manner, where only observed noisy labels and the original instances are used for inference. The core function of this process is to estimate true labels for instances from their multiple noisy labels. Because the inferred labels for the instances aggregate the judgments of different crowd workers, the process is calledlabel aggregationorlabel integration. Meanwhile, some other information, such as the reliability, dedication, and intention of workers, the difficulty of instances, etc., might be simultaneously inferred, which depends on models. Thus, true inference provides essential knowledge about the crowdsourced annotation environment. Table I summarizes the agnostic truth inference methods for different crowdsourced annotation tasks in this paper.

    1) Agnostic Probabilistic Methods:A large number of agnostic inference methods are based on probability and statistics. These methods can be divided into two main categories according to their differences in probabilistic modeling:probabilistic generativeapproaches andnonprobabilistic generative(discriminative) approaches. Generative approaches use some basic probability distributions to represent the generation process of the crowdsourced labels as probabilistic graphical models, while discriminative approaches do not exactly rely on probabilistic graphical models although they can also use probability modeling.

    Probabilistic generativeapproaches originated from the classic Dawid & Skene’s (DS) model [9], which uses confusion matrices to model the capability of workers. DS is only based on the categorical distribution. An elementin a confusion matrix represents the probability of workerjclassifying an instance as classlgiven its true class isk.Generally, DS is simple, robust and has a good explanation to the capability of a worker over each class. Raykaret al.[12]developed a Bayesian version of this model (namely, RY),which concentrates on modeling the workers’ biases towards the positive and negative classes in binary classification tasks using two parameterssensitivityandspecificity. RY has good performance in binary-class inference. When it is extended to multi-class tasks, the explanation to sensitivity and specificity becomes ambiguous and the model’s performance will deteriorate. However, RY has a wide applicability, and it can also be extended to numeric annotations. Welinder and Perona[15] modeled the probability of a worker providing correct labels. Various aspects of an annotation process can be modeled by a generative model as long as it has a sound probabilistic explanation and generation mode. In addition to the characteristics of workers, difficulty of workers is the most common one to be modeled [21], [25], [29], [33], [41], [44].For example, Whitehillet al.[21] introduced a parameter to model the difficulty of tasks in their GLAD method. GLAD uses a logistic regression model, which makes its performance on real data sets unsatisfactory, because noise labels rarely obey a specific distribution. Welinderet al.[25] proposed a more complicated multi-dimensional model, where noises in the instance features are also considered. More elaborate worker models can be found in [14] and [22], where the dedication and intention of workers are added into their truth inference models. Exploring and exploiting the relationship among workers brings another opportunity to improve the accuracy of inference. Based on the Bayesian classifier combination model (BCC) [26], cBCC [30] explores the community structure hidden behind workers by analyzing their crowdsourced labels, where workers in the same community have similar labeling results, and EBCC [32]exploits worker correlation to improve label aggregation.However, these models usually includes many variables and parameters, which can hardly be applied in big data environments.

    Probabilistic generative models usually assume some prior distributions of variables. They have a good theoretical basis and can be solved by standard solutions for probabilistic graphical models, such as the Markov Chain Monte Carlo (or Gibbs) sampling, EM algorithms, convex optimization, and variational inference. However, their drawback lies in that if actual distributions of variables do not obey the assumptions,the inference accuracy will deteriorate. For example, the more complicated DS [9] and RY [12] with predetermined assumptions occasionally perform worse than the simpler model ZenCrowd [27] (where the reliability of a worker is modeled by a binary variable) on quite a few real-world datasets [52]. Optimization methods for objective functions in probabilistic models also affect their performance. Thus, some studies [28], [31] focus on the optimization of the models. For example, SpectralDS [31] uses the spectral method to obtain an initial estimates of parameters, resulting in a better outcome of its EM procedure.

    Discriminativeapproaches do not require that variables of the models must obey specific probability distributions and the final results are derived from a series of probabilistic inferences. Different mathematical methods such as matrix factorization and convex optimization can be used to obtain the results. The simplest discriminative method is majority voting (MV), which is also called plurality voting (PV) in multi-class cases. Although MV is simple, it is very effective.Thus, researchers are still keen to study its variants [39], [45],[46]. For example, Taoet al.[46] proposed four strategies to model the similarity of crowdsourced labels. Their method gives workers different label quality weights for different samples, and finally integrates labels through weighted MV.The KOS method [40] incorporates singular value decomposition (SVD) of a low-rank matrix with a belief propagation-like procedure to achieve inference. KOS works well when the noisy label matrix is full (each worker labels all instances). Dalviet al.[42] proposed a similar SVD-based method, which relaxes the prerequisite that label matrix must be full. Liuet al.[23] unified MV and KOS under a Bayesian framework (considering prior probabilities) and solved them via variational inference. Inspired by the multi-class support vector machines (SVM), Tian and Zhu [49] proposed a maxmargin majority voting that directly finds the most likely labels for instances by maximizing margins. Zhou and He [50]proposed two structured methods based on tensor augmentation and completion. The two methods use tensor representation for the labeled data, augment it with a ground truth layer, and estimate the true labels via low rank tensor completion. Jianget al.[51] proposed a multiple noisy label distribution propagation method (MNLDP), which considers the relationship between multiple noisy label sets. MNLDP first estimates the distribution of noisy labels for each instance and propagates it to its nearest neighbors. Each instance considers the noise label distribution of itself and its nearest neighbors in label aggregation. Some discriminative methods such as CATD [47] and PM [48] can be extended to numeric labels. Discriminative approaches are usually faster than the probabilistic generative ones.

    2) Difficulties in Agnostic Methods:Both probabilistic generative and discriminative methods work well when label noises are regularly distributed in different categories. In reality, this premise is not always true. As early as 2013, we found that workers’ labeling qualities on the two classes in binary-labeling tasks exhibit significant difference, which is the so-calledbiased labelingphenomenon and will deteriorate most truth inference algorithms [58]. This cognitive bias was further confirmed by successive studies [59], [60]. To deal with the bias, we proposed a PLAT [43] algorithm that can automatically adjust the decision threshold between the inferred positive and negative instances. This topic has been continually attracting the attention of researchers [61], [62].Recently, Gemalmaz and Yin [24] studied a specific cognitive bias, i.e., confirmation bias, which is people’s tendency to favor information that confirms their existing beliefs and values. They proposed a probabilistic graphical model that uses a parameter to model the probability of a worker being subject to confirmation bias. Another difficulty is that if there are spammers or adversarial workers in the system, agnostic methods seldom obtain good results. Some researchers had to use the method of injecting prior information to identify lowquality workers so that their weights can be reduced during inference. For example, ELICE [63] optimized truth inference by injecting expert labels. Not only true labels can be injected but also does the information about workers. Bonald and Combes [64] showed that if the reliability of a small portion of workers can be known, the reliability of all workers can be accurately inferred, and the lower bound on the minimax estimation error can be calculated. Oyamaet al.[65] required workers to provide their confidence levels when performing tasks. In their work, the confidence scores are utilized during inference. Liuet al.[66] proposed a method that selects the most informative instances and maximizes the influence of expert labels injected. The method develops a complete uncertainty assessment for instance selection. The expert labels are propagated to similar instances via regularized Bayesian inference. However, these methods partially break the agnostic nature of truth inference, requiring more information and human interventions.

    B. Improving Aggregation With Learning Models

    It might be unwise that the above agnostic truth inference methods completely ignore the instances themselves during inference. Instance features should be helpful for obtaining better results, but how to utilize them in a domain-independent way is challenging. Some of the past research attempted to achieve the goal by using learning models in both unsupervised and supervised manners.

    Zhanget al.[52] proposed an agnostic inference algorithm GTIC, which generatesconceptualfeatures for instances from the crowdsourced labels of instances and uses a K-means algorithm to cluster all the instances intoKclasses. GTIC fuzzifies the biases that are difficult to describe in multi-class classification and uses clustering to discover these biases.Based on GTIC, its subsequent method [67] also runs a clustering algorithm on thephysicalfeatures of instances and uses the clustering results to correct the potential errors in the inferred results of GTIC. In contrast, the AVNC method [68]first builds a predictive model using a subset of the inferred data in which the instances with highly-probable wrong labels have been filtered out, and then uses this learned predictive model to correct the errors in the inferred labels. Wanget al.[69] used a small portion of high-quality instances to build models to classify the difficulty of unlabeled instances. Liuet al.[70] used predicted labels to improve the performance of label aggregation. Their method captures the characteristics of workers and questions through neural networks, predicts the answers of different workers to the questions, and expands the label set to enhance the performance under sparse data. It is interesting that even transfer learning can be used to improve the truth inference [71], [72]. These studies show that both supervised and unsupervised learning can be used to improve the truth inference.

    In some early probabilistic graphical model-based inference algorithms [14], [25], [73]–[76], the instance features are retained in the models and participate in truth inference together with noisy labels. However, the role of the feature structures of instances in these inference models is still unclear. That is, how to select instance features and how to make the features contribute greatly to the improvement of the accuracy of truth inference requires further studies. In recent years, another new train of thought to utilize instance features in label aggregation resorts to deep learning. As early as 2016,Gauntet al.[53] began to train deep neural networks with two building blocks, namely DeepAgg, for label aggregation. Yinet al.[54] proposed label-aware autoencoders (LAA) to aggregate crowd wisdom. More sophisticated, Rodrigues and Pereira [55] proposed a model CrowdLayer that trains deep neural networks to realize end-to-end learning from crowds(including label aggregation). Chenet al.[56] proposed SpeeLFC, which extends CrowdLayer with interpretable parameters and strengthens the correlation between workers and classes. GCN-Clean [77] uses graph convolution networks(GCNs) to learn the relations between classes. The learned GCN model is used to clean wrong crowdsourced labels. Caoet al.[78] and Liet al.[79] simultaneously aggregate the crowdsourced labels and learn an accurate classifier via multiview learning. Yinet al.[80] proposed a clustering-based label-aware autoencoder for label aggregation. The method uses clustering to aggregate instances with similar features,and constructs a deep generation process to infer the true labels. Liet al.[34] proposed a fully Bayesian deep generative crowdsourcing model (BayesDGC), which combines the deep neural networks on automatic representation learning and the interpretable probabilistic structure encoding of probabilistic graphical models. It is worth noting that many deep learning models such as [53], [55], [56], [77] perform better when they have a small number of training examples with ground truth,which violates the agnostic characteristics of truth inference.

    In summary, instance features potentially help improve the accuracy of truth inference, but how to develop more domainindependent methods requires further investigations.

    C. Perspective

    The objective of the data dimension is to provide highquality data for knowledge model learning. The first direction to do so is that we can enrich the labels from different aspects.This derives the recent hot research on multi-label truth inference, whose core issue is to explore and exploit label correlations, which not only improves the inference accuracy but also reduces the number of labels required. Bragget al.[35] proposed multi-label naive Bayes (MLNB) model. For each label in the model, MLNB constructs a star graph with directed edges from that label to all other labels, which is used to calculate label correlations. Duanet al.[36] studied how to extend the DS model in multi-label settings, proposing P-DS and ND-DS models. The P-DS model groups candidate labels into pairs, and then separately estimates the states of each pair,thereby preventing interference from uncorrelated labels. The ND-DS model depicts the conditional independence properties of the joint distribution over candidate labels as a Bayesian network and approximates the underlying joint distribution by the product of the conditional distributions of candidate labels.Zhang and Wu [37] proposed a multi-class multi-label dependency (MCMLD) model. MCMLD introduces a mixture of multiple independently Multinoulli (or so-called Categorical)distributions to capture the correlation among the labels,together with a set of confusion matrices modeling the reliability of the workers. However, this model is very timeconsuming. The variant of this model, MCMLD-OC [38],limits each label to binary values, thereby reducing the complexity of the model. In addition to these generative methods, Tuet al.[57] proposed a multi-label aggregation model MLCC based on joint matrix decomposition. MLCC decomposes the instance-label matrix into the product of two low rank matrices, and uses them to model the worker similarity and the correlation between labels. Because graph neural networks have strong correlation modeling capabilities,the use of graph neural networks in crowdsourced multi-label truth inference is a promising research direction.

    However, only focusing on labels will limit the benefits of using crowdsourcing as a means of data collection to a narrow scope. Crowdsourcing should provide more variety of data for the upper-level model training. Since crowd workers can provide class labels, they can also provide descriptive data for instances themselves from various facets, namely Multi-Faceted Feature Description in Fig. 1. Some attempts have been made in this direction. Denget al.[81] proposed an online game that reveals discriminative features of images,and the human annotated features are used to training classification models. Similar work can be found in [82].However, in these studies, human feature selection process does not introduce additional data. It is of great practical value for crowd workers to provide multi-faceted feature descriptions. For example, in a health record classification task, different doctors, nurses, and dieticians may provide different answers from their different concerns and perspectives. Therefore, in addition to the final class label, the reasons for their judgment must also be collected for model training. Currently, there are no studies focusing on integrating crowdsourced multi-faceted feature descriptions.In data collection and integration, we probably need the help of domain knowledge. External knowledge graph may serve as a reliable knowledge source. Therefore, how to introduce knowledge graph in crowdsourced data aggregation has become another topic. At least there are two ways that we can use knowledge graph to improve the label quality. In the data acquisition phase, the domain knowledge can be extracted and pushed to workers according to their current tasks so that they can obtain some hints to better complete the tasks. In the truth inference, domain knowledge can also be integrated into the true inference process to obtain better results.

    Using topic models as a bridge to connect the conceptual inference model, multi-faceted feature integration model, and external knowledge graph is a potentially feasible solution.The topic model is a common way for knowledge representation, which can extract the most core knowledge concepts in the data and ignore those minor details. The probabilistic graphic representation of topic models makes it technically compatible with most crowdsourcing truth inference models,which can be uniformly integrated into larger graphical models. For each crowd worker, we can build a quality vector for the topics. This topic quality vector is also inferred from existing data, which describes the reliability of a worker on different topics at a fine-granular level.

    III. ROBUST AND COMPLEX MODEL LEARNING

    After the crowdsourced data are fused, we can use the collected data to build predictive models. Model learning is highly relevant to application domains. This paper only focuses on those domain-independent techniques.

    A. Predictive Model Learning

    Most of the current work focuses on supervised learning from crowdsourced labeled data. Compared with the truth inference in crowdsourcing that has already achieved fruitful results, the general-purpose predictive model learning research is still in its young stage.

    1) Standard Weakly Supervised Learning:Weak supervision is defined as a particular supervised learning setting, where limited, or imprecise sources are used to provide supervision signals for labeling a large number of training data. Since the training instances have their integrated labels (though imperfect) after truth inference, it is straightforward to train learning models using any suitable learning algorithms, such as decision tree, SVM, etc. For example, in the early work [7], a random forest is built after performing majority voting, and in [12], [14], logistic regression models are built after labels aggregated. These studies have demonstrated a basic fact that truth inference is an effective means to improve the quality of labels of training samples. Although imperfect, the noisy labeled training sets still can be used to train realistically available models. The performance of learned models not only depends on the quality of labels but also depends on the quality of instance features and learning algorithms. However, we should know that for some particular domains, label noises do deteriorate the learned models. This is the so-calledtwo-stagelearning paradigm, i.e., inferencepluslearning. There are also some other methods that directly build learning models from the raw data with repeated noisy labels. Kajinoet al.[13], [83]proposed two methods, Personal Classifier and Clustered Personal Classifier, to learn logistic regression models with convex optimization. Both methods treat crowd workers as independent classifiers, each of which only uses the labels provided by a particular worker. All classifiers are modeled by a multi-task learning model with an objective function that can be globally optimized. Due to the limited effect of regression models in modeling high-dimensional sparse data,these methods can only be applied in a few fields. Donmez and Carbonell [84] proposed a Proactive learning method that does not include inference, but it merely works under the scenario that each instance is labeled by two workers. In [85],the author proposed a pairwise training strategy, where each instance has two weighted copies with negative and positive labels. We prefer two-stage methods for three reasons. First,we have not observed any extensive performance improvement for the bundled model (without inference). Moreover,when knowledge learning systems encounter problems, twostage methods are easier to judge which part is out of order.The bundled model blurs the boundaries between inference and training. In addition, if we have plenty crowdsourced labeled data, the two-stage scheme will facilitate to build predictive models using a portion of data and use them to correct those incorrect labels [68].

    2) Other Learning Settings:Although using integrated labels to train models is a standard form, it may lose some information during inference. Noisy labels reflect the judgment of the workers to the instances, which may improve the generalization performance of learning models. To take advantage of full noisy labels, Sheng [85] proposed five label utilization strategies for weakly supervised learning, which utilized the fact that some learning algorithms such as costsensitive decision trees [86] and neural networks can accept weights for training instances. Thus, weights are generated for instances from the repeated labels, which are calculated using both frequency and the tail of a Beta distribution.

    Some studies focused on learning methods in particular cases. Zhanget al.proposed a method PLAT [43] that can automatically adjust the decision threshold between the inferred positive and negative instances, which can solve the imbalanced learning issue resulted from the biased labeling.This work showed that, biased labeling would exacerbate the imbalance of data distribution. Therefore, in the dual context of imbalanced underlying data and biased labeling, the objective of truth inference is not simply to achieve the maximization of accuracy but to maximize the imbalanced learning performance (for example, to maximize the AUC index in performance measure). Rodrigues and Pereira [55]proposed a deep learning method CrowdLayer for crowdsourcing. Their method uses an EM algorithm to jointly learn the parameters of networks as well as the reliability of workers and then introduced a crowd layer that directly trains end-to-end deep neural networks from the noisy labels using back-propagation. This model adds a crowd layer after the traditional convolutional neural network (CNN). The crowd layer can be used for label aggregation during training. After the model training is completed, the crowd layer can be removed, and the remaining part is the standard CNN prediction model. Atarashiet al.[87] addressed semisupervised learning using deep neural networks. They presented a generative deep learning model, which leverages unlabeled data effectively by introducing latent features and data distribution. Shiet al.[88] proposed a deep generative model for more complicated multi-label semi-supervised learning, which incorporates latent variables to describe the labeled/unlabeled data as well as the labeling process of crowdsourcing. Wanget al.[89] believed that the inconsistency of crowdsourced labels not only stems from malicious workers or errors made by normal workers but also indicates semantic information (such as ambiguity or difficulty) of instances. Their method measures label inconsistency and assigns different weights to labels with different inconsistencies, which helps the training of neural networks and improves the learning performance. As the application of deep learning becomes more widespread,crowdsourcing learning will be more closely integrated with it.

    Fig. 2. A blueprint to build robust and complex learning models. HEPM: Heterogeneous ensemble predictive models.

    B. Perspective

    Due to the uncertainty of crowdsourcing, there are inevitably errors in the fused data. The target of the model dimension is to build complicated learning models that are robust to the errors. We have four types of data sources, i.e.,conceptual annotation, multi-faceted feature description,instance features, and external knowledge graph, which allows the model dimension to adopt rich learning paradigms.

    Fig. 2 illustrates a blueprint for forecasting future technical development in this dimension. Usually, performing instance and feature selections on the dataset before model learning can effectively reduce noises, balance data distribution, and improve learning performance. The quantitative relationship between the various factors in statistical inference models for crowdsourcing, such as the reliability of workers, quality of topics, difficulty of instances, distribution of classes, quality of integrated labels, etc., can be established by probably approximately correct (PAC) learnable or statistical-query learnable theories [90]. This quantitative relationship, together with the external knowledge graph, will provide some guidance information for the instance and feature selection processes.

    Since both multi-source heterogeneous crowdsourced data and external knowledge graph describe the different facets of instance features, we can obtain a set of training datasets through the above instance and feature selection. From the perspective of the model training process, co-training has provided a potential solution to use different feature sets that provide different, complementary information about the instance. As a semi-supervised method, it learns a separate classifier for each view and predicts unlabeled or imperfect labeled data by iteratively constructing additional labeled training sets. From the perspective of decision making,ensemble learning can aggregate the outputs of multiple classifiers to form a more accurate prediction. Since the base models can be built with different learning algorithms from different views of features, heterogeneous ensemble predictive models can be created, which are more robust to the noises in the data according to the latest study [91]. Moreover, in their ensemble model, each instance is duplicated with different weights according to the distribution and class memberships of its multiple noisy labels and the final classifier is obtained from the aggregation of multiple base learners using the maximum a posteriori probability estimate, which shows a smaller upper boundary of error rate than that of the voting method. This ensemble model is isomorphic, where the same learning algorithm runs on the training sets extracted from the same data pool. If training sets from different facets could be used, heterogeneous ensemble learning may achieve better performance.

    From the perspective of specific learning algorithms, there are also some research topics that have never been well studied. Although some recent studies [55], [87], [88] push the crowdsourcing learning into a deep learning stage, how to optimize the deep learning model to cope with crowdsourcing noises has not been studied as well as embedding external knowledge graph to enhance deep learning models [92]. We believe that crowdsourcing learning based on graph neural networks [93] is a promising research topic. For example, we can let the network nodes represent the crowdsourced workers, and let the weight of the network edge represent the similarity of the workers, so as to obtain the adjacency matrix.Then, we define the degree matrix of the worker network, that is, the degree matrix only has values on the diagonal,indicating the number of workers adjacent to a node. All the noise labels of all workers on a certain sample form a matrix.In this way, three types of required matrices of the graph convolutional neural network (GCN) [94] are obtained. GCN has two convolution operations, and the integrated labels of the samples can be obtained by averaging all rows of the output label matrix of the second convolution operation. After obtaining the integrated labels, we connect GCN with a traditional CNN to train predictive models. On GCN, the idea of K-nearest neighbors can be used to complete missing labels. We can also use the predicted labels obtained from the CNN fully connected layer and the integrated label obtained from GCN to calculate cross-entropy loss to train the parameters on the convolutional layers, max-pooling layers,and fully connected layers of CNN.

    The multi-view and multi-kernel learning is another interesting point, which can build learning models directly on different feature subspaces or heterogeneous feature spaces.Zhou and He [95] made the first attempt on this topic,however, we have not seen much further research. Some weakly supervised learning techniques, such as graph matching in semi-supervised learning, self-taught learning[96], data programming [97], positive-unlabeled learning [98],noise filtering and correction [99], etc., can be used to build noise-robust models.

    IV. LEARNING PROCESS AND STRATEGIES

    In real-world applications, the ideal state is that the knowledge learning system maintains the ability of continuous learning at low cost and is readily available. Therefore, the focus of the systemic dimension is the optimization of the knowledge learning system, which includes a trade-off between cost and performance, dynamic modeling of the system state, and the guarantee of the high availability of the system. All these functionalities need the support of a knowledge base, which, although necessary, belongs to the research area of data management [100], which is out of the scope of this paper. During the past years, active learning with crowdsourcing attracted much attention from researchers because it can significantly reduce the cost of crowdsourced annotation while maintaining high learning performance.Active learning is defined as a special learning paradigm,where a learning algorithm can interactively query a user (or some other information source, i.e., a teacher or an oracle) to label new data points with the desired outputs. There is a natural connection between crowdsourcing learning and active learning. Since crowdsourcing labeling can significantly reduce labeling costs compared to expert labeling, the pursuit of optimization of labeling costs naturally becomes one of the goals of crowdsourcing learning. The realization of this goal requires the help of active learning. Therefore, this section puts forward our viewpoints on future technical development after reviewing the progress in active learning.

    A. Active Learning and its Strategies

    Active learning optimizes the learning process through the design of learning strategies. The early work [7] first proposed three uncertainty-based instance selection strategies for crowdsourcing which consider the noisy label distributions on instances, the prediction of current models, and both of them,respectively. In this work, the label uncertainty is calculated from the distribution of two types of labels in the sample’s noisy crowdsourced label set, which reflects the inconsistency of crowdsourced workers’ judgments on sample types. Model uncertainty is calculated based on the probability that the sample belongs to each class calculated by the current model,that is, the greater the entropy, the more uncertain. The hybrid uncertainty is the geometric mean of the label uncertainty and model uncertainty, which has the best overall performance.However, it can be seen from the experiments of this work that label uncertainty is very sensitive to worker errors, and a small number of labeling errors will cause the fluctuation of model’s performance. This will increase the difficulty of judging the convergence of the model in practice. Zhanget al.[101] extended these strategies to a biased labeling scenario by considering the level of labeling bias obtained from their PLAT algorithm, which partially solves the imbalance issue caused by biased labeling. Yanet al.[74] believed that selecting crowd workers is also necessary to improve the quality of labels and should be treated as a strategy in active learning.They designed a strategy that picks the workers that are most beneficial to the performance improvement of the current learning model during each iteration of active learning.However, this work did not answer the question that if the optimal worker is unavailable (for example, temporarily withdrawing from the task), choosing a sub-optimal worker will bring what kind of impact on the model. Longet al.[102]proposed the Bayesian active learning that introduces a sorting strategy based on information entropy when selecting instances and workers. Even the more complex multi-label active learning follows a similar research idea, but more correlation modeling is added by evaluating the instance-label pairs [35],[103]. As a mode of data acquisition, active learning paradigm also can be used for label aggregation [104].

    In addition to the above standard forms of active learning,some studies have designed more diverse active learning strategies. Self-taught active learning [96] first classifies crowd workers into two categories, i.e., weak workers and reliable workers, and then occasionally selects weak workers but uses the labels provided by reliable workers to compensate for their working outputs. This strategy not only learns reliable knowledge but also provides opportunities to those weak workers and help them improve their capability, which prevents the system from being unavailable after the reliable workers leave. However, the judgment of who is a reliable worker in this work seems to be derived from prior knowledge. In addition, the work did not consider the strategy of label expansion for an unreliable worker, that is, which reliable workers will be used for the expansion of the unreliable worker. Linet al.[105] proposed a concept of reactive learning that introduces two sampling strategies—uncertainty sampling and impact sampling. It makes a trade-off between acquiring more labels to lower the noise levels and enlarging the size of the train set with more instances. It is a common sense that although some workers exhibit a low quality, they can still provide correct answers to some specific instances. This work was very impressive, because most other algorithms are guided by the designed active learning strategy to automatically decide whether to expand the crowdsourced label set of a sample or to select a new sample. This work explicitly considered this issue before the active learning strategy was designed. Unfortunately, we did not find more in-depth investigations of this issue. According to this fact,Huanget al.[17] proposed an active strategy that evaluates the cost-effectiveness of instance-worker pairs. The strategy selects an instance that is beneficial for the performance improvement of the current model and a worker that has a high probability of providing a correct answer to this instance at a relatively low cost.

    Fig. 3. A blueprint of the technical development trend and their relations in the systemic dimension. A-P: Annotation process; K-T: Knowledge transfer;D-T: Domain topic.

    Many active learning strategies are heuristic. They are intuitively rational, but it is difficult to conduct a theoretical analysis. It is impossible to know the performance boundaries of these strategies. Therefore, theoretical research in this direction needs to be strengthened in the future.

    B. Perspective

    Due to the uncertainty of crowdsourcing, the functionalities of the systemic dimension are extremely critical to crowdsourced knowledge learning systems. There are several examples. Sometimes our active learning strategy does choose the most appropriate worker to perform the task. However, when the task is pushed to the worker, the worker has left the system, or the correct rate of the worker begins to decline due to long hours of work, which fails the active learning and results in the unstableness of the knowledge learning process.In some other cases, a requester publishes a set of new human intelligence tasks, however, workers in the system are not familiar with this application domain, resulting in a low quality of answers and bad learned models. All these issues need to be addressed with new techniques in the systemic dimension.

    Fig. 3 illustrates a blueprint of the technical development trend and their relations. The entire technical roadmap adds a variety of modeling processes, strategy design, and a global knowledge base into the crowdsourcing active learning framework. To ensure the stability of the knowledge learning process, we need to dynamically model the crowdsourcing annotation process, including the system state modeling and time series modeling. There has been some work addressing the dynamical modeling of crowdsourced annotation.Rodrigueset al.[106] introduced a Gaussian process classification to model multiple annotators with different levels of expertise. Junget al.[107] explored temporal behavioral patterns of underlying crowd work and proposed a time-series label predictive model to capture past worker behaviors. Modeling the state and time series of a crowdsourcing system relies on the results of the statistical inference and aggregation of multi-source data. For example,we may use hidden Markov models (HMM), State Space Models, or Latent Autoregressive Models to model the changes of the reliability of workers over times. Compared with statistical inference, dynamic time series modeling can reduce the influence of early historical data on current state prediction. Accurate time series modeling will increase the effectiveness of the instance and worker selection.

    Another interesting research direction is knowledge transfer modeling in the systemic dimension. Knowledge transfer can improve the usability and stability of a knowledge learning system. Particularly, when the system is in its cold-start stage,we can use the knowledge of related domains that has been learned before to improve the quality of the outputs of crowd workers. Researchers have already noticed the effect of transfer learning for the crowdsourcing inference. Moet al.[108] first introduced transfer learning into crowdsourcing and proposed the cross-task crowdsourcing that could share the knowledge across different domains and solve knowledge sparsity of a particular domain. Fanget al.[109], [110]proposed an active learning framework with knowledge transfer, where workers’ expertise is modeled from the historical annotation in a source domain and used in a target domain in the instance and worker selection. Zhaoet al.[111]transferred the knowledge from categorizedYahoo! Answersdatasets for learning user expertise in the tasks on Twitter.The learning strategy of the knowledge transfer model solves thecold-startproblem in the process of new topic expansion in crowdsourced annotation. We need to embed the knowledge transfer process in active learning and start this process under certain conditions. Knowledge transfer is a way of knowledge sharing between the source domain and the target domain. It can use the model of the source domain to determine or predict the attributes of samples in the target domain. The source domain is composed of samples that have been well labeled and formed a good prediction model for the specific field. The target domain consists of samples that are ready to be crowdsourced for annotation (description). In the iterative process of active learning, the existing sample selection strategy is first used to select samples that need to be crowdsourced labeled, and then we evaluate whether these samples need knowledge transfer. We evaluate the similarity between each topic in the source domain and the topics of these samples. If the topic of the samples is different from the topics of the source domain, the knowledge transfer process needs to be initiated, otherwise, the original active learning strategy will be used. In addition, for those already labeled samples, how to use them to update the source domain model also needs further consideration.

    We have noticed that, compared with traditional active learning, active learning from crowds can hardly maintain a smooth rising learning curve. This may be partially caused by the reason that workers probably make mistakes repeatedly due to the lack of expertise or the exhaustion of repetitive tasks. Thus, machine teaching [112], the inverse problem of machine learning, can be embedded in the active learning procedure, where the selected teaching examples are pushed onto the workers when they perform annotation tasks. The latest study [113] attempted to enhance the ability of workers by the Generalized Context model in cognitive psychology.Therefore, we need to introduce the concept of interactive learning, which can feed back knowledge to crowdsourcing workers in the learning process to maintain and improve their working ability. The core part of interactive learning is the online machine teaching process. The difficulty lies in the generation of teaching cases. After selecting the samples to be labeled in the next round, interactive learning needs to generate some teaching cases to help the crowdsourced workers complete the tasks with high quality. The number of these teaching cases cannot be too many, and at least the teaching needs to be carried out from both positive and negative aspects. Too many teaching cases will not only increase the learning burden of workers, but also distract workers’ attention when performing tasks. Providing positive and negative teaching cases simultaneously can enable workers to quickly grasp concepts through comparison.Therefore, we need to find those samples that are most similar to the samples to be labeled and those that look similar but have different categories in the knowledge base as teaching cases. In addition, we have to consider how to update the teaching knowledge base.

    To sum up, under this framework, active learning strategies include the instance, worker, transfer-model selections, and even machine teaching methods. The transfer-model selection strategy aims to determine whether the current task requires a transfer model and which transfer model to use. Once selected, it should be considered in both instance and worker selections. Finally, the human intelligence tasks adopt a twopart graph structure. When workers accept tasks, they also receive some recommended references which may improve their skills by machine teaching, finally achievingpositive feedbackknowledge learning.

    V. RESEARCH TOOLS

    Open-source tools including datasets for public acquisition is a trend in recent data-driven research. There have been some open-source research tools for crowdsourcing learning.Nguyenet al.[114] proposed a visual tool BATC for label aggregation research, which implements several truth inference algorithms MV, DS, RY, KOS, and GLAD but uses synthetic datasets. The advantage of this tool is that it provides an easy-to-use graphical user interface for both evaluation and simulation. Sheshadri and Lease [115]proposed another tool SQUARE for truth inference. It does not provide a graphic user interface but provides a set of APIs that facilitate users to integrate their functions. SQUARE implements and integrates the algorithms MV, DS, RY,GLAD, and ZenCrowd. Moreover, it collects ten real-world crowdsourcing data sets. In [11], the authors provided dozens of implementations of truth inference algorithms together with several real-world datasets. Zhanget al.[116] proposed a tool CEKA, which involves model learning processes as well as a large number of ground truth inference algorithms. It follows the object-oriented design and is fully compatible with a wellknown machine learning tool WEKA [117]. CEKA has a more open architecture, which makes it easy to integrate new algorithms in the future. Table II shows an evaluation example of CEKA. We evaluated seven different truth inference methods implemented and integrated in CEKA on eight realworld crowdsourcing annotation datasets. The experimental results show that GTIC and CrowdLayer have good performance in general. Some traditional inference methods such as MV and DS are not necessarily bad. Venanziet al.[118] presented an open-source toolkit that allows the easy comparison of the performance of active crowdsourcing learning methods over a series of datasets. Users can construct new strategies by combining aggregation models, task selection methods, and worker selection methods. However,this tool only implemented a few existing algorithms and collected a small number of real-world crowdsourcing datasets.

    In the future, it is worth studying how to integrate open source research tools with real crowdsourcing platforms to support the entire knowledge learning process. The development of open source tools still faces some challenges.First, the tools need to simulate the complex behaviors of crowdsourced workers to support in-depth analysis of the performance of different algorithms in different environments.As we have seen in many studies, we have quite a few truth inference methods, actually, we still do not have a very clearunderstanding of the applicability of these algorithms. The behaviors of these crowdsourced workers are often not independent, and they may influence each other in large tasks.Second, the tools need to have interfaces that support mainstream crowdsourcing platforms. These interfaces make it easy for researchers to connect to the real crowdsourcing platform, and make the self-made back-end systems work together with the crowdsourcing platform. Third, the tools need to support mainstream programming languages and frameworks. For example, in the field of deep learning,researchers are used to using Python language and working with PyTorch or TensorFlow deep learning frameworks. In addition, in the field of crowdsourcing learning, there is an urgent need for comprehensive comparison of deep learning methods with traditional learning methods. Finally,visualization is a very necessary but difficult traditional issue.

    TABLE II ACCURACY OF SEVEN TRUTH INFERENCE METHOD ON EIGHT REAL-WORLD DATASETS (IN PERCENTAGE)

    VI. CONCLUSION

    Knowledge learning with crowdsourcing has launched an enormous picture for the researchers in AI-related disciplines.This paper summarized the progress in the field from a systematic perspective, including three dimensions of techniques to promote the performance of knowledge learning. More importantly, according to our many years of research experience, this paper comprehensively discusses the future research directions in this field from the perspective of knowledge learning for the first time. In the data dimension,we emphasize to maximize the heterogeneous data collection ability of crowdsourcing and aggregate various types of crowdsourcing data. In the model dimension, we emphasize the use of different learning paradigms, training methods, and the characteristics of the learning model to make full use of data. In the systemic dimension, we emphasize the optimization of model training costs, system reliability, and human-machine collaboration.

    男女午夜视频在线观看| 亚洲精品一区av在线观看| 久久婷婷人人爽人人干人人爱| 天堂√8在线中文| 最近最新免费中文字幕在线| 亚洲第一欧美日韩一区二区三区| 成人永久免费在线观看视频| 无遮挡黄片免费观看| 国产黄a三级三级三级人| 丝袜人妻中文字幕| 国产激情久久老熟女| 女人爽到高潮嗷嗷叫在线视频| 日韩欧美 国产精品| 国产免费av片在线观看野外av| 久久中文字幕人妻熟女| 国产成人一区二区三区免费视频网站| 亚洲成av人片在线播放无| 午夜两性在线视频| 国产精品av视频在线免费观看| 亚洲精品中文字幕一二三四区| av片东京热男人的天堂| 不卡av一区二区三区| 国产男靠女视频免费网站| 欧美日韩中文字幕国产精品一区二区三区| 午夜激情福利司机影院| 中文资源天堂在线| 欧美日本亚洲视频在线播放| 一夜夜www| 国产亚洲欧美98| 可以在线观看毛片的网站| 在线观看www视频免费| 香蕉av资源在线| 成人午夜高清在线视频| 国产精品久久电影中文字幕| 在线观看免费视频日本深夜| 亚洲av美国av| 欧美色视频一区免费| 精品久久久久久久久久免费视频| 人妻夜夜爽99麻豆av| 亚洲18禁久久av| 午夜福利视频1000在线观看| 黄色a级毛片大全视频| 亚洲电影在线观看av| 18禁国产床啪视频网站| 中亚洲国语对白在线视频| 亚洲国产中文字幕在线视频| 变态另类成人亚洲欧美熟女| 男女床上黄色一级片免费看| 国产熟女午夜一区二区三区| 国产亚洲精品综合一区在线观看 | 大型av网站在线播放| 成年女人毛片免费观看观看9| 亚洲专区字幕在线| 丁香欧美五月| 国产精品久久视频播放| 啦啦啦观看免费观看视频高清| 在线观看www视频免费| 天堂动漫精品| 99热6这里只有精品| 久久久久久久久中文| 伊人久久大香线蕉亚洲五| av有码第一页| 欧美黄色淫秽网站| 一二三四在线观看免费中文在| 久久婷婷人人爽人人干人人爱| a级毛片在线看网站| 日本在线视频免费播放| 身体一侧抽搐| 黄色视频,在线免费观看| 99久久精品国产亚洲精品| 韩国av一区二区三区四区| 亚洲精品一区av在线观看| 久久中文字幕人妻熟女| 久久伊人香网站| av福利片在线观看| 国产欧美日韩精品亚洲av| 久9热在线精品视频| 啦啦啦韩国在线观看视频| 国产高清视频在线播放一区| 欧美性猛交黑人性爽| 很黄的视频免费| 91在线观看av| 一区福利在线观看| 亚洲精华国产精华精| 国产高清videossex| 亚洲熟女毛片儿| 男女床上黄色一级片免费看| 国产亚洲欧美98| 欧美精品啪啪一区二区三区| 国产精品av视频在线免费观看| 免费看十八禁软件| 一级毛片精品| 亚洲国产精品sss在线观看| а√天堂www在线а√下载| 91麻豆av在线| 国产精品98久久久久久宅男小说| 久久精品影院6| av天堂在线播放| 国产欧美日韩精品亚洲av| 女警被强在线播放| 不卡av一区二区三区| 日本 av在线| a在线观看视频网站| 国内精品久久久久精免费| 日韩欧美 国产精品| 亚洲无线在线观看| 可以免费在线观看a视频的电影网站| 亚洲一区中文字幕在线| 国产精品一区二区三区四区久久| 国产成人aa在线观看| 国产亚洲精品av在线| 亚洲成人精品中文字幕电影| 99久久无色码亚洲精品果冻| 怎么达到女性高潮| 大型黄色视频在线免费观看| 色播亚洲综合网| 亚洲天堂国产精品一区在线| 久久精品国产亚洲av香蕉五月| 桃红色精品国产亚洲av| 欧美乱码精品一区二区三区| 国产又色又爽无遮挡免费看| 别揉我奶头~嗯~啊~动态视频| 真人做人爱边吃奶动态| 日韩 欧美 亚洲 中文字幕| 18美女黄网站色大片免费观看| 观看免费一级毛片| av中文乱码字幕在线| а√天堂www在线а√下载| svipshipincom国产片| 亚洲精品在线美女| 舔av片在线| 亚洲第一欧美日韩一区二区三区| 欧美成人午夜精品| 美女黄网站色视频| 黄色片一级片一级黄色片| 国产伦在线观看视频一区| 国内揄拍国产精品人妻在线| 18禁黄网站禁片午夜丰满| 无限看片的www在线观看| 男女下面进入的视频免费午夜| 俺也久久电影网| 一级毛片精品| 国产免费av片在线观看野外av| 国产av又大| 色综合婷婷激情| 中文字幕人成人乱码亚洲影| 在线观看日韩欧美| 国语自产精品视频在线第100页| 久久久久久久久久黄片| 可以免费在线观看a视频的电影网站| 三级男女做爰猛烈吃奶摸视频| 亚洲精品美女久久久久99蜜臀| 在线观看免费午夜福利视频| 欧美日韩精品网址| 深夜精品福利| 精品日产1卡2卡| 日韩中文字幕欧美一区二区| 免费观看人在逋| www.熟女人妻精品国产| 妹子高潮喷水视频| 黄频高清免费视频| 日韩大尺度精品在线看网址| 天天躁狠狠躁夜夜躁狠狠躁| tocl精华| 9191精品国产免费久久| 美女黄网站色视频| 久久久久国产一级毛片高清牌| 在线观看免费视频日本深夜| 国产亚洲精品久久久久5区| 欧美成人免费av一区二区三区| 欧美成人午夜精品| 18禁美女被吸乳视频| 一夜夜www| 美女扒开内裤让男人捅视频| 人妻夜夜爽99麻豆av| 国产精品av久久久久免费| 国产精品久久电影中文字幕| av天堂在线播放| 久久天躁狠狠躁夜夜2o2o| 成人欧美大片| 国产一区二区在线观看日韩 | 国产精品久久久久久精品电影| tocl精华| 婷婷精品国产亚洲av| 免费高清视频大片| 午夜成年电影在线免费观看| 国产精品久久久久久亚洲av鲁大| 在线观看免费日韩欧美大片| 成人高潮视频无遮挡免费网站| 在线免费观看的www视频| 免费在线观看亚洲国产| 全区人妻精品视频| 狠狠狠狠99中文字幕| 男女那种视频在线观看| 2021天堂中文幕一二区在线观| 午夜福利在线观看吧| 亚洲av中文字字幕乱码综合| 1024香蕉在线观看| 成人亚洲精品av一区二区| 免费在线观看影片大全网站| 久久精品91无色码中文字幕| 亚洲一区二区三区不卡视频| 床上黄色一级片| 色av中文字幕| 一进一出抽搐gif免费好疼| 亚洲精品中文字幕一二三四区| 国产成人av激情在线播放| 无遮挡黄片免费观看| 久久性视频一级片| 成年版毛片免费区| 国产精品爽爽va在线观看网站| 久久草成人影院| 国产一区二区三区视频了| 午夜亚洲福利在线播放| 国产1区2区3区精品| 久久精品夜夜夜夜夜久久蜜豆 | 女同久久另类99精品国产91| 成人午夜高清在线视频| 韩国av一区二区三区四区| 免费电影在线观看免费观看| 99久久精品国产亚洲精品| 熟女电影av网| 99国产精品一区二区蜜桃av| 韩国av一区二区三区四区| 亚洲,欧美精品.| 可以免费在线观看a视频的电影网站| 亚洲精品中文字幕在线视频| 亚洲精品久久国产高清桃花| www日本在线高清视频| 波多野结衣高清作品| 成年女人毛片免费观看观看9| 老汉色∧v一级毛片| 亚洲一码二码三码区别大吗| 国产伦一二天堂av在线观看| 一二三四社区在线视频社区8| 欧美极品一区二区三区四区| 精品人妻1区二区| 男人的好看免费观看在线视频 | 日本一二三区视频观看| www日本在线高清视频| 国产三级黄色录像| 99热这里只有是精品50| 久久婷婷人人爽人人干人人爱| aaaaa片日本免费| 亚洲成人免费电影在线观看| 男女视频在线观看网站免费 | 夜夜爽天天搞| 中文字幕高清在线视频| 久久中文看片网| 久久草成人影院| 午夜福利高清视频| 午夜免费激情av| 亚洲va日本ⅴa欧美va伊人久久| 长腿黑丝高跟| 色尼玛亚洲综合影院| 精品人妻1区二区| 一本一本综合久久| 亚洲av五月六月丁香网| 最近视频中文字幕2019在线8| xxx96com| 18禁国产床啪视频网站| 午夜福利成人在线免费观看| 久久人人精品亚洲av| 欧美色视频一区免费| 777久久人妻少妇嫩草av网站| 极品教师在线免费播放| 日韩欧美在线乱码| 亚洲av日韩精品久久久久久密| 观看免费一级毛片| 欧美午夜高清在线| 国产久久久一区二区三区| 99精品欧美一区二区三区四区| 精品久久蜜臀av无| 男人舔奶头视频| 精品欧美国产一区二区三| 国产精品一区二区精品视频观看| 欧美zozozo另类| 日韩欧美在线乱码| 色综合站精品国产| 日日夜夜操网爽| 免费电影在线观看免费观看| 国产片内射在线| 丁香欧美五月| 久久精品国产99精品国产亚洲性色| 精品电影一区二区在线| 亚洲国产精品成人久久小说 | 国产黄色视频一区二区在线观看 | 精品午夜福利在线看| 精品99又大又爽又粗少妇毛片| 蜜臀久久99精品久久宅男| 麻豆成人午夜福利视频| 亚洲欧美精品自产自拍| 国产精华一区二区三区| 亚洲人成网站在线播放欧美日韩| 六月丁香七月| h日本视频在线播放| 国产一区二区三区在线臀色熟女| 精品久久久久久久人妻蜜臀av| 99久久中文字幕三级久久日本| 国产v大片淫在线免费观看| 日日啪夜夜撸| 能在线免费观看的黄片| 国产精品综合久久久久久久免费| 日韩亚洲欧美综合| 麻豆久久精品国产亚洲av| 性色avwww在线观看| 欧美不卡视频在线免费观看| 欧美在线一区亚洲| 黄色一级大片看看| 99国产精品一区二区蜜桃av| 岛国毛片在线播放| 日日摸夜夜添夜夜爱| 波多野结衣巨乳人妻| 亚州av有码| 日本在线视频免费播放| 久久久成人免费电影| 国产精品蜜桃在线观看 | 午夜爱爱视频在线播放| 男的添女的下面高潮视频| 亚洲最大成人手机在线| 久久精品91蜜桃| 亚洲精品影视一区二区三区av| 黄色配什么色好看| 少妇猛男粗大的猛烈进出视频 | 男人的好看免费观看在线视频| 三级经典国产精品| 国产精品久久久久久av不卡| 国产精品1区2区在线观看.| 亚洲av熟女| 国产精品99久久久久久久久| 精品免费久久久久久久清纯| 看免费成人av毛片| 69人妻影院| 伦精品一区二区三区| 亚洲最大成人av| 看片在线看免费视频| 又粗又硬又长又爽又黄的视频 | 久久这里只有精品中国| 国产久久久一区二区三区| 欧美bdsm另类| 亚洲不卡免费看| 午夜a级毛片| 欧美成人免费av一区二区三区| 日韩欧美精品v在线| 韩国av在线不卡| 亚洲精品日韩在线中文字幕 | 在线观看一区二区三区| 亚洲久久久久久中文字幕| 国产一区二区在线观看日韩| 九九爱精品视频在线观看| 久久精品国产自在天天线| 亚洲精品456在线播放app| 亚洲欧美精品专区久久| 一级av片app| 亚洲成人精品中文字幕电影| 99久国产av精品| 亚洲成人中文字幕在线播放| 免费黄网站久久成人精品| 男女下面进入的视频免费午夜| 欧美成人免费av一区二区三区| 免费观看a级毛片全部| 国产单亲对白刺激| 26uuu在线亚洲综合色| 国产一级毛片在线| 久久精品夜夜夜夜夜久久蜜豆| 一卡2卡三卡四卡精品乱码亚洲| 免费观看a级毛片全部| 免费看日本二区| 国产 一区精品| 色视频www国产| 国产av在哪里看| 日本撒尿小便嘘嘘汇集6| 99国产极品粉嫩在线观看| 韩国av在线不卡| 亚洲真实伦在线观看| 桃色一区二区三区在线观看| 免费观看a级毛片全部| 亚洲欧美精品综合久久99| av国产免费在线观看| 男人狂女人下面高潮的视频| 国内揄拍国产精品人妻在线| 级片在线观看| 又粗又硬又长又爽又黄的视频 | 亚洲一级一片aⅴ在线观看| 乱码一卡2卡4卡精品| av免费在线看不卡| 精品久久久久久久久久免费视频| 国产精品三级大全| 欧美日本视频| 搡老妇女老女人老熟妇| 三级经典国产精品| 日本一二三区视频观看| 欧美一区二区精品小视频在线| 日韩 亚洲 欧美在线| 久久欧美精品欧美久久欧美| 欧美色欧美亚洲另类二区| 亚洲欧美精品专区久久| 国产91av在线免费观看| 日韩欧美一区二区三区在线观看| 丝袜喷水一区| 哪里可以看免费的av片| 看黄色毛片网站| 中文字幕熟女人妻在线| 国产精品乱码一区二三区的特点| 国产精品久久视频播放| 色哟哟哟哟哟哟| 久99久视频精品免费| 亚洲精品亚洲一区二区| 欧美丝袜亚洲另类| 欧美日韩精品成人综合77777| 欧美高清性xxxxhd video| 丰满的人妻完整版| 青春草视频在线免费观看| 国产精品三级大全| 伦精品一区二区三区| 女人十人毛片免费观看3o分钟| 人人妻人人澡人人爽人人夜夜 | av又黄又爽大尺度在线免费看 | 九草在线视频观看| 欧美xxxx黑人xx丫x性爽| 日本-黄色视频高清免费观看| 精品一区二区三区视频在线| 国产av在哪里看| 国产v大片淫在线免费观看| 国产高清激情床上av| 午夜久久久久精精品| 国产亚洲91精品色在线| 国产精品不卡视频一区二区| 日日摸夜夜添夜夜爱| 成人亚洲欧美一区二区av| 给我免费播放毛片高清在线观看| 亚洲精品亚洲一区二区| 欧美成人免费av一区二区三区| 久久99热这里只有精品18| 亚洲美女搞黄在线观看| 日韩中字成人| 中出人妻视频一区二区| 国产精品一及| 午夜久久久久精精品| 看非洲黑人一级黄片| 亚洲精品456在线播放app| 久久婷婷人人爽人人干人人爱| 又黄又爽又刺激的免费视频.| 97超视频在线观看视频| 精品国内亚洲2022精品成人| 欧美性感艳星| 变态另类成人亚洲欧美熟女| 国产精品久久久久久精品电影小说 | 欧美bdsm另类| 亚洲国产精品国产精品| 99在线人妻在线中文字幕| 国产午夜精品一二区理论片| 一夜夜www| 国产高清不卡午夜福利| 乱系列少妇在线播放| 自拍偷自拍亚洲精品老妇| 亚洲人成网站在线播| 亚州av有码| 中文字幕av在线有码专区| 观看美女的网站| 夫妻性生交免费视频一级片| 一本一本综合久久| 国产三级在线视频| 高清在线视频一区二区三区 | 搡老妇女老女人老熟妇| 插逼视频在线观看| 亚洲成人久久爱视频| 国产一区二区三区av在线 | 国产精品日韩av在线免费观看| 91狼人影院| 国产成人a∨麻豆精品| 国产一级毛片七仙女欲春2| 老司机影院成人| 国产av在哪里看| 亚洲美女搞黄在线观看| 床上黄色一级片| 熟妇人妻久久中文字幕3abv| 一本久久精品| 日产精品乱码卡一卡2卡三| 一卡2卡三卡四卡精品乱码亚洲| 久久久久久久久久黄片| 日日摸夜夜添夜夜添av毛片| 欧美xxxx黑人xx丫x性爽| 一区二区三区高清视频在线| 成人二区视频| 久久中文看片网| 午夜免费男女啪啪视频观看| 色播亚洲综合网| 色哟哟哟哟哟哟| 色视频www国产| 国产成人91sexporn| 久久这里有精品视频免费| 亚洲av第一区精品v没综合| 全区人妻精品视频| 国产精品一二三区在线看| 搡女人真爽免费视频火全软件| 亚洲欧美日韩高清在线视频| 偷拍熟女少妇极品色| 国产老妇女一区| 热99re8久久精品国产| 黄片wwwwww| 欧美一级a爱片免费观看看| 我要看日韩黄色一级片| 夫妻性生交免费视频一级片| 最近的中文字幕免费完整| 最新中文字幕久久久久| 免费人成在线观看视频色| 综合色丁香网| 久久精品91蜜桃| 久久99热6这里只有精品| 久久久成人免费电影| 精品久久久久久久人妻蜜臀av| 日日摸夜夜添夜夜爱| 国产免费一级a男人的天堂| 女人被狂操c到高潮| 国产精品国产三级国产av玫瑰| 91精品一卡2卡3卡4卡| videossex国产| 欧美+亚洲+日韩+国产| 亚洲不卡免费看| 成年女人永久免费观看视频| 网址你懂的国产日韩在线| 国产人妻一区二区三区在| 五月伊人婷婷丁香| 亚洲在线自拍视频| 国产黄色小视频在线观看| 国产午夜福利久久久久久| 成人鲁丝片一二三区免费| 亚洲一级一片aⅴ在线观看| 久久久久免费精品人妻一区二区| 成人无遮挡网站| 91午夜精品亚洲一区二区三区| 内地一区二区视频在线| 国产精品日韩av在线免费观看| 免费无遮挡裸体视频| 国产黄色视频一区二区在线观看 | 中国美女看黄片| 99国产极品粉嫩在线观看| 日韩欧美国产在线观看| 国产精品女同一区二区软件| 你懂的网址亚洲精品在线观看 | 少妇熟女aⅴ在线视频| 99久国产av精品国产电影| 久久99蜜桃精品久久| 日本一本二区三区精品| 少妇裸体淫交视频免费看高清| 久久精品91蜜桃| 男人狂女人下面高潮的视频| 国产成人精品婷婷| 伦理电影大哥的女人| 国产淫片久久久久久久久| 国产一区二区在线观看日韩| a级一级毛片免费在线观看| 六月丁香七月| 只有这里有精品99| 99热这里只有是精品50| 一区二区三区四区激情视频 | 亚洲欧美精品综合久久99| 变态另类丝袜制服| 亚洲av二区三区四区| 99热全是精品| 91麻豆精品激情在线观看国产| 尾随美女入室| 亚洲av免费高清在线观看| 国产 一区精品| 99热精品在线国产| 久久韩国三级中文字幕| 日日啪夜夜撸| 亚洲人成网站高清观看| 日日干狠狠操夜夜爽| 22中文网久久字幕| 高清在线视频一区二区三区 | 亚洲精品国产av成人精品| 精品人妻视频免费看| 免费人成在线观看视频色| 性插视频无遮挡在线免费观看| 亚洲激情五月婷婷啪啪| 老女人水多毛片| 精品免费久久久久久久清纯| 国产爱豆传媒在线观看| 国产久久久一区二区三区| 成人漫画全彩无遮挡| 亚洲一区二区三区色噜噜| 校园春色视频在线观看| 日韩强制内射视频| 日本-黄色视频高清免费观看| 午夜福利成人在线免费观看| 亚洲精品国产成人久久av| 你懂的网址亚洲精品在线观看 | 国产白丝娇喘喷水9色精品| 少妇熟女欧美另类| 国产高清视频在线观看网站| 黄色一级大片看看| 人妻系列 视频| 国产男人的电影天堂91| 白带黄色成豆腐渣| 日本撒尿小便嘘嘘汇集6| 国产男人的电影天堂91| 婷婷精品国产亚洲av| 麻豆av噜噜一区二区三区| 五月玫瑰六月丁香| 国语自产精品视频在线第100页| 亚洲成a人片在线一区二区| 国产成年人精品一区二区| 免费无遮挡裸体视频| 夜夜夜夜夜久久久久| 天堂√8在线中文| 国产高清不卡午夜福利| 亚洲精品456在线播放app| 一级毛片电影观看 | 2021天堂中文幕一二区在线观| 日韩大尺度精品在线看网址| 国产精品久久视频播放| 91av网一区二区| 国产精品人妻久久久久久|