• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Restricted Boltzmann machine: Recent advances and mean-field theory*

    2021-05-06 08:54:24AurelienDecelleandCyrilFurtlehner
    Chinese Physics B 2021年4期
    關鍵詞:法華楊譯多指

    Aur′elien Decelle and Cyril Furtlehner

    1Departamento de F′?sica T′eorica I,Universidad Complutense,28040 Madrid,Spain

    2TAU team INRIA Saclay&LISN Universit′e Paris Saclay,Orsay 91405,France

    Keywords: restricted Boltzmann machine(RBM),machine learning,statistical physics

    1. Introduction

    During the last decade,machine learning has experienced a rapid development,both in everyday life with the incredible success of image recognition used in various applications,and in research[1,2]where many different communities are now involved. This common effort involves fundamental aspects such as why it works or how to build new architectures and at the same time a search for new applications of machine learning to other fields,like for instance improving biomedical images segmentation[3]or detecting automatically phase transitions in physical systems.[4]Machine learning classical tasks are divided into at least two big categories: supervised and unsupervised learning (putting aside reinforcement learning and the more recently introduced approach of self-supervised learning). Supervised learning consists in learning a specific task — for instance recognizing an object on an image or a word in a speech — by giving the machine a set of samples together with the correct answer and correcting the prediction of the machine by minimizing a well-design and easy computable loss function.Unsupervised learning consists in learning a representation of the data given an explicit or implicit probability distribution,hence adjusting a likelihood function on the data. In this latter case,no label is assigned to the data and the result depends thus solely on the structure of the considered model and of the dataset.

    In this review, we are interested in a particular model: the restricted Boltzmann machine (RBM). Originally called Harmonium[5]or product of experts,[6]RBMs were designed[7]to perform unsupervised tasks even though they can also be used to accomplish supervised learning in some sense. RBMs are part of what is called generative models which aim to learn a latent representation of the data in order to later be used to generate statistically similar new data—but different from those of the training set. There are Markov random fields(or Ising model for physicists),that were designed as a way to automatically interpret an image using a parallel architecture including a direct encoding of the probability of each“hypothesis”(latent description of a small portion of an image). Later on, RBMs started to take an important role in the machine learning(ML)community, when a simple learning algorithm introduced by Hinton et al.,[6]the contrastive divergence (CD), managed to learn a non-trivial dataset such as MNIST.[8]It was in the same period that RBMs became very popular in the ML community for its capability to pretrain deep neural networks (for instance deep auto-encoder),in a layer wise style. And,it was then showed that RBMs are universal approximator[9]of discrete distributions, that is, an arbitrary large RBM can approximate arbitrarily well any discrete distribution(which led to many rigorous results about the modelization mechanism of RBMs[10]).In addition,RBMs offer the possibility to be stacked to form a multi-layer generative model known as a deep Boltzmann machine(DBM).[11]In the more recent years,RBMs continued to attract scientific interest. Firstly because it can be used on continuous or discrete variable very easily.[12–15]Secondly, because the possible interpretations of the hidden nodes can be very useful.[16,17]Interestingly, in some cases, more elaborate methods such as GAN[18]are not working better.[19]Finally it can be used for other tasks as well, such as classification or representation learning.[20]Besides all these positive aspects, the learning process itself of the RBM remains poorly understood.The reasons are twofold: firstly,the gradient can be computed only in an approximated way as we will see;secondly,simple changes may have terrible impact on the learning or, messed up completely with the other meta-parameters. For instance making a naive change of variable in the MNIST dataset[21,22]can affect importantly the training performance(In MNIST,it is usual to consider binary variable{0,1}to describe the dataset. Taking instead {±1} naively will affect dramatically the learning of the RBM).In another example,varying the number of hidden nodes,while keeping the other mete-parameters fixed,will affect not only the representational power of the RBM but also the learning dynamics itself.

    The statistical physics community,on its side,has a long tradition of studying inference and learning process with its own tools.Using idealized inference problems,it has managed in the past to shed light on the learning process of many ML models. For instance,in the Hopfield model,[23–26]a retrieval phase was characterized where the maximum number of patterns that can be retrieved can be expressed as a function of the temperature. Another example is the computation of the storage capacity of the Perceptron[27]on synthetic datasets.[28,29]In these approaches, the formalism of statistical physics explains the macroscopic behavior of the model in term of its position on a phase diagram in the large size limit.

    From a purely technical point of view, the RBM can be seen for a physicist as a disordered Ising model on a bipartite graph. Yet,the difference with respect to the usual models that are studied in statistical physics is that the phase diagram of a trained RBM involves a highly non-trivial coupling matrix where the components are correlated as a result of the learning process. These dependencies make it non-trivial to adapt classical tools from statistical mechanics, such as the replica theory.[30]We will illustrate in this article how methods from statistical physics still have helped to characterize both the equilibrium phase of an idealized RBM where the coupling matrix has a structured spectrum, and how the learning dynamics can be analyzed in some specific regimes,both results being obtained with traditional mean-field approaches.

    The paper is organized as follows. We will first give the definition of the RBM and review the typical learning algorithm used to train the model in Section 2,Then,in Section 3,we will review different types of RBMs by changing the prior on its variables and show explicit links with other models. In Section 4,we will review two approaches that characterize the phase diagram of the RBM and in particular its compositional phase,based on two different hypotheses over the structure of the parameters of the model. Finally, in Section 5, we will show some theoretical development helping to understand the formation of patterns inside the machine and how we can use the mean-field or TAP equations to learn the model.

    2. Definition of the model and learning equations

    2.1. Definition of the RBM

    from which we define a Boltzmann distribution

    where Z is given by

    The structure of the RBM is presented in Fig.1 where the visible nodes are represented by black dots,the hidden nodes by red dots,and the weight matrix by blue dotted lines.

    Fig.1. Bipartite structure of the RBM.

    Historically,the RBM was first defined with binary{0,1}variables for both the visible and the hidden nodes in line with the sigmoid activation function of the perceptron, hence being directly intepretable as spin-glass model of statistical mechanics. A more general definition is considered here by introducing a prior distribution function for both the visible and hidden variables, allowing us to consider discrete or continuous variables. This generalization will allow us to see the links between RBMs and other well-known models of machine learning. From now on we will write all the equations for the generic case using the notations qv(σ)and qh(τ)to indicate an arbitrary choice of “prior” distribution. Averaging over the RBM measure corresponding to Hamiltonian(1)will then be denoted by

    where Σ can represent both discrete sums or integrals and with the RBM distribution defined from now on as

    The gradient with respect to (w.r.t.) the different parameters will then take a simple form. Let us detail the computation of the gradient w.r.t. the weight matrix. By deriving the loglikelihood w.r.t. the weight matrix we get

    where we have used the following notation:

    The gradients for the biases(or magnetic fields)are

    It is interesting to note that, in expression (4), the gradient is very similar to the one obtained in the traditional inverse Ising problem with the difference that in the inverse Ising problem the first term(sometimes coined“positive term”)depends only on the data,while for the RBM,we have a dependence on the model (yet simple to compute). Once the gradient is computed,the parameters of the model are updated in the following way:

    where γ called the learning rate tunes the speed at which the parameters are updated in a given direction, the superscript t being the index of iteration. A continuous limit of the learning process can be formally defined by considering t real and replacing t+1 by t+dt,γ by γ dt and letting dt →0.

    2.2. Stochastic gradient descent

    In Ref.[31]it is argued that this procedure is roughly equivalent to minimizing the following KL difference:

    ?CDk=DKL(p0||p)?DKL(pk||p),

    up to an extra term considered to be small without much theoretical guaranty. The major drawback of this method is that the phase space of the learned RBM is never explored since we limit ourselves to k MC steps around the data configurations,therefore it can lead to estimate very poorly the probability distribution for configurations that lie“far away”from the dataset. A simple modification has been proposed to deal with this issue in Ref.[32]. The new algorithm is called persistent-CD (pCD) and consists of having again a set of parallel MC chains,but instead of using the dataset as initial condition,they are first initialized from random initial conditions and then the state of the chains is saved from one update of the parameters to the next one. In other words,the chains are initialized one time at the beginning of the learning and are then constantly updated a few MC steps further at each update of the parameters. In that case, it is no longer needed to have as many chains as the number of samples in the mini-batch even though in order to keep the statistical error comparable between the positive and the negative terms it should be of the same order. More details can be found in Ref.[32]about pCD and in Ref.[33]for a more general introduction to the learning behavior using MC.In Section 5,we will intend to understand some theoretical and numerical aspects of the RBMs learning process.

    3. Overview of various RBM settings

    Before investigating the learning behavior of RBMs, let us have a glimpse at various RBM settings and their relation to other models, by looking at common possible priors used for the visible and hidden nodes.

    3.1. Gaussian–Gaussian RBM

    The most elementary setting is the linear RBM, where both visible and hidden nodes have Gaussian priors:

    Writing the distribution in this new basis we obtain

    and we obtain a solution of the form

    with

    For δα?1 we get a sigmoid type behavior

    with

    句中“天花亂墜”是佛教中的故事,語出《法華經·序品》。傳說佛祖講經,感動了天神,上天紛紛落下花來。后來,人們用“天花亂墜”來形容說話極其動聽,但多指言語過分夸張,不切實際。楊譯為 such a glowing picture,用“一幅美麗的圖畫”來指代“天花亂墜”,此翻譯也算是意盡言至了,基本解決了譯入語讀者閱讀時的理解問題。

    so that finally we get a dynamical system of the form

    Note that at fixed x1and x2the dynamics of θ corresponds to the motion of a pendulum w.r.t the variable θ′=4θ shown in Fig.2.

    Fig.2. Angle between the reference basis given by the data and the moving one given by the RBM shown on the up left panel. Equivalence with the motion of a pendulum is indicated on the left bottom panel. Solution of Eqs.(18)–(20)of two coupled modes in the linear RBM(right panel).

    3.2. Gaussian-spherical

    To end up this section let us also mention that the finite size regime is amenable to an exact analysis when restricting the weight matrix spectrum to have the property of being doubly degenerated(see Ref.[40]for details).

    3.3. Gaussian-softmax

    The case of the Gaussian mixture if rarely viewed like that, fits actually perfectly the RBM architecture. Consider here the case of Gaussian visible nodes and a set of discrete{0,1}hidden variables with a constraint corresponding to the softmax activation function[42]

    With this formulation,we indeed see that the conditional probability of activating a hidden node is a softmax function

    If we impose that the gradient is zero, doing now the “maximization”step,we obtain

    Interestingly,the threshold does not depend on the value of γ in that case,meaning that the instability is a generic property of the learning dynamics. The only change is the speed with which the instabilities will develop.

    3.4. Bernoulli–Gaussian RBM

    The next case is the Bernoulli–Gaussian RBM where we consider the following prior:

    Again,a Gaussian prior implies that the activation function is Gaussian. It is interesting to consider this version of the RBM through its relation with the Hopfield model[23]was realized in Ref.[48]. Since the hidden variables are Gaussian they can be integrated out, which leads to a simple analytical form for the marginals of the visible variables. In some recent works,the opposite approach has been done, starting with a Hopfield model and expressing it as an RBM using the Hubbard–Stratonovitch(HS)transformation(expressing the exponential of a square as a Gaussian integral)to decouple the interactions between spins.[49,50]After integrating over the hidden nodes in Eq.(3),we end up with the following distribution:

    We recognize a Hopfield model where the patterns are given by the weights wiaof the RBM and the effective coupling between two variables i and j is Jij=∑awiawja. We can also consider that the variances of the hidden nodes are related to the temperature of the model.

    With this machine, it is also possible to impose a maximum rank in order to reduce the number of parameters needed to describe the dataset giving the possibility of a trade-off between a good description of the dataset and the number of parameters. This property has been used in Ref. [50] to find global patterns in protein foldings,using the RBM version of the Hopfield model with q discrete states.

    3.5. Gaussian–Bernoulli RBM

    At this point we now focus on models where the hidden layers will have a stronger impact. The integration of the hidden layer will not end up in a simple analytical form and therefore will make it difficult to understand the effect of the features and to characterize properly the learning dynamics. We first mention the Gaussian–Bernoulli case dealing with the following priors:

    When using the discrete {0,1} variables, we obtain the sigmoid activation function for the hidden nodes

    3.6. Bernoulli–Bernoulli RBM

    The last model here is traditionally the one which is implied when speaking of RBM.In that case both the visible and hidden nodes are in{0,1}with the following priors:

    The activation functions are sigmoid functions, for both the hidden and visible nodes

    In that case, the prior distribution has the advantage of not having any free parameter to be determined. In practice this model is used when dealing with a discrete dataset while the Gaussian–Benoulli is for continuous ones. This model can also be generalized to the case where the hidden nodes take more than two states, see Ref. [53] for more details on this approach.

    where we have defined the sigmoid function sig(x) = (1+exp(?x))?1. The r.h.s. of Eq.(29)is very close to the RELU activation function RELU(x)=max(0,x),hence showing that having all these replicas gives a similar activation function as RELU.In practice,it is not very efficient to have a large number of sigmoids for the training algorithm. An approximation is found by using the truncated Gaussian distribution. The average number of activated replica is then given by

    where now τais a RELU hidden node and σais the variance associated with the number of activated replicas for the hidden node a. Equation(30)can now be seen as an approximation of the truncated-Gaussian prior for the hidden nodes

    In the following section, we will focus mainly on the Bernoulli–Bernoulli setting,its equilibrium phase diagram and its learning dynamics in the mean-field regime.

    4. Phase diagram of the Bernoulli–Bernoulli RBM

    4.1. Mean-field approach,the random-RBM

    This MF approach to the macroscopic behavior of the RBM is based on statistical ensembles with iid elements of the weight matrix.Here,a random ensemble for the weight matrix is defined as follows. The weight matrix will be constructed. Now, each pattern is selected to be

    Using this definition, the degree of sparsity of the system is p=∑ipi/Nv. The term random-RBM was coined by Tubiana et al.,[58]but Agliari et al.[63,64]worked on a similar model although with a different theoretical approach. In particular,they computed the phase diagram in Ref.[57]. We start by reproducing here the argument of Agliari that was developed for the RBM with a finite number of patterns before switching to the replica computation done in Tubiana’s thesis.[60]

    recovering the Hopfield model with a square inverse temperature, where we define the magnetization along the pattern a as . In Ref. [63], the authors considered a weight dilution as in Eq.(32)applied to the above binary–binary RBM,or equivalently to a Hopfield model with a rescaled temperature.It is important to mention that it is a different procedure than diluting the network itself, see Ref. [65] for more details on the other case. Having sparse patterns allows the network to retrieve more than one pattern at a time. In particular, global minima of the free energy can have an overlap with many patterns and locally stable states can be composed of a complex mixture of patterns. We reproduce below in Fig.3 the plot from Ref.[63]showing the overlap over three and six patterns in the(almost)zero temperature limit. We observe in the left panel that one pattern is fully retrieved when the dilution is low. Then,when increasing pi,more and more patterns are retrieved together until the system enters a paramagnetic phase at high dilution.

    Fig.3. From Ref. [63]. Overlap with different patterns when varying the dilution factor p (named d on the figure) at low temperature. Left: a case with 3 patterns where we can observe how at small dilution,only one pattern is fully retrieved while the second and third ones appear for larger dilution.Right: a case with 6 patterns where the figure is zoomed in the high dilution region where the branching phenomenon is occurring and all the overlaps converge toward the same value.

    Replica approach of the random-RBMWe will now follow the approach of Tubiana et al.[60]and give more details on the derivation. This approach is based on a Bernoulli-RELU architecture giving the possibility to have continuous positive value for the hidden variables.

    The characterization of the phase diagram is based on the determination of the free energy in thermodynamic limits.Given the weight ensemble(see Eq.(32)),the weight matrix is now made of independent and sparse elements.In this context,the replica analysis can be used to perform the quenched average. The replicated interaction term can be first easily computed and gives

    for the interaction term(ia). The interaction between the visible and hidden nodes can be decoupled using the HS transformation

    introducing the spin-glass order parameter over the replicas(we denote by p,q,... the replica index)

    ? p<1,a ferromagnetic transition is found when imposing ?L=1,where one pattern is recalled at a time.

    ? p<1, when all the hidden nodes are all weakly activated,a SG phase is found.

    In this analysis, it is demonstrated that in the possible equilibrium behaviors of the random-RBM, an interesting phase mixing many patterns is present that characterizes in some way the efficient working regime of a learned RBM.It is of course a simplified case where the patterns are {±1} with a certain dilution factor. Now,the fact that there exists a family of weights where this phase exists is quite different from showing that the learning dynamics converges toward such a phase and how. In Tubiana’s thesis,a stability analysis of the different phases is done showing that for a range of parameters of the RBM, the compositional phase is indeed the dominant one. Then,a certain number of numerical results are provided on the MNIST dataset which tend to confirm that the behavior of the learned RBM looks similar to a“compositional phase”.It would therefore be of great interest to characterize the learning curve theoretically in order to understand how this phase is reached. It is also interesting to mention a recent work investigating the role of the diluted weights[66]during the learning in a RBM with one hidden node. In this article, it is shown that the proportion of diluted weights tends to vanish during the learning procedure. This might be a signal that when the number of hidden features is very low, the RBM automatically adjusts itself in the ferromagnetic phase described above,learning a global pattern of the dataset.

    4.2. Mean-field approach using rank K weight matrix

    The difficulty with the RBM is to be able to study the phase diagram of the model without discarding the fact that during the learning, the weights wiabecome correlated between each others: starting from independently distributed wia,we can observe how the spectrum of the weight matrix is modified during the learning(see Fig.11 for instance). Classical approaches in statistical mechanics consider a set of independent weights,all identically distributed,before trying to compute the quenched free energy of the system by the replica trick(in few words,considering the quantity Znfor a given(integer)n,where Z is the partition function,for small n,we can develop Zn≈1+nlog(Z). The key point here is that it is generally possible to compute the quenched Znand then make a small n expansion). In the present case the hypothesis of independent weights cannot hold,as can be seen by looking at the spectrum of the weight matrix at the beginning of the learning and a few iterations later.The absorption of information by the machine prompts the development of strong correlations. This phenomenon is illustrated in Subsection 5.3 and in Fig.11. In order to understand how these eigenvalues affect the phase diagram of the system, it is reasonable to assume a particular statistical ensemble of the weight matrix of the form

    Then, the form of the weight matrix, Eq. (33), leads to the following change of variable:

    we obtain two additional order parameters

    where E indicates an average over the variables that are in subscripts and κ =Nh/Nv. The quantities A and B are given by

    where

    A first look at the equations for the magnetization over the mode α tells us that they correspond to the usual mean-field equations of the Sherrington–Kirkpatrick model[67]projected on the SVD decomposition of the weight matrix. The same is true for the overlap,with the difference that we have an overlap parameter for each layer. Analyzing these equations, we can distinguish three phases.

    Fig.4. Left: the phase diagram of the model. The y-axis corresponds to the variance of the noise matrix,the x-axis to the value of the strongest mode of. We see that the ferromagnetic phase is characterized by having strong mode eigenvalues. In this phase,the system can behave either by recalling one eigenmode of or by composing many modes together(compositional phase). For the sake of completeness,we indicate the AT region where the replica symmetric solution is unstable,but for practical purpose we are not interested in this phase. Right: an example of a learning trajectory on the MNIST dataset (in red) and on a synthetic dataset (in blue). It shows that starting from the paramagnetic phase,the learning dynamics brings the system toward the ferromagnetic phase by learning a few strong modes.

    ? γ =0,e.g.,the Gaussian distribution. In this case,only the strongest mode is stable, and the weaker ones are unstable w.r.t. to the strongest one. Here, the system will condense along the strongest mode only.

    ? γ < 0, e.g., the uniform or the Bernoulli distribution.Here the weaker modes can be metastable if they are not“too far away”from the strongest one. However the system will condense only toward one mode.

    ? γ >0, e.g., a sparse Bernoulli, or the Laplace distribution. In this case, the strongest mode is unstable w.r.t.weaker ones, leaving the possibility to have condensation over many modes at the same time. This corresponds to a dual compositional phase, by reference to the terminology introduced in Ref. [58] which corresponds to combination of features instead of modes.

    5. Learning RBM

    5.1. Learning dynamics for exactly solvable RBMs

    Fig.5. On this artificial dataset, we observe that eigenvalues that followare learned and reach the threshold indicated by Eq.(15). In the inset, the alignment of the first four principal directions of the matrix of the SVD of and of the dataset. In red, we observe that the likelihood function is increasing each time that a new mode emerges.

    Fig.6. Left: the learning curves for the modes wα using an RBM with(Nv,Nh)=(100,100)learned on a synthetic dataset distributed in the neighborhood of a 20d ellipsoid embedded into a 100d space. Here the modes interact together: the weaker modes push the stronger ones higher, and they all accumulate at the top of the spectrum, as explained in Subsection 3.2. Right: a scatter plot projected on the first two SVD modes of the training(blue)and sampled data from the learned RBM(red)for a problem in dimension Nv =50 with two condensed modes. We can see that the learned matrix captures relevant directions and that the RBM generates data perfectly similar to the one of the training set.

    5.2. Pattern formation in the 1D Ising chain

    In a recent work,[69]the formation of features is studied analytically on a RBM with one hidden node. The training dataset is generated from a 1D Ising chain with a uniform coupling constant and periodic boundary conditions.The model used for generating the data has a translational symmetry which is exploited to solve the learning dynamics exactly.There is indeed available a closed form expression for the correlation function. Thanks to the translation invariance this depends only on the relative distance between the variables. Numerically,it is found that

    ? Using more hidden nodes (but still few), it is observed that each feature is peaked at different places and repels each other to encode the correlation patterns of the data.Again,the positions of the peaks diffuse with time even though some repulsive interaction seems to forbid them to cross. See Fig.7 taken from Ref.[69]illustrating this phenomenon.

    Now in Ref. [69], the author computed the gradient of a system with one hidden node

    This expression can be developed up to the fourth order in w(and β),giving in the case of the 1D Ising chain

    It is easy to identify in the first two terms the 1D discrete spatial diffusion. This equation can be cast into a spatial diffusion equation with additional term in the continuous time limit(see Ref. [69] for more details). From this small coupling expansion it is also possible to study the stationary solution in the one hidden node case and show that it is consistent with experimental results: it describes a peaked function decreasing rapidly as the distance from the center increases. An approximated weak coupling equation can also be derived in the case of two hidden units.In this case,an effective coupling between the two features vectors w1and w2is present and responsible for a repulsive interaction between the two peaks.

    This illustrates nicely how the features learned by the RBM tend to describe local correlations between variables.In addition,these features diffuse over the whole system during the learning to restore the translational symmetry without crossing thanks to a repulsive interactions between them. In the next section,we will focus on the learning behavior on the MNIST dataset and see that in that case, the learned features similarly describe local correlations.

    Fig.7. Left: figure from Ref.[69],the value of wi for each visible site of a RBM with 3 hidden nodes trained on the dataset of the 1D homogeneous Ising model with periodic boundary condition. We see three similarly peak shaped potentials with a decreasing magnitude of similar order for the three.Each peak intends to reproduce the correlation pattern around a central node,and therefore cannot reproduce the translational symmetry of the problem.Right: figure from Ref. [69], the position of the three peaks as a function of the number of training epochs. We observe that the peaks diffuse while repelling each others. The diffusion aims at reproducing the correlation patterns of the translational symmetry,while the repelling interaction ensures that two peaks will not overlap.

    5.3. Pattern formation in MNIST:from SVD to ICA?

    The pattern formation mechanism can be studied numerically on the MNIST dataset.MNIST[8]is one of the most used real dataset in machine learning, it contains 60000 images of black and white handwritten digits of 28×28 pixels,ranging from 0 to 9. The digits are about all the same size and are at the center of the image. They are illustrated in Fig.8.

    Fig.8. A subset of the MNIST dataset.

    Additionally,at this stage of the learning the MC samples obtained from the RBM are typically prototypes: each sample is almost identical (or has a large overlap) with a learned feature. In fact, during the training, if we monitor samples at each epoch(keeping a low learning rate), we can see that the samples have a high overlap with one mode at the beginning,then later on with combinations of modes. To be more precise,we can distinguish different stages of the learning by inspecting the features, the produced samples, and the distance between the discretize features (taking the sign of each feature and computing the overlap). We illustrate these different stages in Fig.10.

    To summarize we have identified the following stages:

    ? Stage 2: the RBM enters the ferromagnetic phase, the first strongest mode of the SVD is learned by all features, giving a high positive or negative overlap in the inter-features distances while the generated samples have a high overlap with the learned features.

    ? Stage 3: where many modes have emerged, but the learned features remain global and close to the modes of the dataset. The histogram of distances becomes much broader but the generated sample corresponds basically to the learned features with few variety. The RBM is in a pure Mattis phase analogous to the recall phase of the Hopfield model.

    ? Stage 4: finally,after a much longer period,we observe that the learned features are much alike an ICA decomposition while the distances between features are still centered in zero but with a much smaller variance. Finally the generated samples look very similar to the provided dataset. The RBM is in a compositional phase,both regarding the features and the modes (the dual one).

    Fig.9. Left: the first 10 modes of the MNIST dataset(top)and the RBM(bottom)at the beginning of the learning. The similarity between most of them is clearly visible. Right: 100 random features of the RBM at the same moment of the learning. We can see that most features correspond to a mode of the dataset when comparing with the left-top panel.

    Fig.10. The column represents respectively (i) the first hundred learned features, (ii) the histogram of distances between the binarized features:W±1 =sign(W), and (iii) 100 samples generated from the learned RBM. The first row corresponds to the beginning of the learning when only one feature is learned. Looking at the histogram,we see that most of the features have a high overlap. Also,the MC samples are all similar to the learned features. On the second row, the RBM has learned many features, and therefore the histogram is wider but still centered at zero. The MC sampling however is only capable of reproducing one of the learned features. On the last row the learning is much more advanced. The features tend to be very localized and the samples correspond now to digits.

    Fig.11. (a)Singular values distribution of the initial random matrix compared to the Marchenko–Pastur law. (b)As the training proceeds we observe singular values passing above the threshold set by the Marchenko–Pastur law.(c)Distribution of the singular values after a long training:the Marchenko–Pastur distribution has disappeared and been replaced by a fat tailed distribution of eigenvalues mainly spreading above threshold and a peak of belowthreshold singular values near zero. The distribution of eigenvalues does not get close to any standard random matrix ensemble spectrum.

    In future works,it could be interesting to understand the mechanism leading to the localization of the features,in particular whether this is related to some specific tail distribution of the weight matrix spectrum. An aspect of RBMs completely absent from the previous description of the learning process is the behavior of the biases associated to the hidden nodes.These are very important since they determine the threshold above which the features are activated and their learning dynamics is quite intertwined with the modes dynamics. This aspect of the learning could be worth studying especially to improve present learning algorithms.

    5.4. Learning RBM using TAP equations

    The difficulty of learning an RBM comes as already said from the negative term which requires to compute the thermal average of correlations between a visible and hidden nodes.In particular, when the machine starts to learn many modes,it becomes more and more difficult to estimate this term correctly using Monte–Carlo methods due to the eventually large relaxation time. In addition, to get a precise measurement, it is necessary to get many statistically independent samples in order to reduce the statistical error.

    In this section we will derive the mean-field selfconsistent equations that can be used to approximate the negative term by using a high-temperature expansion of the Boltzmann measure. We illustrate the method showing the result of Gabri′e et al.[71]where an RBM has been trained by using the TAP equations. An interesting derivation using a variational approach in the case of the Gaussian–Bernoulli case has also been done in Ref.[72].

    High-temperature(Plefka)expansionWe review here a famous approach using a high-temperature expansion of the system in order to compute the mean-field magnetization.This method is both very simple to implement and also provides a way to approximate the free energy of the system in the weak couplings regime. Recent successful approaches[73,74]showed how it is possible to train an RBM using these meanfield equations. For this subsection,we will use{±1}binary variables for simplicity.

    For the Ising model, it is well-known that the (naive)mean-field (nMF) approximation can be written as a set of self-consistent equations on the magnetizations, and the associated approximation of the free energy can be computed as a function of these magnetizations

    These equations can be translated directly to the case of the RBM, with the only need to specify clearly which variables are the visible and hidden ones. One gets the following:

    Using this expression,we can follow[77]and compute the magnetization in the infinite temperature limit of the following free energy:

    With our Hamiltonian, we can compute the first and second orders easily

    where we have used the following identities for the second order computation:

    As show in Ref. [73], the expansion can be easily extended to the third order without a big computational cost due to the particular topology of the RBM.Deriving the free energy obtained at this order w.r.t.the magnetization,we obtain the selfconsistent set of equations defining the TAP equations for the RBM

    Hence, a solution of the TAP equations should satisfy Eqs. (41) and (42) and give us at the same time the approximated free energy associated to this solution

    We can now use these mean-field equations to learn the RBM.First,we need to take into account the fact that many solutions to Eqs.(41)and(42)exist,each one with a given value of the free energy.Hence,the partition function can be approximated by

    where the sum runs over all the possible solutions to the meanfield equations (41) and (42), weighted by the free energy given in Eq. (43). Using this approximation in the computation of the likelihood we obtain the following gradient:

    where

    corresponds to the model average over all the solutions of the mean-field equations. We can see here a notable difference with the approach developed in Ref.[73]. In their work,Gabri′e et al. runs the sums over all obtained fixed points from the mean-field equations divided by the number of the fixed points only. The risk is that if the mean-field equations converge toward a fixed point that is suboptimal (having a high free energy)or even spurious(being a a maximum of the free energy) the estimation of the negative term will be polluted by such fixed points. More details on the Plefka expansion on bipartite Ising model can be found in Ref.[78]. As a final remark,let us insist on the fact that,even if the convergence of the TAP equations is not guaranteed,problems of convergence are practically not met in the ferromagnetic phase.On the contrary,such problems occur quite often in the spin glass phase which we wish to avoid in the context of learning the RBM.

    Experiment with TAP learningWe show here some results obtained on MNIST using the same parameters as above but with the mean-field approximation taken from Ref. [73].Here, the comparison is done using the persistent chain algorithm, where a set of MC chains is maintained all along the learning whenever using CD,nMF,or the TAP approximation(in the case of nMF or TAP, the chain is updated using the corresponding self-consistent equations),see Fig.12.

    Fig.12. Top: figure taken from Ref.[73],the samples taken from the permanent chain at the end of the training of the RBM.The first two lines correspond to samples generated using PCD,the second two lines to samples obtained using the P-nMF approximation,and the last two,using P-TAP.Bottom: 100 features obtained after the training,we can see that they are qualitatively very similar to the ones obtained when training the RBM with P-TAP.

    First we see that the samples generated by all the three methods are qualitatively similar. Second,the features learned are also qualitatively similar to the PCD case. Therefore, on the MNIST dataset the two machines are hardly distinguished by just looking at the generated samples and learned features,indicating that the MF/TAP approximation is working very well. It is also important to point out here that,the advantage of the mean-field approximation in that case does not rely on any speed up with regard to the learning procedure. But,more importantly,it provides complementary tools such as the fixed points as local maxima of the free energy and their associated free energy. For instance, in Ref. [79] an RBM is used as a prior distribution in the context of compressed sensing where the mean-field equations are used to infer equilibrium values of the variables. In Ref.[80],the RBM is used to reconstruct images from partial observations, again using the mean-field formulation to infer the states of the missing information.

    5.5. Mean-field learning: ensemble average

    In the approach developed in Ref.[36],using the statistical ensemble defined in Subsection 4.2 it is possible to have a mean-field estimate of the response functions involved in the gradient of the log-likelihood. For the response term on the data we get

    Fig.13. Top panel: Results for a RBM of size (Nv,Nh)=(1000,500) learned on a synthetic dataset of 104 samples having 20 clusters randomly located in a sub-manifold of dimension d =15. The learning curve for the eigemodes wα (left) and the associated likelihood function(right-red)together with the number of obtained fixed points at each epoch. We can see that,before the first eigenvalue is learned there is one single fixed point, then as modes are learned, the number of fixed points increases. Bottom panel: Results for an RBM of size(Nv,Nh)=(100,50)learned on a synthetic dataset of 104 samples having 11 clusters randomly defined a sub-manifold of dimension d=5. On the left,the scatter plot of the training data together with the position of the fixed points projected on the first two directions of the SVD of

    We see again the different eigenvalues emerging one by one, and that each newly learned eigenvalue is triggering a jump of the likelihood together with a jump in the number of fixed points. At the end of the learning, the obtained meanfield fixed points are located at the center of each cluster of the dataset, as can be seen on the scatter plots. In Ref. [36], it is also shown that the behavior is qualitatively similar to what is routinely obtained when performing a standard learning based on PCD.

    5.6. Other mean-field approach

    6. Conclusion

    With this review, we strive at showing that not only is RBM part of a hectic field of study,but it is also an intriguing puzzle with pieces which are missing in order to be able to understand the way these models can/could assimilate complex information/more complex information. While the black box nature of the learning process starts to fade away very slowly,there are still many key aspects that we do not understand or master for such simple models. We try to list interesting leads for the future.

    Learning qualityDespite the fact that we are maximizing a likelihood function (which can not be computed) it is very hard to obtain a good indicator for comparing two learned RBMs. Even if many methods exist to compute the likelihood approximately[87,88]the obtained scores are in general not commented in regard to robust statistical analysis. If for very hard cases of image generation, it is easy to compare the results by eye inspection,there are no general method that manage to assess the quality of the samples in terms of how well the learned distribution reproduces the dataset distribution. Some recent work[89]introduced the notion of“ressemblance”and“privacy”that test the geometric repartition of the true data against the generated samples. This could be a first step defining scores according to different criteria (actually,this problem is not specific to the RBM but concerns actually most of the unsupervised learning models(GANs,VAEs,...).

    The number of hidden nodesIt is striking that we are still unable to have a principled manner of deciding how many hidden nodes are necessary to learn datasets which are not too complex. For instance, on MNIST, it is possible to learn a machine with only 50 hidden nodes and it somehow manages to produce decent samples. The understanding on how much hidden nodes are necessary to reach a given sample quality is completely missing. In addition,the number of hidden nodes influences a lot the learning behavior of the machine,again in a way that is not fully understood.

    The landscape of free energyWhen using statistical mechanics to understand RBMs,the natural question that comes in mind is about the landscape of free energy of the learned machine. It is easy to observe the mean-field fixed points obtained in the ferromagnetic phase and that they do correspond to prototypes of the dataset. Still, we do not know how these many fixed points are organized: are there low free energy paths relating them one from each others? do these paths define a network structure or instead separated clusters of low free energy?

    The landscape of learned RBMsThis is a generic question in machine learning: what is the landscape of “good”learned machines in parameter space (here the weight matrix). For supervised tasks,some consensus seems to describe a space which is globally flat where all the good models are next to one another. However this is true for deep models, in the case of RBM,apart from the permutation symmetry of the hidden nodes,we have no clue about what this landscape looks like.

    Link between the dataset and the learned featuresWe have seen that in the Gaussian–Gaussian case there is a direct link between the eigen-decomposition of the dataset and the learned features. However, for the non-linear model, we do not understand how the modes of the weight matrix are linked to the dataset,nor to the associated rotation matrices.

    猜你喜歡
    法華楊譯多指
    莫高窟法華經變中的農耕圖藝術
    國畫家(2023年1期)2023-02-16 07:58:32
    和田出土《法華經》古藏譯本的初步研究報告(二)
    西藏研究(2021年1期)2021-06-09 08:09:40
    《普林斯頓大學藏西夏文<法華經>》讀后
    西夏研究(2019年2期)2019-06-20 08:20:50
    Analysis on Two Chinese Versions of Gulliver’s Travels from the Perspective of “Faithfulness, Expressiveness and Elegance”
    慧思研讀《法華經》
    文化語境視閾下的《紅樓夢》詩詞曲賦翻譯策略的選擇
    今非昔比
    意識形態(tài)與翻譯副文本的變遷:楊譯魯迅小說副文本研究
    江漢論壇(2017年6期)2017-06-30 10:22:23
    抓取柔軟織物多指靈巧手的建模與仿真
    多指離斷手指移位再植拇指25例
    国产精品免费大片| 日本av免费视频播放| 亚洲精品,欧美精品| 两个人的视频大全免费| 大香蕉97超碰在线| 国产精品偷伦视频观看了| 久久久久久久久久人人人人人人| 考比视频在线观看| 免费黄网站久久成人精品| 青青草视频在线视频观看| 超色免费av| 中文天堂在线官网| 人妻 亚洲 视频| 男人爽女人下面视频在线观看| 国产成人精品婷婷| 只有这里有精品99| videosex国产| 亚洲在久久综合| 免费看光身美女| 亚洲国产精品999| 免费观看在线日韩| 99热国产这里只有精品6| 91久久精品国产一区二区三区| 一本一本综合久久| 热re99久久精品国产66热6| 亚洲国产毛片av蜜桃av| 久久精品人人爽人人爽视色| 国产一区二区三区av在线| 伦理电影免费视频| 国产亚洲精品久久久com| 亚洲精品成人av观看孕妇| 日韩电影二区| 一级毛片黄色毛片免费观看视频| 人人妻人人澡人人看| 综合色丁香网| 一级黄片播放器| 久久婷婷青草| 肉色欧美久久久久久久蜜桃| 亚洲伊人久久精品综合| 日日爽夜夜爽网站| 蜜桃国产av成人99| 久久亚洲国产成人精品v| 91午夜精品亚洲一区二区三区| 大码成人一级视频| 我要看黄色一级片免费的| 如何舔出高潮| 久久精品国产亚洲av涩爱| av在线观看视频网站免费| 91久久精品国产一区二区三区| 狂野欧美白嫩少妇大欣赏| 亚洲丝袜综合中文字幕| 两个人的视频大全免费| 国产国拍精品亚洲av在线观看| 亚洲情色 制服丝袜| 亚洲av电影在线观看一区二区三区| 成人综合一区亚洲| 日本欧美视频一区| 一级毛片aaaaaa免费看小| 精品一区二区三卡| 人人妻人人爽人人添夜夜欢视频| 成人免费观看视频高清| 天堂中文最新版在线下载| 亚洲av国产av综合av卡| 搡女人真爽免费视频火全软件| 日本vs欧美在线观看视频| 久久久国产一区二区| 狂野欧美激情性xxxx在线观看| 少妇猛男粗大的猛烈进出视频| 亚洲av免费高清在线观看| 妹子高潮喷水视频| 乱码一卡2卡4卡精品| 欧美人与善性xxx| 国产男女超爽视频在线观看| 精品一区二区免费观看| 亚洲婷婷狠狠爱综合网| av线在线观看网站| 久久国产精品大桥未久av| 久久久久久久久大av| 亚洲精品av麻豆狂野| 国产欧美日韩一区二区三区在线 | 亚洲av国产av综合av卡| 中国国产av一级| 国产乱来视频区| 免费看不卡的av| 亚洲精品自拍成人| 久久午夜综合久久蜜桃| 久久99热6这里只有精品| 熟女av电影| 制服诱惑二区| 国产黄片视频在线免费观看| 亚洲国产欧美日韩在线播放| 女人久久www免费人成看片| 精品国产乱码久久久久久小说| 国精品久久久久久国模美| 国产爽快片一区二区三区| 女性生殖器流出的白浆| 菩萨蛮人人尽说江南好唐韦庄| 成人无遮挡网站| 久久久a久久爽久久v久久| 国产免费福利视频在线观看| 久久人人爽人人爽人人片va| 最近的中文字幕免费完整| 国产精品不卡视频一区二区| av播播在线观看一区| 日韩成人av中文字幕在线观看| 亚洲av不卡在线观看| 亚洲中文av在线| 少妇人妻久久综合中文| 能在线免费看毛片的网站| 免费黄频网站在线观看国产| 制服丝袜香蕉在线| 欧美少妇被猛烈插入视频| 麻豆精品久久久久久蜜桃| 久久久欧美国产精品| 久久99蜜桃精品久久| 久久99热这里只频精品6学生| 久久久久久久大尺度免费视频| 久热这里只有精品99| av在线app专区| a级毛片免费高清观看在线播放| 亚洲欧美成人综合另类久久久| 啦啦啦在线观看免费高清www| 久久ye,这里只有精品| 18禁在线无遮挡免费观看视频| 丝袜脚勾引网站| 亚洲美女搞黄在线观看| 少妇被粗大的猛进出69影院 | 秋霞在线观看毛片| 熟妇人妻不卡中文字幕| 在线播放无遮挡| 欧美三级亚洲精品| 亚洲欧洲国产日韩| 91久久精品国产一区二区三区| 欧美3d第一页| 国产精品.久久久| 女人精品久久久久毛片| 免费日韩欧美在线观看| 秋霞伦理黄片| 久久99蜜桃精品久久| 中文字幕av电影在线播放| 新久久久久国产一级毛片| 成人毛片60女人毛片免费| 亚洲欧洲精品一区二区精品久久久 | 国产黄色视频一区二区在线观看| 精品视频人人做人人爽| 亚洲av中文av极速乱| 免费观看性生交大片5| 高清毛片免费看| 成年美女黄网站色视频大全免费 | 国产探花极品一区二区| 又大又黄又爽视频免费| 99久久精品国产国产毛片| 欧美变态另类bdsm刘玥| 欧美日韩视频精品一区| 久久韩国三级中文字幕| 欧美精品高潮呻吟av久久| 伊人亚洲综合成人网| 黄片播放在线免费| 丝瓜视频免费看黄片| 制服丝袜香蕉在线| 一级二级三级毛片免费看| 亚洲人成网站在线播| 国产有黄有色有爽视频| 男女边吃奶边做爰视频| 蜜桃国产av成人99| 全区人妻精品视频| 又粗又硬又长又爽又黄的视频| 日韩一区二区三区影片| 久久女婷五月综合色啪小说| 九色成人免费人妻av| 99热这里只有是精品在线观看| 久久久久国产网址| 成人国语在线视频| 国产精品国产三级国产专区5o| 亚洲在久久综合| 制服诱惑二区| 女人精品久久久久毛片| 精品国产露脸久久av麻豆| 色网站视频免费| 中文天堂在线官网| 成人国语在线视频| 久久97久久精品| 日韩一区二区三区影片| 国产精品人妻久久久久久| 欧美日韩一区二区视频在线观看视频在线| 亚洲精华国产精华液的使用体验| 天堂中文最新版在线下载| 精品久久久久久电影网| 91久久精品国产一区二区成人| 国产69精品久久久久777片| 久久精品熟女亚洲av麻豆精品| 国产高清不卡午夜福利| 视频区图区小说| 中国国产av一级| 久久久久久久大尺度免费视频| 午夜福利在线观看免费完整高清在| 久久午夜综合久久蜜桃| 十八禁高潮呻吟视频| 成人毛片a级毛片在线播放| 国产黄片视频在线免费观看| 欧美精品高潮呻吟av久久| 国产精品偷伦视频观看了| 丝袜美足系列| 国产色爽女视频免费观看| 视频在线观看一区二区三区| 91久久精品国产一区二区成人| 久久精品熟女亚洲av麻豆精品| 日本av免费视频播放| 国产免费一区二区三区四区乱码| 一区二区日韩欧美中文字幕 | 伊人亚洲综合成人网| 国精品久久久久久国模美| 大又大粗又爽又黄少妇毛片口| 日本色播在线视频| 高清欧美精品videossex| 欧美老熟妇乱子伦牲交| 一区二区av电影网| 久久人人爽人人片av| 亚洲中文av在线| 多毛熟女@视频| 欧美成人午夜免费资源| 22中文网久久字幕| 卡戴珊不雅视频在线播放| 午夜免费观看性视频| 国产欧美日韩综合在线一区二区| a级毛片黄视频| 亚洲国产日韩一区二区| 亚洲av福利一区| 欧美 日韩 精品 国产| 交换朋友夫妻互换小说| 男男h啪啪无遮挡| 国产午夜精品一二区理论片| 久久久久精品久久久久真实原创| 亚洲欧美一区二区三区国产| 国产高清不卡午夜福利| 美女xxoo啪啪120秒动态图| 亚洲精品国产av蜜桃| 日韩一本色道免费dvd| 国产精品久久久久久av不卡| 午夜免费男女啪啪视频观看| 成人国产av品久久久| 九九在线视频观看精品| 久久人人爽av亚洲精品天堂| 亚洲第一av免费看| 免费高清在线观看视频在线观看| 久久午夜综合久久蜜桃| 黄色毛片三级朝国网站| 寂寞人妻少妇视频99o| 秋霞伦理黄片| 99精国产麻豆久久婷婷| 色婷婷av一区二区三区视频| 国产免费一区二区三区四区乱码| 久久久久久久国产电影| 亚洲无线观看免费| 亚洲精品乱码久久久久久按摩| 一级二级三级毛片免费看| 亚洲精品456在线播放app| 人妻系列 视频| 国产精品麻豆人妻色哟哟久久| 亚洲欧美一区二区三区黑人 | 久久久久久人妻| 国产有黄有色有爽视频| 精品久久国产蜜桃| 亚洲欧美日韩另类电影网站| 黄色配什么色好看| 亚洲欧洲日产国产| 免费黄频网站在线观看国产| 国产熟女午夜一区二区三区 | 亚洲天堂av无毛| 午夜激情av网站| 99视频精品全部免费 在线| 校园人妻丝袜中文字幕| 男女无遮挡免费网站观看| 成人国产麻豆网| 人妻系列 视频| 亚洲国产精品一区二区三区在线| xxxhd国产人妻xxx| 欧美变态另类bdsm刘玥| 一本色道久久久久久精品综合| 亚洲精品中文字幕在线视频| 母亲3免费完整高清在线观看 | 少妇人妻 视频| 国产精品嫩草影院av在线观看| videosex国产| 亚洲精品久久午夜乱码| 80岁老熟妇乱子伦牲交| xxx大片免费视频| 内地一区二区视频在线| 久久久久久久久久久久大奶| 夜夜骑夜夜射夜夜干| 精品酒店卫生间| 久久久久人妻精品一区果冻| 一级毛片aaaaaa免费看小| 亚洲av男天堂| 熟妇人妻不卡中文字幕| 日产精品乱码卡一卡2卡三| 美女cb高潮喷水在线观看| 欧美激情极品国产一区二区三区 | 99热全是精品| 欧美精品人与动牲交sv欧美| 最近手机中文字幕大全| 精品人妻熟女毛片av久久网站| 国产精品三级大全| 日本欧美视频一区| 日韩制服骚丝袜av| 如何舔出高潮| 欧美精品一区二区免费开放| 久久精品夜色国产| 免费观看在线日韩| 亚洲国产最新在线播放| 国产白丝娇喘喷水9色精品| 亚洲av电影在线观看一区二区三区| 精品久久久噜噜| 男女边摸边吃奶| 久久久精品94久久精品| a级毛片在线看网站| 国产成人精品婷婷| 久久99热6这里只有精品| 一个人看视频在线观看www免费| 大话2 男鬼变身卡| 国产男女内射视频| 熟女人妻精品中文字幕| 一级毛片aaaaaa免费看小| 日韩av不卡免费在线播放| 青青草视频在线视频观看| 成年av动漫网址| 午夜福利在线观看免费完整高清在| 黑人高潮一二区| 在线免费观看不下载黄p国产| 黄色一级大片看看| 18禁裸乳无遮挡动漫免费视频| 一本一本综合久久| 少妇被粗大的猛进出69影院 | 国产精品蜜桃在线观看| 99视频精品全部免费 在线| 成人18禁高潮啪啪吃奶动态图 | 视频在线观看一区二区三区| 下体分泌物呈黄色| 亚洲欧美精品自产自拍| 亚洲成人手机| 国产一区二区三区综合在线观看 | 老司机影院成人| 肉色欧美久久久久久久蜜桃| 王馨瑶露胸无遮挡在线观看| 人人妻人人澡人人爽人人夜夜| 亚洲精品av麻豆狂野| 久久精品国产亚洲av涩爱| 日韩强制内射视频| 亚洲丝袜综合中文字幕| 国产一区亚洲一区在线观看| 丝袜喷水一区| 一级毛片 在线播放| 少妇 在线观看| 边亲边吃奶的免费视频| 一本一本综合久久| 日本黄大片高清| 毛片一级片免费看久久久久| 久久精品人人爽人人爽视色| 久久99热这里只频精品6学生| 高清视频免费观看一区二区| 搡老乐熟女国产| 考比视频在线观看| 日韩av免费高清视频| videos熟女内射| av免费观看日本| 美女视频免费永久观看网站| 中文欧美无线码| 久久精品久久久久久久性| av在线观看视频网站免费| 少妇人妻精品综合一区二区| 一级,二级,三级黄色视频| av女优亚洲男人天堂| 久久女婷五月综合色啪小说| 国产成人免费无遮挡视频| 国产精品一区二区三区四区免费观看| 一区二区三区四区激情视频| 欧美日本中文国产一区发布| 日日爽夜夜爽网站| 五月开心婷婷网| 免费观看性生交大片5| 久久久久久久久久成人| 一级二级三级毛片免费看| 色5月婷婷丁香| 久久影院123| 国产精品人妻久久久久久| 搡老乐熟女国产| 中国美白少妇内射xxxbb| 久久影院123| 亚洲性久久影院| 天美传媒精品一区二区| 日韩,欧美,国产一区二区三区| 99九九线精品视频在线观看视频| 午夜免费男女啪啪视频观看| 日韩强制内射视频| 国产精品不卡视频一区二区| 天美传媒精品一区二区| 精品酒店卫生间| 日日摸夜夜添夜夜爱| 少妇的逼水好多| 成人国产av品久久久| 亚洲国产色片| 九九在线视频观看精品| 成人影院久久| 赤兔流量卡办理| 久久精品久久久久久噜噜老黄| 国产精品久久久久久久久免| 国产极品粉嫩免费观看在线 | 亚洲成色77777| 大片免费播放器 马上看| 蜜桃久久精品国产亚洲av| 3wmmmm亚洲av在线观看| 狠狠精品人妻久久久久久综合| 亚洲天堂av无毛| 如何舔出高潮| 日韩制服骚丝袜av| av在线老鸭窝| xxx大片免费视频| 久久国产精品男人的天堂亚洲 | 亚洲国产精品成人久久小说| 男女啪啪激烈高潮av片| 日本黄色片子视频| 欧美人与性动交α欧美精品济南到 | 97超碰精品成人国产| 多毛熟女@视频| 熟女av电影| 国产精品女同一区二区软件| 国产在线免费精品| 精品国产国语对白av| 日本免费在线观看一区| 人妻人人澡人人爽人人| 在线看a的网站| a 毛片基地| 久久热精品热| 精品久久久精品久久久| 国产精品久久久久久久电影| 夜夜骑夜夜射夜夜干| 一本一本久久a久久精品综合妖精 国产伦在线观看视频一区 | 国产色婷婷99| 中国三级夫妇交换| 亚洲美女搞黄在线观看| 日韩伦理黄色片| 十八禁高潮呻吟视频| 少妇丰满av| 丰满饥渴人妻一区二区三| 久久99精品国语久久久| 亚洲av综合色区一区| 日韩精品有码人妻一区| 少妇精品久久久久久久| 国产精品人妻久久久影院| 欧美 日韩 精品 国产| 成人手机av| 国产黄片视频在线免费观看| 另类亚洲欧美激情| 亚洲精品久久久久久婷婷小说| 最近手机中文字幕大全| 五月开心婷婷网| 在线观看美女被高潮喷水网站| 久久国产精品男人的天堂亚洲 | 国产成人精品福利久久| 一级毛片电影观看| 在线观看一区二区三区激情| 国产精品秋霞免费鲁丝片| 亚洲成人一二三区av| 久久精品熟女亚洲av麻豆精品| 人体艺术视频欧美日本| 一区二区三区四区激情视频| 亚洲精品一二三| 欧美精品国产亚洲| 精品亚洲乱码少妇综合久久| 欧美精品一区二区免费开放| 黄色视频在线播放观看不卡| 国产精品不卡视频一区二区| 夫妻午夜视频| 日韩视频在线欧美| 女人精品久久久久毛片| 自线自在国产av| 免费观看a级毛片全部| 欧美xxxx性猛交bbbb| 夜夜看夜夜爽夜夜摸| 国产精品久久久久久av不卡| 亚洲久久久国产精品| 一级二级三级毛片免费看| 热99久久久久精品小说推荐| 国产亚洲午夜精品一区二区久久| 欧美一级a爱片免费观看看| 十分钟在线观看高清视频www| 亚洲美女搞黄在线观看| 精品久久久久久久久亚洲| 婷婷成人精品国产| 欧美精品一区二区免费开放| 超碰97精品在线观看| 国产精品无大码| 在线免费观看不下载黄p国产| 天堂中文最新版在线下载| 卡戴珊不雅视频在线播放| 成人亚洲精品一区在线观看| 久久99一区二区三区| 只有这里有精品99| 欧美精品亚洲一区二区| 免费观看av网站的网址| 国产老妇伦熟女老妇高清| 满18在线观看网站| 下体分泌物呈黄色| 国产免费福利视频在线观看| 亚洲欧美中文字幕日韩二区| 日韩av在线免费看完整版不卡| 日本欧美国产在线视频| 免费高清在线观看日韩| 一区二区三区免费毛片| 亚洲久久久国产精品| 黑人欧美特级aaaaaa片| 亚洲三级黄色毛片| 黑人高潮一二区| 免费人成在线观看视频色| 欧美精品国产亚洲| 日本欧美视频一区| 国产成人精品婷婷| 大又大粗又爽又黄少妇毛片口| 国产国拍精品亚洲av在线观看| 国产一区有黄有色的免费视频| 美女主播在线视频| 女的被弄到高潮叫床怎么办| 国产 一区精品| 99热全是精品| 如日韩欧美国产精品一区二区三区 | 国产男女超爽视频在线观看| av国产久精品久网站免费入址| 女性生殖器流出的白浆| av卡一久久| 欧美激情 高清一区二区三区| 亚洲国产精品一区三区| 欧美激情国产日韩精品一区| 久久99热这里只频精品6学生| 国产成人精品福利久久| 国产精品久久久久久av不卡| 女人久久www免费人成看片| 在线观看免费高清a一片| 女性生殖器流出的白浆| 亚洲高清免费不卡视频| 国产男女超爽视频在线观看| 亚洲精品国产av成人精品| 高清黄色对白视频在线免费看| 欧美精品亚洲一区二区| 亚洲av中文av极速乱| 国产精品久久久久久久久免| tube8黄色片| 久久久午夜欧美精品| 亚洲五月色婷婷综合| 亚洲精品,欧美精品| 婷婷成人精品国产| 少妇被粗大猛烈的视频| √禁漫天堂资源中文www| 久久久欧美国产精品| 男女国产视频网站| 欧美+日韩+精品| 啦啦啦视频在线资源免费观看| .国产精品久久| 亚洲四区av| 日韩av不卡免费在线播放| 丰满饥渴人妻一区二区三| 看非洲黑人一级黄片| 亚洲精品乱码久久久v下载方式| 最近最新中文字幕免费大全7| 男人添女人高潮全过程视频| 91久久精品国产一区二区三区| 欧美人与善性xxx| 卡戴珊不雅视频在线播放| 熟女电影av网| 欧美日韩av久久| 亚洲精品日韩在线中文字幕| 久久久精品区二区三区| 久久精品久久久久久噜噜老黄| av线在线观看网站| 大码成人一级视频| 赤兔流量卡办理| 黄色怎么调成土黄色| 欧美最新免费一区二区三区| 春色校园在线视频观看| 91久久精品电影网| a级毛色黄片| 中文欧美无线码| 久久精品国产亚洲av天美| 国产一区亚洲一区在线观看| av视频免费观看在线观看| 美女xxoo啪啪120秒动态图| 久久久久国产网址| videossex国产| 高清黄色对白视频在线免费看| 欧美xxxx性猛交bbbb| 日本av免费视频播放| 亚洲图色成人| 啦啦啦在线观看免费高清www| 国产成人午夜福利电影在线观看| 日本爱情动作片www.在线观看| 国产男女超爽视频在线观看| 在现免费观看毛片| 最新的欧美精品一区二区| 另类精品久久| 桃花免费在线播放| 午夜视频国产福利| 久久久久久久精品精品| 国产av精品麻豆| 最近手机中文字幕大全| a级毛片在线看网站| 亚洲国产成人一精品久久久| 国产成人91sexporn| 两个人免费观看高清视频| 综合色丁香网| 人体艺术视频欧美日本| 狂野欧美白嫩少妇大欣赏| 国产熟女午夜一区二区三区 | 制服丝袜香蕉在线| 人妻系列 视频| 草草在线视频免费看| 精品一区二区三卡|