Ki Zhng ,Hi-Qun Yu ,Xio-Peng M *,Jin-Ding Zhng ,Jin Wng ,Chun-Jin Yo ,Yong-Fei Yng ,Hi Sun ,Jun Yo ,Jin Wng
a School of Petroleum Engineering,China University of Petroleum (East China),Qingdao,Shandong,266580,China
b School of Science,China University of Petroleum (East China),Qingdao,Shandong,266580,China
c Shengli Oil Field Exploration and Development Research Institute,Dongying,Shandong,266071,China
Keywords:Multi-source information Automatic history matching Deep learning Data assimilation Generative model
ABSTRACT For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for the ensemble-based data assimilation methods.In this paper,we propose a multi-source information fused generative adversarial network (MSIGAN) model,which is used for parameterization of the complex geologies.In MSIGAN,various information such as facies distribution,microseismic,and inter-well connectivity,can be integrated to learn the geological features.And two major generative models in deep learning,variational autoencoder (VAE) and generative adversarial network (GAN) are combined in our model.Then the proposed MSIGAN model is integrated into the ensemble smoother with multiple data assimilation (ESMDA) method to conduct history matching.We tested the proposed method on two reservoir models with fluvial facies.The experimental results show that the proposed MSIGAN model can effectively learn the complex geological features,which can promote the accuracy of history matching.
The efficient development of oil and gas needs reliable reservoir numerical model.Automatic history matching technology is one of the most effective means to achieve reliable reservoir modeling(Oliver and Chen,2011).The main work of automatic history matching is to adjust the initial reservoir models built by logging,core analysis,and other static data according to the dynamic production data.Ensemble-based data assimlation methods are now one of the most successful and effective techniques for history matching in oil and gas industry.However,the ensemble-based data assimilation methods are difficult to be directly applied to the reservoir model with non-Gaussian parameters.
In last decades,a variety of parameterization methods have been developed for non-Gaussian parameter field.The kernel principal component analysis algorithm(K-PCA)maps the original parameters to a high-dimensional space through a kernel function and then uses the PCA algorithm to reduce the dimensionality(Sarma et al.,2008).Discrete cosine transform (DCT) converts original image information blocks into coefficient sets representing different frequency components,thereby achieving lossy compression of signals and images (Zhao et al.,2016).Although these methods can transform the non-Gaussian parameters to a low-dimensional Gaussian space,they are not accurate enough for the preservation of original geological characteristics.
Recently,the rise of generative neural network in the deep learning community has proposed a different solution to the parameterization problem in history matching(Canchumuni et al.,2021).The generative model in the deep learning is mainly used for feature extraction and image generation (Salakhutdinov and Application,2015),which is similar to the dimension reduction and reconstruction of uncertain parameters in history matching.Therefore,some studies have investigated the generative models to parameterize the geological model(Chan and Elsheikh,2017;Laloy et al.,2017;Mosser et al.,2017).Canchumuni and Emerick (2019)utilized the convolution variational autoencoder(CVAE)model for history matching with complex geologies.Liu and Durlofsky(2020)proposed a CNN-PCA model which combines the PCA and convolution neural network (CNN) to perform parameterization of complex geological facies.However,existing studies only consider using the uncertain parameters for parameterization and the model parameters generated by these methods are difficult to maintain consistency with geological features.
In fact,reservoir modeling involves a variety of data,such as the distribution of sedimentary facies,permeability distribution,complex fault distribution,etc.Traditional decomposition-based methods such as PCA,SVD and DCT are difficult to make comprehensive use of these data,but deep learning methods provide the possibility for comprehensive modeling and dimensionality reduction.In this work,we propose a multi-source information fused generative adversarial network(MSIGAN)model.This model realizes the comprehensive parameterization of complex geological features by sharing the latent space,and integrates them to reconstruct the geological parameters,thus maintaining the consistency of the geological features in the whole process of parameterization and history matching.Our inspiration comes from the multi-view learning (Zhao et al.,2017;Li et al.,2019;Yao et al.,2020).The idea is to build a multi-input neural network and output a model that integrates multiple information.
In MSIGAN,various information such as facies distribution,microseismic,and inter-well connectivity,can be integrated to learn the geological features.And two major generative models,variational autoencoder (VAE) (Kingma and Welling,2014) and generative adversarial network (GAN) (Goodfellow et al.,2014;Radford et al.,2015),are combined in our model.VAE and GAN are two popular generative models in the deep learning community.On the one hand,VAE can learn the latent features through variational inference and representation learning.However,the image generated by VAE has few details and is not high-resolution.On the other hand,the image generated by GAN has richer details but may be missing some features(Lai et al.,2019).Combining the VAE and GAN(Bao et al.,2017)can tackle the above problem.The proposed MSIGAN model is integrated into the ensemble smoother with multiple data assimilation(ES-MDA)(Emerick and Reynolds,2013;Emerick,2017;Evensen,2018) method to conduct history matching.We tested the proposed method on two reservoir models with complex fluvial facies.The numerical results show that our MSIGAN model can preserve the facies distribution features by integrating the boundary and permeability information.Previous studies (Ma et al.,2020,2021;Zhang et al.,2016,2017,2019) have shown that maintaining geological features can effectively alleviate the multisolution of history matching.
We arrange the rest of this paper as follows.In the next two sections,we briefly introduce the two major generative models in deep learning,including VAE and GAN,as well as our proposed MSIGAN model.After that,we introduce the combination of the MSIGAN model with ES-MDA for history matching in section 4.In section 5,we apply our model in three test cases to show the effectiveness of our proposed method in history matching compared with the existing parameterization methods.The last part is a summary of our conclusions.
VAE is a deep generative model,which mainly includes two parts:encoder and decoder.The two parts cooperate to complete the modeling of prior data distribution.In the whole process,the encoder first maps the high-level features of the data distribution to the low-level representations of the data,which are the eigenvectors.Then the decoder absorbs the low-level representations of the data and outputs the high-level representations of the same data (Doersch,2016).Different from autoencoder searching for a single-valued mapping:z=f(x),VAE looks for a mapping of data distribution:p(x)→p(z).Fig.1 shows the basic structure of VAE model.
Assuming that m-dimensional data is input,the encoder outputs two n-dimensional parameters,(μ1,μ2,...,μn)and(σ1,σ2,...,σn).At the same time,an n-dimensional parameter(e1,e2,...,en)is sampled from the normal distribution N(0,1)and the feature vector(z1,z2,...,zn)is generated by the operation zi=exp(σi)×ei+μi.Finally,(z1,z2,...,zn)is input into the decoder network to obtain m-dimensional output data ^X.The loss function of the whole network is as follows:
The first term on the right side of the equation represents the reconstruction loss,that is,the loss of the entire process X~z~^X.The second term behind represents the regular term,where q(z|x)means a posterior distribution of z derived from x,and P(z)means a prior distribution of z.
Fig.1.Structure diagram of the variational autoencoder.
Fig.2.Structure diagram of generative adversarial network.
Fig.3.Structure diagram of MSIGAN.
GAN is also a deep generative model and it provides a way of adversarial learning for neural networks.The main inspiration of GAN comes from the idea of zero-sum game.It is to continuously play the game by generating network G (Generator) and discriminating network D (Discriminator),so that G can learn the distribution of data.Fig.2 shows the structure of the network.
The optimization loss function of the whole process is:
The training process first needs to keep the generator G unchanged and train the discriminator D.Firstly,for the maxD part,the training goal of D is to correctly distinguish between true and false.Since the sigmoid activation function is used for the dichotomy problem,the output D(x)is a probability value in the range[0,1].For the first term,x~pdatarepresents the distribution of x samples from real data.Since we expect D(x)to be close to 1,it is better for logD(x)to be larger because of D(x)∈[0,1].The second term represents the generated data sampled from G.We expect it to be better for D(G(z))to approach 0,which means the second term is bigger and better.In summary,we expect to make the overall value of the first item plus the second item larger through training.Through the iterative optimization of G and D,the final generated model can achieve the purpose of being fake.
Fig.4.Workflow of MSIGAN that integrates constraint information for parameterization.
The MSIGAN model needs to fusion a dual encoder network based on VAE and GAN to realize dual input of permeability and facies constraint information.In this work,we use the Keras(Manaswi and Kumar,2018) to create the model.Fig.3 shows the overall architecture of the MSIGAN.As shown in Fig.3,S represents the uncertain parameter such as permeability,S′represents the reconstruction model,and C represents the constraint information such as facies boundary.
The network maps the S and the constraint information C to the sharing latent space via two encoders.In the shared latent space,the feature vectors z1and z2output by the two encoders are combined into z=(z1+z2)/2.Then the decoder reconstructs the input z into the uncertain parameters space,and the discriminator of GAN determines whether it is true or false,and returns the information to the decoder and discriminator.After continuous optimization of the performance of the decoder and discriminator,the Nash equilibrium is finally reached and the optimal reconstruction model is obtained.
The workflow of the entire MSIGAN architecture is shown in Fig.4.We use g to represent the reconstruction function of uncertain parameters:
And we introduce the reconstruction loss L1:
Besides minimizing the reconstruction loss L1,VAE also regularizes the encoder by imposing a prior on the potential distribution p(z)of z~N(0,1).Therefore,we add the KL regularization loss LKL:
where z,σ,and μ represent(z1+z2)/2,(σ1+σ2)/2 and(μ1+μ2)/2.Finally,the GAN network alternately trains the discriminator D and the generator G by maximizing the loss function Ldisand minimizing the loss function Lgen:
ES-MDA shows unique advantages over other algorithms in histroy matching problems.The basic idea of the ES-MDA method is to perform data assimilation by multiplying an inflation factor α to the error covariance matrix Cdof the observation data and then iterating multiple times.The update of uncertain parameters in the ES-MDA method is as follows:
Fig.5.The workflow of automatic history matching.
Fig.6.Real permeability field and first 16 models in initial reservoir data set(mD).Circles represent oil production wells and triangles represent water injection wells.Test case 1.
Fig.7.Eight templates of Kirsch operator.
Fig.8.First 16 samples in extracted edge model data set.Test case 1.
Fig.9.Comparison of reconstructed permeability by different parameterization methods with reference permeability.Test case 1.
where j=1,...,Nerepresents each ensemble member;CMDis the cross-covariance matrix between the model parameter vector and the prediction data vector;CDDis the autocovariance matrix of the prediction data;dobsis the dynamic response of reservoir development and production;εjis the observation error of the production dynamic response;g(·)is the reservoir system numerical simulation or reservoir numerical simulator;m is the reservoir model parameter;t is the tth data assimilation.Inflation factor αtis the only auxiliary parameter that needs to be determined in the ESMDA and has a significant impact on the solution result.Some people have done work on how to choose the expansion factor(Le et al.,2020;Emerick,2016).In this paper,the setting of the inflation factor in the standard ES-MDA method is used,that is,αt=1/Na,Nausually takes a value of 4-10.
The integrated history matching workflow combining ES-MDA and MSIGAN is shown in Fig.5.At first,the initial latent vectorrandomly sampled from the normal distribution is sent to the generator network to generate the realization of the uncertain parameters,which are used in the reservoir simulation to calculate the production data.Then the ES-MDA algorithm updates the latent variablesaccording to the simulation data and observation date.Afterward,the updated latent variables is sent to the generator network and starting next iteration.
In this case,we carried out the history matching study of a twodimensional fluvial reservoir to adjust the permeability in each gridblock.The data set used in the test case is as the same as that used in Canchumuni et al.(2019),more details can be found in their work.Fig.6(a) shows the true permeability filed.The model includes two facies,in which the permeability of the highpermeability fluvial facies is 5000 mD,and the permeability of the background facies is 500 mD.The model has 45×45 gridblocks and it contains four production wells and three water injection wells.There are 20,000 random models in the data sets,and we select 18,000 for training and another 2,000 for testing the MSIGAN model.Fig.6(b) shows the first 16 models in the data set.
5.1.1.Parameterization of permeability
Before training the MSIGAN model,we first acquire the data setof edge using the Kirsch operator.Kirsch operator is an edge detection algorithm proposed by Kirsch (1971),which uses 8 templates to control the gradient magnitude and to direct the gradient as shown in Fig.7.
These 3 × 3 templates convolve the image,each template responds to a specific edge direction and takes the maximum value as the edge of the image.Kirsch operator has a good effect in keeping image details and anti-noise.The gradient magnitude of the kirsch operator is:
We use the Kirsch operator to extract the edges of the fluvial facies model,and Fig.8 shows the edge of first 16 initial models.
In this case,the dual encoders E1and E2in the MSIGAN both contain 3 convolution layers followed by 2 fully-connected layers(convolution layers have 64,32,and 16 channels with the filter size of 3×3 and stride 2,2,1;the fully-connected layers have 128,100 neurons,respectively).The generator G consists of a full connection layer with 128 neurons and is followed by three convolutional layers (convolution layers have 16,32,and 64 channels with the filter size of 3 × 3 and stride 1,2,2).The discriminator D includes three convolution layers and two fully-connected layers (convolution layers have 64,32,and 16 channels with the filter size of 3×3 and stride 2,2,1;the fully-connected layers have 128,1 neuron and finally outputs a probability value).
In this section,the MSIGAN is compared with the CVAE model.Canchumuni et al.(2019) used two channels to represent the permeability field and then performed post-processing on the reconstructed permeability.This post-processing method is to compare the CVAE reconstructed permeability field data on the two channels and return the channel number corresponding to the larger element,i.e.0 or 1.What this post-processing method gets is not the real reconstruction result of CVAE,but the beautified result by converting the result of CVAE from continuous to binary and artificially removing fuzzy boundaries.We trained the network model in a cluster consisting of 816 core CPU computing nodes and 24 core GPU computing nodes.CVAE took 25 min of training time.Since the GAN network uses the training method of the generator and the discriminator against each other to squeeze the performance of the network,the training time is longer while generating a clear model.An important direction of GAN research at this stage is how to improve the training stability of GAN and shorten the training time (Yaz?c? et al.,2018).The MSIGAN spent 116 min to train.After training,we randomly select a permeability field in the initial reservoir test data set to analyze the reconstruction results,as shown in Fig.9(a).Fig.9(b) shows the reconstruction result of permeability field output by CVAE.Fig.9(c)shows the result of the CVAE with post-processing reconstructed permeability field.Fig.9(d) shows the reconstruction result of MSIGAN.The results in the figure show that the reconstructed phase boundary of the MSIGAN we designed is closest to the reference model.
Fig.10.Comparison of permeability inversion results of the first five prior models.Test case 1.
Fig.11.Observed data history-matched results of ES-MDA combined MSIGAN method.Red dots represent the observed data points,gray lines represent the numerical simulation prediction results of the initial reservoir model set,green lines represent the numerical simulation prediction results of the history matching updated model set.Test case 1.
Fig.12.Real permeability and randomly selected reservoir model data set (mD).Circles represent oil production wells and triangles represent water injection wells.Test case 2.
Fig.13.Comparison of reconstructed permeability by different parameterization methods with reference permeability.Test case 2.
Fig.14.Frequency distribution histograms of reconstructed permeability for different models.Test case 2.
Fig.15.Comparison of permeability inversion results of the first five prior models.Test case 2.
Fig.16.Observed data history-matched results of ES-MDA combined MSIGAN method.Red dots represent the observed data points,gray lines represent the numerical simulation prediction results of the initial reservoir model set,green lines represent the numerical simulation prediction results of the history matching updated model set.Test case 2.
Fig.17.Real permeability (mD).Circles represent oil production wells and triangles represent water injection wells.Test case 3.
Table 1 evaluates the accuracy of reconstruction of different methods.The evaluation parameters include signal-to-noise ratio(SNR) (Sim and Kamel,2010),peak signal-to-noise ratio (PSNR)(Horé and Ziou,2010),structural similarity index (SSIM) (Brunet et al.,2011),hash similarity (Masci et al.,2014) and root mean square error(RMSE).The larger the value of each parameter except RMSE,the more accurate the reconstruction of the permeability field.
Table 1Evaluation of reconstruction results of different parameterization methods.Test case 1.
Compared with the CVAE model,SNR,PSNR,SSIN increased and Hash decreased after adding post-processing.This result shows that although this post-processing method can make the boundary of the permeability field clear,it reduces the accuracy of the permeability.MSIGAN achieved the best results in the four evaluation indicators,indicating that the method can reconstruct the phase boundary more clearly and accurately.
5.1.2.History matching results
This section tests the automatic history matching of the proposed ES-MDA combined with the MSIGAN method to assimilate production observation data and compares it with the CVAE method.Fig.10(a) shows the prediction model of CVAE combined with the ES-MDA method for history matching.The results show that although the CVAE method can roughly capture the distribution of fluvial facies,it cannot accurately restore the permeability distribution at the boundary.As shown in Fig.10(b),the ES-MDA combined with MSIGAN can better predict the distribution and shape of the high permeability facies,and the boundary is clear.
Fig.11 shows the history matching results of daily oil production of 2 production wells.The red dots represent the observed data points,the gray lines represent the numerical simulation results of the initial reservoir models,and the green lines represent the numerical simulation results of the history matched models.Compared with the initial reservoir models,the reservoir models updated by the ES-MDA combined with the MSIGAN method can well reflect the changes of observation data.
Fig.18.Comparison of permeability inversion results of each layer of the reference model.Test case 3.
The model used in the second test case is the same as the article(Emerick,2017).This model has 100×100 gridblocks.As shown in Fig.12(a),there are 5 production wells and 2 water injection wells.The model consists of a low-permeability phase with 500 mD and a high-permeability phase with 5,000 mD.For this case,the neural network architecture we used is basically the same as the case 1,with only some hyperparameter modifications to adapt to different sizes of permeability field models.In this case,18,000 models are used for training,the remaining 2,000 models are used for testing.The permeability field and extracted edge of the first 16 samples are shown in Fig.12(b) and (c).
We set Ne=100 and Na=10 in the ES-MDA to conduct the history matching.As it was proved in test case 1 that the reconstruction result of CVAE+Postprocessing was the result of beautification rather than the real output result of CVAE,we only used CVAE to compare our MSIGAN in test case 2.The network training was carried out in the same cluster as test case 1.CVAE spent 43 min and MSIGAN spent 128 min.Fig.13 shows the reconstruction result of a randomly selected model from the test data set.The RMSE values of CVAE and MSIGAN were 39.1918 and 36.5097,respectively.Fig.14 compares the frequency distribution histograms of the reconstructed permeability.As shown in Fig.14(a),the permeability of each grid of the initial model is significantly concentrated in the high-permeability phase and the lowpermeability facies,while the reconstructed model has intermediate transitions.In Fig.14(b),the reconstruction model of CVAE has more grid permeability distributed between the high-permeability and low-permeability values.In Fig.14(c),the grid permeability of the reconstructed model of MSIGAN is more concentrated on the high-permeability and low-permeability ends.Fig.15 shows the histroy matching result of the permeability filed.The test results show that,compared with CVAE,MSIGAN reconstructs the facies boundary clearly.Fig.16 shows the history matching results of oil production for 2 production wells.It can be seen that the MSIGAN combined with ES-MDA can well fit the observation data.
In this test case,we used SNESIM algorithm (single normal equation simulation)to generate a 3D fluvial facies model data set describing the permeability distribution through sequential Gaussian simulation(Strebelle,2002).The reservoir model has four production wells and two injection wells,and the permeability distribution is shown in Fig.17.The model has 60 × 60 × 5 gridblocks,which are composed of 500 mD low permeability phase and 5,000 mD high permeability phase.In this test case,we also trained the network with 18,000 models and tested it with 2,000 models.In the same cluster as the previous test cases,the CVAE training took 186 min and the MSIGAN took 206 min.
Fig.19.Observed data history-matched results of ES-MDA combined MSIGAN method.Red dots represent the observed data points,gray lines represent the numerical simulation prediction results of the initial reservoir model set,green lines represent the numerical simulation prediction results of the history matching updated model set.Test case 3.
We also set Ne=100 and Na=10 in the ES-MDA to conduct the history matching.Fig.18 shows the inversion results of each layer of the reference permeability field.The results show that for 3D reservoirs,our MSIGAN model can still reverse clear phases in the history matching process.Fig.19 shows the history matching results of oil production for 2 production wells.It can be seen that the combination of MSIGAN and ES-MDA can also fit the observed data well for 3D reservoirs.
For ensemble-based data assimilation methods,localization can effectively decrease the sampling error and minimize the negative impact of limited degrees of freedom.However,as with CVAE,our MSIGAN model faces the same question during parameterization that distance-based localization cannot be applied to update the latent vector z (Houtekamer and Mitchell,2001).None of the known localization methods for the parameterization of deep generation models is as effective as the distance-based approach(Canchumuni et al.,2019,2021).For this reason,we have not used any type of localization in test cases.Localization is an issue worthy of further study,and we will continue to pay attention to and explore the latest solutions to this question.
In this paper,we propose a multi-source information fused generative adversarial network model (MSIGAN) to parameterize the complex geological features in history matching,and combined with ES-MDA for dynamic inversion modeling.In MSIGAN,various information such as facies distribution,microseismic,and interwell connectivity,can be integrated to learn the geological features and parameterization.We tested the proposed parameterization method on two history matching problems and compared it with the other deep learning methods.The numerical results show that the MSIGAN model can integrate the advantages of the two generative models of VAE and GAN,and through integrating facies distribution information and permeability distribution,to better maintain the geological characteristics during the parameterization and history matching process.
Acknowledgements
This work is supported by the National Natural Science Foundation of China under Grant 51722406,52074340,and 51874335,the Shandong Provincial Natural Science Foundation under Grant JQ201808,The Fundamental Research Funds for the Central Universities under Grant 18CX02097A,the Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-008,the Science and Technology Support Plan for Youth Innovation of University in Shandong Province under Grant 2019KJH002,the National Research Council of Science and Technology Major Project of China under Grant 2016ZX05025001-006,111 Project under Grant B08028,Sinopec Science and Technology Project under Grant P20050-1.