袁子云 劉剛 陳雷 邵偉明 張鈺晗
摘要:成品油管道順序輸送過程中會(huì)出現(xiàn)混油現(xiàn)象,精確預(yù)測(cè)混油長(zhǎng)度對(duì)油品批次切割具有重要意義,混油長(zhǎng)度機(jī)制模型存在精度不高,數(shù)值計(jì)算量龐雜等問題。當(dāng)前基于機(jī)器學(xué)習(xí)算法構(gòu)建的全局預(yù)測(cè)模型未考慮實(shí)際工況多模態(tài)特性,預(yù)測(cè)精度受限;直接引入高斯混合回歸算法辨識(shí)數(shù)據(jù)模態(tài)難以準(zhǔn)確表征變量間復(fù)雜非線性關(guān)系。采用現(xiàn)有機(jī)制計(jì)算公式與高斯混合回歸算法構(gòu)建融合機(jī)制認(rèn)知的局部建模算法,基于真實(shí)成品油管道順序輸送混油長(zhǎng)度數(shù)據(jù)集進(jìn)行不同模型預(yù)測(cè)結(jié)果對(duì)比試驗(yàn)。結(jié)果表明,融合機(jī)制認(rèn)知與局部建模算法能有效表征變量間函數(shù)關(guān)系,新模型預(yù)測(cè)精度有明顯優(yōu)勢(shì)。
關(guān)鍵詞:成品油管道; 混油長(zhǎng)度; 局部建模; 高斯混合回歸; 機(jī)制-數(shù)據(jù)
中圖分類號(hào):TP 181 文獻(xiàn)標(biāo)志碼:A
引用格式:袁子云,劉剛,陳雷,等.融合機(jī)制與高斯混合回歸算法的成品油管道順序輸送混油長(zhǎng)度預(yù)測(cè)模型[J].中國(guó)石油大學(xué)學(xué)報(bào)(自然科學(xué)版),2023,47(2):123-128.
YUAN Ziyun, LIU Gang, CHEN Lei, et al. Predictive model of mixed oil length for sequential transportation of multi-product pipeline by combining mechanism and Gaussian mixture regression algorithm [J]. Journal of China University of Petroleum (Edition of Natural Science), 2023,47(2):123-128.
Predictive model of mixed oil length for sequential transportation of multi-product pipeline by combining mechanism and
Gaussian mixture regression algorithm
YUAN Ziyun1,2 , LIU Gang1,2, CHEN Lei1,2, SHAO Weiming2,? ZHANG Yuhan3
(1.College of Pipeline and Civil Engineering in China University of Petroleum(East China),? Qingdao 266580, China;2.Shandong Provincial Key Laboratory of Oil & Gas Storage and Transportation Safety, Qingdao 266580, China; 3.Qingdao Operation Area, Shandong Branch, PipeChina,? Qingdao 266400, China)
Abstract: The oil mixing phenomenon? occurs during the sequential transportation of the multi-product pipeline, and the accurate prediction of the length of the mixed oil is of great significance for the cutting batch segment. The mechanism model is faced with problems such as low accuracy and complex numerical simulation. In the current global predictive models derived from machine learning algorithms, the multi-mode characteristics of actual operating conditions are ignored, thus the predictive accuracy is limited. The Gaussian mixture regression algorithm cannot accurately characterize the complex nonlinear relationship among variables if it is directly introduced to identify the data mode. Based on the existing mechanism equation and the Gaussian mixture regression algorithm, we develop a local modeling algorithm that integrates the mechanism knowledge. Based on the real product oil pipeline sequential transportation mixed oil length data set, a comparison experiment? among different models was carried out, and the results show that the mechanism and local modeling algorithm can effectively characterize the functional relationship of variables, and the predictive accuracy of the new model has obvious advantages.
Keywords: multi-product pipeline; mixed oil length; local modeling; Gaussian mixture regression; mechanism-data
成品油管道通常采取順序輸送,相鄰兩批次油品間不可避免產(chǎn)生混油[1],混油長(zhǎng)度是順序輸送過程中油品批次切割的重要數(shù)據(jù)依據(jù)[2]。準(zhǔn)確預(yù)測(cè)成品油管道混油長(zhǎng)度,對(duì)順序輸送過程的實(shí)時(shí)監(jiān)控、油品批次切割意義重大[3-4]。目前混油長(zhǎng)度計(jì)算機(jī)制模型可分為一維模型與二維模型。一維模型如Austin-Palfrey混油計(jì)算公式[5]應(yīng)用簡(jiǎn)便,但計(jì)算精度有待提升[6]。二維模型[7-8]能更準(zhǔn)確地刻畫混油形成與發(fā)展過程,但模型復(fù)雜度高,求解復(fù)雜,難以應(yīng)用于長(zhǎng)距離成品油管道[6,9]。數(shù)據(jù)驅(qū)動(dòng)建模方法具備良好非線性擬合能力[10-11],但該方法旨在盡可能擬合已有樣本,難以保證其泛化性[12-13]。因此Chen等[14-15]傾向于將Austin-Palfrey公式與數(shù)據(jù)驅(qū)動(dòng)建模算法相結(jié)合,然而現(xiàn)有混油長(zhǎng)度預(yù)測(cè)模型均依賴一個(gè)單獨(dú)的全局預(yù)測(cè)模型完成回歸任務(wù)?,F(xiàn)實(shí)場(chǎng)景中不同管道內(nèi)部的物理流動(dòng)空間與流體流動(dòng)機(jī)制存在差異,導(dǎo)致數(shù)據(jù)集呈現(xiàn)明顯多模態(tài)特性[16-17]。針對(duì)多模態(tài)問題通常采用“分而治之”理念,即為每個(gè)待預(yù)測(cè)樣本構(gòu)建局部預(yù)測(cè)模型以精準(zhǔn)挖掘數(shù)據(jù)關(guān)系[18]。采用高斯混合回歸算法(Gaussian mixture regression, GMR)辨識(shí)數(shù)據(jù)多模態(tài)特性是當(dāng)今主流方法[19],但其假定變量間服從簡(jiǎn)單線性關(guān)系,模型預(yù)測(cè)精度存疑。針對(duì)混油段長(zhǎng)度預(yù)測(cè)問題將GMR算法融合Austin-Palfrey混油公式,借助真實(shí)成品油管道混油長(zhǎng)度數(shù)據(jù)集開展模型性能分析,融合機(jī)制的GMR算法具備明顯預(yù)測(cè)精度優(yōu)勢(shì),驗(yàn)證新算法對(duì)求解成品油管道混油長(zhǎng)度預(yù)測(cè)問題的適用性。
1 原理與方法
1.3 融合機(jī)制公式與GMR算法的GMR-M模型
在GMR模型中假定第k個(gè)模態(tài)內(nèi)輸入輸出變量間的函數(shù)關(guān)系為簡(jiǎn)單的線性關(guān)系。因成品油管道順序輸送過程受多因素耦合影響,且流體流動(dòng)狀態(tài)復(fù)雜多變,輸入輸出變量間應(yīng)服從復(fù)雜非線性關(guān)系,完全基于GMR算法構(gòu)建的預(yù)測(cè)模型將難以準(zhǔn)確描述混油長(zhǎng)度發(fā)展規(guī)律。因此考慮結(jié)合GMR算法與已有機(jī)制公式,將多維輸入變量與對(duì)應(yīng)輸出變量間的復(fù)雜非線性關(guān)系簡(jiǎn)化為線性關(guān)系,再利用GMR算法辨識(shí)數(shù)據(jù)間隱含的多模態(tài)關(guān)系,在同一模態(tài)下構(gòu)建局部預(yù)測(cè)模型,實(shí)現(xiàn)預(yù)測(cè)精度的有效提升。GMR-M建模具體流程如下,相應(yīng)示意圖見圖2。
(1)基于式(5),將管道內(nèi)徑d、輸送距離L以及運(yùn)行雷諾數(shù)Re整合成變量CAP。
(2)結(jié)合GMR算法探尋不同模態(tài)條件下輸入變量CAP與輸出變量即混油長(zhǎng)度C間的函數(shù)關(guān)系。
(3)輸入待預(yù)測(cè)樣本xq可預(yù)測(cè)相應(yīng)混油長(zhǎng)度預(yù)測(cè)值。
考慮到人工神經(jīng)網(wǎng)絡(luò)作為數(shù)據(jù)分析領(lǐng)域主流算法之一[21-22]及其對(duì)復(fù)雜非線性函數(shù)的優(yōu)秀擬合能力,選擇其為對(duì)照方法以驗(yàn)證GMR-M模型在混油長(zhǎng)度預(yù)測(cè)問題上的優(yōu)越性。此外為說明融合機(jī)制公式與考慮數(shù)據(jù)多模態(tài)特點(diǎn)的重要性,基于變量重組方式構(gòu)建了ANN-M模型。相較于GMR-M模型,基于人工神經(jīng)網(wǎng)絡(luò)算法的ANN-M模型并未辨識(shí)數(shù)據(jù)多模態(tài)信息。單純基于GMR,ANN構(gòu)建的預(yù)測(cè)模型的輸入變量信息為L(zhǎng)、d、Re;GMR-M和ANN-M的輸入變量信息為CAP。
2 實(shí) 例
采用真實(shí)混油長(zhǎng)度數(shù)據(jù)集以驗(yàn)證GMR-M模型的適用性,以Austin-Palfrey公式,現(xiàn)有兩種預(yù)測(cè)模型以及單純基于GMR,ANN算法構(gòu)建模型與ANN-M模型的預(yù)測(cè)結(jié)果作為基準(zhǔn),對(duì)比分析GMR-M模型在預(yù)測(cè)精度方面的表現(xiàn)。其中GMR-M與GMR模型模態(tài)數(shù)均設(shè)置為3,ANN與ANN-M隱藏層神經(jīng)元個(gè)數(shù)設(shè)置為10。主要采用均方根誤差(root mean square error, RMSE)、最大絕對(duì)誤差(max absolute error, MAE)與決定系數(shù)R2作為評(píng)價(jià)模型的預(yù)測(cè)性能指標(biāo),評(píng)價(jià)指標(biāo)分別為
式中,ERMS和EMA分別為均方根誤差和最大絕對(duì)誤差;yq、q與q分別為樣本實(shí)際值、預(yù)測(cè)值與樣本均值;Q為測(cè)試樣本數(shù)量。
R2指標(biāo)越大,表明預(yù)測(cè)值與實(shí)際值吻合程度更高。而RMSE與MAE指標(biāo)越大,代表預(yù)測(cè)結(jié)果越偏離實(shí)際值。利用SCADA(supervisory control and data acquisition)系統(tǒng)采集的中國(guó)南方三條成品油管道生產(chǎn)運(yùn)行數(shù)據(jù)作為樣本來源,部分樣本基本信息如表1所示。
前兩條管道共計(jì)1 948個(gè)樣本用于訓(xùn)練模型,第三條管道中528個(gè)樣本用于構(gòu)建測(cè)試數(shù)據(jù)集以評(píng)估不同預(yù)測(cè)模型的泛化性能。各模型相應(yīng)預(yù)測(cè)指標(biāo)列于表2。
由表2可知,對(duì)于現(xiàn)有預(yù)測(cè)模型,Chen模型的RMSE指標(biāo)已超過Austin-Palfrey公式的預(yù)測(cè)結(jié)果,表明該模型預(yù)測(cè)值擬合樣本實(shí)際值效果不佳。雖然Yuan模型表現(xiàn)出相對(duì)較優(yōu)的預(yù)測(cè)性能,但未考慮數(shù)據(jù)多模態(tài)特性仍導(dǎo)致其出現(xiàn)預(yù)測(cè)失真,相比于現(xiàn)有機(jī)制計(jì)算公式難以顯現(xiàn)出明顯預(yù)測(cè)優(yōu)勢(shì)。由于神經(jīng)網(wǎng)絡(luò)具備復(fù)雜非線性擬合能力,ANN與ANN-M模型在混油長(zhǎng)度預(yù)測(cè)問題上表現(xiàn)出較好的預(yù)測(cè)能力。從整體來看,二者的決定系數(shù)R2超過0.94,模型預(yù)測(cè)結(jié)果與實(shí)際值較為接近。相較于現(xiàn)有兩種預(yù)測(cè)模型,基于ANN算法構(gòu)建的預(yù)測(cè)模型具備一定優(yōu)勢(shì)。此外相較于ANN模型,基于已有計(jì)算公式重組輸入變量得到的ANN-M能更精確捕捉變量間的函數(shù)映射關(guān)系,預(yù)測(cè)性能有一定的提升。但其MAE指標(biāo)均約為900且超過了現(xiàn)有兩種預(yù)測(cè)模型,說明基于神經(jīng)網(wǎng)絡(luò)構(gòu)建的預(yù)測(cè)模型對(duì)個(gè)別樣本出現(xiàn)了較嚴(yán)重的預(yù)測(cè)偏差,模型預(yù)測(cè)能力仍有待提升??芍蚝雎詳?shù)據(jù)集內(nèi)樣本可能來源于不同模態(tài)導(dǎo)致變量間函數(shù)關(guān)系存在的差異,即使融合已有機(jī)制公式并采用具備擬合非線性能力的數(shù)據(jù)分析算法,全局建模方法預(yù)測(cè)分屬不同模態(tài)樣本時(shí)適用性仍欠佳,導(dǎo)致預(yù)測(cè)結(jié)果偏離實(shí)際情況。
對(duì)比GMR模型,由于未有機(jī)結(jié)合已有機(jī)制公式,模型無法有效表征輸入變量與輸出變量間的復(fù)雜非線性關(guān)系,預(yù)測(cè)結(jié)果不理想。反映在GMR表現(xiàn)出最高的RMSE和MAE指標(biāo),說明直接引入GMR算法難以解決混油長(zhǎng)度預(yù)測(cè)問題。與之相對(duì)的,融合了機(jī)制表達(dá)形式且采用局部建模方法的GMR-M模型,預(yù)測(cè)結(jié)果更貼近真實(shí)情況。GMR-M模型的RMSE與R2預(yù)測(cè)指標(biāo)均明顯優(yōu)于其他模型。具體而言,GMR-M是RMSE指標(biāo)唯一低于200的模型,且表現(xiàn)出最低的MAE指標(biāo),充分表明GMR-M具備良好的預(yù)測(cè)精度。上述結(jié)果有效驗(yàn)證了融合機(jī)制公式與局部建模方法在準(zhǔn)確預(yù)測(cè)成品油管道順序輸送混油長(zhǎng)度預(yù)測(cè)問題中的重要性。
圖3為各模型估計(jì)值與測(cè)試集樣本實(shí)際值的擬合情況。由圖3可以看到,相較于混油長(zhǎng)度實(shí)際值,基于Austin-Palfrey公式得到的預(yù)測(cè)值偏低;因缺少模態(tài)識(shí)別步驟,Chen-和Yuan模型表現(xiàn)出明顯的預(yù)測(cè)偏差;單純基于GMR算法構(gòu)建的預(yù)測(cè)模型,由于未有機(jī)融合機(jī)制公式,導(dǎo)致模型陷入過擬合,預(yù)測(cè)精度不理想;ANN和ANN-M預(yù)測(cè)值較接近實(shí)際值,然而融合了機(jī)制公式與多模態(tài)識(shí)別功能的GMR-M模型表現(xiàn)出最高的預(yù)測(cè)精度,預(yù)測(cè)值擬合實(shí)際值效果最好,表明模型精準(zhǔn)捕捉到了輸入輸出變量間的函數(shù)映射關(guān)系。
3 結(jié)束語
為克服現(xiàn)有成品油管道順序輸送混油長(zhǎng)度預(yù)測(cè)方法中存在的不足,提出了一種融合機(jī)制與GMR算法的成品油管道混油長(zhǎng)度預(yù)測(cè)模型GMR-M。與現(xiàn)有預(yù)測(cè)模型以及ANN模型相比,因考慮了數(shù)據(jù)內(nèi)部多模態(tài)特性并針對(duì)性地構(gòu)建了多個(gè)局部預(yù)測(cè)模型完成回歸任務(wù),GMR-M模型能有效提高成品油管道混油長(zhǎng)度預(yù)測(cè)精度;對(duì)比已有機(jī)制公式與單純采用GMR算法構(gòu)建的預(yù)測(cè)模型,通過耦合現(xiàn)場(chǎng)數(shù)據(jù)攜帶的關(guān)鍵信息與已有機(jī)制認(rèn)知,GMR-M模型能更有效表征輸入輸出變量間的復(fù)雜函數(shù)關(guān)系,預(yù)測(cè)結(jié)果更接近于真實(shí)情況。
參考文獻(xiàn):
[1]HE Guoxi, LIN Mohan, WANG Baoying, et al. Experimental and numerical research on the axial and radial concentration distribution feature of miscible fluid interfacial mixing process in products pipeline for industrial applications[J]. International Journal of Heat & Mass Transfer, 2018,127:728-745.
[2]SHAHANDEH H, LI Z. Modeling and optimization of the upgrading and blending operations of oil sands Bitumen[J]. Energy & Fuels, 2016,30(JUL.SPEC.):5202-5213.
[3]MORADI S, MIRHASSANI S A. Robust scheduling for multi-product pipelines under demand uncertainty[J]. The International Journal of Advanced Manufacturing Technology, 2016,87(9):2541-2549.
[4]CAFARO V G, CAFARO D C, MNDEZ CA. Optimization model for the detailed scheduling of multi-source pipelines[J]. Computers & Industrial Engineering, 2015,88:395-409.
[5]AUSTIN J E,PALFREY J R. Mixing of miscible but dissimilar liquids in serial flow in a pipeline[J]. P Mech Eng B-J Eng, 1963,178:377-389.
[6]孫健飛,梁永圖.成品油管道順序輸送混油模型研究進(jìn)展[J].油氣儲(chǔ)運(yùn),2019,38(5):496-502.
SUN Jianfei, LIANG Yongtu. Research progress on the mixed oil models for the batch transportation in products pipeline [J]. Oil & Gas Storage and Transportation, 2019,38(5):496-502.
[7]夏增艷,劉青泉.順序輸送混油過程的二維數(shù)值分析[J].力學(xué)與實(shí)踐,2010,32(6):13-17.
XIA Zengyan, LIU Qingquan. Numerical simulation of the contamination between batches in multi-product pipeline transport [J]. Mechanics in Engineering, 2010,32(6):13-17.
[8]馬鋼,白瑞.成品油管道二維混油理論數(shù)值分析研究[J].油氣田地面工程,2018,37(7):54-59.
MA Gang, BAI Rui. Study on the two-dimensional mixed oil theory numerical analysis for product oil pipelines [J]. Oil-Gas Field Surface Engineering, 2018,37(7):54-59.
[9]吳玉國(guó).冷熱原油順序輸送技術(shù)研究[D].青島:中國(guó)石油大學(xué)(華東),2010.
WU Yuguo. Research on technology of the batch transportation of cold and hot crude oils [D]. Qingdao:China University of Petroleum(East China) ,2010.
[10]何玉榮,宋志超,張燕明,等.機(jī)器學(xué)習(xí)在水力壓裂作業(yè)中的應(yīng)用綜述[J].中國(guó)石油大學(xué)學(xué)報(bào)(自然科學(xué)版),2021,45(6):127-135.
HE Yurong, SONG Zhichao, ZHANG Yanming, et al. Review on application of machine learning in hydraulic fracturing [J]. Journal of China University of Petroleum(Edition of Natural Science), 2021,45(6):127-135.
[11]王艷松,趙惺,李強(qiáng),等.基于油氣開采的海上油田中長(zhǎng)期電力負(fù)荷預(yù)測(cè)[J].中國(guó)石油大學(xué)學(xué)報(bào)(自然科學(xué)版),2021,45(2):127-133.
WANG Yansong, ZHAO Xing, LI Qiang, et al. Medium and long term power load prediction of offshore oil field based on oil and gas exploitation [J]. Journal of China University of Petroleum (Edition of Natural Science), 2021,45(2):127-133.
[12]張黎明,陳昕晟,李國(guó)欣,等.基于集合和神經(jīng)網(wǎng)絡(luò)架構(gòu)搜索的自動(dòng)歷史擬合方法[J].中國(guó)石油大學(xué)學(xué)報(bào)(自然科學(xué)版),2022,46(2):127-136.
ZHANG Liming, CHEN Xinsheng, LI Guoxin, et al. An automatic history matching method based on ensemble and neural architecture search [J]. Journal of China University of Petroleum (Edition of Natural Science), 2022,46(2):127-136.
[13]潘少偉,王朝陽,張?jiān)?,?基于長(zhǎng)短期記憶神經(jīng)網(wǎng)絡(luò)補(bǔ)全測(cè)井曲線和混合優(yōu)化XGBoost的巖性識(shí)別[J].中國(guó)石油大學(xué)學(xué)報(bào)(自然科學(xué)版),2022,46(3):62-71.
PAN Shaowei, WANG Chaoyang, ZHANG Yun, et al. Lithology identification based on LSTM neural networks completing log and hybrid optimized XGBoost [J]. Journal of China University of Petroleum(Edition of Natural Science), 2022,46(3):62-71.
[14]CHEN L, YUAN Z Y, LIU G, et al. A novel predictive model of mixed oil length of products pipeline driven by traditional model and data[J]. Journal of Petroleum Science and Engineering,2021,205:108787.
[15]YUAN Z Y, CHEN L, SHAO W M, et al. A robust hybrid predictive model of mixed oil length with deep integration of mechanism and data[J]. Journal of Pipeline Science and Engineering, 2021,1(4):459-467.
[16]SOUZA F, RUI A. Mixture of partial least squares experts and application in prediction settings with multiple operating modes[J]. Chemometrics and Intelligent Laboratory Systems, 2013,130(2):192-202.
[17]SHAO W, GE Z, SONG Z. Soft-sensor development for processes with multiple operating modes based on semisupervised Gaussian mixture regression[J]. IEEE Transactions on Control Systems Technology, 2019,27(5):2169-2181.
[18]SHAO W, GE Z, SONG Z, et al. Data-driven predictive model based on locally weighted Bayesian Gaussian Regression:? 2019 IEEE 8th Data Driven Control and Learning Systems Conference (DDCLS)[C]. Dali: IEEE, 2019.
[19]WANG J, SHAO W, SONG Z. Bayesian regularized Gaussian mixture regression with application to soft sensor modeling for multi-mode industrial processes:2018 IEEE 7th Data Driven Control and Learning Systems Conference (DDCLS)[C]. Enshi: IEEE, 2018.
[20]SHAO W, XIAO C, WANG J, et al. Real-time estimation of quality-related variable for dynamic and non-Gaussian process based on semisupervised Bayesian HMM[J]. Journal of Process Control, 2022,111:59-74.
[21]宋先知,朱碩,李根生,等.基于BP-LSTM雙輸入網(wǎng)絡(luò)的大鉤載荷與轉(zhuǎn)盤扭矩預(yù)測(cè)[J].中國(guó)石油大學(xué)學(xué)報(bào)(自然科學(xué)版),2022,46(2):76-84.
SONG Xianzhi, ZHU Shuo, LI Gensheng, et al. Prediction of hook load and rotary drive torque during well-drilling using a BP-LSTM network[J]. Journal of China University of Petroleum(Edition of Natural Science),2022,46(2):76-84.
[22]鄭秋梅,商振浩,王風(fēng)華,等.基于深度神經(jīng)網(wǎng)絡(luò)和支持向量機(jī)的海底管線水合物生成預(yù)測(cè)模型[J].中國(guó)石油大學(xué)學(xué)報(bào)(自然科學(xué)版),2020,44(5):46-51.
ZHENG Qiumei, SHANG Zhenhao, WANG Fenghua, et al. Prediction model of submarine pipeline hydrate formation based on deep neural network and support vector machines[J]. Journal of China University of Petroleum(Edition of Natural Science), 2020,44(5):46-51.
(編輯 沈玉英)
收稿日期:2022-08-12
基金項(xiàng)目:國(guó)家重點(diǎn)研發(fā)計(jì)劃(2021YFA1000104);國(guó)家自然科學(xué)基金項(xiàng)目(52174068);中央高校自主創(chuàng)新基金項(xiàng)目(22CX01001A-5)
第一作者:袁子云(1998-),男,博士研究生,研究方向?yàn)橛蜌夤芫W(wǎng)大數(shù)據(jù)分析。E-mail:yuanziyun@s.upc.edu.cn。
通信作者:劉剛(1975-),男,教授,博士,博士生導(dǎo)師,研究方向?yàn)橛蜌夤艿老到y(tǒng)數(shù)據(jù)挖掘與智能決策的應(yīng)用。E-mail:liugang@upc.edu.cn。