黃泳熙,朱 云*,謝陽(yáng)紅,李海賢,張志誠(chéng),黎 杰,李金盈,袁穎枝
空氣質(zhì)量模擬與觀測(cè)機(jī)器學(xué)習(xí)NO2濃度預(yù)報(bào)
黃泳熙1,朱 云1*,謝陽(yáng)紅2,李海賢2,張志誠(chéng)1,黎 杰1,李金盈1,袁穎枝1
(1.華南理工大學(xué)環(huán)境與能源學(xué)院,廣東省大氣環(huán)境與污染控制重點(diǎn)實(shí)驗(yàn)室,廣東 廣州 510006;2.佛山市生態(tài)環(huán)境局順德分局,廣東 佛山 528300)
在空氣質(zhì)量模擬預(yù)報(bào)數(shù)據(jù)基礎(chǔ)上,采用套索算法(Lasso)將前饋神經(jīng)網(wǎng)絡(luò)(FNN)與基于污染物濃度及氣象實(shí)時(shí)觀測(cè)值搭建的長(zhǎng)短期記憶網(wǎng)絡(luò)(LSTM)組合,形成了模擬與觀測(cè)機(jī)器學(xué)習(xí)(SOML)預(yù)報(bào)模型,開展了佛山市順德區(qū)NO2未來(lái)3d 10個(gè)鎮(zhèn)街空氣質(zhì)量監(jiān)測(cè)點(diǎn)位逐日濃度預(yù)報(bào).結(jié)果顯示:SOML3d的準(zhǔn)確性均優(yōu)于WRF-CMAQ及其它單一模型,其中第一天SOML平均絕對(duì)誤差(MAE)為4.99 μg/m3,改進(jìn)幅度達(dá)66.18%;SOML不同季節(jié)適用性均較強(qiáng),四季預(yù)報(bào)效果均較WRF-CMAQ明顯提升(MAE分別降低42.18%、42.89%、61.04%、50.91%),其中秋冬季改善幅度更好;相比WRF-CMAQ,SOML預(yù)報(bào)結(jié)果能較好反映順德區(qū)內(nèi)各站點(diǎn)NO2濃度實(shí)際空間分布和數(shù)值水平,有效提升了濃度預(yù)報(bào)精準(zhǔn)度.
NO2濃度預(yù)報(bào);機(jī)器學(xué)習(xí);預(yù)報(bào)模型;WRF-CMAQ模型;空氣質(zhì)量監(jiān)測(cè)
二氧化氮(NO2)是空氣質(zhì)量六項(xiàng)指標(biāo)污染物之一,對(duì)大氣污染有顯著貢獻(xiàn),其是臭氧和顆粒物的主要前體物,導(dǎo)致光化學(xué)污染和霧霾的發(fā)生[1-2].流行病學(xué)研究表明,長(zhǎng)期暴露于過量NO2環(huán)境中可能增加人群呼吸道及心血管疾病的患病及死亡風(fēng)險(xiǎn)[3].提升NO2濃度預(yù)報(bào)準(zhǔn)確性,是預(yù)判NO2污染形勢(shì)、提前布局強(qiáng)化減排措施、降低對(duì)大氣環(huán)境和人體健康不利影響的關(guān)鍵環(huán)節(jié)之一.然而,NO2在大氣中參與多種化學(xué)反應(yīng),反應(yīng)過程受到排放強(qiáng)度、氣象條件及其它反應(yīng)物等的影響,近地面NO2濃度變化具備明顯非線性特征[4-5],精準(zhǔn)預(yù)報(bào)難度大.
常用的NO2濃度預(yù)報(bào)方法分為統(tǒng)計(jì)預(yù)報(bào)和模式預(yù)報(bào)兩類.統(tǒng)計(jì)預(yù)報(bào)基于污染物濃度、氣象等的歷史數(shù)據(jù)構(gòu)建數(shù)理統(tǒng)計(jì)模型進(jìn)行NO2預(yù)報(bào)[6-8],易于實(shí)現(xiàn),適用性強(qiáng)[9-11],但缺乏對(duì)污染物排放、大氣物化反應(yīng)過程等關(guān)鍵因素的變化表征[12-13].相對(duì)而言,模式預(yù)報(bào)綜合考慮了氣象、排放及大氣物化反應(yīng)機(jī)制,是一種較先進(jìn)的污染物濃度預(yù)報(bào)手段[14-16],但受制于反應(yīng)機(jī)理、輸入清單及氣象模擬等較大的不確定性,模式預(yù)報(bào)輸出結(jié)果與真實(shí)值之間仍存在較大偏差[13,17].
為修正模式預(yù)報(bào)偏差,許多研究采用如神經(jīng)網(wǎng)絡(luò)、隨機(jī)森林、極端隨機(jī)樹等機(jī)器學(xué)習(xí)方法對(duì)模式預(yù)報(bào)結(jié)果進(jìn)行后處理,該方法能夠捕捉各項(xiàng)污染物及氣象的模擬值與目標(biāo)污染物觀測(cè)值之間復(fù)雜的非線性關(guān)系,提升預(yù)報(bào)準(zhǔn)確性[18-20].目前,模式預(yù)報(bào)后處理修正建模多用于改善臭氧、顆粒物的預(yù)報(bào)性能[21],針對(duì)NO2的研究相對(duì)較少;已有研究表明,修正建模效果受模式性能影響很大[22],當(dāng)模擬值變化趨勢(shì)與真實(shí)情況相差過大時(shí),后處理所得預(yù)報(bào)結(jié)果準(zhǔn)確性下降明顯.在空氣質(zhì)量模擬修正的基礎(chǔ)上結(jié)合基于實(shí)時(shí)觀測(cè)值的預(yù)報(bào)建模,能夠?qū)⑽廴疚餄舛燃皻庀笥^測(cè)數(shù)據(jù)的實(shí)際變化趨勢(shì)引入預(yù)報(bào)過程,進(jìn)一步提升已有修正建模的效果[23];同時(shí),也可解決單獨(dú)基于觀測(cè)值進(jìn)行統(tǒng)計(jì)建模時(shí)可解釋性較低等問題[24-25].據(jù)此,本文提出了空氣質(zhì)量模擬與觀測(cè)機(jī)器學(xué)習(xí)NO2濃度預(yù)報(bào)方法,即在模式預(yù)報(bào)結(jié)果基礎(chǔ)上,通過多模型比選,選用前饋神經(jīng)網(wǎng)絡(luò)模型進(jìn)行模式預(yù)報(bào)后處理,同時(shí)基于污染物濃度和氣象觀測(cè)數(shù)據(jù)搭建長(zhǎng)短期記憶網(wǎng)絡(luò)模型,再采用套索方法將上述兩種模型進(jìn)行組合,得到模擬與觀測(cè)機(jī)器學(xué)習(xí)預(yù)報(bào)模型,并以順德區(qū)為例,驗(yàn)證模型對(duì)空氣質(zhì)量監(jiān)測(cè)站點(diǎn)NO2未來(lái)3d逐日濃度的預(yù)報(bào)性能.
本文的技術(shù)路線如圖1所示.首先搭建空氣質(zhì)量模式預(yù)報(bào)系統(tǒng)WRF-CMAQ獲取氣象及六項(xiàng)常規(guī)污染物濃度模擬值;然后以NO2逐日觀測(cè)值為目標(biāo)值,采用基于貪心思想的特征選擇方法從模擬值中篩選出特征變量集①~④,分別輸入到前饋神經(jīng)網(wǎng)絡(luò)(FNN)、隨機(jī)森林(RF)、支持向量回歸(SVR)和極端梯度提升樹(XGBoost)四種模型中,選擇預(yù)報(bào)效果較優(yōu)的模型用于后續(xù)研究;同時(shí),使用目標(biāo)站點(diǎn)NO2等污染物及氣象的實(shí)時(shí)觀測(cè)數(shù)據(jù),經(jīng)特征選擇和輸入時(shí)序長(zhǎng)度確定后輸入長(zhǎng)短期記憶網(wǎng)絡(luò)(LSTM)模型中進(jìn)行訓(xùn)練;再將訓(xùn)練好的LSTM模型與比選所得模型進(jìn)行套索方法(Lasso)組合,得到用于預(yù)報(bào)目標(biāo)站點(diǎn)NO2未來(lái)3d逐日濃度的模擬與觀測(cè)機(jī)器學(xué)習(xí)(SOML)預(yù)報(bào)模型,并通過對(duì)比組合前后預(yù)報(bào)的準(zhǔn)確性等,對(duì)模型的預(yù)報(bào)性能進(jìn)行評(píng)價(jià)與分析.
圖1 研究技術(shù)路線
特征變量①~⑤詳見表1及表3
本文采用中尺度氣象模式WRFv3.9.1耦合空氣質(zhì)量預(yù)測(cè)與評(píng)估系統(tǒng)CMAQv5.3.2而成的WRF- CMAQ模式預(yù)報(bào)系統(tǒng)獲取氣象及污染物濃度的模擬值.模擬區(qū)域?yàn)樗膶泳W(wǎng)格嵌套,一~四層網(wǎng)格分辨率分別為27, 9, 3, 1km,最外層區(qū)域包含東亞地區(qū),最內(nèi)層如圖2所示,覆蓋了整個(gè)順德區(qū).其中WRF用于獲取風(fēng)向、風(fēng)速等14項(xiàng)氣象指標(biāo)的模擬值(表1),同時(shí)也為CMAQ提供輸入氣象場(chǎng),其所需的初始?xì)庀筚Y料來(lái)自美國(guó)國(guó)家海洋和大氣管理局(NOAA)的全球預(yù)報(bào)氣象場(chǎng)(GFS)數(shù)據(jù)集[26],垂直層數(shù)為34層;CMAQ則用于獲取NO2、SO2、PM10、PM2.5、O3、CO六項(xiàng)常規(guī)污染物的濃度模擬值,本文采用清華大學(xué)提供的全國(guó)大氣污染物排放清單作為一、二層的輸入清單[27],基準(zhǔn)年為2017年,三、四層使用課題組編制的2019年廣東省及順德區(qū)污染源排放清單[28].WRF-CMAQ其余參數(shù)配置參見文獻(xiàn)[29].模式運(yùn)行時(shí)段設(shè)為2019年1月1日~2022年7月29日,預(yù)報(bào)時(shí)效為3d.
圖2 WRF-CMAQ最內(nèi)層模擬域示意
圖中1~10分別代表獅嶺公園站點(diǎn)、蘇崗站點(diǎn)、倫教站點(diǎn)、北滘站點(diǎn)、陳村站點(diǎn)、世紀(jì)蓮站點(diǎn)、龍江站點(diǎn)、勒流站點(diǎn)、杏壇站點(diǎn)和均安站點(diǎn),其中獅嶺公園站點(diǎn)和蘇崗站點(diǎn)為國(guó)控點(diǎn),標(biāo)注星號(hào);其余站點(diǎn)為市控點(diǎn),標(biāo)注三角
考慮到實(shí)際預(yù)報(bào)工作中可獲得預(yù)報(bào)當(dāng)日7:00及以前污染物濃度及氣象的觀測(cè)值[30],故本文收集了順德區(qū)10個(gè)鎮(zhèn)街監(jiān)測(cè)站點(diǎn)2019年1月1日00:00~2022年7月29日07:00六項(xiàng)常規(guī)污染物濃度及同期風(fēng)速、風(fēng)向、溫度、濕度和氣壓的逐小時(shí)觀測(cè)數(shù)據(jù)用于觀測(cè)值建模,同時(shí)整理了各站點(diǎn)2019年1月1日~2022年7月31日NO2逐日濃度監(jiān)測(cè)數(shù)據(jù)作為模型目標(biāo)值;上述數(shù)據(jù)均源于佛山市順德區(qū)環(huán)境監(jiān)測(cè)站大氣環(huán)境監(jiān)測(cè)平臺(tái)[31].后對(duì)模擬值和觀測(cè)值進(jìn)行了Z-score標(biāo)準(zhǔn)化和缺失值處理,以消除不同屬性數(shù)據(jù)量綱對(duì)建模的影響,提升預(yù)報(bào)精度[32-33].
1.3.1 模型優(yōu)選 為選出最適用于NO2的模擬值后處理模型,本文搭建了FNN、RF、SVR和XGBoost四種模型進(jìn)行擇優(yōu).劃分訓(xùn)練集(2019年1月~2020年7月)、驗(yàn)證集(2020年8月~2021年7月)和測(cè)試集(2021年8月~2022年7月),訓(xùn)練及驗(yàn)證集用于確定模型的參數(shù)配置,測(cè)試集用于預(yù)報(bào)性能評(píng)估.WRF-CMAQ模擬輸出的氣象及污染物濃度值是模型重要輸入變量;由于模擬值種類較多,選出對(duì)輸出影響高的特征變量可減少引入冗余信息、提高建模效率及準(zhǔn)確性[34].因此,本文基于貪心思想[35]對(duì)四種模型的輸入變量進(jìn)行特征選擇,即先將所有特征逐一輸入模型進(jìn)行訓(xùn)練,選擇平均絕對(duì)誤差最低的一項(xiàng)放入特征變量集內(nèi),在此基礎(chǔ)上引入下一項(xiàng)特征,順序進(jìn)行訓(xùn)練與擇優(yōu),反復(fù)迭代上述步驟至誤差不再降低,停止引入,得到最終特征變量集(表1).
表1 模擬值初始變量及各模擬值后處理模型特征變量
注:特征變量①~④對(duì)應(yīng)輸入模型分別為XGBoost、SVR、RF和FNN.
四種參與比較的模擬值后處理模型簡(jiǎn)介如下:FNN是一種經(jīng)典的人工神經(jīng)網(wǎng)絡(luò)模型,其由多個(gè)神經(jīng)元逐層排列構(gòu)成,通過尋找適合的網(wǎng)絡(luò)結(jié)構(gòu)和權(quán)重實(shí)現(xiàn)對(duì)目標(biāo)函數(shù)的逼近[36-38];RF是Bagging的一種擴(kuò)展變體,其通過對(duì)輸入集有放回的重復(fù)采樣以構(gòu)建多棵決策樹,再將每棵樹的輸出進(jìn)行加權(quán)求和,得到最終預(yù)報(bào)結(jié)果[39-41];SVR由支持向量機(jī)(SVM)發(fā)展而來(lái),其原理是期望找到一個(gè)回歸平面,使集合中所有數(shù)據(jù)點(diǎn)到該平面的距離達(dá)到最近[42-43]; XGBoost則是一種改進(jìn)的Boosting方法,其在單棵樹模型的基礎(chǔ)上以糾正預(yù)測(cè)殘差為目的增加子樹,最終將所有樹進(jìn)行加權(quán)獲得預(yù)報(bào)結(jié)果[44-46].本文使用Python 3.8平臺(tái)搭建模型,并結(jié)合隨機(jī)網(wǎng)格搜索法和人工調(diào)參法確定模型最佳參數(shù),各模型主要參數(shù)配置如表2所示.后根據(jù)評(píng)價(jià)指標(biāo)選出性能較優(yōu)的模型.
1.3.2 LSTM建模 LSTM是循環(huán)神經(jīng)網(wǎng)絡(luò)(RNN)的一種變體,在時(shí)序預(yù)測(cè)領(lǐng)域應(yīng)用廣泛[47].其將RNN隱藏層中的神經(jīng)元替換為可保留長(zhǎng)短期信息的記憶體,并添加“門”結(jié)構(gòu)對(duì)記憶體進(jìn)行更新,相比同樣常用于時(shí)序預(yù)測(cè)的RNN能夠更好地綜合歷史及現(xiàn)有信息對(duì)未來(lái)情況進(jìn)行預(yù)測(cè)[48],因此本文選擇LSTM用于觀測(cè)值建模.LSTM的計(jì)算公式及訓(xùn)練過程參見文獻(xiàn)[49].同樣使用基于貪心思想的特征選擇方法篩選LSTM的輸入變量(表3).為了更好提升預(yù)報(bào)效果,本文設(shè)置輸入時(shí)序長(zhǎng)度為1~32h共32組實(shí)驗(yàn)進(jìn)行比較,其中32h取自預(yù)報(bào)當(dāng)日00:00~07:00及前一日00:00~23:00加和.確定LSTM最佳輸入時(shí)長(zhǎng)為24h(結(jié)果比較及分析詳見圖3)后,本文對(duì)LSTM模型其它參數(shù)進(jìn)行調(diào)整,最終設(shè)置模型的迭代次數(shù)為1250,隱藏層的維度為20,輸出層的維度為10.
表2 各模型參數(shù)配置
表3 觀測(cè)值初始變量及LSTM模型特征變量
1.3.3 SOML建模 完成模擬值后處理模型擇優(yōu)和觀測(cè)值預(yù)報(bào)建模后,為綜合模型優(yōu)勢(shì),獲得更準(zhǔn)確的預(yù)報(bào)結(jié)果[50-51],本文將上述兩種模型進(jìn)行組合,從而得到SOML模型.考慮到本文所涉及的單一模型學(xué)習(xí)能力均較強(qiáng),若選擇復(fù)雜度較高的組合方法,可能會(huì)加劇預(yù)報(bào)過擬合情況的發(fā)生[52-53],因此,本文采用Lasso方法進(jìn)行組合.該方法通過構(gòu)造懲罰項(xiàng)壓縮回歸系數(shù),從而降低模型復(fù)雜度.模型一般表示如下:
式中:y為第個(gè)預(yù)測(cè)值,文中對(duì)應(yīng)第天NO2濃度的預(yù)報(bào)值,=1、2、3;x相應(yīng)代表兩種模型第天的預(yù)報(bào)輸出值,故文中=2;b為回歸系數(shù);為偏移量.
為保證回歸系數(shù)={12,b}可求,設(shè)置Lasso的目標(biāo)函數(shù)為:
為了定量評(píng)估不同模型的預(yù)報(bào)效果,本文選取平均絕對(duì)誤差(MAE)、均方根誤差(RMSE)和相關(guān)系數(shù)()作為預(yù)報(bào)評(píng)價(jià)指標(biāo)[47,54].其中,MAE能直觀反映模型預(yù)報(bào)誤差的大小,RMSE側(cè)重體現(xiàn)預(yù)報(bào)值與目標(biāo)值的離散程度,兩者均以μg/m3為單位,數(shù)值越小說(shuō)明預(yù)報(bào)效果越好;反映預(yù)報(bào)值與目標(biāo)值隨時(shí)間變化趨勢(shì)的相似程度,為無(wú)量綱值,數(shù)值越接近1,代表預(yù)報(bào)值越接近于目標(biāo)值.
圖3為32組不同輸入時(shí)序長(zhǎng)度實(shí)驗(yàn)的MAE站點(diǎn)平均結(jié)果.由圖可知,MAE在輸入時(shí)序長(zhǎng)度小于等于17h時(shí)整體呈遞減趨勢(shì),輸入時(shí)長(zhǎng)為18~24h時(shí)MAE先上升后降至最低,大于24h后MAE值保持較高水平.其中輸入時(shí)長(zhǎng)為24h時(shí)MAE最低,這是因?yàn)镹O2逐小時(shí)濃度變化一般呈雙峰特征,峰值出現(xiàn)在每日06:00~8:00和18:00~22:00[55-56],第24組實(shí)驗(yàn)中LSTM輸入為預(yù)報(bào)當(dāng)日07:00~前一日08:00共24h各特征變量的逐小時(shí)觀測(cè)數(shù)據(jù),輸入時(shí)長(zhǎng)跨度包含了NO2的早晚峰時(shí)段,模型學(xué)習(xí)特征信息較多,故預(yù)報(bào)效果相對(duì)更好.綜上,本文設(shè)置LSTM預(yù)報(bào)模型輸入時(shí)序長(zhǎng)度為24.
圖3 不同輸入時(shí)序長(zhǎng)度實(shí)驗(yàn)MAE對(duì)比
首先比較WRF-CMAQ及FNN、RF、SVR、XGBoost四種模擬值后處理模型NO2未來(lái)3d逐日濃度預(yù)報(bào)效果,計(jì)算各評(píng)價(jià)指標(biāo)的站點(diǎn)平均值如表4所示.由表4可知,四種后處理模型預(yù)報(bào)效果相比WRF-CMAQ均得到明顯提升,其中FNN的預(yù)報(bào)表現(xiàn)相對(duì)最好:其MAE、RMSE和3d均值分別為8.61μg/m3、11.52μg/m3和0.82,相比WRF-CMAQ分別改善了41.24%、39.86%和27.73%,這可能是因?yàn)镕NN設(shè)置了多個(gè)非線性層,能夠更好捕捉到各項(xiàng)特征變量間顯著的非線性相關(guān)性[57].綜上,本文選用FNN作為模擬值后處理模型.
表4 WRF-CMAQ及模擬值后處理模型預(yù)報(bào)評(píng)價(jià)指標(biāo)站點(diǎn)平均值對(duì)比
注:DAY1、DAY2、DAY3和AVE分別代表預(yù)報(bào)第一天、第二天、第三天和三天平均.
完成模擬值后處理建模及擇優(yōu)后,基于實(shí)時(shí)觀測(cè)數(shù)據(jù)搭建LSTM模型.由圖4可知,LSTM第一天預(yù)報(bào)效果最好,MAE、RMSE、分別為5.13μg/m3、7.10μg/m3和0.93,均優(yōu)于同期FNN,這是因?yàn)槭芤雽?shí)時(shí)觀測(cè)數(shù)據(jù)的影響,距預(yù)報(bào)起始時(shí)間節(jié)點(diǎn)越近,LSTM預(yù)報(bào)結(jié)果越接近污染物真實(shí)情況.但隨著預(yù)報(bào)時(shí)間跨度增長(zhǎng),實(shí)時(shí)觀測(cè)數(shù)據(jù)對(duì)模型的影響下降[49],LSTM在第二、三天預(yù)報(bào)效果相比第一天出現(xiàn)明顯降低,以MAE為例,其第二天的數(shù)值較第一天上升了99.90%,第三天則為118.17%.總的來(lái)說(shuō), LSTM3d的預(yù)報(bào)效果相差較大,評(píng)價(jià)指標(biāo)數(shù)值分布不如FNN和WRF-CMAQ穩(wěn)定.
圖4 WRF-CMAQ、FNN、LSTM及SOML模型(a)MAE, (b)RMSE, (c)r站點(diǎn)平均值對(duì)比
圖中柱體及數(shù)字代表評(píng)價(jià)指標(biāo)的站點(diǎn)平均值,誤差線范圍表示站點(diǎn)評(píng)價(jià)指標(biāo)的極值范圍
由前文可知,LSTM模型因引入實(shí)時(shí)觀測(cè)數(shù)據(jù)第一天預(yù)報(bào)效果明顯優(yōu)于FNN等模型;而模擬值后處理模型FNN3d預(yù)報(bào)穩(wěn)定性相對(duì)較高.為進(jìn)一步提升預(yù)報(bào)性能,本文采用Lasso方法將FNN與LSTM進(jìn)行組合,得到SOML預(yù)報(bào)模型.由圖4可知, SOML3d各項(xiàng)評(píng)價(jià)指標(biāo)表現(xiàn)均優(yōu)于同期FNN及LSTM等模型,其中受LSTM影響,SOML第一天的預(yù)報(bào)效果最佳,其MAE、RMSE、分別為4.99μg/m3、6.78μg/m3和0.94,相比WRF-CMAQ分別改善了66.18%、64.78%和46.35%,改善效果顯著;第二、三天的預(yù)報(bào)效果較第一天出現(xiàn)一定程度的下降,但SOML3d的預(yù)報(bào)效果相比LSTM更穩(wěn)定(以MAE為例,其第二天的數(shù)值較第一天上升了64.44%,第三天則為69.49%,變化幅度均低于LSTM),這是得益于FNN的引入.綜上所述,SOML能夠綜合不同模型優(yōu)勢(shì),從而提升NO2濃度預(yù)報(bào)性能.
2.3.1 季節(jié)適用性分析 為檢驗(yàn)SOML模型在不同季節(jié)的預(yù)報(bào)精度和適用情況[58-60],本文將SOML及WRF-CMAQ的預(yù)報(bào)結(jié)果按春(3~5月)、夏(6~8月)、秋(9~11月)、冬(12、1、2月)四季進(jìn)行劃分,計(jì)算各季節(jié)兩種模型各評(píng)價(jià)指標(biāo)的3d均值,結(jié)果見圖5.由圖可知,在MAE上,SOML與WRF-CMAQ的四季分布相似,均為冬季數(shù)值較高,夏季相對(duì)最低;SOML四季MAE相比WRF-CMAQ均得到改善,改善幅度分別為42.18%、42.89%、61.04%、50.91%,其中秋冬季改善幅度相對(duì)較高,這是因?yàn)橄啾却合募?秋冬季順德區(qū)NO2濃度達(dá)到良及輕度污染等級(jí)天數(shù)明顯增多,污染發(fā)生的情況下NO2濃度變化與前時(shí)段污染物濃度及氣象之間關(guān)聯(lián)性增強(qiáng),SOML引入了觀測(cè)數(shù)據(jù),相比WRF-CMAQ能更好地學(xué)習(xí)到其中關(guān)聯(lián)性,故秋冬季的改善效果更加顯著[59-61]; RMSE的情況與MAE類似;在上,SOML在秋冬季較高,春夏季較低,其中冬季相比WRF-CMAQ提升更為顯著(由0.41提升至0.85),這也是受SOML在冬季學(xué)習(xí)效果更好的影響.總的來(lái)說(shuō),SOML模型預(yù)報(bào)效果較WRF-CMAQ得到提升,能夠更好適用于各季節(jié)的預(yù)報(bào)工作.
圖5 WRF-CMAQ及SOML模型在不同季節(jié)(a)MAE, (b)RMSE, (c)r站點(diǎn)平均值對(duì)比
圖6 WRF-CMAQ及SOML模型在不同污染等級(jí)MAE站點(diǎn)平均值對(duì)比
進(jìn)一步驗(yàn)證SOML模型在不同污染等級(jí)下的適用性.本文根據(jù)《空氣質(zhì)量分指數(shù)(IAQI)及對(duì)應(yīng)的污染物項(xiàng)目濃度限值》中空氣質(zhì)量分指數(shù)及對(duì)應(yīng)污染物濃度限值,將各站點(diǎn)NO2日均濃度對(duì)應(yīng)劃分為優(yōu)、良、輕度污染3種等級(jí),計(jì)算測(cè)試時(shí)段內(nèi)各等級(jí)下SOML及WRF-CMAQ預(yù)報(bào)值與真實(shí)值之間MAE,得到結(jié)果如圖6所示.由圖6可知,SOML的MAE值在不同污染等級(jí)下相比WRF-CMAQ均得到改善,其中優(yōu)等級(jí)下SOML的改善幅度為36.79%,良及輕度污染等級(jí)下SOML改善幅度則相對(duì)更高,分別為64.81%和59.31%.總的來(lái)說(shuō),SOML在不同污染等級(jí)下相比WRF-CMAQ均得到改善,其中在良及輕度污染等級(jí)下SOML的改善幅度更加明顯,這也說(shuō)明引入觀測(cè)數(shù)據(jù)的SOML能更好學(xué)習(xí)到NO2污染易發(fā)時(shí)段內(nèi)NO2濃度變化與前時(shí)段污染物濃度及氣象之間關(guān)聯(lián)性,從而得到更好的預(yù)報(bào)結(jié)果.
2.3.2 站點(diǎn)適用性分析 計(jì)算測(cè)試時(shí)段內(nèi)各個(gè)站點(diǎn)未來(lái)3dNO2觀測(cè)數(shù)據(jù)及SOML預(yù)報(bào)數(shù)據(jù)的平均值,使用克里金加權(quán)法繪制順德區(qū)NO2濃度插值圖.由圖7可知,SOML預(yù)報(bào)值空間分布表現(xiàn)為北部站點(diǎn)(如陳村、北滘、世紀(jì)蓮、龍江和倫教站點(diǎn))NO2濃度較高,南部相對(duì)較低,而WRF-CMAQ受輸入清單的影響[62],NO2濃度預(yù)報(bào)值整體呈東高西低分布,相比之下,SOML預(yù)報(bào)值與觀測(cè)值在空間分布上更一致,這也符合順德區(qū)北部靠近佛山城區(qū)和廣州市,高速路網(wǎng)密布、機(jī)動(dòng)車數(shù)量及工業(yè)聚集區(qū)多,NO2濃度相對(duì)較高的情況[63].另外,WRF- CMAQ預(yù)報(bào)濃度相比觀測(cè)數(shù)據(jù)整體偏低,SOML預(yù)報(bào)值與觀測(cè)值更加接近,其中北部站點(diǎn)SOML預(yù)報(bào)值較觀測(cè)值稍有偏高,但兩者濃度差3d均值為2.07μg/m3,數(shù)值較小,即SOML預(yù)報(bào)總體上能更好反映各站點(diǎn)NO2濃度值的實(shí)際情況.由此可見,相比預(yù)報(bào)誤差相對(duì)較高的WRF-CMAQ,SOML在各站點(diǎn)預(yù)報(bào)效果得到明顯提升,由前文可知這得益于SOML綜合了模擬值后處理建模及實(shí)時(shí)觀測(cè)數(shù)據(jù)建模優(yōu)勢(shì),能夠有效捕捉輸入數(shù)據(jù)與模型目標(biāo)值間復(fù)雜的非線性關(guān)系.綜上,SOML能夠更好適用于不同站點(diǎn)的預(yù)報(bào)工作.
圖7 順德區(qū)2021年8月1日~2022年7月29日十鎮(zhèn)街各站點(diǎn)未來(lái)3d的NO2觀測(cè)值、SOML及WRF-CMAQ預(yù)報(bào)值分布
本文目前研究也存在以下不足:1)僅以順德為例進(jìn)行預(yù)報(bào),未驗(yàn)證SOML在全國(guó)其它區(qū)域的預(yù)報(bào)效果;2)僅開展NO2未來(lái)3d逐日濃度預(yù)報(bào),未拓展為3d逐小時(shí)預(yù)報(bào).未來(lái)的研究可將SOML應(yīng)用到更大區(qū)域(如珠三角地區(qū))及污染物濃度逐小時(shí)預(yù)報(bào)工作中,以進(jìn)一步檢驗(yàn)SOML預(yù)報(bào)適用性,提升預(yù)報(bào)精細(xì)化程度,為加強(qiáng)NO2污染防治,改善環(huán)境空氣質(zhì)量提供科技支撐.
3.1 FNN、RF、SVR、XGBoost四種WRF-CMAQ模擬值后處理預(yù)報(bào)模型優(yōu)選結(jié)果為FNN最好,基于實(shí)時(shí)觀測(cè)數(shù)據(jù)搭建的LSTM預(yù)報(bào)模型第一天預(yù)報(bào)效果最佳,隨后2d預(yù)報(bào)效果差.將FNN與LSTM組合得到SOML模型,預(yù)報(bào)結(jié)果表明,SOML能綜合各模型優(yōu)勢(shì),得到明顯優(yōu)于WRF-CMAQ、FNN和LSTM的NO2未來(lái)3d逐日濃度預(yù)報(bào)結(jié)果,其中第一天預(yù)報(bào)效果最好(MAE第一天為4.99μg/m3,較WRF- CMAQ改進(jìn)66.18%),第二三天略有下降.
3.2 SOML預(yù)報(bào)效果分季節(jié)評(píng)估顯示,模型冬季絕對(duì)誤差較高,夏季相對(duì)最低(冬夏MAE分別為10.90和4.71μg/m3);R值則為秋冬高,春夏低.與WRF- CMAQ相比,SOML四季預(yù)報(bào)效果均出現(xiàn)明顯提升,其中秋冬季預(yù)報(bào)準(zhǔn)確性改善效果最佳(3d預(yù)報(bào)MAE分別改善61.04%和50.91%).總體而言,SOML能適用于全年NO2日均濃度預(yù)報(bào).
3.3 SOML在順德區(qū)10個(gè)鎮(zhèn)街空氣質(zhì)量監(jiān)測(cè)站點(diǎn)預(yù)報(bào)效果顯示,各站點(diǎn)的SOML預(yù)報(bào)值在空間與觀測(cè)值均呈北高南低分布,在數(shù)值上相比WRF- CMAQ更好的反映NO2的實(shí)際濃度水平,有效提升了NO2濃度預(yù)報(bào)的精準(zhǔn)度.
[1] Zhu Y, Zhan Y, Wang B, et al. Spatiotemporally mapping of the relationship between NO2pollution and urbanization for a megacity in Southwest China during 2005~2016 [J]. Chemosphere, 2019,220:155-162.
[2] V?rghileanu M, S?vulescu I, Mihai B, et al. Nitrogen Dioxide (NO2) Pollution Monitoring with Sentinel-5P Satellite Imagery over Europe during the Coronavirus Pandemic Outbreak [J]. Remote Sensing, 2020,12(21):3575.
[3] Huang S, Li H, Wang M, et al. Long-term exposure to nitrogen dioxide and mortality: A systematic review and meta-analysis [J]. Science of the Total Environment, 2021,776:145968.
[4] Chen J, Jiang Z, Li R, et al. Large discrepancy between observed and modeled wintertime tropospheric NO2variabilities due to COVID-19controls in China [J]. Environmental Research Letters, 2022,17(3):35007.
[5] Chi Y, Fan M, Zhao C, et al. Machine learning-based estimation of ground-level NO2concentrations over China [J]. Science of the Total Environment, 2022,807:150721.
[6] Lei M, Monjardino J, Mendes L, et al. Statistical forecast applied to two macao air monitoring stations [J]. IOP Conference Series. Earth and Environmental Science, 2020,489(1):12018.
[7] Navares R, Aznarte J L. Predicting air quality with deep learning LSTM: Towards comprehensive models [J]. Ecological Informatics, 2020,55:101019.
[8] Mao W, Jiao L, Wang W. Long time series ozone prediction in China: A novel dynamic spatiotemporal deep learning approach [J]. Building and environment, 2022,218:109087.
[9] Wu Q, Lin H. A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors [J]. Science of The Total Environment, 2019,683:808-821.
[10] Zhao Z, Wu J, Cai F, et al. A statistical learning framework for spatial-temporal feature selection and application to air quality index forecasting [J]. Ecological Indicators, 2022,144:109416.
[11] 李 娜.基于EEMD-LSTM-ARIMA的蘭州市空氣質(zhì)量預(yù)測(cè)研究 [D]. 蘭州:蘭州財(cái)經(jīng)大學(xué), 2022.
Li N. Lanzhou city based on EEMD-LSTM-ARIMA air quality prediction study [D]. Lanzhou: Lanzhou University of Finance and Economics, 2022.
[12] 王 茜,吳劍斌,林燕芬.CMAQ模式及其修正技術(shù)在上海市PM2.5預(yù)報(bào)中的應(yīng)用檢驗(yàn) [J]. 環(huán)境科學(xué)學(xué)報(bào), 2015,35(6):1651-1656.
Wang Q, Wu J, Lin Y. Implementation of a dynamic linear regression method on the CMAQ forecast of PM2.5in Shanghai [J]. Acta Scientiae Circumstantiae, 2015,35(6):1651-1656.
[13] Yan R, Liao J, Yang J, et al. Multi-hour and multi-site air quality index forecasting in Beijing using CNN, LSTM, CNN-LSTM, and spatiotemporal clustering [J]. Expert Systems with Applications, 2021,169:114513.
[14] Zhang Y, Bocquet M, Mallet V, et al. Real-time air quality forecasting, part I: History, techniques, and current status [J]. Atmospheric Environment, 2012,60:632-655.
[15] Bai L, Wang J, Ma X, et al. Air pollution forecasts: An overview [J]. International Journal of Environmental Research and Public Health, 2018,15(4):780.
[16] Qiao Z, Cui S, Pei C, et al. Regional predictions of air pollution in guangzhou: Preliminary results and multi-model cross-validations [J]. Atmosphere, 2022,13(10):1527.
[17] Zhang Y, Bocquet M, Mallet V, et al. Real-time air quality forecasting, part II: State of the science, current research needs, and future prospects [J]. Atmospheric Environment, 2012,60:656-676.
[18] Sayeed A, Eslami E, Lops Y, et al. CMAQ-CNN: A new-generation of post-processing techniques for chemical transport models using deep neural networks [J]. Atmospheric Environment, 2022,273:118961.
[19] Meng X, Wang W, Shi S, et al. Evaluating the spatiotemporal ozone characteristics with high-resolution predictions in mainland China, 2013~2019 [J]. Environmental Pollution, 2022,299:118865.
[20] 黃叢吾,陳報(bào)章,馬超群,等.基于極端隨機(jī)樹方法的WRF- CMAQ-MOS模型研究 [J]. 氣象學(xué)報(bào), 2018,76(5):779-789.
Huang C W, Chen B Z, Ma C Q, et al. WRF-CMAQ-MOS studies based on extremely randomized trees [J]. Acta Meterologica Sinica, 2018,76(5):779-789.
[21] Petetin H, Bowdalo D, Bretonnière P, et al. Model output statistics (MOS) applied to Copernicus Atmospheric Monitoring Service (CAMS) O3forecasts: trade-offs between continuous and categorical skill scores [J]. Atmospheric Chemistry and Physics, 2022,22(17): 11603-11630.
[22] Sayeed A, Eslami E, Lops Y, et al. CMAQ-CNN: A new-generation of post-processing techniques for chemical transport models using deep neural networks [J]. Atmospheric Environment, 2022,273:118961.
[23] Catalano M, Galatioto F. Enhanced transport-related air pollution prediction through a novel metamodel approach [J]. Transportation Research Part D: Transport and Environment, 2017,55:262-276.
[24] Zhou H, Zhang F, Du Z, et al. A theory-guided graph networks based PM2.5forecasting method [J]. Environmental Pollution, 2022,293: 118569.
[25] Liu B, Yu X, Chen J, et al. Air pollution concentration forecasting based on wavelet transform and combined weighting forecasting model [J]. Atmospheric Pollution Research, 2021,12(8):101144.
[26] 肖 宇.基于多機(jī)器學(xué)習(xí)算法耦合的空氣質(zhì)量數(shù)值預(yù)報(bào)訂正方法研究及應(yīng)用 [J]. 環(huán)境科學(xué)研究, 2022,35(12):2693-2701.
Xiao Y. Research and application of an ensemble forecasting method based on coupled multi-machine learning algorithms [J]. Research of Environmental Sciences, 2022,35(12):2693-2701.
[27] Li M, Liu H, Geng G, et al. Anthropogenic emission inventories in China: a review [J]. National Science Review, 2017,4(6):834-866.
[28] Chen W, Li H, Zhu Y, et al. Impact Assessment of Energy Transition Policy on Air Quality over a Typical District of the Pearl River Delta Region, China [J]. Aerosol and Air Quality Research, 2022,22(7):220071.
[29] Chen Y, Zhu Y, Lin C, et al. Response surface model based emission source contribution and meteorological pattern analysis in ozone polluted days [J]. Environmental Pollution, 2022,307:119459.
[30] 程興宏,刁志剛,胡江凱,等.基于CMAQ模式和自適應(yīng)偏最小二乘回歸法的中國(guó)地區(qū)PM2.5濃度動(dòng)力-統(tǒng)計(jì)預(yù)報(bào)方法研究 [J]. 環(huán)境科學(xué)學(xué)報(bào), 2016,36(8):2771-2782.
Cheng X H, Diao Z G, Hu J K, et al. Dynamical-statistical forecasting of PM2.5concentration based on CMAQ model and adapting partial least square regression method in China [J]. Acta Scientiae Circumstantiae, 2016,36(8):2771-2782.
[31] 修 晨.2017年佛山市順德區(qū)PM2.5污染過程特征及改善策略 [J]. 廣東化工, 2018,45(16):54-56.
Xiu C. Characteristics of the PM2.5Pollution process of 2017 in shunde district of foshan and the improvement strategy [J]. Guangdong Chemical Industry. 2018,45(16):54-56.
[32] 葉玉杰.基于ARIMA-LSTM混合模型的短期空氣質(zhì)量預(yù)測(cè) [D]. 天津:天津商業(yè)大學(xué), 2022.
Ye Y J. Short-term air quality prediction based on ARIMA-LSTM hybrid model [D]. Tianjin: Tianjin University of Commerce, 2022.
[33] 趙前矩.基于RF-CRNN模型的上??諝赓|(zhì)量指數(shù)的預(yù)測(cè) [D]. 上海:上海師范大學(xué), 2022.
Zhao Q J. Prediction of Shanghai air quality index based on RF-CRNN model [D]. Shanghai: Shanghai Normal University, 2022.
[34] Ghasemi A, Amanollahi J. Integration of ANFIS model and forward selection method for air quality forecasting [J]. Air Quality, Atmosphere & Health, 2019,12(1):59-72.
[35] 陳 乾.基于隨機(jī)分布式貪心算法的變量選擇 [D]. 上海:華東師范大學(xué), 2019.
Chen Q. Feature selection based on stochastic distributed greedy algorithm [D]. Shanghai: East China Normal University, 2019.
[36] Ojha V K, Abraham A, Sná?el V. Metaheuristic design of feedforward neural networks: A review of two decades of research [J]. Engineering Applications of Artificial Intelligence, 2017,60:97-116.
[37] Sousa S, Martins F, Alvimferraz M, et al. Multiple linear regression and artificial neural networks based on principal components to predict ozone concentrations [J]. Environmental Modelling & Software, 2007, 22(1):97-103.
[38] 陳 沛,劉文奇,鄭萬(wàn)波.基于LSTM和FNN的昆明市氣候舒適度相關(guān)氣象指標(biāo)預(yù)測(cè)方法 [J]. 計(jì)算機(jī)應(yīng)用, 2021,41(S2):113-117.
Chen P, Liu W, Zheng W. Prediction method of related meteorological indexes of Kunming climate comfort based on LSTM and FNN [J]. Journal of Computer Applications, 2021,41(S2):113-117.
[39] Rodriguez-Galiano V, Mendes M P, Garcia-Soldado M J, et al. Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: A case study in an agricultural setting (Southern Spain) [J]. Science of The Total Environment, 2014,476-477:189-206.
[40] Chen G, Chen J, Dong G, et al. Improving satellite-based estimation of surface ozone across China during 2008~2019 using iterative random forest model and high-resolution grid meteorological data [J]. Sustainable Cities and Society, 2021,69:102807.
[41] Matin S S, Hower J C, Farahzadi L, et al. Explaining relationships among various coal analyses with coal grindability index by Random Forest [J]. International Journal of Mineral Processing, 2016,155: 140-146.
[42] Cheng Z, Zhang S, Zhang Z. Predictive control for coke oven blowing cooler system based on SVR [C]. IEEE, 2019.
[43] 尹博文,張亞娟,王曉芳,等.基于支持向量回歸與LSTM的城市PM2.5預(yù)測(cè) [J]. 河北工業(yè)大學(xué)學(xué)報(bào), 2022,51(3):1-9.
Yi B W, Zhang Y J, Wang X F, et al. Urban PM2.5forecasting based on support vector regression and LSTM [J]. Journal of Hebei University of Technology, 2022,51(3):1-9.
[44] Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System [C]. Ithaca: ACM, 2016.
[45] 胡占占,陳傳法,胡保健.基于時(shí)空XGBoost的中國(guó)區(qū)域PM2.5濃度遙感反演 [J]. 環(huán)境科學(xué)學(xué)報(bào), 2021,41(10):4228-4237.
Hu Z Z, Chen C F, Hu B J. Estimating PM2.5concentrations across China based on space-time XGBoost approach [J]. Acta Scientiae Circumstantiae, 2021,41(10):4228-4237.
[46] 周恒左,陳恒蕤,廖 鵬,等.蘭州市CMAQ近地面臭氧模擬結(jié)果的訂正方法研究——基于機(jī)器學(xué)習(xí)方法 [J]. 中國(guó)環(huán)境科學(xué), 2022,42 (12):5472-5483.
Zhou H Z, Chen H R, Liao P, et al. A study on the revision method of CMAQ ozone prediction results in Lanzhou City — Based on machine learning methods [J]. China Environmental Science, 2022,42(12): 5472-5483.
[47] Kokkinos K, Karayannis V, Nathanail E, et al. A comparative analysis of Statistical and Computational Intelligence methodologies for the prediction of traffic-induced fine particulate matter and NO2[J]. Journal of cleaner production, 2021,328:129500.
[48] Zhao J, Deng F, Cai Y, et al. Long short-term memory - Fully connected (LSTM-FC) neural network for PM2.5concentration prediction [J]. Chemosphere, 2019,220:486-492.
[49] Mao W, Jiao L, Wang W. Long time series ozone prediction in China: A novel dynamic spatiotemporal deep learning approach [J]. Building and environment, 2022,218:109087.
[50] Valsecchi C, Grisoni F, Consonni V, et al. Consensus versus Individual QSARs in Classification: Comparison on a Large-Scale Case Study [J]. Journal of Chemical Information and Modeling, 2020,60(3):1215- 1223.
[51] Zhang J, Tan Z, Wei Y. An adaptive hybrid model for short term electricity price forecasting [J]. Applied Energy, 2020,258:114087.
[52] Tibshirani R. Regression shrinkage and selection via the lasso: a retrospective [J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2011,73(3):273-282.
[53] 王靜宇.基于LASSO的原油價(jià)格收益率預(yù)測(cè)的集成模型研究 [D]. 成都:西南財(cái)經(jīng)大學(xué), 2021.
Wang J Y. Research on integrated model of crude oil price yield forecast based on LASSO [D]. Chengdu: Southwest University of Finance and Economics, 2021.
[54] Donnelly A, Misstear B, Broderick B. Real time air quality forecasting using integrated parametric and non-parametric regression techniques [J]. Atmospheric Environment, 2015,103:53-65.
[55] 梁高亮,關(guān)遠(yuǎn)鵬.南海區(qū)空氣中的O3和NO2濃度變化特征分析 [J]. 環(huán)境科學(xué)與技術(shù), 2013,36(S1):110-112.
Liang G L, Guan Y P. Analysis of the concentration variation characteristic of O3and NO2in Nanhai District air [J]. Environmental Science & Technology, 2013,36(S1):110-112.
[56] 樊建凌,胡正義.江西鷹潭地區(qū)森林生態(tài)系統(tǒng)NO2濃度變化規(guī)律 [J]. 中國(guó)環(huán)境科學(xué), 2006,26(2):171-175.
Fan J L, Hu Z Y. Dynamics of atmospheric NO2concentration in a forest eco-system at Yingtan, Jiangxi Province [J]. China Environmental Science, 2006,26(2):171-175.
[57] Guo Q, He Z, Li S, et al. Air pollution forecasting using artificial and wavelet neural networks with meteorological conditions [J]. Aerosol and Air Quality Research, 2020,20(6):1429-1439.
[58] Chi Y, Fan M, Zhao C, et al. Machine learning-based estimation of ground-level NO2concentrations over China [J]. Science of The Total Environment, 2022,807:150721.
[59] Xu J, Lindqvist H, Liu Q, et al. Estimating the spatial and temporal variability of the ground-level NO2concentration in China during 2005~2019 based on satellite remote sensing [J]. Atmospheric Pollution Research, 2021,12(2):57-67.
[60] 肖鐘湧,謝先全,陳穎鋒,等.粵港澳大灣區(qū)NO2污染的時(shí)空特征及影響因素分析 [J]. 中國(guó)環(huán)境科學(xué), 2020,40(5):2010-2017.
Xiao Z Y, Xie X Q, Chen Y F, et al. Temporal and spatial characteristics and influencing factors of NO2pollution over Guangdong-Hong Kong-Macao Greater Bay Area, China [J]. China Environmental Science, 2020,40(5):2010-2017.
[61] 蘆 華,謝 旻,吳 鉦,等.基于機(jī)器學(xué)習(xí)的成渝地區(qū)空氣質(zhì)量數(shù)值預(yù)報(bào)PM2.5訂正方法研究 [J]. 環(huán)境科學(xué)學(xué)報(bào), 2020,40(12):4419-4431.
Lu H, Xie M, Wu Z, et al. Adjusting PM2.5prediction of the numerical air quality forecast model based on machine learning methods in Chengyu region [J]. Acta Scientiae Circumstantiae, 2020,40(12):4419- 4431.
[62] 廖啟行.順德工業(yè)產(chǎn)業(yè)結(jié)構(gòu)優(yōu)化研究 [D]. 蘭州:蘭州大學(xué), 2010.
Liao Q X. The research of the optimization of Shunde's Industrial construction [D]. Lanzhou: Lanzhou University, 2010.
[63] 佛山市生態(tài)環(huán)境局順德分局.順德區(qū)“十四五”環(huán)境空氣質(zhì)量達(dá)標(biāo)規(guī)劃(2021-2025)(送審稿) [EB/OL]. http://www.shunde.gov.cn/ sdqsthj/tzggjdt/content/post_5447749.html 2022-11-15.
Shunde Branch of Foshan Ecological Environment Bureau. Shunde District's 14th Five Year Plan for Environmental Air Quality Compliance (2021-2025) (Draft for Review) [EB/OL]. http://www. shunde.gov.cn/sdqsthj/tzggjdt/content/post_5447749.html 2022-11-15.
Forecast of NO2concentrations based on coupled air quality model simulations and monitoring data using machine learning method.
HUANG Yong-xi1, ZHU Yun1*, XIE Yang-hong2, LI Hai-xian2, ZHANG Zhi-cheng1, LI Jie1, LI Jin-ying1, YUAN Ying-zhi1
(1.Guangdong Provincial Key Laboratory of Atmospheric Environment and Pollution Control, College of Environment and Energy, South China University of Technology, Guangzhou 510006, China;2.Foshan Ecology and Environment Bureau, Shunde Branch, Foshan 528300, China)., 2023,43(12):6225~6234
In this study, built upon the WRF-CMAQ air quality model simulations, a novel machine learning method based on simulations and observations (SOML) that integrating feedforward neural network (FNN) and long short-term Memory network (LSTM) through the Lasso method was developed for forecasting NO2concentrations, where LSTM was derived based on real-time pollutant and meteorological data. This innovative method was then applied to forecast the NO2concentrations for three consecutive days for ten air quality monitoring stations in Shunde, Foshan to evaluate the model performance. Our results show that: Compared to WRF-CMAQ and other individual models, SOML gave higher accuracy in the three-day forecast of NO2concentrations, with the mean absolute error (MAE) of first day at 4.99μg/m3, decreasing up to 66.18%; The accuracy of SOML predictions has significantly improved compared with that of WRF-CMAQ, indicating SOML’s suitable applicability to all seasons (MAE decreased by 42.18%, 42.89%, 61.04% and 50.91%, respectively), particularly in autumn and winter; and Compared with WRF-CMAQ, SOML appears to provide better forecasting accuracy of the spatial distribution as well as the NO2concentration levels at each station in Shunde.
NO2concentration forecast;machine learning;forecast model;WRF-CMAQ model;air quality monitoring
X511
A
1000-6923(2023)12-6225-10
黃泳熙,朱 云,謝陽(yáng)紅,等.空氣質(zhì)量模擬與觀測(cè)機(jī)器學(xué)習(xí)NO2濃度預(yù)報(bào) [J]. 中國(guó)環(huán)境科學(xué), 2023,43(12):6225-6234.
Huang Y X, Zhu Y, Xie Y H, et al. Forecast of NO2-concentrations based on coupled air quality model simulations and monitoring data using machine learning method [J]. China Environmental Science, 2023,43(12):6225-6234.
2023-04-23
高端外國(guó)專家引進(jìn)計(jì)劃項(xiàng)目(G2023163014L)
* 責(zé)任作者, 教授, zhuyun@scut.edu.cn
黃泳熙(1999-),女,廣東汕頭人,華南理工大學(xué)碩士研究生,主要從事空氣質(zhì)量模擬與預(yù)報(bào)研究.1663795613@qq.com.