陳詩揚,劉 佳
基于GF-6時序數(shù)據(jù)的農(nóng)作物識別深度學(xué)習(xí)算法評估
陳詩揚,劉 佳※
(中國農(nóng)業(yè)科學(xué)院農(nóng)業(yè)資源與農(nóng)業(yè)區(qū)劃研究所,北京 100081)
農(nóng)作物類型制圖是農(nóng)情遙感的重要內(nèi)容。該研究利用GF-6時序數(shù)據(jù),在黑龍江省對基于卷積、遞歸和注意力3種機制的6個深度學(xué)習(xí)模型在農(nóng)作物類型制圖中的性能進行了定性和定量的評估。結(jié)果表明:所有模型對大豆、玉米和水稻3類主要農(nóng)作物的1值不低于89%、84%和97%,總體分類精度達(dá)到了93%~95%。將模型異地遷移后,各模型的總體分類精度下降7.2%~41.0%,基于卷積或遞歸的深度學(xué)習(xí)模型仍保持了較強的農(nóng)作物識別能力,優(yōu)于基于注意力的深度學(xué)習(xí)模型和隨機森林模型。在時間消耗上,各深度學(xué)習(xí)模型相比于隨機森林模型,訓(xùn)練與推理時間不超過6.2倍。GF-6時序數(shù)據(jù)結(jié)合深度學(xué)習(xí)模型在分類精度和運行效率上滿足高精度大范圍農(nóng)作物制圖的需要,且遷移性優(yōu)于傳統(tǒng)模型。研究結(jié)果可為深度學(xué)習(xí)在黑龍江農(nóng)作物遙感分類任務(wù)中的應(yīng)用提供參考。
農(nóng)作物;遙感;識別;深度學(xué)習(xí);GF-6;時間序列;黑龍江
基于中高空間分辨率遙感數(shù)據(jù)的農(nóng)作物類型制圖是農(nóng)業(yè)監(jiān)測業(yè)務(wù)中最重要的管理工具之一。黑龍江省是中國糧食主產(chǎn)區(qū),在糧食安全中占有重要地位,及時準(zhǔn)確地掌握黑龍江農(nóng)作物種植分布及面積,對農(nóng)作物估產(chǎn)和農(nóng)業(yè)生產(chǎn)政策的制定具有重要意義。近年來,GF-6衛(wèi)星穩(wěn)定在軌運行并以4 d的重訪周期不斷提供優(yōu)質(zhì)的有效數(shù)據(jù),包含更多農(nóng)作物生育時期的時間序列數(shù)據(jù)在提高農(nóng)作物識別精度[1]的同時也使數(shù)據(jù)處理量成倍增加,因此尋找高效高能的機器學(xué)習(xí)分類算法變得更為重要。國內(nèi)外許多學(xué)者已對遙感農(nóng)作物分類進行了大量研究,但使用的最小距離、支持向量機、隨機森林(Random Forest,RF)等傳統(tǒng)分類方法存在以下問題:首先,難以提取到深層次的特征,特征提取都為層次較低的單一或少量的淺層特征,其次,傳統(tǒng)方法只能在特定的區(qū)域、時間下使用,遷移性差[2]。近年來,利用深度學(xué)習(xí)方法分析時間序列遙感數(shù)據(jù)的方法迅速增加。深度學(xué)習(xí)算法的運用為復(fù)雜數(shù)據(jù)的分析提供了有效支持,特別是,卷積神經(jīng)網(wǎng)絡(luò)(Convolutional Neural Network,CNN)[3]和遞歸神經(jīng)網(wǎng)絡(luò)(Recurrent Neural Network,RNN)[4]已被證明可以有效探索空間結(jié)構(gòu)與時間結(jié)構(gòu),并應(yīng)用于農(nóng)作物遙感識別中。
卷積神經(jīng)網(wǎng)絡(luò)已廣泛應(yīng)用于各種遙感任務(wù),包括超高分辨率影像的地物分類[5-6]、語義分割[7]、對象檢測[8]、數(shù)據(jù)插補[9]和融合[10]等,在這些工作中,CNN通過在不同維度應(yīng)用卷積來充分利用數(shù)據(jù)的空間結(jié)構(gòu)或時間結(jié)構(gòu)。對于地物分類,CNN包括跨越光譜或時間維度的1D-CNN[11-12]、跨越空間維度的2D-CNN[13]、跨越光譜和空間維度的3D-CNN[14]以及跨越時間和空間維度的3D-CNN[15]等。盡管1D-CNN已在時間序列分類中廣泛應(yīng)用[16],但直到近年才在土地覆蓋制圖領(lǐng)域有所應(yīng)用[17],如Pelletier等[12]開發(fā)了一種在時域應(yīng)用卷積的1D-CNN,以便定性和定量地評估網(wǎng)絡(luò)結(jié)構(gòu)對農(nóng)作物制圖的影響。遞歸神經(jīng)網(wǎng)絡(luò)以序列數(shù)據(jù)為輸入,在前進方向進行遞歸并保持來自先前上下文的特征。RNN是時間序列分類研究使用最多的體系結(jié)構(gòu),已成功運用于時間序列光學(xué)數(shù)據(jù)[18-20]以及多時相合成孔徑雷達(dá)數(shù)據(jù)[21]的地物分類,如Campos-Taberner等[22]利用Sentinel-2數(shù)據(jù)和基于兩層雙向長短期記憶網(wǎng)絡(luò)的RNN,在西班牙巴倫西亞省農(nóng)作物制圖中達(dá)到了98.7%的總體精度,并在時域和譜域中采用噪聲置換方法,評估了不同光譜和時間特征對分類精度的影響。此外,也有學(xué)者將CNN與RNN結(jié)合,用于農(nóng)作物分類任務(wù)[23]和遙感變化檢測任務(wù)[24-25]。注意力機制(Attention Mechanism,AM)[26]不同于傳統(tǒng)的CNN或RNN,僅由自注意力和前饋神經(jīng)網(wǎng)絡(luò)組成。在遙感圖像處理中,注意力機制已被用于改進超高分辨率圖像分類[27-28]以及捕獲空間與光譜的依賴關(guān)系[29]。Ru?wurm等[23]提出了一種具有卷積遞歸層的編碼器結(jié)構(gòu),并引入注意力機制利用Sentinel-2時間序列數(shù)據(jù)在德國巴伐利亞進行了農(nóng)作物分類試驗,定性地展示了自注意力如何提取與分類相關(guān)的特征。
近年來,研究人員已對農(nóng)作物分類算法進行了廣泛的研究,但對GF-6衛(wèi)星寬視場(GF-6 Satellite Wide Field of View,GF-6/WFV)相機時間序列數(shù)據(jù)的利用較少,沒有在農(nóng)作物遙感分類中發(fā)揮GF-6/WFV高時間分辨率的優(yōu)勢。此外,多數(shù)農(nóng)作物分類研究所使用的深度學(xué)習(xí)模型基于計算機視覺領(lǐng)域的語義分割模型,缺少對時間序列領(lǐng)域深度學(xué)習(xí)模型的利用,后者可識別農(nóng)作物不同生育時期提供的獨特時間信號,作為區(qū)分各種類型農(nóng)作物的關(guān)鍵特征來構(gòu)建判別函數(shù)。本文擬針對GF-6/WFV時間序列數(shù)據(jù)和時間序列深度學(xué)習(xí)模型,評估與比較卷積神經(jīng)網(wǎng)絡(luò)、遞歸神經(jīng)網(wǎng)絡(luò)、注意力機制以及傳統(tǒng)算法在黑龍江農(nóng)作物分類制圖中的性能表現(xiàn),為深度學(xué)習(xí)在農(nóng)作物遙感分類中的應(yīng)用提供參考。
選擇黑龍江省林甸縣和海倫市作為研究區(qū)域,兩地主要農(nóng)作物均為水稻、玉米和大豆,種植結(jié)構(gòu)在黑龍江省平原地區(qū)具有代表性,適用于模型遷移性評估。研究區(qū)主要農(nóng)作物物候歷如表1所示。研究區(qū)位于松嫩平原東北端,屬北溫帶大陸季風(fēng)性氣候,年平均氣溫4℃,無霜期120 d左右,年降水量400~600 mm。其中,林甸縣縣域面積3 503 km2,耕地面積約為166 000 hm2;海倫市東距林甸縣145 km,幅員面積4 667 km2,耕地面積約為294 000 hm2。研究區(qū)在黑龍江省的具體位置如圖1所示,GF-6/WFV假彩色合成影像如圖2所示。
表1 研究區(qū)主要農(nóng)作物物候歷
GF-6/WFV空間分辨率16 m,觀測幅寬800 km,重訪周期4 d,較其他“高分”衛(wèi)星新增紅邊、黃等波段,是國內(nèi)首顆精準(zhǔn)農(nóng)業(yè)觀測的高分衛(wèi)星,其光譜響應(yīng)函數(shù)如圖3所示。
注:綠框區(qū)域用于目視分析。
Note: Green square areas were analyzed visually.
圖2 研究區(qū)GF-6/WFV假彩色(近紅外,紅,綠)影像
Fig.2 GF-6/WFV image in false color (near infra-red, red, green) composite of study areas
為提取3類農(nóng)作物不同生育時期的分類特征,數(shù)據(jù)獲取時間為2020年4月初至11月初。林甸研究區(qū)和海倫研究區(qū)所用時間序列數(shù)據(jù)分別由41景和48景GF-6/WFV數(shù)據(jù)組合而成,如圖4所示,研究區(qū)大部分區(qū)域均有35景以上的高質(zhì)量晴空數(shù)據(jù)覆蓋。
數(shù)據(jù)預(yù)處理過程包括輻射定標(biāo)、大氣表觀反射率計算、6S大氣校正、RPC(Rational Polynomial Coefficient)校正,預(yù)處理相關(guān)代碼存放于在線倉庫https://github.com/GenghisYoung233/Gaofen-Batch。設(shè)定目標(biāo)時間序列長度,長度不足的像元隨機復(fù)制部分時域或全部丟棄,長度超出的像元隨機丟棄部分時域,以解決部分?jǐn)?shù)據(jù)未覆蓋研究區(qū)以及兩個研究區(qū)時間序列長度不一致無法遷移的問題。最終,兩個研究區(qū)形成長度為35景的時間序列數(shù)據(jù)。
對時間序列數(shù)據(jù)采用全局最大/最小值歸一化,以減少網(wǎng)絡(luò)隱藏層數(shù)據(jù)分布的改變對神經(jīng)網(wǎng)絡(luò)參數(shù)訓(xùn)練的影響, 從而加快神經(jīng)網(wǎng)絡(luò)的收斂速度和穩(wěn)定性。對于插值后時間序列數(shù)據(jù),將所有時相的第一波段的所有像元作為整體,去除2%極端值后得到最大值與最小值,對第一波段所有像元進行歸一化,依次類推,對所有8個波段進行歸一化。不同于Z-score歸一化,結(jié)果保留了數(shù)據(jù)中對農(nóng)作物識別中起到關(guān)鍵作用的量綱與量級信息,保留了時間序列的變化趨勢,并避免了極端值的影響。
地表真實數(shù)據(jù)根據(jù)地面調(diào)查數(shù)據(jù)結(jié)合GF-2高分辨率數(shù)據(jù)目視解譯結(jié)果獲得。在地面調(diào)查中,利用安裝奧維地圖軟件的移動設(shè)備記錄每個樣點的位置與地物類型,地物類型包括大豆、水稻、玉米、雜糧等農(nóng)作物和水體、城鎮(zhèn)、草地等非農(nóng)作物。然后通過GF-2數(shù)據(jù)目視解譯獲取樣點所在地塊的邊界,繪制樣方。最終,將樣方數(shù)據(jù)分為水稻、玉米、大豆、水體、城鎮(zhèn)和其他(草地、林地等自然地物)6類。其中林甸共854個樣方,覆蓋像元2 003 629個,海倫共631個樣方,覆蓋像元935 679個,如表2所示。
表2 各地物類別在樣方與像元級別的實例數(shù)量
從林甸縣地面數(shù)據(jù)中對每個地物類別隨機選取70%的樣方,從中提取所有的像元及其對應(yīng)的地物類別作為模型訓(xùn)練數(shù)據(jù)集,從其余30%樣方中對每一類地物隨機抽取1 000個像元及其對應(yīng)的地物類別作為模型驗證數(shù)據(jù)集,各類地物的像元數(shù)量保持均衡以避免對精度驗證產(chǎn)生較大干擾。該過程重復(fù)5次,生成5組不同的訓(xùn)練/驗證集供后續(xù)試驗需要,各組數(shù)據(jù)集相互獨立。海倫研究區(qū)生成方法相同,后續(xù)試驗僅使用驗證數(shù)據(jù)集。
本試驗使用了基于卷積、基于遞歸和基于注意力3種機制的6個深度學(xué)習(xí)模型以及作為基準(zhǔn)的隨機森林模型,均為農(nóng)作物遙感分類中的常用模型或最新模型。試驗相關(guān)代碼存放于在線倉庫https://github.com/GenghisYoung233/ DongbeiCrops。
針對基于卷積的分類機制,本文選擇了3種不同的卷積神經(jīng)網(wǎng)絡(luò)模型,分別是TempCNN(Temporal Convolutional Neural Network)[12]、MSResNet(Multi Scale 1D Residual Network)和OmniscaleCNN(Omniscale Convolutional Neural Network)[30]。TempCNN堆疊了3個具有相同卷積濾波器大小的卷積層,其后跟隨著全連接層和softmax激活層。MSResNet首先連接一個卷積層與最大池化層,并向3個分支傳遞,通過6個不同長度的連續(xù)卷積濾波器和全局池化層。各個分支中,每3個卷積層中間加入ResNet殘差網(wǎng)絡(luò)連接以解決梯度消失或梯度爆炸的問題。最后級聯(lián)各個分支的結(jié)果并通過完全連接層和softmax激活層。OmniscaleCNN由3個卷積層、全局池化層和softmax激活層組成。
針對基于遞歸的分類機制,本文選擇了兩種不同的遞歸神經(jīng)網(wǎng)絡(luò)模型:LSTM(Long Short-Term Memory)[31]和StarRNN(Star Recurrent Neural Network)[32]。LSTM單元由記憶細(xì)胞、輸入門、輸出門和忘記門組成,記憶細(xì)胞負(fù)責(zé)存儲歷史信息,3個門控制著進出單元的信息流。StarRNN與LSTM或門控循環(huán)單元(Gated Recurrent Unit,GRU)相比所需要的參數(shù)更少,并對梯度消失問題進行了優(yōu)化。
針對基于注意力的分類機制,本文選擇了Transformer模型[26]。Transformer模型作為序列到序列、編碼器-解碼器模型,最初用于自然語言翻譯,對于遙感農(nóng)作物識別,本試驗僅保留編碼器。
針對傳統(tǒng)分類算法,本文選擇了隨機森林。隨機森林利用bootsrap重抽樣方法從原始樣本中抽取多個樣本,對每個bootsrap樣本進行決策樹建模,然后組合多棵決策樹的預(yù)測,通過投票得出最終預(yù)測結(jié)果[33]。隨機森林算法可以處理時間序列數(shù)據(jù)的高維度特征、易于調(diào)整參數(shù)[34]、對于錯誤標(biāo)簽數(shù)據(jù)具有魯棒性[35],目前與支持向量機算法在傳統(tǒng)算法中占主導(dǎo)地位。
模型架構(gòu)和訓(xùn)練過程依賴于各種超參數(shù),這些超參數(shù)可能根據(jù)目標(biāo)和數(shù)據(jù)集而有所不同。對于深度學(xué)習(xí)模型,這些超參數(shù)包括隱藏向量的維數(shù)、層數(shù)、自注意力頭數(shù)、卷積核大小、丟棄率、學(xué)習(xí)速率和權(quán)值衰減率等,通過對模型的不同超參數(shù)分別設(shè)計多種選項并排列組合,進行參數(shù)優(yōu)選。對于TempCNN,超參數(shù)選項為卷積濾波器大小(convolution filter size)、隱式表達(dá)層數(shù)量(number of hidden layer)、丟棄率(dropout rate)(選擇5個,呈對數(shù)正態(tài)分布);對于MS-ResNet模型,超參數(shù)選項為;對于OmniscaleCNN,本文遵循模型作者的建議[30],不修改任何超參數(shù),為1 024;對于LSTM和StarRNN,超參數(shù)選項為、、級聯(lián)層數(shù)量(number of cascade layer),并針對LSTM,加入了單向/雙向(bidirectional)選項;對于Transformer,超參數(shù)選項為、自注意力頭數(shù)(number of self-attention heads);對于RF,已有研究發(fā)現(xiàn)調(diào)整其參數(shù)只會帶來輕微的性能提升[34],故本文采用標(biāo)準(zhǔn)設(shè)置:決策樹數(shù)量(number of decision tree)為500、最大深度(maximum depth)為30。將各模型超參數(shù)組合后,得到不同超參數(shù)的TempCNN模型45個,MS-ResNet模型5個,OmniscaleCNN模型1個,LSTM模型120個,StarRNN模型60個,Transfromer模型40個,RF模型1個。對所有超參數(shù)組合,首先在林甸研究區(qū)訓(xùn)練數(shù)據(jù)集與驗證數(shù)據(jù)集上進行參數(shù)優(yōu)選,根據(jù)總體分類精度選擇最佳參數(shù)組合。
確定最佳超參數(shù)組合后,首先在林甸研究區(qū)的訓(xùn)練數(shù)據(jù)集和驗證數(shù)據(jù)集上進行訓(xùn)練與驗證,得到訓(xùn)練后模型、林甸分類后影像與總體分類精度、制圖精度、用戶精度、1值4種精度指標(biāo)。然后利用林甸訓(xùn)練后模型在海倫研究區(qū)的驗證數(shù)據(jù)集上進行遷移性測試,獲取海倫分類后影像和精度指標(biāo)。為避免統(tǒng)計學(xué)誤差[36],該過程在5組不同的數(shù)據(jù)集上進行評估,取其中值為最終結(jié)果。最后對所有模型最終的評估結(jié)果進行比較與分析。
對各個超參數(shù)組合的訓(xùn)練結(jié)果進行比對后確定最終結(jié)果,如表3所示。
表3 各模型最終超參數(shù)
利用林甸訓(xùn)練數(shù)據(jù)集和驗證數(shù)據(jù)集訓(xùn)練模型并對林甸研究區(qū)影像進行分類,所有模型在綠框區(qū)域的分類結(jié)果圖5a所示,模型經(jīng)林甸數(shù)據(jù)集訓(xùn)練后遷移至海倫,對海倫研究區(qū)的影像進行分類,利用海倫驗證數(shù)據(jù)集進行精度驗證,所有模型的分類結(jié)果如圖5b所示。精度指標(biāo)如表4和表5所示。
表4 各模型在林甸的精度
注:UA、PA、1分別代表用戶精度、制圖精度、1值。下同。
Note: UA, PA and1 represent the user's accuracy, the producer's accuracy and the1 score. Same below.
在林甸研究區(qū),各模型的分類結(jié)果在視覺上十分相似,都取得了93%~95%的總體分類精度,對大豆、玉米和水稻3類主要農(nóng)作物的1值均不低于89%、84%和97%,沒有表現(xiàn)出大的性能差異,較好地反映了3類農(nóng)作物與其余地物的空間分布趨勢。存在的共同問題是大豆與玉米像元的錯分,由于大豆、玉米混種的情況較為常見,GF-6/WFV數(shù)據(jù)較低的空間分辨率使得大豆與玉米像元容易混淆,同時因為大豆與玉米生育期相近、光譜響應(yīng)相似,采用原始波段數(shù)據(jù)分離兩者的難度較大,導(dǎo)致大豆與玉米的制圖精度和用戶精度降低,需要提高遙感數(shù)據(jù)分辨率、添加植被指數(shù)波段來進一步提高大豆、玉米的識別精度。
受樣本量較小、云污染、自然地物類型和數(shù)據(jù)時相組成發(fā)生改變等因素影響,模型在遷移后出現(xiàn)了不同程度的過擬合現(xiàn)象,總體分類精度下降7.2%~41.0%。由總體精度和1值可以看出,在卷積模型中,MSResNet對3類農(nóng)作物均保持了較好的分類性能,由于時間序列插值使得不同區(qū)域的像元具有不同的時相組成,MSResNet良好的擬合性或許是因為相對較復(fù)雜的網(wǎng)絡(luò)結(jié)構(gòu)和較大的參數(shù)量使其具有更強的數(shù)據(jù)泛化能力。OmniScaleCNN在分類過程中可根據(jù)數(shù)據(jù)特征自動調(diào)整卷積濾波器大小,從而更好地從時間序列數(shù)據(jù)中捕獲分類特征,遷移后對3類農(nóng)作物的識別能力優(yōu)于卷積濾波器大小固定的TempCNN。遞歸模型中,LSTM和StarRNN通過引入門控單元對抗分類過程中梯度爆炸和梯度消失的問題,遷移后總體精度下降幅度均小于10%?;谧⒁饬Φ腡ransformer對3類農(nóng)作物的識別能力大幅下降,誤分錯分明顯,或許是因為Transformer依賴弱歸納偏置、缺少條件計算使其對超長時間序列建模能力較差,導(dǎo)致模型在遷移后出現(xiàn)嚴(yán)重過擬合;RF由于在訓(xùn)練過程僅能提取少量淺層特征,遷移后精度損失最大。在其他類中,林甸以草地為主,海倫以林地為主,所有模型對其他類的識別能力都有所下降。綜上,所有模型在數(shù)據(jù)空間位置和時相組成不變的情況下均能較好地對像元進行判斷、反映3類農(nóng)作物的分布趨勢;但在數(shù)據(jù)的空間位置與時相組成變化的情況下,只有部分基于卷積或遞歸的深度學(xué)習(xí)模型能夠反映各類地物的大致分布趨勢。由于不同地區(qū)的農(nóng)作物種植結(jié)構(gòu)與地物類型有所差異,該結(jié)論在其他地區(qū)的適用性仍有待論證。
表5 各模型在海倫的精度
對于計算效率,在搭載Inter Xeon 4114處理器和GeForce RTX 2080Ti顯卡的圖形工作站上,以訓(xùn)練模型并利用模型對覆蓋面積約10 000 km2的海倫研究區(qū)影像進行推理作為全過程,各個深度學(xué)習(xí)模型與隨機森林模型的運行時間比值均在6.2倍以內(nèi),1 h內(nèi)即可完成全過程,如表6所示。此外,由于GF-6/WFV時間序列數(shù)據(jù)獲取與預(yù)處理較為耗時,難以讓時間序列深度學(xué)習(xí)模型和GF-6/WFV數(shù)據(jù)在農(nóng)作物分類中得到充分應(yīng)用,后續(xù)將優(yōu)化時序數(shù)據(jù)組合和波段組合以提高計算效率。
表6 各模型的運行時間
GF-6是中國第一顆具備紅邊和黃等農(nóng)作物識別特征波段的中高空間分辨率衛(wèi)星,同時具有高時間分辨率的特點。本文定性和定量地評估了多個時間序列深度學(xué)習(xí)模型在GF-6/WFV時間序列數(shù)據(jù)農(nóng)作物識別中的性能,對基于卷積、遞歸和注意力機制的6個深度學(xué)習(xí)模型和隨機森林模型的分類結(jié)果進行了目視分析以及精度評價,得出如下結(jié)論:
1)在空間位置不變的情況下,所有模型對大豆、玉米和水稻3類主要農(nóng)作物的1值均不低于89%、84%和97%,表現(xiàn)出較強的農(nóng)作物識別能力,能滿足高精度農(nóng)作物制圖的業(yè)務(wù)需要。盡管卷積、遞歸和注意力機制間存在較大差別,各模型均取得了93%~95%的總體分類精度,表明在不進行空間位置遷移,即測試數(shù)據(jù)與訓(xùn)練數(shù)據(jù)來自于同一分布時,農(nóng)作物識別精度與模型的分類機制的相關(guān)性較小。
2)在空間位置變化的情況下,由于自然地物類型和數(shù)據(jù)時相組成發(fā)生改變,各個模型的總體分類精度下降了7.2%~41.0%。其中,基于卷積的MSResNet的農(nóng)作物識別能力沒有明顯變化,基于遞歸的LSTM和StarRNN總體精度下降幅度小于10%,而基于注意力的Transformer和隨機森林對3類農(nóng)作物的識別能力都出現(xiàn)了明顯減弱,表明在空間位置遷移使得測試數(shù)據(jù)與訓(xùn)練數(shù)據(jù)處于不同分布后,分類機制對農(nóng)作物識別精度影響較大,基于卷積或遞歸的模型優(yōu)于基于注意力的模型和隨機森林模型。該結(jié)論僅適用于黑龍江省平原地區(qū),其他地區(qū)仍需進一步測試。
3)在時間消耗上,各個深度學(xué)習(xí)模型與隨機森林模型的運行時間比值均在6.2倍以內(nèi),在較短時間內(nèi)即可完成訓(xùn)練與推理的全過程。
基于3種機制可構(gòu)建大量不同網(wǎng)絡(luò)結(jié)構(gòu)的深度學(xué)習(xí)模型,分類機制與農(nóng)作物識別能力的關(guān)系仍需進一步探索。后續(xù)工作將側(cè)重于深度學(xué)習(xí)模型的可解釋性,研究各分類機制如何選擇性地提取少數(shù)與農(nóng)作物識別相關(guān)的特征。
[1] 王鵬新,荀蘭,李俐,等. 基于時間序列葉面積指數(shù)稀疏表示的作物種植區(qū)域提取[J]. 遙感學(xué)報,2018,24(5):121-129.
Wang Pengxin, Xun Lan, Li Li, et al. Extraction of planting areas of main crops based on sparse representation of time-series leaf area index[J]. Journal of Remote Sensing, 2018, 24(5): 121-129. (in Chinese with English abstract)
[2] 趙紅偉,陳仲新,劉佳. 深度學(xué)習(xí)方法在作物遙感分類中的應(yīng)用和挑戰(zhàn)[J]. 中國農(nóng)業(yè)資源與區(qū)劃,2020,41(2):35-49.
Zhao Hongwei, Chen Zhongxin, Liu Jia. Deep learning for crop classification of remote sensing data: Applications and challenges[J]. Chinese Journal of Agricultural Resources and Regional Planning, 2020, 41(2): 35-49. (in Chinese with English abstract)
[3] Lecun Y, Bottou L. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
[4] Hinton E, Williams J. Learning representations by back propagating errors[J]. Nature, 1986, 323(6088): 533-536.
[5] Maggiori E, Tarabalka Y, Charpiat G, et al. Convolutional neural networks for large-scale remote sensing image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 55(2):645-657.
[6] Postadjian T, Bris L, Sahbi H, et al. Investigating the potential of deep neural networks for large-scale classification of very high resolution satellite images[J]. Remote Sensing and Spatial Information Sciences, 2017, 31(5): 11-25.
[7] 趙斐,張文凱,閆志遠(yuǎn),等. 基于多特征圖金字塔融合深度網(wǎng)絡(luò)的遙感圖像語義分割[J]. 電子與信息學(xué)報,2019,41(10):44-50.
Zhao Fei, Zhang Wenkai, Yan Zhiyuan, et al. Multi-feature map pyramid fusion deep network for semantic segmentation on remote sensing data[J]. Journal of Electronics and Information Technology, 2019, 41(10): 44-50. (in Chinese with English abstract)
[8] 陳洋,范榮雙,王競雪,等. 基于深度學(xué)習(xí)的資源三號衛(wèi)星遙感影像云檢測方法[J]. 光學(xué)學(xué)報,2018,38(1):32-42.
Chen Yang, Fan Rongshuang, Wang Jingxue, et al. Cloud detection of ZY-3 satellite remote sensing images based on deep learning[J]. Acta Optica Sinica, 2018, 38(1): 32-42. (in Chinese with English abstract)
[9] Zhang Q, Yuan Q, Zeng C, et al. Missing data reconstruction in remote sensing image with a unified spatial-temporal-spectral deep convolutional neural network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(8):4274-4288.
[10] Ozcelik F, Alganci U, Sertel E, et al. Rethinking CNN-based pansharpening: Guided colorization of panchromatic images via GANS[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 59(4): 3486-3501.
[11] 趙紅偉,陳仲新,姜浩,等. 基于Sentinel-1A影像和一維CNN的中國南方生長季早期作物種類識別[J]. 農(nóng)業(yè)工程學(xué)報,2020,36(3):169-177.
Zhao Hongwei, Chen Zhongxin, Jiang Hao, et al. Early growing stage crop species identification in southern China based on sentinel-1A time series imagery and one-dimensional CNN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(3): 169-177. (in Chinese with English abstract)
[12] Pelletier C, Webb G I, Petitjean F. Temporal convolutional neural network for the classification of satellite image time series[J]. Remote Sensing, 2019, 11(5): 3154-3166.
[13] Heming L, Qi L. Hyperspectral imagery classification using sparse representations of convolutional neural network features[J]. Remote Sensing, 2016, 8(2):99-110.
[14] Ji S, Zhang Z, Zhang C, et al. Learning discriminative spatiotemporal features for precise crop classification from multi-temporal satellite images[J]. International Journal of Remote Sensing, 2020, 41(8): 3162-3174.
[15] Shunping J, Chi Z, Anjian X, et al. 3D convolutional neural networks for crop classification with multi-temporal remote sensing images[J]. Remote Sensing, 2018, 10(2):75-89.
[16] Ismail H, Forestier G, Weber J, et al. Deep learning for time series classification: A review[J]. Data Mining and Knowledge Discovery, 2019, 24(2): 57-69.
[17] Zhong L, Hu L, Zhou H. Deep learning based multi-temporal crop classification[J]. Remote Sensing of Environment, 2018, 221(3): 430-443.
[18] Ru?wurm M, Korner M. Temporal vegetation modelling using long short-term memory networks for crop identification from medium-resolution multi-spectral satellite images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE Computer Society, 2017: 11-19.
[19] 楊澤宇,張洪艷,明金,等. 深度學(xué)習(xí)在高分辨率遙感影像冬油菜提取中的應(yīng)用[J]. 測繪通報,2020,522(9):113-116.
Yang Zeyu, Zhang Hongyan, Ming Jin, et al. Extraction of winter rapeseed from high-resolution remote sensing imagery via deep learning[J]. Bulletin of Surveying and Mapping, 2020, 522(9): 113-116. (in Chinese with English abstract)
[20] Ienco D, Gaetano R, Dupaquier C, et al. Land cover classification via multitemporal spatial data by deep recurrent neural networks[J]. IEEE Geoscience and Remote Sensing Letters, 2017, 14(10): 1685-1689.
[21] Ndikumana E, Minh DHT, Baghdadi N, et al. Deep recurrent neural network for agricultural classification using multitemporal SAR Sentinel-1 for Camargue, France[J]. Remote Sensing, 2018, 10(8): 1217-1230.
[22] Campos-Taberner M, García-Haro F J, Martínez B, et al. Understanding deep learning in land use classification based on Sentinel-2 time series[J]. Scientific Reports, 2020, 10(1): 1-12.
[23] Ru?wurm M, K?rner M. Multi-temporal land cover classification with sequential recurrent encoders[J]. ISPRS International Journal of Geo-Information, 2018, 7(4): 129-143.
[24] Haobo L, Hui L, Lichao M. Learning a transferable change rule from a recurrent neural network for land cover change detection[J]. Remote Sensing, 2016, 8(6):506-513.
[25] Mou L, Bruzzone L, Zhu X X. Learning spectral-spatial- temporal features via a recurrent convolutional neural network for change detection in multispectral imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 57(2): 924-935.
[26] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems. NIPS, 2017: 5998-6008.
[27] Xu R, Tao Y, Lu Z, et al. Attention-mechanism-containing neural networks for high-resolution remote sensing image classification[J]. Remote Sensing, 2018, 10(10): 1602-1611.
[28] Liu R, Cheng Z, Zhang L, et al. Remote sensing image change detection based on information transmission and attention mechanism[J]. IEEE Access, 2019, 7: 156349-156359.
[29] Fu J, Liu J, Tian H, et al. Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3146-3154.
[30] Tang W, Long G, Liu L, et al. Rethinking 1d-cnn for time series classification: A stronger baseline[EB/OL]. (2021-01-12)[2021-04-20]https://arxiv.org/abs/2002.10061.
[31] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.
[32] Turkoglu M O, D'Aronco S, Wegner J, et al. Gating revisited: Deep multi-layer rnns that can be trained[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 8(12): 2145-2160.
[33] 吳見彬,朱建平,謝邦昌.隨機森林方法研究綜述[J]. 統(tǒng)計與信息論壇,2011,26(3):32-38.
Wu Jianbin, Zhu Jianping, Xie Bangchang. A review of technologies on random forests[J]. Journal of Statistics and Information, 2011, 26(3): 32-38. (in Chinese with English abstract)
[34] Pelletier C, Valero S, Inglada J, et al. Assessing the robustness of Random Forests to map land cover with high resolution satellite image time series over large areas[J]. Remote Sensing of Environment, 2016, 187: 156-168.
[35] Pelletier C, Valero S, Inglada J, et al. Effect of training class label noise on classification performances for land cover mapping with satellite image time series[J]. Remote Sensing, 2017, 9(2): 173-180.
[36] Lyons M B, Keith D A, Phinn S R, et al. A comparison of resampling methods for remote sensing classification and accuracy assessment[J]. Remote Sensing of Environment, 2018, 208: 145-153.
Evaluation of deep learning algorithm for crop identification based on GF-6 time series images
Chen Shiyang, Liu Jia※
(,,100081,)
Crop type mapping is one of the most important tools with medium and high spatial resolution satellite images in monitoring services of modern agriculture. Taking Heilongjiang Province of northeast China as a study area, this study aims to evaluate the state-of-the-art deep learning in crop type classification. A comparison was made on the Convolution Neural Network (CNN), Recurrent Neural Network (RNN), and Attention Mechanism (AM) for the application in crop type classification, while the traditional random forest (RF) model was also used as the control. Six models of deep learning were Temporal Convolutional Neural Network (TempCNN), Multi Scale 1D Residual Network (MSResNet), Omniscale Convolutional Neural Network (OmniscaleCNN), Long Short-Term Memory (LSTM), STAR Recurrent Neural Network (StarRNN), and Transformer. The specific procedure was as follows. First, GF-6 wide-field view image time series was acquired between April and November in the Lindian and Hailun study area, northeast of China, in order to extract the features of three types of crops at different growth stages. The resulting image time series used in the Lindian and the Hailun was composed of 41 and 48 GF-6 images, respectively. The preprocessing workflow included RPC correction, radiometric calibration, convert to top-of-atmospheric and surface reflectance using 6S atmospheric correction. The image interpolation and global min-max normalization were also applied to fill the empty pixel, further improving the convergence speed and stability of neural networks. The ground truth data was manually labelled using a field survey combined with GF-2 high-resolution image to generate datasets for train and evaluation. The datasets included six crops, such as rice, maize, soybean, water, urban and rest, covering 2 003 629 pixels in Lindian, 935 679 pixels in Hailun. Second, all models were trained and evaluated in Lindian, according to the differences between CNN, RNN, AM, and RF. All models achieved an overall accuracy of 93%-95%, and1-score above 89%, 84%, and 97% for soybean, maize, and rice, respectively, where three major crops were from both study areas. Thirdly, the trained model in Lindian was transferred to that in Hailun, where the overall classification accuracy of each model declined between 7.2% to 41.0%, due to the differences of land cover classes and temporal composition of the data. Among CNNs, the accuracy of MSResNet barely changed to recognize three types of crops after transfer. Since OmniScaleCNN was automatically adjusted the size of the convolution filter, the accuracy of OmniScaleCNN after the transfer was better than that of TempCNN. A forget gate was utilized in the LSTM and StarRNN among RNNs, in order to avoid gradient explosion and disappearance in the classification, where the overall accuracy declined less than 10% after transfer. However, the accuracy of attention-based Transformer and RF dropped significantly. All models performed better on the distribution of three types of crops under the condition that the spatial location and temporal composition of data remain unchanged, in terms of visual analysis of classified images. Two CNN or RNN models were expected to accurately identify the general distribution of all land cover classes, under the varying spatial location and temporal composition. Furthermore, the run time of each deep learning was within 1 h, less than 6.2 times of random forest. Time consumption in the whole process was associated with the model training, as well as the image treatment for the Hailun study area covering an area of about 10 000 km2. Correspondingly, deep learning presented a high-precision and large-scale crop mapping, in terms of classification accuracy and operating efficiency, particularly that the transfer learning performed better than before.
crops; remote sensing; recognition; deep learning; GF-6; time series; Heilongjiang
陳詩揚,劉佳. 基于GF-6時序數(shù)據(jù)的農(nóng)作物識別深度學(xué)習(xí)算法評估[J]. 農(nóng)業(yè)工程學(xué)報,2021,37(15):161-168.doi:10.11975/j.issn.1002-6819.2021.15.020 http://www.tcsae.org
Chen Shiyang, Liu Jia. Evaluation of deep learning algorithm for crop identification based on GF-6 time series images[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(15): 161-168. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2021.15.020 http://www.tcsae.org
2021-05-08
2021-07-13
高分農(nóng)業(yè)遙感監(jiān)測與評價示范系統(tǒng)(二期)(09-Y30F01-9001-20/22)
陳詩揚,研究方向為基于深度學(xué)習(xí)的農(nóng)作物遙感分類。Email:genghisyang@outlook.com
劉佳,研究員,研究方向為遙感監(jiān)測業(yè)務(wù)運行。Email:liujia06@caas.cn
10.11975/j.issn.1002-6819.2021.15.020
S127
A
1002-6819(2021)-15-0161-08