馬慧琴,黃文江,景元書,董瑩瑩,張競成,聶臣巍,唐翠翠,,趙晉陵,黃林生
基于AdaBoost模型和mRMR算法的小麥白粉病遙感監(jiān)測
馬慧琴1,2,黃文江※2,景元書1,董瑩瑩2,張競成3,聶臣巍2,唐翠翠2,4,趙晉陵4,黃林生4
(1. 南京信息工程大學(xué)應(yīng)用氣象學(xué)院,氣象災(zāi)害預(yù)報預(yù)警與評估協(xié)同創(chuàng)新中心,南京 210044;2. 中國科學(xué)院遙感與數(shù)字地球研究所,數(shù)字地球重點實驗室,北京 100094;3. 杭州電子科技大學(xué)生命信息與儀器工程學(xué)院,杭州 310018;4. 安徽大學(xué)電子信息工程學(xué)院,合肥 230039)
除選擇合適的建模方法外,選擇合適的特征選擇算法來優(yōu)選建模特征對提高作物病害的遙感監(jiān)測水平具有重要作用。選取陜西省關(guān)中平原西部小麥白粉病為對象,基于Landsat 8遙感影像共提取了18個特征變量,通過相關(guān)性分析(correlation analysis,CA)和最小冗余最大相關(guān)(minimum redundancy maximum relevance,mRMR)2種特征選擇算法篩選出了2組不同的特征變量,分別將其輸入Fisher線性判別分析(Fisher linear discriminant analysis,F(xiàn)LDA)、支持向量機(support vector machine,SVM)和AdaBoost 3種方法,構(gòu)建小麥白粉病發(fā)生嚴重程度監(jiān)測模型,并對其進行精度驗證與對比分析。結(jié)果表明,2種AdaBoost模型對小麥白粉病發(fā)生嚴重程度的總體監(jiān)測精度分別比FLDA模型和SVM模型高出27.9%、27.9%和14.0%、9.3%,mRMR算法篩選特征所建FLDA、SVM及AdaBoost監(jiān)測模型的總體監(jiān)測精度分別比CA篩選特征所建模型高出7.0%、11.7%和7.0%,且mRMR算法篩選特征結(jié)合AdaBoost方法所建監(jiān)測模型的精度和Kappa系數(shù)分別為88.4%和0.807,為所有模型中最高。說明將AdaBoost方法用于作物病害遙感監(jiān)測效果較好,在作物病害監(jiān)測模型的特征變量選擇中mRMR算法比常用CA算法更具優(yōu)勢。研究結(jié)果可為其他作物病害遙感監(jiān)測提供方法參考。關(guān)鍵詞:病害;遙感;監(jiān)測;小麥;mRMR算法;AdaBoost方法
小麥白粉病已成為中國發(fā)生面積和危害十分嚴重的小麥病害之一,在重發(fā)年份減產(chǎn)可高達30%[1]。因此利用現(xiàn)代信息技術(shù)來提高對該病的監(jiān)測水平,對于指導(dǎo)病害防治,確保中國糧食生產(chǎn)穩(wěn)定具有重要的意義。近年來,遙感技術(shù)的發(fā)展可有效解決病蟲害傳統(tǒng)監(jiān)測方式的多種弊端,為未來大面積病蟲害監(jiān)測預(yù)測提供了重要手段[2]。
目前,一些學(xué)者利用該技術(shù)開展了作物病蟲害的監(jiān)測預(yù)測研究。蔣金豹等[3]將冠層光譜一階微分數(shù)據(jù)與相應(yīng)的小麥條銹病病情指數(shù)進行相關(guān)分析,并采用單變量線性和非線性回歸技術(shù),建立了小麥病情指數(shù)估測模型。Devadas等[4]和Mahlein等[5]利用單因子方差分析或Relief F方法篩選出的特征對同一作物的多種病害進行了識別研究。魯軍景等[6]通過相關(guān)分析和閾值限定的方法分別對傳統(tǒng)光譜特征和小波變換特征進行篩選,并采用偏最小二乘回歸(partial least squares regression,PLSR)法對小麥條銹病嚴重度進行了估測研究。王靜等[7]利用結(jié)合前人研究成果選取的7種植被指數(shù)對小麥條銹病嚴重度的監(jiān)測方法進行了探討。以上基于高光譜遙感數(shù)據(jù)建立的模型很難運用到區(qū)域尺度病蟲害的監(jiān)測當中,學(xué)者們通過航空或航天遙感數(shù)據(jù),對區(qū)域尺度的作物病蟲害進行了監(jiān)測預(yù)測研究。Manuel等[8]利用向前逐步判別分析法從機載影像反演的多種植被指數(shù)中篩選出的指數(shù)特征,采用線性判別分析(linear discriminant analysis,LDA)和支持向量機(support vector machine,SVM)方法建立了杏樹紅葉斑病嚴重度等級的檢測模型。Yuan等[9]通過T檢驗和互相關(guān)檢驗從衛(wèi)星影像反演的眾多遙感特征中篩選出的2種植被指數(shù),采用3種不同方法構(gòu)建了冬小麥白粉病圖像反演模型。部分學(xué)者[10-12]通過結(jié)合遙感與氣象數(shù)據(jù)特征,從中篩選出最優(yōu)特征,利用不同建模方法對區(qū)域尺度小麥白粉病的發(fā)生概率或嚴重度進行了監(jiān)測預(yù)測研究。在上述對作物病蟲害的監(jiān)測預(yù)測研究中,大部分學(xué)者將研究重點放在影響作物病蟲害的數(shù)據(jù)源或者模型構(gòu)建方法的選擇上,對模型輸入變量的選擇方法則研究較少,通常只是通過簡單的相關(guān)分析或T檢驗等方法篩選。
然而,在構(gòu)建遙感監(jiān)測模型過程中,特征變量選擇也是最關(guān)鍵的問題,由于一些機器學(xué)習(xí)算法受不相關(guān)或冗余特征的負面影響,直接影響了模型的分類精度和泛化性能[13-15]。一些學(xué)者將最小冗余最大相關(guān)(minimum redundancy maximum relevance,mRMR)等特征選擇算法用于圖像分類、土地利用分類、生物樣本類型區(qū)分及疾病診斷等研究時均取得了較好的結(jié)果[16-19]。Huang等[20]和Mahlein等[21]利用Relief F算法找出的單一波段和兩波段歸一化波長差中的最優(yōu)組合構(gòu)建的新植被指數(shù)分別對冬小麥和甜菜的3種不同病害進行監(jiān)測識別,結(jié)果表明新的光譜指數(shù)能夠監(jiān)測和識別病害且可靠性較高。Unler等[22]利用mRMR結(jié)合粒子群(particle swarm optimization,PSO)的混合算法(mr2PSO)進行特征選擇,采用支持向量機分類器在幾種基準數(shù)據(jù)集上進行測試,并與另2種算法進行比較,結(jié)果表明該算法在分類精度和計算性能中均具有競爭力。程希萌等[23]通過3種方法實現(xiàn)mRMR算法優(yōu)選特征信息并采用C5.0和K-近鄰(K-nearest neighbor,KNN)2種分類器進行遙感圖像分類,其結(jié)果表明兩種分類器的分類效率和總體精度均有不同程度的提高。
常規(guī)的病蟲害監(jiān)測預(yù)測方法存在人為主觀性強、普適性差、構(gòu)建困難等缺陷,學(xué)者們開始探索新的方法對作物病蟲害進行監(jiān)測預(yù)測[7,9,11]。AdaBoost算法能夠?qū)⒈入S機猜測略好的弱分類器提升為分類精度高的強分類器,為學(xué)習(xí)算法的設(shè)計提供了新的思想和新的方法[24]。目前,該方法主要用于手寫文檔的識別與檢索,從印刷文檔中識別手寫注釋和手寫簽名的驗證,人臉檢測等[25-27],尚未運用到作物病害的遙感監(jiān)測研究中。本文以陜西省關(guān)中平原西部為研究區(qū)域,基于Landsat8遙感影像反演得到多個特征變量,分別通過相關(guān)性分析(correlation analysis,CA)和mRMR算法從中選取出2組相應(yīng)最優(yōu)的特征變量,輸入到Adaboost方法及常用的分類方法Fisher線性判別分析(Fisher linear discriminant analysis,F(xiàn)LDA)和支持向量機,構(gòu)建6種小麥白粉病發(fā)生嚴重程度遙感監(jiān)測模型,并將其結(jié)果進行對比分析。
1.1 研究區(qū)概況
研究區(qū)位于陜西省關(guān)中平原西部(圖1)。該區(qū)域水熱條件較好,氣候溫和濕潤,是陜西省主要的高產(chǎn)農(nóng)作區(qū)[28],冬小麥為當?shù)刂匾r(nóng)作物,同時也是中國小麥白粉病的常發(fā)區(qū)域。
圖1 研究區(qū)地理位置及樣本點空間分布Fig.1 Geographic location and spatial distribution of sample points of study area
1.2 數(shù)據(jù)獲取
所用數(shù)據(jù)主要包括遙感數(shù)據(jù)和小麥白粉病地面調(diào)查數(shù)據(jù)。遙感數(shù)據(jù)為2014年5月11日的一景Landsat 8遙感影像。小麥白粉病地面調(diào)查數(shù)據(jù)于2014年5月10日(小麥揚花期)在陜西省關(guān)中平原西部調(diào)查獲得。每個調(diào)查樣地取1 m × 1 m的樣方,樣方區(qū)域的中心經(jīng)緯度坐標用亞米級高精度手持式GPS記錄。采用5點調(diào)查法調(diào)查,即每個調(diào)查樣方取對稱的5點,每點取20株小麥,采用改進的“0-9級法”記錄病害嚴重程度,調(diào)查時先將每株小麥均勻地劃分為9段,由下至上依次記錄為第1段至第9段,表1為具體的分級標準。分別記錄其發(fā)病級別后,采用式(1)計算病情指數(shù)(disease index,DI)。
式中x為發(fā)病級數(shù),n為最高級別9,f為不同發(fā)病級別的株數(shù)。
總共獲取了43個野外調(diào)查點,為防止過多等級增加監(jiān)測上的難度,研究將發(fā)病嚴重程度重分為健康(0級),用Ⅰ表示,輕發(fā)(1~3級),用Ⅱ表示,重發(fā)(4~9級),用Ⅲ表示,總共3個等級進行監(jiān)測模型的構(gòu)建。
1.3 數(shù)據(jù)處理
首先,對Landsat 8遙感影像進行輻射定標、大氣校正及裁剪等預(yù)處理。根據(jù)當?shù)氐淖魑锓N植類型及其物候歷,通過設(shè)置最優(yōu)分割閾值結(jié)合最大似然分類法提取研究區(qū)的小麥種植區(qū)域[11](圖1b)。
接著,基于預(yù)處理后的遙感影像提取與病害相關(guān)的植被指數(shù)和其他特征變量。本文提取了Landsat 8藍、綠、紅及近紅外波段的反射率及對病害較敏感的植被指數(shù),以及表征生境因子的地表溫度(land surface temperature,LST)。表2僅列舉出了研究最終使用到的3種植被指數(shù)的名稱及計算公式。另外,地表溫度利用熱紅外11波段,由常用的單通道算法反演獲得。研究還通過穗帽變換獲得了綠度植被指數(shù)(Greenness)以及濕度指數(shù)(Wetness),見公式(2)~(3)所示:
式中wi,gi(i=2,…,7)為各波段對應(yīng)系數(shù)(表3),bi(i=2,…,7)為各波段反射率值。
表1 小麥白粉病發(fā)生程度分級標準Table1 Grading standard of wheat powdery mildew occurrence degree
表2 寬波段植被指數(shù)計算公式Table2 Vegetation indices based on wide bands
1.4 特征變量優(yōu)選及模型構(gòu)建
1.4.1 mRMR(最小冗余最大相關(guān))算法
mRMR算法是基于信息理論的典型特征選擇算法,其主要思想是從特征空間中尋找與目標類別有最大相關(guān)性且相互之間冗余性最小的m個特征。主要是利用互信息來衡量特征子集中特征與類別之間、特征與特征之間的相關(guān)度[32-35]。
表3 穗帽變換指數(shù)Wetness、Greenness相應(yīng)波段系數(shù)Table3 Corresponding coefficients of Wetness and Greenness by tasseled cap transformation
算法中特征集中的特征與類別之間的相關(guān)度為:
式中S為特征集合,c為目標類別,I( xi; c)為特征i和目標類別c之間的互信息。
特征集中特征與特征的相關(guān)度為:
式中I( xi; xj)為特征i與特征j之間的互信息。
任意2個變量x,y之間的互信息為:
式中p(x)為變量x的概率,p(y)為變量y的概率,p(x,y)為x,y的聯(lián)合概率。
組合式(4)、(5)即為mRMR原則,記為max(,)D RΦ,得到mRMR原則的互信息商標準:
在商標準的基礎(chǔ)上,采用增量搜索優(yōu)化算法得到最優(yōu)特征子集。假定已有m–1個特征組成的特征集Sm–1,在剩下的樣本特征{X–Sm–1}中選擇第m個滿足下式條件的特征:
1.4.2 特征變量優(yōu)選
選擇最能反映病害發(fā)生發(fā)展狀況的特征變量構(gòu)建病害監(jiān)測預(yù)測模型,可以有效提高模型的準確度。當前的病蟲害監(jiān)測研究對建模方法開展了大量分析,而對特征變量選擇方法的研究則較少。本文通過以下2種方式選取特征變量,構(gòu)建小麥白粉病發(fā)生嚴重程度監(jiān)測模型。首先,在SPSS中通過相關(guān)性分析計算各特征變量與白粉病發(fā)生嚴重程度間的相關(guān)系數(shù),篩選出達到顯著性差異(P<0.01)的特征Wetness、LST和SIWSI為第1組特征變量。之后,利用mRMR算法選擇出第2組特征變量Greenness、Wetness、LST、RDVI和SR。
1.4.3 模型構(gòu)建
本文選取AdaBoost算法構(gòu)建大尺度小麥白粉病遙感監(jiān)測模型。AdaBoost是由Freund和Schapire提出,目前使用和研究最多的集成算法[36]。該算法的基本思想是利用若干個分類能力較弱的弱分類器,將其按一定的方式疊加為分類能力很強的強分類器[37-38]。AdaBoost算法最初設(shè)計解決二分類問題,但實際問題以多分類居多,將此方法用于多分類時的二分類拆解將多分類問題拆解為多個二分類問題,這種拆解中隱式地假定了使用與二分類相同的弱分類器條件,可避免求解復(fù)雜的多分類下的弱分類器條件這一難題[39]。文中將一對其余的二分類拆解AdaBoost算法[40]用于解決3分類(健康、輕發(fā)、重發(fā))問題。該算法具體過程如下:
1)輸入樣本集S={(x1,y1),…,(xm,ym)},其中,xi∈X,X為訓(xùn)練集,m=X;yi∈Y,Y為類別標簽集,k=Y,(xi, yi)為某一樣本點的特征變量和該點對應(yīng)的類別標簽;
2)初始化:D1( i, l)=1/mk (1≤i≤m,1≤l≤k) ,為初始權(quán)重分布;
3)訓(xùn)練過程:
循環(huán)學(xué)習(xí)T次
① 將第t次迭代后的分布D(tt=1,…,T)用于訓(xùn)練弱學(xué)習(xí);
② 得到弱規(guī)則ht:X×Y→R,由弱學(xué)習(xí)器產(chǎn)生;
③ 選擇ht的加權(quán)投票權(quán)值
式中rt為弱規(guī)則ht的預(yù)測與類別yi之間相關(guān)性的度量,其計算公式為
式中ht( xi, l)為類別標簽l∈Y是否應(yīng)該賦給xi的預(yù)測,其值反映了預(yù)測的可信度。
④更新權(quán)值
4)則最終的合并假設(shè)為
2.1 模型的評估與驗證
利用2種特征選擇方法篩選出2組特征變量作為輸入變量,采用上述AdaBoost方法及常用分類方法FLDA和SVM構(gòu)建監(jiān)測模型,并對其結(jié)果進行擬合優(yōu)度評價。主要通過Spearman來檢驗?zāi)P?,統(tǒng)計量參數(shù)包括Somers’D、Kendall’s Tau-c和Goodman-Kruskal Gama,取值范圍為[–1,1],其值越大,則模型精度越高。表4列出了具體的參數(shù)值,從中可以看出,AdaBoost方法所建2模型的Spearman相關(guān)性均達到了極顯著水平,F(xiàn)LDA和SVM模型中則只有mRMR算法篩選特征所建SVM監(jiān)測模型(mRMR-SVM模型)達到了極顯著水平。并且另3種參數(shù)值均表現(xiàn)為AdaBoost模型最高,F(xiàn)LDA模型最低,SVM模型介于二者之間。同時在6個監(jiān)測模型中,2個AdaBoost模型和mRMR-SVM模型對應(yīng)參數(shù)值明顯高于其余的監(jiān)測模型,其中mRMR算法篩選特征所建AdaBoost監(jiān)測模型(mRMR-AdaBoost模型)為所有模型中最高。此外,在3種模型中均表現(xiàn)為mRMR算法篩選特征所建監(jiān)測模型的參數(shù)值高于CA算法篩選特征所建監(jiān)測模型。說明通過AdaBoost方法結(jié)合mRMR算法篩選出的特征變量構(gòu)建的監(jiān)測模型的優(yōu)越性最高。
研究結(jié)合2014年5月10日的地面調(diào)查數(shù)據(jù)對3種方法模型的監(jiān)測結(jié)果進行了進一步評價。表5列舉了3種建模方法對應(yīng)2種不同特征選擇算法篩選特征所建監(jiān)測模型的漏分、錯分情況、總體精度以及Kappa系數(shù)。從中可以看出,2個FLDA監(jiān)測模型精度及Kappa系數(shù)為所有模型中最低,且mRMR-FLDA模型對應(yīng)的精度及Kappa系數(shù)較高,但僅為60.5%和0.321;在2個SVM監(jiān)測模型中,mRMR算法篩選特征所建監(jiān)測模型(mRMR-SVM模型)精度及Kappa系數(shù)分別為79.1%和0.652,高于CA篩選特征所建監(jiān)測模型(CA-SVM模型);在2個AdaBoost監(jiān)測模型中,mRMR算法篩選出的特征變量構(gòu)建的模型(mRMR-AdaBoost模型)的總體精度及Kappa系數(shù)為分別為88.4%和0.807,高于CA算法篩選特征構(gòu)建的模型(CA-AdaBoost模型)的81.4%和0.685。對比3種方法所建監(jiān)測模型的總體精度發(fā)現(xiàn),通過CA和mRMR算法篩選得到的2組特征結(jié)合FLDA方法及SVM方法所建監(jiān)測模型的總體精度比相應(yīng)特征結(jié)合AdaBoost方法所建監(jiān)測模型分別低27.9%、27.9%和14.0%、9.3%。mRMR算法篩選特征所建FLDA、SVM及AdaBoost監(jiān)測模型的總體監(jiān)測精度分別比CA篩選特征所建模型高出7.0%、11.7%和7.0%。對比2種方法所建模型的漏分、錯分情況發(fā)現(xiàn),F(xiàn)LDA監(jiān) 測模型的漏分、錯分情況總體最為嚴重,SVM監(jiān)測模型次之,AdaBoost監(jiān)測模型最低。從CA和mRMR 2種特征選擇算法角度對比模型的漏分、錯分情況發(fā)現(xiàn),不同方法所建模型中均表現(xiàn)為CA算法篩選特征所建監(jiān)測模型的漏分、錯分誤差明顯高于mRMR算法篩選特征所建監(jiān)測模型。說明AdaBoost方法所建監(jiān)測模型的優(yōu)越性高于FLDA和SVM方法所建監(jiān)測模型,mRMR算法篩選出的特征所建監(jiān)測模型的優(yōu)越性高于CA算法篩選特征所建監(jiān)測模型。
表4 AdaBoost模型的擬合優(yōu)度評價Table4 Evaluation of fit goodness of three AdaBoost models
表5 3種方法所建監(jiān)測模型的總體驗證結(jié)果Table5 Overall verification results of monitoring models through three methods
2.2 小麥白粉病發(fā)生嚴重程度監(jiān)測
3種建模方法結(jié)合2種特征選擇算法所建小麥白粉病發(fā)生嚴重程度監(jiān)測模型的評價結(jié)果顯示,2個FLDA監(jiān)測模型精度為6個模型中最低的2個,且模型中精度較高的也只有60.5%,無法實現(xiàn)對病害的較準確監(jiān)測,因此,研究以2種特征選擇算法篩選出的特征變量作為輸入變量,只將采用SVM方法和AdaBoost方法構(gòu)建的監(jiān)測模型應(yīng)用于整個研究區(qū),對研究區(qū)中小麥病害進行監(jiān)測,得到研究區(qū)小麥白粉病發(fā)生嚴重程度的空間分布圖(圖2)。
圖2 SVM模型和AdaBoost模型監(jiān)測小麥白粉病發(fā)病嚴重程度空間分布Fig.2 Spatial distribution map of wheat powdery mildew with different severities by SVM and AdaBoost models
從監(jiān)測結(jié)果分布圖中可以看出,CA-SVM模型的監(jiān)測結(jié)果中將研究區(qū)小麥基本上全部監(jiān)測為健康,只有零星區(qū)域表現(xiàn)為發(fā)病,完全不同于另外3個模型;mRMRSVM模型和2個AdaBoost模型的監(jiān)測結(jié)果則均表現(xiàn)為禮泉縣以西區(qū)域發(fā)病嚴重,以東區(qū)域發(fā)病較少,且白粉病重發(fā)區(qū)主要位于禮泉縣以西。對比上述3個不同模型發(fā)現(xiàn),對于禮泉縣以東,3模型的監(jiān)測結(jié)果基本一致,白粉病重發(fā)區(qū)極少,少部分區(qū)域輕發(fā),健康麥區(qū)最多。就禮泉縣以西麥區(qū)而言,mRMR-SVM模型則明顯不同于2個AdaBoost模型,表現(xiàn)為白粉病輕發(fā)區(qū)明顯大于2個AdaBoos模型,健康麥區(qū)和重發(fā)區(qū)域則相反。2個AdaBoost模型的監(jiān)測結(jié)果則表現(xiàn)為重發(fā)麥區(qū)基本一致,CA-AdaBoost模型的健康麥區(qū)大于mRMR-AdaBoost模型,輕發(fā)麥區(qū)則剛好相反。但3模型的整體趨勢均表現(xiàn)為輕發(fā)麥區(qū)最大,健康麥區(qū)次之,重發(fā)區(qū)域最少。43個小麥病害調(diào)查點中22個調(diào)查點位于禮泉縣以西,其中5個調(diào)查點為健康,11個調(diào)查點為輕發(fā),6個點重發(fā)。而禮泉縣以東的21個調(diào)查點中17個為健康,4個輕發(fā),沒有重發(fā)點。結(jié)合白粉病發(fā)生嚴重程度實地調(diào)查情況和4個監(jiān)測模型的評價結(jié)果及監(jiān)測所得的病害發(fā)生嚴重程度的空間分布結(jié)果可以看出,CA-SVM監(jiān)測模型的結(jié)果與實際情況極為不符,而另3個模型監(jiān)測結(jié)果與實際情況則較為相符,且其中mRMR-AdaBoost監(jiān)測模型最相符,說明其優(yōu)越性高于mRMR-SVM監(jiān)測模型和CA-AdaBoost監(jiān)測模型。
本文利用Landsat 8遙感數(shù)據(jù),通過CA和mRMR 2種不同的特征變量選擇方法,選取了2組特征變量Wetness、LST和SIWSI,Greenness、Wetness、LST、RDVI和SR,采用AdaBoost方法以2組變量為輸入變量對陜西省關(guān)中平原西部麥區(qū)的小麥白粉病發(fā)生嚴重程度進行監(jiān)測,同時將2個AdaBoost模型監(jiān)測結(jié)果與常用分類方法FLDA和SVM所建模型監(jiān)測結(jié)果進行對比分析。結(jié)果表明:6個模型中mRMR-SVM模型和2個AdaBoost模型均能較準確地監(jiān)測研究區(qū)小麥白粉病發(fā)生嚴重程度,其中采用CA和mRMR算法篩選特征結(jié)合AdaBoost方法構(gòu)建的監(jiān)測模型的總體精度比對應(yīng)的FLDA監(jiān)測模型及SVM監(jiān)測模型分別高出27.9%、27.9%和14.0%、9.3%,證明將AdaBoost方法用于作物病害監(jiān)測具有較好的應(yīng)用前景。受特征選擇方法自身特點的影響,在FLDA、SVM和AdaBoost 3種不同方法模型中,mRMR算法篩選特征所建模型的總體監(jiān)測精度分別比相應(yīng)的CA篩選特征所建模型高7.0%、11.7%和7.0%,說明mRMR算法在作物病害建模特征的選擇中比常用CA算法更具優(yōu)勢。
研究只采用了1景影像進行小麥白粉病發(fā)生嚴重程度的監(jiān)測研究,而小麥白粉病的發(fā)生發(fā)展受多種類型因素的影響,如氣象因子及田間管理信息等,同時,數(shù)據(jù)獲取區(qū)域也僅為陜西省關(guān)中平原西部地區(qū),數(shù)據(jù)量及區(qū)域范圍均相對較少,因此模型的通用性有待進一步驗證。另外,在建模特征的篩選過程中,主要采用了常用的CA算法和mRMR算法進行特征變量的選擇,旨在解決常用方法在特征選擇時無法同時去除病害類別不相關(guān)特征和冗余特征的問題,但卻未考慮mRMR算法的復(fù)雜性。在今后的研究中,盡可能地采用多年多地區(qū)的病害數(shù)據(jù),并結(jié)合多源數(shù)據(jù),尋找更加簡單有效的特征選擇方法篩選模型的輸入變量,采用文中的建模方法來構(gòu)建多種監(jiān)測模型,并從中選擇更優(yōu)的模型來實現(xiàn)對病害嚴重程度的準確監(jiān)測。
[1] 周益林,段霞瑜,程登發(fā). 利用高光譜遙感估計白粉病對小麥產(chǎn)量及蛋白質(zhì)含量的影響[J].植物保護學(xué)報,2009,36(1):32-36.
Zhou Yilin, Duan Xiayu, Cheng Dengfa. Estimation of the effects of powdery mildew on wheat yield and protein content using hyperspectral remote sensing [J]. Acta Phytophylacica Sinica, 2009, 36(1): 32-36. (in Chinese with English abstract)
[2] 張競成,袁琳,王紀華,等. 作物病蟲害遙感監(jiān)測研究進展[J]. 農(nóng)業(yè)工程學(xué)報,2012,28(20):1-11.
Zhang Jingcheng, Yuan Lin, Wang Jihua, et al. Research progress of crop diseases and pests monitoring based on remote sensing [J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2012, 28(20): 1-11. (in Chinese with English abstract)
[3] 蔣金豹,陳云浩,黃文江. 用高光譜微分指數(shù)監(jiān)測冬小麥病害的研究[J]. 光譜學(xué)與光譜分析,2007,27(12):2475-2479.
Jiang Jinbao, Chen Yunhao, Huang Wenjiang. Using hyperspectral derivative index to monitor winter wheat disease [J]. Spectroscopy and Spectral Analysis, 2007, 27(12): 2475-2479. (in Chinese with English abstract)
[4] Devadas R, Lamb D W, Simpfendorfer S, et al. Evaluating ten spectral vegetation indices for identifying rust infection in individual wheat leaves[J]. Precision Agriculture, 2008, 10(6): 459-470.
[5] Mahlein A K, Rumpf T, Welke P, et al. Development of spectral indices for detecting and identifying plant diseases[J]. Remote Sensing of Environment, 2013, 128: 21-30.
[6] 魯軍景,黃文江,蔣金豹,等. 小波特征與傳統(tǒng)光譜特征估測冬小麥條銹病病情嚴重度的對比研究[J]. 麥類作物學(xué)報,2015,35(10):1456-1461.
Lu Junjing, Huang Wenjiang, Jiang Jinbao, et al. Comparison of wavelet features and conventional spectral features on estimating severity of stripe rust in winter wheat[J]. Journal of Triticeae Crops, 2015, 35(10): 1456-1461. (in Chinese with English abstract)
[7] 王靜,景元書,黃文江,等. 冬小麥條銹病嚴重度不同估算方法對比研究[J]. 光譜學(xué)與光譜分析,2015,35(6):1649-1653.
Wang Jing, Jing Yuanshu, Huang Wenjang, et al. Comparative research on estimating the severity of yellow rust in winter wheat[J]. Spectroscopy and Spectral Analysis, 2015, 35(6): 1649-1653. (in Chinese with English abstract)
[8] Manuel López-López, Rocío Calderón, Victoria González-Dugo, et al. Early detection and quantification of almond red leaf blotch using high-resolution hyperspectral and thermal imagery[J]. Remote Sensing, 2016, 8(4): 276.
[9] Yuan Lin, Pu Ruiliang, Zhang Jingcheng, et al. Using high spatial resolution satellite imagery for mapping powdery mildew at a regional scale[J]. Precision Agriculture, 2016, 17: 332-348.
[10] Zhang Jingcheng, Pu Ruiliang, Yuan Lin, et al. Integrating remotely sensed and meteorological observations to forecast wheat powdery mildew at a regional scale[J]. Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of, 2014, 7(11): 4328-4339.
[11] 馬慧琴,黃文江,景元書. 遙感與氣象數(shù)據(jù)結(jié)合預(yù)測小麥灌漿期白粉病[J]. 農(nóng)業(yè)工程學(xué)報,2016,32(9):165-172.
Ma Huiqin, Huang Wenjiang, Jing Yuanshu. Wheat powdery mildew forecasting in filling stage based on remote sensing and meteorological data [J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(9): 165-172. (in Chinese with English abstract)
[12] 聶臣巍,袁琳,王保通,等. 綜合遙感與氣象信息的小麥白粉病監(jiān)測方法[J]. 植物病理學(xué)報,2016,46(2):285-288.
Nie Chenwei, Yuan Lin, Wang Baotong, et al. Monitoring wheat powdery mildew based on integrated remote sensing and meteorological information[J]. Acta Phytopathologica Sinica, 2016, 46(2):285-288. (in Chinese with English abstract)
[13] Kohavi R, John G. Wrappers for feature subset selection[J]. Artificial Intelligence, 1997, 97(1/2): 273-324.
[14] 張麗新. 高維數(shù)據(jù)的特征選擇及基于特征選擇的集成學(xué)習(xí)研究[D]. 北京:清華大學(xué),2004 .
Zhang Lixin. Study on Feature Selection and Ensemble Learning Based on Feature Selection for High-dimensional Datasets[D]. Beijing: Tsinghua University, 2004. (in Chinese with English abstract)
[15] 姚旭,王曉丹,張玉璽,等. 特征選擇方法綜述[J]. 控制與決策,2012,27(2):161-166,192.
Yao Xu, Wang Xiaodan, Zhang Yuxi, et al. Summary of feature selection algorithms[J]. Control and Decision, 2012, 27(2): 161-166, 192. (in Chinese with English abstract)
[16] 丁建睿,黃劍華,劉家鋒,等. 基于mRMR和SVM的彈性圖像特征選擇與分類[J]. 哈爾濱工業(yè)大學(xué)學(xué)報,2012,44(5):81-85.
Ding Jianrui, Huang Jianhua, Liu Jiafeng, et al. Elastogram features selection and classification based on mRMR and SVM[J]. Journal of Harbin Institute of Technology, 2012, 44(5): 81-85. (in Chinese with English abstract)
[17] Li Biqing, Hu Lele, Chen Lei, et al. Prediction of protein domain with mRMR feature selection and analysis[J]. PLoS One, 2012, 7(6): e39308.
[18] 肖艷,姜琦剛,王斌,等. 基于Relief F和PSO混合特征選擇的面向?qū)ο笸恋乩梅诸怺J]. 農(nóng)業(yè)工程學(xué)報,2016,32(4):211-216.
Xiao Yan, Jiang Qigang, Wang Bin, et al. Object based land-use classification based on hybrid feature selection method of combining Relief F and PSO [J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(4): 211-216. (in Chinese with English abstract)
[19] Sen B, Peker M. Novel approaches for automated epileptic diagnosis using FCBF selection and classification algorithms [J]. Turkish Journal of Electrical Engineering & Computer Sciences, 2013, 21(Supp. 1): 2092-2109.
[20] Huang Wenjiang, Guan Qingsong, Luo Juhua, et al. New optimized spectral indices for identifying and monitoring winter wheat diseases [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2014, 7(6): 2516-2524.
[21] Mahlein A K, Rumpf T, Welke P, et al. Development of spectral indices for detecting and identifying plant diseases [J]. Remote Sensing of Environment, 2013, 128(1): 21-30.
[22] Unler A, Murat A, Chinnam R B. mr2PSO: A maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification [J]. Information Sciences, 2011, 181(20): 4625-4641.
[23] 程希萌,沈占鋒,邢廷炎,等. 基于 mRMR 特征優(yōu)選算法的多光譜遙感影像分類效率精度分析[J]. 地球信息科學(xué)學(xué)報,2016,18(6):815-823.
Cheng Ximeng, Shen Zhanfeng, Xing Tingyan, et al. Efficiency and accuracy analysis of multispectral image classification based on mRMR feature selection method[J]. Journal of Geo-information Science, 2016, 18(6): 815-823. (in Chinese with English abstract)
[24] 曹瑩,苗啟廣,劉家辰,等. AdaBoost 算法研究進展與展望[J]. 自動化學(xué)報,2013,39(6):745-758.
Cao Ying, Miao Qiguang, Liu Jiachen, et al. Advance and prospects of AdaBoost algorithm[J]. Acta Automatica Sinica, 2013, 39(6): 745-758. (in Chinese with English abstract)
[25] Peng X, Setlur S, Govindaraju V, et al. Using a boosted tree classifier for text segmentation in hand-annotated documents[J]. Pattern Recognition Letters, 2012, 33(7): 943-950.
[26] Hu Juan, Chen Youbin. Writer-independent off-line handwritten signature verification based on real AdaBoost[C]// Proceedings of the 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce, NewYork, USA: IEEE, 2011: 6095-6098.
[27] Viola P, Jones M J. Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57(2): 137-154.
[28] 石玉瓊,李團勝. 陜西關(guān)中耕地糧食生產(chǎn)潛力研究[J]. 中國農(nóng)學(xué)通報,2015,31(13):196-204.
Shi Yuqiong, Li Tuansheng. Farmland grain potential productivity of Guanzhong, Shaanxi Province[J]. Chinese Agricultural Science Bulletin, 2015, 31(13): 196-204. (in Chinese with English abstract)
[29] Baret F, Guyot G. Potentials and limits of vegetation indices for LAI and APAR assessment [J]. Remote Sensing of Environment, 1991, 35(2/3): 161-173.
[30] Roujean J L, Breon F M. Estimating PAR absorbed by vegetation from bidirectional reflectance measurements [J]. Remote Sensing of Environment, 1995, 51(3): 375-384.
[31] Fensholt R, Sandholt I. Derivation of a shortwave infrared water stress index from MODIS near-and shortwave infrared data in a semiarid environment[J]. Remote Sensing of Environment, 2003, 87(1): 111-121.
[32] Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data[J]. Journal of Bioinformatics and Computational Biology, 2005, 3(2): 185-205.
[33] Peng Hanchuan, Long Fuhui, Ding Chris. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy [J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2005, 27(8): 1226-1238.
[34] 孫健,王成華. 基于mRMR原則和優(yōu)化SVM的模擬電路故障診斷[J]. 儀器儀表學(xué)報,2013,34(1):221-226.
Sun Jian, Wang Chenghua. Analog circuit fault diagnosis based on mRMR and optimized SVM [J]. Chinese Journal of Scientific Instrument, 2013, 34(1): 221-226. (in Chinese with English abstract)
[35] 王露,龔光紅. 基于ReliefF+mRMR特征降維算法的多特征遙感圖像分類[J]. 中國體視學(xué)與圖像分析,2014,19(3):250-257.
Wang Lu, Gong Guanghong. Multiple features remotesensing image classification based on combining Relief F and mRMR[J]. Chinese Journal of Stereology and Image Analysis, 2014, 19(3): 250-257. (in Chinese with English abstract)
[36] Freund Y, Schapire R E. A decision-theoretic generalization of online learning and an application to Boosting[J]. Journal of Computer and System Sciences, 1997, 55(1): 119-139.
[37] 李文波,王立研. 一種基于Adaboost算法的車輛檢測方法[J]. 長春理工大學(xué)學(xué)報:自然科學(xué)版,2009,32(2):292-295.
Li Wenbo, Wang Liyan. An approach of vehicle detection based on adaboost algorithm[J]. Journal of Changchun University of Science and Technology: Natural Science Edition, 2009, 32(2): 292-295. (in Chinese with English abstract)
[38] 付忠良. 關(guān)于AdaBoost有效性的分析[J]. 計算機研究與發(fā)展,2008,45(10):1747-1755.
Fu Zhongliang. Effectiveness analysis of adaboost[J]. Journal of Computer Research and Development, 2008, 45(10): 1747-1755. (in Chinese with English abstract)
[39] 曹瑩,苗啟廣,劉家辰,等. AdaBoost算法研究進展與展望[J]. 自動化學(xué)報,2013,39(6):745-758.
Cao Ying, Miao Qiguang, Liu Jiachen, et al. Advance and prospects of adaboost algorithm[J]. Acta Automatica Sinica, 2013, 39(6): 745-758. (in Chinese with English abstract)
[40] Schapire R E, Singer Y. Improved boosting algorithms using confidence-rated predictions[J]. Machine Learning, 1999, 37(3): 297-336.
Remote sensing monitoring of wheat powdery mildew based on AdaBoost model combining mRMR algorithm
Ma Huiqin1,2, Huang Wenjiang2※, Jing Yuanshu1, Dong Yingying2, Zhang Jingcheng3,
Nie Chenwei2, Tang Cuicui2,4, Zhao Jinling4, Huang Linsheng4
(1. Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, School of Applied Meteorology, Nanjing University of Information Science & Technology, Nanjing 210044, China; 2. Key Laboratory of Digital Earth Science, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100094, China; 3. College of Life Information Science and Instrument Engineering, Hangzhou Dianzi University, Hangzhou 310018, China; 4. School of Electronic and Information Engineering, Anhui University, Hefei 230039, China)
Wheat powdery mildew has become one of the most serious wheat diseases in China, so it is necessary for using modern remote sensing information technology to improve the monitoring ability of the disease for guiding disease prevention and ensuring Chinese grain production safety. Feature selection was one of the key issues for establishing inversion models, and the use of good feature selection method would make a direct impact on disease classification accuracy. In this study, the Landsat 8 remote sensing image was used to extract total eighteen characteristic variables. Then, we got two groups different features, and Wetness, land surface temperature (LST) and shortwave infrared water stress index (SIWSI) were obtained by correlation analysis (CA) algorithm, and Greenness, Wetness, LST, re-normalized difference vegetation index (RDVI) and simple ratio (SR) were obtained by minimum redundancy maximum relevance (mRMR) algorithm. The basic idea of AdaBoost method was through a certain category by using numbers of weak classification classifiers to get a strong classifier which has great classification ability for improving classification accuracy. It generally was used to solve the binary classification problem, and we reformed it to solve three classification problems through dichotomous dismantling way of one against all. Then, we used it and common classification method Fisher linear discriminant analysis (FLDA) and support vector machine (SVM) to monitor wheat powdery mildew occurrence severity (healthy, slight, severe) in western Guanzhong Plain, Shaanxi province, China through two group features obtained by two different feature selection methods mentioned above. Model with mRMR algorithm combining AdaBoost method (mRMR-AdaBoost model) produced the highest Spearman relevance value (0.868) in six models. Moreover, the values of Somers’D, Goodman-Kruskal Gamma, and Kendal’s Tau-c of mRMR-AdaBoost model were the highest than those of models with CA algorithm and models with mRMR algorithm which constructed by FLDA and SVM methods. It indicated that mRMR-AdaBoost model had a better performance than the other five models. The validation results showed that, the overall accuracies and the Kappa coefficient of AdaBoost models with CA and mRMR algorithms were 81.4%, 0.685 and 88.4%, 0.807, respectively, and they were higher by 27.9%, 27.9%, 14.0% and 9.3% than those of FLDA and SVM models with corresponding selection algorithms. The overall accuracies of FLDA, SVM and AdaBoost models with mRMR algorithm were higher by 7.0%, 11.7% and 7.0% than those of the corresponding methodological models with CA algorithm. Furthermore, mRMR-AdaBoost model had the lowest omission and commission error in all six models. Additionally, compared with the spatial distribution results of wheat powdery mildew severities which mapped by SVM and AdaBoost models and combined with surface survey results of wheat powdery mildew occurrence severity, the mapping results of mRMR-SVM model and two AdaBoost models were similar and close to ground survey results, and among them, the results of mRMR-AdaBoost model was the closest to ground reality than the others’. These results revealed that for remote sensing monitoring of crop disease, the application of AdaBoost method had a good prospect, and for feature variables selecting of crop disease monitoring model, the minimal redundancy maximal relevance algorithm had more advantages than CA algorithm. The study results can provide a method reference for monitoring of other crop diseases.
diseases; remote sensing; monitoring; wheat; mRMR algorithm; AdaBoost method
10.11975/j.issn.1002-6819.2017.05.024
S4; TP79
A
1002-6819(2017)-05-0162-08
馬慧琴,黃文江,景元書,董瑩瑩,張競成,聶臣巍,唐翠翠,趙晉陵,黃林生. 基于AdaBoost模型和mRMR算法的小麥白粉病遙感監(jiān)測[J]. 農(nóng)業(yè)工程學(xué)報,2017,33(5):162-169.
10.11975/j.issn.1002-6819.2017.05.024 http://www.tcsae.org
Ma Huiqin, Huang Wenjiang, Jing Yuanshu, Dong Yingying, Zhang Jingcheng, Nie Chenwei, Tang Cuicui, Zhao Jinling, Huang Linsheng. Remote sensing monitoring of wheat powdery mildew based on AdaBoost model combining mRMR algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(5): 162-169. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2017.05.024 http://www.tcsae.org
2016-06-21
2017-02-11
中國科學(xué)院國際合作局對外合作重點項目(131211KYSB20150034);國家重點研發(fā)計劃項目(2016YFD030702);國家自然科學(xué)基金國際合作項目(61661136004);國家自然科學(xué)基金項目(41271412、41601467);江蘇省普通高校自然科學(xué)研究資助項目(15KJA170003)。
馬慧琴,女,甘肅人,主要從事農(nóng)業(yè)氣象和植被定量遙感研究。南京 南京信息工程大學(xué)應(yīng)用氣象學(xué)院,210044。
Email:1033513161@qq.com。
※通信作者:黃文江,博士,研究員,博士生導(dǎo)師,主要從事植被定量遙感研究。北京 中國科學(xué)院遙感與數(shù)字地球研究所數(shù)字地球重點實驗室,100094。Email:huangwj@radi.ac.cn