LIAO Zhen-qi,DAI Yu-long,WANG Han,Quirine M.KETTERINGS,LU Jun-sheng,ZHANG Fu-cang,LI Zhi-jun,FAN Jun-liang#
1 Key Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas of the Ministry of Education,Northwest A&F University,Yangling 712100,P.R.China
2 Department of Animal Science,Cornell University,Ithaca,NY 14853,USA
Abstract The accurate and rapid estimation of canopy nitrogen content (CNC) in crops is the key to optimizing in-season nitrogen fertilizer application in precision agriculture. However,the determination of CNC from field sampling data for leaf area index (LAI),canopy photosynthetic pigments (CPP;including chlorophyll a,chlorophyll b and carotenoids) and leaf nitrogen concentration (LNC) can be time-consuming and costly. Here we evaluated the use of high-precision unmanned aerial vehicle (UAV) multispectral imagery for estimating the LAI,CPP and CNC of winter wheat over the whole growth period. A total of 23 spectral features (SFs;five original spectrum bands,17 vegetation indices and the gray scale of the RGB image) and eight texture features (TFs;contrast,entropy,variance,mean,homogeneity,dissimilarity,second moment,and correlation) were selected as inputs for the models. Six machine learning methods,i.e.,multiple stepwise regression (MSR),support vector regression (SVR),gradient boosting decision tree (GBDT),Gaussian process regression (GPR),back propagation neural network (BPNN) and radial basis function neural network (RBFNN),were compared for the retrieval of winter wheat LAI,CPP and CNC values,and a double-layer model was proposed for estimating CNC based on LAI and CPP. The results showed that the inversion of winter wheat LAI,CPP and CNC by the combination of SFs+TFs greatly improved the estimation accuracy compared with that by using only the SFs. The RBFNN and BPNN models outperformed the other machine learning models in estimating winter wheat LAI,CPP and CNC. The proposed double-layer models (R2=0.67-0.89,RMSE=13.63-23.71 mg g-1,MAE=10.75-17.59 mg g-1) performed better than the direct inversion models (R2=0.61-0.80,RMSE=18.01-25.12 mg g-1,MAE=12.96-18.88 mg g-1) in estimating winter wheat CNC. The best winter wheat CNC accuracy was obtained by the double-layer RBFNN model with SFs+TFs as inputs (R2=0.89,RMSE=13.63 mg g-1,MAE=10.75 mg g-1). The results of this study can provide guidance for the accurate and rapid determination of winter wheat canopy nitrogen content in the field.
Keywords: UAV multispectral imagery,spectral features,texture features,canopy photosynthetic pigment content,canopy nitrogen content
Wheat (TriticumaestivumL.) ranks third globally,after maize (ZeamaysL.) and rice (OryzasativaL.),in terms of total production area (Fuet al.2021). Winter wheat is an important crop in Northwest China,but its production is affected by the low and uneven rainfall as well as poor soil fertility (Fanget al.2022). Water and nitrogen are critical to the growth and development of winter wheat (Xionget al.2018;Wooet al.2020). Appropriate supplemental irrigation significantly increases the chlorophyll content of wheat leaves,and improves crop yield and water-nitrogen use efficiency (Al-Ghzawiet al.2018). Appropriately increasing the nitrogen (N) fertilizer application rate can improve the photosynthetic productivity of wheat,thereby increasing grain yield (Maet al.2021). The ridge-furrow film mulching system has been an effective planting pattern in semi-arid and semi-humid regions of China to reduce soil evaporation,collect rainwater and improve crop yields (Liaoet al.2022b). This system can significantly increase the leaf area index (LAI) and leaf chlorophyll content,and promote plant N absorption,thereby increasing crop yields (Guet al.2021).
The accurate and rapid monitoring of the N concentration in crops during the growth period of winter wheat is the key to optimizing in-season N management of winter wheat. Leaf color monitoring,tissue sampling and analyses,non-destructive chlorophyll monitoring (Padillaet al.2018),LAI assessment and leaf photosynthetic pigment determination have been used for monitoring N concentrations,but such methods are labor-intensive and costly,and for the most part they are also destructive(Padillaet al.2018;Aliet al.2020). Spectral features of the radiation reflected,transmitted or absorbed by intact leaves can provide estimates of the leaf chlorophyll content. As vegetation indices (VIs) may correlate with chlorophyll content and biomass,they can be used to indirectly estimate the N status of crops (Aliet al.2020).
Hand-held or machine-mounted proximal optical sensors can provide radiometric indicators and indirect measurements of indicator compounds that are sensitive to crop nitrogen status,which can then be used to indirectly determine crop N status in a non-destructive manner(Padillaet al.2018). Satellite platforms can provide remote sensing data over large areas,but the precise N management of crops during the growing season requires a large amount of timely multi-temporal data,and satellites are usually limited by factors such as revisitation intervals or weather conditions (Zhanget al.2021). Highquality and real-time unmanned aerial vehicle (UAV)remote sensing data may be better suited for determining plant characteristics such as LAI,canopy photosynthetic pigments (CPP;including chlorophylla,chlorophyllband carotenoids) and canopy nitrogen content (CNC) (Yaoet al.2017;Liuet al.2018). Zhaet al.(2020) found that machine learning algorithms improved the prediction accuracy of the UAV-based rice nitrogen nutrition index compared with stepwise multiple linear regression. Qiuet al.(2021) also estimated the nitrogen nutrition index in rice from UAV RGB images coupled with machine learning algorithms.
There are obvious correlations among crop biochemical parameters and VIs,and estimating N based on VIs is an important method for diagnosing crop N status (Geipelet al.2016). Studies have shown that accurately describing the complex canopy structure using only VIs is difficult (Luoet al.2020). Therefore,the effect of surrounding pixels on the target pixel should be considered when estimating LAI,biomass or leaf N content. Texture features (TFs) can describe the spatial correlation of pixels and help reflect changes in the vegetation structure (Wulderet al.1998).Crops can be classified based on TFs extracted from the Gray Level Co-occurrence Matrix (GLCM) and spectral features (SFs),resulting in improved classification accuracy(Kwak and Park 2019). Several studies have shown that VIs combined with TFs can significantly improve the estimation accuracy of LAI and biomass (Zhouet al.2017;Hlatshwayoet al.2019;Yueet al.2019). However,few studies have examined the effects of TFs on the estimation of winter wheat LAI,CPP and CNC. Since chlorophyll is the main leaf component that determines its reflectance in the visible region of the spectrum,optical technology has great potential for providing information on canopy chlorophyll and nitrogen contents (Clevers and Gitelson 2013). Spectroscopic techniques have become feasible for the non-destructive estimation of large-scale changes in canopy chlorophyll,carotenoid and nitrogen contents.
The objectives of this study were to: (1) explore the potential of SFs combined with TFs for non-destructively estimating the wide-ranging changes in winter wheat LAI,CPP and CNC;(2) compare the performances of six machine learning models for the retrieval of winter wheat LAI,CPP and CNC;and (3) specifically propose a double-layer model for improving winter wheat CNC predictions based on LAI and CPP from UAV multispectral images. The results of this study can provide guidance for optimizing winter wheat nitrogen fertilization by accurately estimating the canopy nitrogen content.
Fig. 1 Location and arrangement of the experimental plots.The upper image is a satellite image of the experimental station,and the lower image is an unmanned aerial vehicle image of the experimental plots.
Fig. 2 The flowchart of double-layer model for estimating winter wheat canopy nitrogen content (CNC).
Fig. 3 Boxplots showing the normal distribution curve characteristics for leaf area index (LAI;A),chlorophyll a (Chl a;B),chlorophyll b (Chl b;C),carotenoids (Car;D),chlorophyll (Chl;E) and leaf nitrogen concentration (LNC;F) at the different growth stages of winter wheat.
Fig. 4 Violin plots showing the distribution characteristics of canopy chlorophyll a content (A),canopy chlorophyll b content(B),canopy carotenoid content (C) and canopy nitrogen content (D) at different growth stages of winter wheat. CCLCa,canopy chlorophyll a content;CCLCb,canopy chlorophyll b content;CCC,canopy carotenoid content;CNC,canopy nitrogen content;IQR,inter-quartile range.
Fig. 5 Correlation coefficient matrix (P<0.01) of the relationship between feature variables (spectral features and texture features)and field data (the deeper the red color,the stronger the positive correlation. A,spectral features. B,mean. C,homogeneity. D,variance. E,contrast. F,dissimilarity. G,entropy. H,second moment. I,correlation. The deeper the blue color,the stronger the negative correlation;and the deeper the white color,the weaker the correlation.
Fig. 5 (Continued from preceding page)
Fig. 6 Taylor diagrams for the leaf area index and canopy photosynthetic pigment estimation models. A,leaf area index. B,canopy chlorophyll a content. C,canopy chlorophyll b content. D,canopy carotenoid content. CRMSE,centered root mean square error.
Fig. 7 Scatter plots of measured winter wheat canopy nitrogen concentration and estimated values from the double-layer model.A,multiple stepwise regression with spectral features. B,support vector regression with spectral features. C,gradientboosting decision tree with spectral features. D,Gaussian process regression with spectral features. E,back propagation neural network with spectral features. F,radial basis function neural network with spectral features. G,multiple stepwise regression with spectral and texture features. H,support vector regression with spectral and texture features. I,gradient boosting decision tree with spectral and texture features. J,Gaussian process regression with spectral and texture features. K,back propagation neural network with spectral and texture features. L,radial basis function neural network with spectral and texture features.
The study site was located at the Key Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas (34°18′N,108°24′E;521 m altitude)sponsored by the Ministry of Education,Northwest A&F University,Yangling,China (Fig.1). Over the past 25 years (1995-2019),the average annual air temperature was 13.3°C,annual precipitation was 595 mm,pan evaporation was 1 500 mm,sunshine duration was 2 185 h and the average frost-free period exceeded 210 d. This location has a warm temperate and monsoon climate with abundant light and heat resources,but the seasonal distribution of precipitation is uneven.According to the USDA soil texture classification,the soil is loam,with a pH of 8.0,dry bulk density of 1.4 g cm-3,field capacity of 24% (gravimetric),permanent wilting point of 8.5% (gravimetric),organic matter of 13.8 g kg-1,and total nitrogen of 0.9 g kg-1,nitrate nitrogen of 81.3 mg kg-1,available phosphorus of 25.10 mg kg-1,available potassium of 144.7 mg kg-1in the 0-20 cm soil layer prior to the experiment.
The field experiment was conducted in a split-split-split plot design with three replicates. The main plot was planting mode (P1,flat cultivation with no mulching;P2,ridge-furrow cultivation with plastic film mulching on the ridge);the split plot was seeding rate (S1,90 kg ha-1;S2,135 kg ha-1S3,180 kg ha-1);the split-split plot was N rate(N1,100 kg ha-1;N2,200 kg ha-1;urea with 46% N),and the sub-sub-subplot was irrigation level (W1,rainfed;W2,30 mm irrigation in both winter and spring). Each ridgefurrow cultivation plot was 8 m long and 6 m wide. The flat cultivation plots were 8 m long and 3 m wide. The ridge and furrow were 60 cm wide,and the ridge covered by the plastic film was 10 cm high. Fertilizers were applied before sowing. To ensure sufficient P,150 kg P2O5was applied as superphosphate (16% P2O5). In addition,potassium sulfate (51% K2O) was applied to supply 90 kg K2O ha-1for each treatment. The widely cultivated wheat variety Xiaoyan 22 (bred by the Northwest A&F University,Yangling,China) was planted on October 15,2020,and harvested on June 7,2021. Other management practices(weed/pest/disease control and tillage) followed the standard local practices.
LAI measurementThe LAI-2200 plant canopy analyzer(LI-COR Biosciences,Inc.,Lincoln,NE,USA) was used to measure wheat LAI. Measurements were conducted during the daytime on bright sunny days and a 45° view cap was used to avoid direct sunlight. Three sampling areas were selected by the three-point sampling method for each treatment plot,and three repeated measurements were performed in each sampling area. Each sampling unit included one A value (measured at the top of the canopy) above the canopy and four B values (measured at four marking areas under the canopy) under the canopy with three replicates. The samples were collected at the jointing (March 21),booting (April 11),heading-anthesis(April 30) and grain-filling (May 21) stages. A total of 72 measurements were obtained at each sampling stage,resulting in a total of 288 data points for the experiment.All data were pooled for subsequent analyses.
Canopy photosynthetic pigment content measurementsFive fully expanded flag leaves were sampled at the same locations where LAI values were measured,and they were soaked with 95% alcohol for about three days.The absorption values of the extraction mixtures were measured at 663,645,and 470 nm wavelengths using an ultraviolet spectrophotometer (UV-2250,Kyoto,Japan)to determine the contents of chlorophylla(Chla,mg L-1),chlorophyllb(Chlb,mg L-1),chlorophyll (Chl,Chla+Chlb,mg L-1) and carotenoids (Car,mg L-1) (Guet al.2021;Liaoet al.2022b):
where A645,A663,and A470are the absorbance values at 645,663 and 470 nm,respectively;DW (g) is the leaf fresh weight (0.1 g) and VT(mL) is the extract volume(25 mL).
The CPP content (including CCLCa,CCLCb,CCC)was obtained by multiplying LAI by leaf photosynthetic pigments (Atzbergeret al.2010;Louet al.2021):
where CCLCais the canopy chlorophyllacontent (mg g-1),CCLCbis the canopy chlorophyllbcontent (mg g-1),and CCC is the canopy carotenoid content (mg g-1).
Canopy N content measurementLeaves were sampled at each of the four growth stages for tissue N measurements. The samples were dried in an oven at 105°C for 0.5 h,and subsequently at 75°C to constant weight. Dried samples were ground to pass a 0.5-mm sieve,and then digested using the H2SO4-H2O2method for chemical analysis (Yanet al.2022). The leaf N content (LNC,mL) was determined by the Kjeldahl digestion method (Kjeltec 2300,FOSS Tecator,Sweden),and canopy nitrogen content (CNC) was obtained by multiplying LNC and LAI:
where A (mg L-1) is the concentration measured by the flow analyzer and M (g) is the weight of the plant sample before digestion.
A DJI M200 V2 Quadcopter UAV (DJI Company,Shenzhen,China) equipped with a multiple camera array wireless system (Prime ALTUM,MicaSense Inc.,Seattle,WA,USA) was used. The expanded dimensions(including propellers and landing gear) of the DJI M200V2 were 883 mm×886 mm×398 mm (length×width×height).The weight (including two TB55 batteries) and maximum take-off weight of the DJI M200V2 were 4.69 and 6.14 kg,respectively. The corresponding maximum flight times were 38 and 24 min,respectively. The weight and dimensions (length×width×height) of the Prime ALTUM camera were 410 g and 82.0 mm×67.0 mm×64.5 mm,respectively. The Prime ALTUM camera was equipped with GPS,and it could capture five spectral images and one thermal infrared image simultaneously (Table 1). The image resolutions of the spectral unit and thermal infrared image unit were 2 064×1 544 pixels and 160×120 pixels,respectively.
Table 1 Spectral information of the Prime ALTUM camera
The UAV multispectral images were collected on sunny and windless days between 13:00 and 14:00 h,shortly before the collection of field data (16:00-18:00 h) on the same day. The flight altitude was set at 30 m above the ground,providing images with spatial resolutions of 7.24 mm/pixel on the ground. The flight path was designed to obtain continuous images with overlaps of up to 80%. The image was calibrated before each flight using the standard reflectivity radiation calibration panelsupplied with the Prime ALTUM camera.
The Pix4D Mapper Software (Version 4.5.6,Pix4DInc.,Switzerland) was used for image preprocessing to generate corrected and geo-referenced spectral reflectance data,including initial processing,conformal generation,and exponential calculation (with reflectance correction). The output of each layer was a single high-resolution TIFF image. The method of vegetation index threshold (Zhouet al.2021) was used for the threshold segmentation of the preprocessed multispectral image in the ENVI 5.3 Software(Harris Corp,Bloomfield,CO) and R (Version 4.1.3) to eliminate the soil and mulching background,and accurately obtain the average reflectance of the winter wheat canopy.The UAV estimation data and field measurement data at the sampling points were corresponded by the Real-Time Kinematic Differential Global Positioning System (RTKDGPS).
Five single-band reflectance data,17 VIs (Table 2) and gray scale were selected as the inputs,as they are most closely related to LAI and leaf photosynthetic pigments.
Table 2 Vegetation indices derived from the blue (B),green (G),red (R),red edge (RE) and near infrared (NIR) bands
According to the grey level co-occurrence matrix(GLCM),the contrast (CON),entropy (ENT),variance(VAR),mean (MEA),homogeneity (HOM),dissimilarity(DIS),second moment (SEC),and correlation (COR),were selected as TFs and their correlations with the measured plant data were evaluated. The image texture was generated from the multispectral image after removing the soil and mulch background data and each of the TFs were analyzed with the smallest window size (3×3 pixels). Further details of the calculations are available from the literature (Yueet al.2019;Zhouet al.2021).
where Ngis the number of distinct gray levels in the quantized image;p(i,j) is the value of the (i,j)th entry in the gray level co-occurrence matrix;μxandμyare the mean values ofxrows andyrows in the matrix calculation,respectively;andσxandσyare the standard deviations ofxrows andyrows in the matrix calculation,respectively.
The 23 SFs and their corresponding eight TFs were preliminarily screened. Variable inputs included SFs only and SFs+TFs. The selected SFs and combinations of SFs+TFs were subjected to full subset selection in order to obtain the most sensitive variables. The leaps package of the software R 4.1.3 and the ‘regsubsets()’ function were used to select the variables. After performing the optimal subset regression,the returned subset regression equation of the combination of independent variables and the corresponding evaluation index of each regression equation were used. The function ‘which()’ was then used to select the best performing regression equation using the adjustedR2,Marlows Cp and Bayesian information criterion (BIC) as evaluation criteria. A multicollinearity check was performed on the screened variables by optimal subset screening. The final screening results are shown in Table 3.
Table 3 Statistics of the variable screening based on full subset selection
Machine learning algorithmsSix machine learning algorithms were used for modeling,including the multiple stepwise regression (MSR),support vector regression(SVR),gradient boosting decision tree (GBDT),Gaussian process regression (GPR),back propagation neural network (BPNN) and radial basis function neural network(RBFNN).
The MSR automatically selects the most important variables from a large number of variables,and establishes a prediction or interpretation model for regression analysis.It introduces new variables one by one and requires the sum of partial regression squares to be significant.Old variables are tested one by one and variables withinsignificant sums of partial regression squares are excluded. The essence is to establish the optimal multiple linear regression equation (Jinet al.2015).
The SVM is a class of generalized linear classifiers that performs the binary classification of data in a supervised learning manner. Its principle is to establish a classification hyperplane that maximizes the distances between learning samples. SVR can be obtained by extending SVM from a classification problem to a regression problem (Shafieeet al.2021). The kernel function of the SVR was a Gaussian function in this study.
The GBDT is also known as multiple additive regression tree,which is an iterative decision tree algorithm (Liuet al.2021). This algorithm consists of multiple decision trees,and the conclusions of all trees are accumulated to make the final prediction. The natural advantage of GBDT is that a variety of discriminative features and feature combinations can be found.
The GPR is a regression analysis of data based on the prior Gaussian process. It continuously adjusts the hyperparameters by observing the mean and covariance,which has the advantages of easy implementation,adaptive acquisition of hyperparameters and probabilistic significance of the predicted output (Fuet al.2020). The kernel function of GPR was the squared exponential in this study.
The BPNN is a multi-layer feedforward neural network trained according to the error back propagation algorithm(Zhouet al.2021). It has the advantages of global approximation and high accuracy,and it can describe nonlinear relationships between data. BPNN includes two calculation processes of forward and reverse. In the forward calculation,the input variable is transferred from the input layer to the output layer through the hidden layer calculation. If the expected value is not obtained in the output layer,the process will automatically transfer to the reverse calculation. The error signal will be returned along the original path,and the weight of each hidden layer will be modified to minimize the error. In this study,the maximum number of iterations was set to 1 000,and the minimum error of the training target was 0.001.
The RBFNN is a classic single hidden layer feedforward neural network (Heet al.2019). Its hidden layer neuron activation function is a radial basis function,and the output layer is a linear combination of hidden layer neuron outputs. The network has the advantages of global optimality,limited computational complexity,fast computational speed,and best approximation performance. The number of neurons in the hidden layer varied between 40 and 100,and the radial basis expansion speed ranged from 1 to 9 in this study. The maximum number of iterations was set to 1 000,and the minimum error of the training target was 0.001.
Each dataset of the sampling period was arranged in the descending order of measured values,and samples selected from every three samples (out of 288 samples total per sampling period) were used as the validation dataset,so that the ratio of calibration data to validation data was approximately 2:1. This selection method can ensure the range consistency and homogeneous distribution of the calibration samples and validation samples.
Establishing the canopy nitrogen content estimation modelTwo approaches were used for model development,i.e.,the direct estimation model and a double-layer model. In the direct estimation model,CNC was estimated directly from the SFs or SFs+TFs after optimal subset screening. In the double-layer model approach,CPP was first estimated (the first layer model)from the SFs or SFs+TFs after optimal subset selection.Then,a second layer model (named the EPNC model)was developed for estimating CNC by CPP. All modeling and validation methods were the same as the CPP content model (Fig.2).
The PROSAIL is a canopy-leaf coupled radiative transfer model (RTM) coupled with the leaf optical model(PROSPECT) and the canopy model (SAIL). This model was used to simulate the spectral reflectance of plants in the 400 to 2 500 nm band (Verrelstet al.2014). To improve the accuracy and applicability of the EPNC model,part of the data used for the EPNC model came from the measured data in the field,while the other part came from the LUT-based PROSAIL inversion (Look up table,LUT) data. A 50-50% split of these data yielded two subsets that were used for EPNC model training and evaluation.
The LAI,CCLC and CCC were retrieved by using a LUT-based method for PROSAIL model inversion. These variable retrievals were performed in the automated radiative transfer model operator (ARTMO) toolbox.The ARTMO toolbox is a software package written in Matlab that provides various tools for running different RTMs in forward or inverse modes at both leaf and canopy scales (Verrelstet al.2012;Riveraet al.2013).At the same time,the models were parameterized by using either the variable ranges observed in the field during each measurement date or the values obtained from the relevant literature (Wanget al.2022;Zhang Jet al.2022). A separate LUT was built for each date to constrain the model and reduce the effect of discomfort(Chakhvashviliet al.2022). To produce LUTs,PROSAIL was first run in the forward mode separately for each date. Either a Gaussian or uniform distribution was used for the variables (Table 4). The relevant parameters were derived from the measured data and related literature values (Liet al.2019b;Zhaiet al.2021;Wanget al.2022)and LOPEX93 data (Hosgoodet al.1995). Finally,the retrieved chlorophyll was divided into Chlaand Chlbaccording to the ratio of Chlato Chlbfrom the field data in the different treatments. The details of LUT-based PROSAIL inversion referred to Chakhvashviliet al.(2022).Since the spectral data simulated by the PROSAIL model were different from the spectral band of the image collected by UAV,the simulation data were converted to the target data through the spectral response function(Lianget al.2015).
Table 4 PROSAIL variables used in the construction of individual LUT,leaf area index,canopy chlorophyll content and canopy carotenoid content ranges
Evaluation indicatorsThe accuracy of the models was assessed using three commonly used statistical indicators: coefficient of determination (R2),root mean squared error (RMSE) and mean absolute error (MAE)(Liaoet al.2023):
whereis the measured value;is the average of the measured values;Yi is the predicted value;is the average of the simulated values;nis the number of measured values,andiis the sample ordinal. TheR2ranges from 0 (most unstable) to 1 (most stable). The RMSE characterizes the degree of deviation between the predicted and measured values. When theR2value is close to 1 and the values of RMSE and MAE are closer to 0,the estimated values are more consistent with the measured values. The closer the value is to 0,the higher prediction accuracy and the stronger the predictive ability of the model.
For statistical analysis,the R programming language(Version 4.1.3) was used. The correlation coefficient matrix of the relationships among various traits was drawn using the R package (corrplot),and the Taylor diagram function in the R package (openair) was used to draw the Taylor diagram. The Taylor diagram is an effective way to evaluate model performance,which was invented by Taylor (2001). Taylor diagrams include three statistics,top left (standard deviation),top right (correlation),and bottom(centered RMS error). The variability is represented by the standard deviation of the observed and modeled values. The Taylor diagrams show the variability of observations (given by the standard deviation) and aremarked “observed” on thex-axis. The magnitude of the variability is measured in radial distance from the origin of the plot. To aid interpretation,the radial dashed line is shown from the “observed” point. The next statistic to consider is the correlation coefficient,R,as shown in the figure. This is shown on the arc and points that lie closest to thex-axis have the highest correlation. The grey lines show this specific correlation coefficient. Finally,the right plot in the figure highlights the centered root mean square error (CRMSE). It is centered because the mean values of the data (observations and predictions) were subtracted. The concentric dashed lines emanating from the ‘observed’ points show the values of the CRMSE,so points furthest from the ‘observed’ value are the worst performing models because they have the highest RMS errors. Taken as a whole,the best model is the one with reasonably similar variability compared with the observations,the highest correlation and the lowest CRMSE. The Originpro 2021 (OriginLab Corporation,USA) was used to create the other figures. The inversion modeling was based on MATLAB 2021a (MathWorks,Natic,US).
As shown in Fig.3,LAI,photosynthetic pigments and LNC showed the same trends over the growth period,following unimodal curves;that is,they first gradually increased and then gradually decreased. The wheat canopy LAI over the entire period ranged from 1 to 7,and the ranges of variations of Chla,Chlb,Car,Chl,and LNC were 0-5,0-1.2,0-0.8,0-6,0-45 mg g-1,respectively.At the same growth stage,the differences of each index in the different treatments were also large,which showed that the treatments had a great impact on the indicators involved in this study.
As shown in Fig.4,the canopy photosynthetic pigments and canopy nitrogen content also followed unimodal curves,and their trends of change and changes were roughly the same. The maximum values were obtained at the heading-anthesis stage. The size of the violin diagram shows that different treatments had great effects on canopy photosynthetic pigments and canopy nitrogen content. The variation ranges of CCLCa,CCLCb,CCC,CNC were 0-30,0-8,0-6,and 0-250 mg g-1,respectively.
A correlation coefficient matrix (P<0.01) was used to analyze the relationships between each spectral feature,each texture feature and the winter wheat field leaf data(LAI,Chla,Chlb,Chl,Car,LNC,CCLCa,CCLCb,CCLC,CCC and CNC) (Fig.5). There were strong positive correlations between the leaf photosynthetic pigments(Chla,Chlb,Car and Chl) and leaf nitrogen content(r≥0.80). Stronger positive correlations existed between the canopy photosynthetic pigments (CCLCa,CCLCband CCC) and canopy nitrogen content (r≥0.90). There was no strong correlation between LAI and the feature variables (r=-0.57-0.58). There was also no strong relationship between LAI and photosynthetic pigments or LNC. The spectral indices very sensitive to photosynthetic pigments and leaf nitrogen content were GNDVI (r≥0.74),GCI(r≥0.84) and gray scale (r≤0.65). The NDVI achieved a slightly higherR2for carotenoid content than either Chla,Chlbor total Chl content. Among the eight texture features,COR was most sensitive to the field data. The texture features corresponding to some vegetation indices(NIR,DVI,MSR,GNDVI,EVI,TVI,SRCI and MSAVI) also had strong correlations with the field data.
Due to the correlations between most vegetation indexes,feature variables and LAI and canopy photosynthetic pigments (CCLCa,CCLCb,and CCC),the optimal variables were screened from the full subset. The models for estimating LAI,CCLCa,CCLCband CCC of winter wheat by including spectral features or spectral features combined with texture features were established based on six machine learning models. Compared with the inversion modeling using only spectral features,the inversion with spectral features combined with texture features showed improvements in the estimation accuracy to varying degrees (Table 5). All inversion models performed better in estimating CPP than LAI. Models for inverting the canopy chlorophyll content of winter wheat using spectral features combined with texture features had high accuracy and good stability,so it is feasible and effective to invert the CPP in winter wheat using this method. TheR2in estimating CCLCaranged from 0.77 to 0.96,theR2in estimating CCLCbranged from 0.73 to 0.94,and theR2in estimating CCC ranged from 0.68 to 0.94(RMSE=0.17-0.48 mg g-1,MAE=0.13-0.36 mg g-1). The accuracy in predicting canopy chlorophyll content was better for the BPNN and RBFNN models,withR2≥0.85.Among the six machine learning models,the BPNN and RBFNN models had better accuracy in estimating LAI and CPP,while MSR had the worst accuracy,and the SVR,GBDT,and GPR models had better accuracy in estimating some indicators. The best inversion results of LAI and CCLCa,CCLCband CCC were obtained by the RBFNN_SFs+TFs,BPNN_SFs+TFs,RBFNN_SFs+TFs and BPNN_SFs+TFs models,respectively.
Table 5 Inversion results of winter wheat leaf area index and canopy photosynthetic pigments
The Taylor diagram plots in Fig.6 are divided into three components to aid interpretation. The first part is the concentric circle with the origin as the center,and the larger the radius,the larger the standard deviation. The second part is the closer point to thex-axis,which indicates a greater correlation coefficient,The last part is the larger the radius of the golden dotted concentric circle with the variance of the observed value as the center,the larger the RMS errors. These plots show that the variability of all models was less than the measured value because none of them exceeded the black dashed line. The closer the model is to the dashed line,the closer the observed variability. The performance of each model in estimating LAI varied,but the performance of each model was relatively good for estimating photosynthetic pigments.Overall,the best estimation models for LAI and CPP were the BPNN_SFs+TFs and RBFNN_SFs+TFs models,because they had fairly similar variability,the highest correlations and the smallest RMS errors compared to the observations. However,the GPR_SFs+TFs model also had good performance in the estimations of some indicators,and it was easy to conclude that the MSR_SFs model was the worst.
The data in Table 6 show that the inversion accuracy of winter wheat CNC was improved when using spectral features combined with texture features compared with the accuracy using only spectral features. Compared with the accuracy of canopy photosynthetic pigments,the accuracy of winter wheat CNC was worse (R2=0.61-0.80),but its accuracy was better than that of LAI. In the inversion results of winter wheat CNC,the RBFNN_SFs+TFs and BPNN_SFs+TFs models had good accuracy,while the accuracies of the SVR,GBDT and GPR models were similar.
The correlation matrix (Fig.5) shows that there was a strong correlation between CNC and CPP. Therefore,the winter wheat CNC was estimated through the measuredcanopy photosynthetic pigments based on six machine learning methods. The results showed that the estimation accuracy of winter wheat CNC through canopy photosynthetic pigments was very good (R2=0.86-0.96,RMSE=7.92-14.83 mg g-1,MAE=5.84-10.88 mg g-1)based on the six machine learning regression models (Table 7). The estimation accuracy of machine learning models was ranked as:RBFNN>BPNN>GPR>SVR>GBDT>MSR.
Table 7 Estimation results of winter wheat canopy nitrogen content based on canopy photosynthetic pigments
High accuracy of canopy photosynthetic pigment retrieval was obtained through spectral feature inversion or spectral features combined with texture features (Table 7). The canopy photosynthetic pigments could also accurately estimate the CNC of winter wheat. Therefore,the canopy photosynthetic pigments were inverted by spectral features combined with texture features,and the winter wheat CNC was then estimated by the inversion of the canopy photosynthetic pigments. Compared with only using spectral features (R2=0.67-0.86) as the input variables,the accuracy of the estimation model (R2=0.75-0.89) with spectral features combined with texture features as inputs was greatly improved (Fig.6). The estimation results are shown in Fig.7. The winter wheat CNC estimation model under the coupling of LAI and photosynthetic pigments (double-layer model) had good accuracy. The accuracy of the estimation model for winter wheat CNC based on the proposed double-layer model (R2=0.67-0.89,RMSE=13.63-23.71 mg g-1,MAE=10.75-17.59 mg g-1) was better than that of the direct inversion model (R2=0.61-0.80,RMSE=18.01-25.12 mg g-1,MAE=12.96-18.88 mg g-1).Among them,the double-layer model based on RBFNN with the screened SFs+TFs as the input variable had the best accuracy in estimating wheat CNC (R2=0.89,RMSE=13.63 mg g-1,MAE=10.75 mg g-1).
The LAI,photosynthetic pigments and LNC are important indicators for characterizing crop photosynthetic characteristics (Wuet al.2020).They mainly affect the spatial distribution of the crop populations,light interception and light use efficiency,as well as solar energy capture for leaf photosynthesis,thereby affecting photosynthetic productivity,biomass and yield (Penget al.2021;Liaoet al.2022a). In this study,LAI,photosynthetic pigments and LNC all showed unimodal changes with wheat growth and development because the winter wheat leaves grew rapidly at the early growth stage and then LAI decreased sharply at the later growth stage due to leaf abscission and senescence. During the reproductive growth period,the nitrogen in wheat was mainly transported to the grains,so the nitrogen content of the leaves decreased. There was a strong linear correlation between photosynthetic pigments and LNC,and nitrogen is required for the synthesis of photosynthetic pigments,resulting in a decrease in the leaf photosynthetic pigment contents (Siguaet al.2018). Therefore,the canopy photosynthetic pigment contents and CNC also increased at first and then decreased. In addition to chlorophyll,plant leaves also can collect and transmit radiant energy through carotenoids,which makes important contributionsto the daylight response. However,there are still relatively few studies on carotenoid content and LNC in field experiments,and studies on canopy photosynthetic pigments and canopy nitrogen content mainly focus on remote sensing scales (Chouet al.2020). Guet al.(2021)found that the LAI and chlorophyll contents of wheat could be significantly increased under ridge-furrow film mulching with supplemental irrigation,which increased and then decreased over time. Their results were consistent with the results of this study. Other studies have shown that the nitrogen application rate and seeding rates are the main factors affecting chlorophyll content and LNC in wheat (Shiet al.2010;Fenget al.2014),which may be the main reason for the large differences between the different treatments at the same growth stages.
The LAI can be used to quantitatively describe the key parameters of plant photosynthetic efficiency,respiration,transpiration,etc.,so it has been widely used in crop growth monitoring (Yanget al.2017). The photosynthetic pigments of higher plants mainly include chlorophyll and carotenoids,and nitrogen nutrition plays an important role in the regulation of photosynthetic pigment synthesis in crop leaves. The N content of crop leaves is known to be related to the chlorophyll concentration and LAI (Revillet al.2021),but this study found no strong relationship between LAI and photosynthetic pigments or LNC. The leaf is one of the most important organs of crops,and a change in crop color is essentially a change in the photosynthetic pigment concentration in plants. Nitrogen is the main component of photosynthetic pigments,and there was a close relationship between photosynthetic pigment concentration and nitrogen content. Therefore,some studies have used proximal sensing to estimate leaf or canopy chlorophyll content to assess the nitrogen status in crops (Frelset al.2018). The photosynthetic pigments,proteins,water and carbon-containing compounds are the main substances that affect the absorption and reflection of light by leaves. Among them,the photosynthetic pigment content was found to be significantly positively correlated with nitrogen content in leaves (Penget al.2021),which was consistent with the results of this study. Total leaf chlorophyll content includes the contents of Chlaand Chlb,which can provide key information for crop nitrogen nutrition status (Cuiet al.2019). In addition,Gitelsonet al.(2014) also found that chlorophyll was closely related to the nitrogen content and could be used as a close surrogate for the leaf nitrogen concentration. Therefore,it should be feasible to indirectly estimate the nitrogen content of leaves through the content of photosynthetic pigments. There was a weak correlation between LAI and feature variables,and Zhanget al.(2021) found that most texture features were weakly correlated with maize LAI. Non-destructive measurements of canopy biophysical variables are required for many applications from precision agriculture to the global assessment of carbon and nutrient cycling(Atzbergeret al.2010). Therefore,practical information on canopy photosynthetic pigments or CNN are important for understanding plant function and status. This study found a strong correlation between canopy photosynthetic pigments and canopy nitrogen,which was consistent with the results of Clevers and Gitelson (2013). Most studies have used UAV multispectral imagery to analyze crops in precision agriculture,and VIs such as NDVI,GNDVI,or SAVI have been used to monitor crop water and nitrogen status (Kopa?ková-Strnadováet al.2021). Some studies have shown that the NDVI has a linear relationship with LAI (Songet al.2020),and NDVI has an exponential relationship with canopy chlorophyll content (Atzbergeret al.2010).
Kopa?ková-Strnadováet al.(2021) found that NDVI achieved a slightly higherR2for estimating total chlorophyll content than the Chlaand Chlbcontents.This was not consistent with the findings of this study,which may be related to the differences in crop varieties.Previous studies have shown that MTVI2 and EVI could estimate LAI and biomass more accurately than other OSVIs (Jinet al.2015;Songet al.2020). Since MTVI2 includes green,red and NIR bands,a decrease or increase in the reflectivity in these bands affects the total area of the triangle,which is highly correlated with LAI (Jinet al.2015;Songet al.2020). These results suggest that OSVIs could be used to estimate the LAI and biomass of winter wheat (Jinet al.2015). In some studies,a combination of canopy chlorophyll content index (CCCI)and canopy nitrogen index (CNI) was used for predicting the N status of the wheat canopy (Palkaet al.2021). This study found that the spectral indices very sensitive to photosynthetic pigments and leaf nitrogen content were GNDVI (r≥0.74),GCI (r≥0.84) and gray scale (r≤0.65).The results of this study showed that the retrieval accuracy of leaf photosynthetic pigments by machine learning was better than that of leaf nitrogen content.Chouet al.(2020) also found that leaf photosynthetic pigments were more closely related to leaf spectral reflectance data than leaf nitrogen content.
VIs are effective methods for extracting spectral features from remote sensing images. However,spectral information (vegetation index or reflectance) tends to be saturated when the crop canopy coverage is high (Zhanget al.2022). The saturation problem can be overcome to some extent by using spectral features combined with texture features (Maimaitijianget al.2017;Cenet al.2019). Essentially,the spectral information contained in multispectral remote sensing images is limited,but UAV multispectral images also have many texture features related to crop growth information. The texture features of images can describe the spatial correlation of pixels and reflect changes in vegetation structure. Image texture information can improve the LAI and estimation accuracy in wheat monitoring (Zhanget al.2021). Duanet al.(2019) developed a crop LAI monitoring method based on the Fourier spectral texture features of UAV images,which effectively improved the estimation accuracy of the crop LAI. Zhanget al.(2021) found that combining visible light,color index and texture features from fixed-wing UAV images could improve the estimation performance of wheat LAI and leaf dry matter. The results of this study show that models with spectral features combined with texture features not only improved the prediction accuracy of LAI compared with that using only spectral features,but it also improved the accuracy of canopy photosynthetic pigments and CNC. This improvement may be because the wheat canopy spectrum does not decrease with the reduction of leaf area due to the appearance of wheat ears at the later stage of growth,while the textural features improve the accuracy of crop monitoring by smoothing the canopy structure (Zhanget al.2021).Therefore,it is crucial to combine spectral features and texture features to make full use of the image spatial information for LAI estimation. Image texture features were also used to estimate the N-state of crops,which can improve the N-state determination by combining VIs with gray-level co-occurrence matrix (GLCM)-based textures derived from UAV-based multispectral VIs. Combining 2D wavelet textures with RGB-based VIs resulted in higher accuracy for estimating the nitrogen density in winter wheat plants than using VIs or wavelet textures alone(Yanget al.2019),which was consistent with the results of this study. This improvement may occur because the spatial features of the image can provide information that is complementary to the spectral and color space features(Fuet al.2020). Although texture features have improved the estimates of LAI and LNC in crops,few studies have examined the effect of texture features on estimating wheat photosynthetic pigments (Pu and Cheng 2015).This study also showed that combining the spectral features and texture features could improve the estimation accuracy of wheat photosynthetic pigments.
Crop nitrogen retrieval methods based on remote sensing images are mainly divided into empirical models and physical models. The former are mainly achieved by establishing the statistical relationships between vegetation indexes and nitrogen. They have high computational efficiency and can provide accurate estimations at the regional scale. The empirical modeling methods are simple and easy to operate,but these methods lack theoretical support. They mainly depend on the modeled data,vegetation type and characteristics,and the derived statistical relationship is only applicable to a specific area and vegetation type. By simulating the physical relationship between the spectral reflectance of remote sensing images and the physiological parameters of vegetation,a physics-based model is realized,which considers more factors and is suitable for nitrogen inversion in different regions and at different scales.
The physical models have strong theoretical support,but the modeling process is very complicated.Although the PROSAIL RTM used in conjunction with hyperspectral reflectance has been shown to provide an efficient method for estimating crop LAI and LCC,it is unclear how this method could be used to provide sufficient LNC for UAV multispectral images with high estimation accuracy (Zhenget al.2018). The research on LNC estimation using the vegetation index method and crop canopy spectrum at the same time is relatively limited and unsatisfactory,mainly because the crop canopy spectrum is not only affected by leaf biochemical parameters and leaf distribution,but also by crop canopy structure,soil nutrients,atmospheric conditions and other factors (Zhenget al.2018;Fuet al.2020). Whether using spectral features or texture features alone,or spectral features combined with texture features to estimate crop nitrogen,different regression methods can also improve the accuracy of crop nitrogen estimation.These regression methods can be mainly divided into two categories: parametric regression and nonparametric regression. Parametric regression methods,such as the commonly used multiple linear regression are better suited for data that exhibit linear or exponential relationships. Nonparametric regression methods,such as Random Forest (RF),are completely data-driven in the model building process. Therefore,making full use of different image features,such as spectrum,texture or color,through multiple regression methods may be a potential way to overcome the limitations of virtual instruments (Zhanget al.2021).
Currently,a variety of machine learning algorithms,such as RF,decision tree,artificial neural network (ANN),SVR,GPR,etc.,have been used in nitrogen estimation from remote sensing images (Fuet al.2020). The RF algorithm has become popular among the many machine learning algorithms due to its superior multi-dimensional data processing capability. Remote sensing data analysis should always provide a better understanding of underlying mechanisms and the identification of important spatial and spectral features that contribute the most to traits of interest in crops. Kernel-based machine learning regression algorithms have been widely used in crop nitrogen estimation due to their high generalization performance and ability to describe global nonlinear relationships. Previous work has shown that kernelbased machine learning regressions (e.g.,SVR and kernel partial least squares regression) can be interpreted as partial least squares regression rather than black-box techniques (Verrelstet al.2011,2012). The proposed visualization method can explain the driving forces behind the established kernel-based regression model. Some studies have used partial least squares (PLS),BPNN and extreme learning machine (ELM) to establish CNC prediction models based on different combinations of band reflectance and VIs (Camps-Vallset al.2016). Least squares support vector machines (LSSVMs),a variant of SVM,are also suitable for structural risk minimization and they outperform ANN in model generalization when limited training samples are available (Camps-Vallset al.2016;Fuet al.2020).
This study showed that the inversion model of winter wheat nitrogen content based on the BPNN and RBFNN models had the best accuracy in estimating winter wheat LNC by extracting crop information as a whole in the experimental plot. Compared with the traditional inversion method,this model has higher stability and it can provide a reference for the decision-making and management of water and fertilizer in winter wheat fields.Our results are consistent with the results of Leeet al.(2020),and we found that SVR performed better and was more consistent than multiple linear regression,but the difference between GBDT and SVR was only minor.At the same time,remote sensing images combined with a nitrogen dilution curve or nitrogen nutrition index is another method for crop nitrogen nutrition diagnosis(Palkaet al.2021),but the calculations of critical nitrogen concentration and nitrogen nutrition index require crop biomass data. Compared with traditional inversion methods,the proposed double-layer model for estimating winter wheat CNC improved the estimation accuracy of CNC,so it provides a new method for predicting the nitrogen nutrition status of the crop canopy and optimizing nitrogen fertilizer application. Although some errors accumulated in the double-layer model,the accuracy of the double-layer model was higher due to the large differences in the direct inversion accuracies of CNC(R2=0.61-0.80,RMSE=18.01-25.12 mg g-1,MAE=12.96-18.88 mg g-1) and photosynthetic pigments (R2=0.68-0.94,RMSE=0.17-0.48 mg g-1,MAE=0.13-0.36 mg g-1),and the prediction accuracy of winter wheat CNC based on the canopy photosynthetic pigments was excellent (R2=0.86-0.96,RMSE=7.92-14.83 mg g-1,MAE=5.84-10.89 mg g-1). However,Zhenget al.(2018) showed that an N inversion algorithm based on vegetation indexes was susceptible to the band configuration,exponential form and fitting function,and the model portability was poor.The double-layer model proposed in this study has a certain physical mechanism and provides higher inversion accuracy.
Studies have shown that the inversion accuracy of winter wheat LAI based on PROSAIL combined with a multiple regression model was better than that of the individual PROSAIL and multiple regression models (Wanget al.2022). This improvement may occur because the hybrid model has both the simplicity of the empirical model and the universal applicability of the physical model,so it has strong anti-noise resilience and is not susceptible to over-fitting (Wanget al.2022),thereby improving the accuracy of the model. However,the leaf water content can affect the spectral absorption characteristics of the leaf nitrogen content (Schlerfet al.2010),which brings some uncertainty into the estimation of the leaf nitrogen content. Therefore,the influence of the leaf water content on the estimation accuracy of leaf nitrogen content should be reduced by some spectral transformation techniques (such as logarithmic transformation),and such refinements need to be explored in future studies.
(1) Compared with the spectral features of UAV images,the spectral features combined with texture features of UAV images improved the inversion accuracy of winter wheat LAI,canopy photosynthetic pigments and CNC,and the results also showed better stability.
(2) The six machine learning algorithms selected could accurately estimate winter wheat canopy photosynthetic pigments than LAI and CNC,especially the RBFNN and BPNN models. Excellent prediction accuracy of CNC was also achieved by canopy photosynthetic pigments.
(3) The accuracy of the double-layer model in estimating CNC was better than that of the direct inversion model. Among the models,the double-layer model based on RBFNN with the screened SFs+TFs as inputs had the best accuracy in estimating winter wheat CNC.
Acknowledgements
This study was funded by the Key Research and Development Program of Shaanxi Province of China(2022NY-063) and the Chinese Universities Scientific Fund (2452020018).
Declaration of competing interest
The authors declare that they have no conflict of interest.
Journal of Integrative Agriculture2023年7期