Lei Li ,Muhmmd Adeel Hssn ,Shurong Yng ,Furong Jing ,Mengjio Yng ,Awis Rsheed,c,d ,Jinkng Wng,Xinchun Xi,Zhonghu He,c,*,Yonggui Xio,*
a Institute of Crop Sciences,National Wheat Improvement Centre,Chinese Academy of Agricultural Sciences (CAAS),Beijing 100081,China
b Electronic Information School,Foshan Polytechnic,Foshan 528137,Guangdong,China
c International Maize and Wheat Improvement Centre (CIMMYT) China Office,c/o CAAS,Beijing 100081,China
d Department of Plant Sciences,Quaid-i-Azam University,Islamabad 44000,Pakistan
Keywords:Deeping learning High-throughput phenotyping QTL mapping RGB imaging
ABSTRACT Spike number(SN)per unit area is one of the major determinants of grain yield in wheat.Development of high-throughput techniques to count SN from large populations enables rapid and cost-effective selection and facilitates genetic studies.In the present study,we used a deep-learning algorithm,i.e.,Faster Region-based Convolutional Neural Networks(Faster R-CNN)on Red-Green-Blue(RGB)images to explore the possibility of image-based detection of SN and its application to identify the loci underlying SN.A doubled haploid population of 101 lines derived from the Yangmai 16/Zhongmai 895 cross was grown at two sites for SN phenotyping and genotyped using the high-density wheat 660K SNP array.Analysis of manual spike number (MSN) in the field,image-based spike number (ISN),and verification of spike number(VSN)by Faster R-CNN revealed significant variation(P<0.001)among genotypes,with high heritability ranged from 0.71 to 0.96.The coefficients of determination (R2) between ISN and VSN was 0.83,which was higher than that between ISN and MSN (R2=0.51),and between VSN and MSN(R2=0.50).Results showed that VSN data can effectively predict wheat spikes with an average accuracy of 86.7%when validated using MSN data.Three QTL Qsnyz.caas-4DS,Qsnyz.caas-7DS,and QSnyz.caas-7DL were identified based on MSN,ISN and VSN data,while QSnyz.caas-7DS was detected in all the three data sets.These results indicate that using Faster R-CNN model for image-based identification of SN per unit area is a precise and rapid phenotyping method,which can be used for genetic studies of SN in wheat.
Digital phenotyping of crop traits relies on different types of advanced sensors,image quality,appropriate image analysis and data mining tools[1].The development of new digital phenotyping techniques is getting attention because of the superiority of these techniques over manual methods.Field-based phenotyping of important crop traits using conventional tools remains a bottleneck [2].Image-based phenotyping requires huge image analysis that can be managed through new algorithms to measure a wide array of traits[3-6].Spike number(SN)per unit area is a key indicator used to predict grain yield in wheat(Triticum aestivum L.)[7].Accurate quantification of SN in large populations is important to determine grain yield for selection.
Traditionally,SN is determined by manual count,a laborious and time-consuming method.Usually,a limited unit area within plots is selected for spike counting;these counts can be misled because of heterogeneity in the whole plot.To overcome the spike counting limitation under field conditions,different image-based approaches for automatic spike counting were recently developed using thermal,red-greenblue (RGB),and multispectral imagery[8-11].A combination of imaging and machine learning techniques has been successfully used to count wheat spikes and achieved predictions with high precision.A method based on a Gabor filter and K-means clustering algorithm has been reported for spikes detection with 90.7% accuracy,despite of limitations regarding time efficiency of analysis due to algorithmic complications [12].The use of multi-feature optimization and a twinsupport-vector-machine to recognize spikes also yielded better results,but this approach still needs improvements [13].Thermal sensors are considered more efficient at capturing crop features based on color contrast and temperature difference of canopy.Previously,high correlation (R2=0.80) between automatic spike counts from thermal images and manual observations was also reported in wheat [11].However,the thermal sensors are expensive which limits their availability for many research programs[14].RGB imagery-based phenotyping has relatively lower cost,higher resolution,and easier adaptation under varying light conditions as compared to thermal sensors [10,15].The RGB cameras have been successfully used for phenotyping of key traits such as biomass and senescence in wheat and maize (Zea mays L.)[14,16-19].Wheat SN has been predicted with high accuracy (up to R2=0.75)during the mid-grain filling stage when spikes appear green with the background of light yellowish leaves in RGB images[8].
In recent years,the improvement of deep learning,artificial intelligence,and convolutional neural networks (CNN) for detection of plant traits has increased the importance of image-based phenotyping in crops [20].In the context of spike detection two main approaches have been reported for detection of spikes in a large field area.The first approach is a regression model through TasselNet,and the second is a target detection model using Faster R-CNN [17].Previously,target detection using the Faster R-CNN algorithm has shown more robustness and precision with high repeatability in detecting wheat spikes at mid-grain filling stage[21,22].However,these studies have been done under relatively low densities of spikes (130-180 spikes m-2) [22],whereas spike densities in China are high,i.e.,around 600-700 spikes m-2.Therefore,it is important to know whether the Faster R-CNN algorithm is effective to count spikes under high densities,in which many small spikes overlap with each other in RGB images.Most of the previous studies have focused on improving the accuracy of the model and exploiting the detection algorithm for prediction of spikes.There is no report on the application of Faster R-CNN model-based spike counting data for genetic studies.Quantitative trait loci(QTL)mapping using high-density genetic maps is a powerful approach to identify and understand the genetic basis of important traits in crops [23].The integration of RGB image and Faster R-CNN algorithm based rapid SN data with QTL mapping can accelerate breeding activities in wheat.
The aims of this study were to validate the Faster R-CNN algorithm for counting spikes using RGB imagery in a doubled haploid(DH) population of wheat and to evaluate its accuracy for QTL mapping using the high-density 660K SNP array.Our findings provide a new avenue for fast and cost-effective digital phenotyping of wheat agronomic traits.
A DH population of 101 lines derived from the cross of Yangmai 16/Zhongmai 895 was used to detect spike numbers for QTL mapping.Whereas 207 wheat genotypes including accessions from the Yellow and Huai Valleys Winter Wheat Zone (YHVWWZ) of China and five other countries such as Turkey,Australia,Japan,Argentina and Italy were used for RGB imaging to train the Faster R-CNN model.These genotypes have been described previously [24].The DH population was planted in 2017-2018,and the 207 wheat genotypes were planted during the cropping season of 2018-2019.Field trials were conducted at Xinxiang (35°18′0′′N,113°52′0′′E)and Luohe(33°34′0′′N,114°2′0′′E),in Henan province,China,using randomized complete blocks with three replications.The plot area was 3.0 m2(1.0 m × 3.0 m) with 6 rows and 0.2 m spacing between rows.The planting density was maintained at 270 plants m-2.To ensure the same growth conditions and density during the grain-filling stage,plots were treated with supplementary and tailored seedlings during the early tillering stage.Field management was done according to the local standards to ensure uniform experimental conditions.
Data acquisition for all traits was done by both manual spike counting and ground based RGB imaging using a digital camera(DJI HG310 Head Camera) at the early grain filling stage.RGB images were taken vertically 0.7 to 0.9 m above the canopy under natural light illumination.To ensure the consistency of spike density for each RGB image,a 0.5 m × 0.5 m square shape made by four red plastic tubes was used to obtain the SN per unit area.For each plot,two red squares were randomly placed at nonmarginal areas of each plot for taking two images (Fig.1a).All images were taken on sunny days between 2:00 to 4:00 PM.The original image size and resolution were about 3-4 Mb with 4000 × 2250 pixels.However,there were many spikes outside the red squares in the original images,which resulted in errors in the statistical analysis.Thus,the images were cropped,compressed and saved at the resolution of 1000 × 1004 pixel(Fig.1b).Spike recognition was performed using a supervised learning strategy.So we used Labeling (https://github.com/lilei max/github)to annotate spikes in the 1032 images from the training data set,which included 207 lines × 1 replicate × 1 shoot for Luohe and 207 lines × 2 replicates × 2 shoots for Xinxiang.Due to logging three images were removed from Xinxiang data set,and 808(101 lines×2 replicates×2 images×2 locations)images from the testing data set (Fig.1c).Ten images were randomly selected from the training set as a validation set for the model.The purpose of labeling the testing data set was to compare correlations with automatic spike counts obtained using Faster R-CNN.Labeled image data were saved in the 2007 VOC format.After training the model using data from 207 genotypes,three types of spike counting data sets were generated on the DH population,(1) manual count spike number in the field (MSN) using the 0.25 m2sampling sites,(2) image-based spike number (ISN)counted within the red squares using Labelimg,and(3)verification of spike number (VSN) obtained through automatic counting by integrating RGB images and the Faster R-CNN model.
In total,1032 images from the 207 wheat genotypes were used in the TensorFlow implementation of the Faster R-CNN model,and an object detection application programming interface (API) for identification of spikes.Compared with the Fast R-CNN,a region proposal network (RPN) structure was added in the Faster R-CNN as a representative two-phase model [25,26].The advantage of RPN is the idea of sharing of weights and translation-invariance which not only ensures accuracy but also helps in fast end-toend recognition[27].The function of the RPN network is to generate anchors through convolution of feature maps in rapid manner.In this way,spikes can be enclosed by each anchor as much as possible,and each generated anchor was evaluated through full connection convolution.Usually,Faster R-CNN can be divided into four parts:a backbone,RPN network,regions of interest(ROI)pooling,and fully connected classification and regression model.Selecting a befitting backbone is a prerequisite for obtaining high-quality feature maps.Several backbones,such as VGG [28],GoogLeNet [29],ResNet [30],MobileNet [31],NASNet [32] and Inception v3 [33] can be selected.Taking the amount of computation and the composite indices into account,ResNet50 was chosen as the backbone.It is a deep residual network and consists of many Conv Blocks and Identity Blocks (Fig.S1),which can resolve the degeneration problem of deep neural networks and generate a high-quality feature map [30].
Fig.1.Preparation of images for spike detection.(a)Training image before modifying into 4000×2250 pixels.(b)Training image after modifying into 1000×1004 pixels.(c)Training image after labelling with a 1000 × 1004 pixels.
To make the model more suitable for spikes recognition for both high and low dense characteristics,we adjusted some parameters based on the original framework.Firstly,we adopted the feature stride to 8.So,at the first conv2D of the ResNet50,the stride was 1 to have a good result for low and highly dense spikes.Then,we also adjusted the number and scale of each anchor,and 12 anchors for every pixel of featured maps were generated.The scale of these anchors was 0.25,0.5,1.0 and 2.0,respectively,while the aspect ratios were 0.5,1.0 and 2.0,respectively.Table S1 shows the size for each anchor.
The concept of intersection-over-union (IOU) was also introduced to evaluate the quality of thousands of anchors by overlapping ratios.We adopted IOU=0.7 as the threshold to compare the overlap of each anchor with the labeling ground truth square.If IOU was greater than 0.7,the anchor was treated as the foreground,while anchors with an IOU <0.3 were regarded as the background.Whereas,those with an IOU between 0.3 and 0.7 were ignored.Regression and classification were further used to obtain information on anchors.In regression,the regularization can prevent overfitting and improve the generalization ability,and L2 regularization in particular had a better effect.The sigmoid and softmax with outstanding classification functions were used in Faster R-CNN.We applied L2 regularization and softmax,through two 1×1 convolutions to get the regulatory coordinates and score of each anchor.In cases where more than one anchor corresponded to one spike,a non-maximum suppression (NMS) strategy was adopted.At the training stage,the principle was to build the model faster and better,according to the results of classification and NMS.We selected the top 300 anchors and combined with the results of regression to regulate the anchors.After this step,the RPN loss and proposals were obtained.
ROI pooling,full connected classification and regression section can be regarded as further confirmation that is similar to RPN.ROI pooling crops the feature maps according to the proposals,and pools them to the same size.This operation not only implements end-to-end training,but also allows weight sharing with Resnet50.After ROI pooling,the model could connect a convolution (the last layer of Resnet50) and flattening,which makes full classification and regression easier.In order to classify these 300 proposals,we used 0.3 as a threshold to determine whether the proposal contains spikes,which is the same regression and classification strategy used by RPN.This process is illustrated in Fig.2.
We trained the model on Dell PowerEdge C4130 server,which consisted of two E5-2603 CPUs with 128G of memory and two Nvidia Tesla P100 GPUs.Adopting the concept of transfer learning,we used the weight of VOC2007 as an initial weight to train the model,which contained 9963 images,belonging to 20 object categories.
The evaluation model requires a loss function.There are two regressions and classifications in Faster R-CNN.The regression and classification loss were calculated using the following formulas:
where,M is the number of training images,aiis the residual that measures the difference between the regressed coordinates and the ground truth coordinates for the ith image,biis the score for the ith image,and yiis either 1 or 0 for the ith image (the labeled is 1,and the background is 0).
During the training,the total loss functions consisted of RPN regression loss,RPN classification loss,full connected regression loss and full connected classification loss.The initial weights were updated with each epoch through back-propagation and stochastic gradient descent,making the value of total loss function decrease continuously.Finally,the predicted number of spikes was closer to the real value.
We adjusted the initial learning rate at 3 × 10-4based on several experiments in order to avoid the occurrence of overfitting,and adopted the learning rate to descend by one tenth at 90,000 iterations.The maximal training step was 120,000 iterations with a batch size of 6 and the momentum was 0.9.During training,the loss functions began to converge after 45,000.Therefore,we terminated it in advance.The final number of iterations was 50,000.Fig.3 shows the loss functions curve approaching 50,000 iterations during the debugging process.
The convergence of the loss value is of concern during the training phase,but other indicators also need to be considered.Thus,we introduced the concepts of accuracy(A),precision(P),recall(R)and harmonic mean of precision and recall (F1).
Fig.3.The total loss curve of 5 × 104 iterations during the training process.
where,TP (true positive) and FP (false positive) mean that the model perceives the presence of a spike when there is present or absent of a spike in the proposal,respectively;TN (true negative)and FN (false negative)means the model perceives no spikes when there is absent and present of a spike in the proposal,respectively.In this study,no images were obtained with TN result.
The DH population and the parents (Yangmai 16/Zhongmai 895) were genotyped using commercially available Affymetrix wheat 660K SNP array at the Capital Bio Corporation (Beijing,China;https://www.capitalbio.com).Previously,this array was used for genome-wide QTL mapping studies [23].The averaged value of MSN,ISN and VSN from two replicates in each environment and the best linear unbiased prediction (BLUP) were used for QTL mapping.IciMapping 4.0 was used for linkage map construction using the Kosambi mapping approach.The inclusive composite interval mapping-additive (ICIM-ADD) method was used in QTL analysis at a logarithm of the odds (LOD) threshold of 2.5.To assess the accuracy of QTL calling from the VSN data set,we cross-validated our results with the ground truth data.QTL with overlapping confidence intervals identified in different environments or data sets were considered to be identical.Differences between the phenotypic variances explained by QTL from both data sets were detected as validation for the image-based QTL.
The phenotyping data of MSN,ISN and VSN from the two environments were subjected to analysis of variance and correlation using Python 3.7 to establish linear regression functions.SAS 9.4 software (SAS Institute,Cary,NC,USA) was used to the variance,and the broad-sense heritability (h2) was calculated using the following equation.
DH lines showed significant variation(P<0.001)in spike number under both environments (Table 1).There was also significant genotype × environment interaction.The broad sense heritability values of MSN,ISN,and VSN ranged from 0.90 to 0.93,0.76 to 0.77,and 0.71 to 0.78,respectively.The coefficients of variation (CV) for MSN,ISN and VSN data sets were 11.6%,10.1%and 9.4%in Xinxiang and 11.2%,13.4%and 12.5%in Luohe,respectively(Fig.S2).There was a transgressive segregation among the DH lines in MSN,ISN,and VSN data sets,which showed out performance of DH lines compared to parents for SN,typical of quantitative nature controlled by polygenes.Absolute values for skewness and kurtosis were smaller than 1 for all the three data sets across environments (Fig.S2).
Performance of Faster R-CNN can be considered from the perspective of images and lines.The model was centered on different images,so the performance of the model was judged by the loss function curve during training and the evaluation metrics in different training and testing images.Therefore,the phenotypic data was obtained through averaging the MSN,ISN and VSN from different images of the same line.The linear regression functions and the coefficients of determination (R2) between the average MSN,ISN and VSN values were used to evaluate the performance of the model and map QTL.
The total loss of Faster R-CNN was close to 0 and reached convergence after 50 k iterations(Fig.3).From the testing image data set,50 images were randomly selected to evaluate the performance of the trained model.Results for each image are given in Table S2.The F1scores ranged from 0.87 to 0.97.The accuracy of the model for spike prediction when compared with the MSN ranged from 76.0%to 98.0%with an average of 86.7%.The proportion of FP was small and could be ignored.The average FN ratio was large(12.5%)and had a greater impact on the accuracy of the model.The low recall rate for some images indicates that some spikes cannot be detected in the images.Fig.4 shows the ability of the model to identify spikes under different amounts of solar illumination.Under bright conditions,the model had a better ability to identify spikes;testing images were cropped and enlarged to show the details of spikes in the pictures.We observed that the failure of the model to recognized spikes in some images was due to the darkness of image or spikes being obscured by leaves,even though some erect spikes were also not recognized in several images (Fig.4).
We averaged the phenotypic data of 101 lines in the DH population from two replicates in each environment to establish three linear regression functions (Fig.5).There was a high linear correlation between VSN and ISN (R2=0.83,MSE=21.40),indicating that the model performs well in recognizing spikes.The smallest error was observed between VSN and ISN.The R2values between MSN and VSN and between MSN and ISN were similar (R2≈0.50).This not only reflects the difference between the model and artificial counting,but also indicates that the difference in performance was mainly caused by the labeling.The MSE of MSN and ISN was greater than that of MSN and VSN,indicating that the latter was more stable.Comprehensive analysis showed that the error between MSN and VSN had a great relationship with the effect of labeling.
Three QTL,QSnyz.caas-4DS,Qsnyz.caas-7DS and Qsnyz.caas-7DL,were detected on chromosomes 4DS,7DS and 7DL,respectively(Fig.6;Table 2).Qsnyz.caas-4DS was located between SNPs AX-89421921 and AX-109478820 and was only identified in the MSN data set across all environments.The QTL on chromosome 4DS,which explained 5.6%to 7.2%of the phenotypic variation,was close to the functional marker of the semi-dwarf gene Rht-D1.
Qsnyz.caas-7DS was identified in all the three data sets (MSN,ISN,and VSN) and explained 8.1% to 16.6%of the phenotypic variance in SN with LOD scores ranging from 3.34 to 4.86.Qsnyz.caas-7DL,located between markers was only detected in the MSN data set between markers AX-109122450 and AX-108816163,was only detected in the MSN data set,accounting for 7.5% to 8.2% of the phenotypic variance.The additive effects of the three QTL showed that the alleles increasing SN were contributed by Yangmai 16.
Development of deep learning and image-based integrative methods can decrease the workload of spike counting when conducting large germplasm screens under field conditions.In terms of subjectivity,deep learning technology solves the problem of phenotypic errors caused by individual subjectivity differences,so it is more stable in repeated measurements and has a great significance in wheat breeding.The use of CNN for image analysis is becoming more acceptable,while it has proven to be able to extract features of wheat effectively under greenhouse conditions[34].
In the present study,a Faster R-CNN model was trained and used to detect wheat spikes from RGB images under field conditions.The model showed average accuracies up to 86.7%,in agreement with a previous report,in which images acquired at three different growth stages were used to train Faster R-CNN and four models were developed with the accuracy ranging from 88.0% to 94.0% [22].Compared with the findings of Hasan et al.[22] and Lu et al.[35],the number of images used to train the model in our study was three or four times greater,which has improved the repeatability of our model and alleviated the overfitting problem [36].Previously,an issue of correct identification of super big spikes raised [21],and we have also tried to resolve this issue by improving the proportion and size of anchors,and then increasing the number of anchors in each pixel of the feature maps.The above problems could also be alleviated by increasing the sample size of super spikes in the training stages and changing the model might fully solve this issue.Our model had also some disadvantages that were mainly due to consistent background noises in the images.We adopted the data augmentation strategy of horizontal flipping in our model,but not other strategies such as random cropping,random scaling,color jittering,and noise.This led to a reduction in the number of suitable images to train the model and caused the poor generalization ability of model.False negative had a great influence on the accuracy of the model(Table S2).Previous reports have pointed out that differences in the size of spikes among different cultivars,the presence or absence of awns,spike at different stages,variable lighting conditions,different angles of the camera during image capture,and the complexity of the background could affect the accuracy of model [21,22,37].Our results showed that our model did not perform well on some images.This is because the DH lines have different spike characteristics,especially those that grow vertically,and spikes were at the lower parts (Fig.4;Table S2).The SN of wheat are composed of tillering spikes and main-stack spikes.When taking images from the canopy,some of the main-stack spikes could not be labeled due to occlusion by the leaves.In addition,most tillering spikes were not easy to annotate by Labeling and they lacked corresponding lateral texture features,which made it hard for the model to distinguish spikes.Moreover,spike prediction accuracy in some images was lowbecause of darkness in some parts of images and overlapping of leaves with spikes due to high plant density within the plots.However,in the field statistics,every tiller and main-stack spike were counted,resulting in a large gap between MSN and ISN.The above deficiencies of the model can be overcome and improved using multi-angle imagery and image fusion in addition to data augmentation.
Table 1 Comparison of different methods for counting spike number.
Table 2 QTL identified for spikes per unit area for DH lines derived from the cross between Yangmai 16 and Zhongmai 895.
Fig.4.Comparison of the performance of the model in dark(a)and bright(b)solar illumination.On the left are the original images and correspond to the result of the model to identify spikes.Red boxes indicate recognized spikes and blue unrecognized spikes.
Fig.5.Linear regression analysis between Avg.MSN,Avg.ISN and Avg.VSN.Red represents the linear regression of the Avg.ISN and Avg.VSN;blue represents the linear regression of the Avg.MSN and Avg.ISN;green represents the linear regression of the Avg.MSN and Avg.VSN;MSN,average the manual spike number in the field of two duplicates for one location;Avg.ISN,average the image-based spike number of two duplicates for one location;Avg.VSN,average the verification of spike number by Faster R-CNN of two duplicates for one location.
The improved ISN data set can significantly reduce errors between manual counts and the model prediction.The labeling process was the premise for training the model and served as a bridge between the MSN and VSN data sets.The high determination coefficient (R2=0.83) between the image-based data sets(ISN and VSN) as compared with the manual data (MSN) showed that the model can identify most spikes that are easily labeled.Some spikes that the model failed to identify were mainly affected by external noises.There are three possible reasons for the relatively low R2value between MSN and ISN (R2=0.50).First,the planting density led to serious occlusion,so we were only able to label spikes on the surface of the canopy,especially for some genotypes with more tillers.Second,we used rectangular labeling,instead of the dotted outlines tagging approach such as TasselNet.We added some background characteristics for genotypes with different type of spike growth,especially in areas of dense spikes where only a small part of the heads could be seen in the picture.This resulted in inconsistent sizes of the labeling box and difficulty of labeling.Third,the presence of red plastic tubes changed the position and density of the surrounding spikes.During the cutting process,some spikes outside the tubes were still identified by the model but discarded in the manual identification step.The use of multiple labeling approaches is an effective way to solve this problem,i.e.,through inspecting the labeling and correcting the incorrect labels and unlabeled boxes through repeated training.
Fig.6.QTL mapping of spikes per unit area for the DH population.XX-Avg.MSN,average the manual spike number in the field of two duplicates in Xinxiang;XX-Avg.ISN,average the image-based spike number of two duplicates in Xinxiang;XX-Avg.VSN,average the verification of spike number by Faster R-CNN of two duplicates in Xinxiang;LH-Avg.MSN,average the manual spike number in the field of two duplicates in Luohe;LH-Avg.ISN,average the image-based spike number of two duplicates in Luohe;LHAvg.VSN,average the verification of spike number by Faster R-CNN of two duplicates in Luohe;BLUP,best linear unbiased prediction of spike number in Luohe and Xinxiang.
The SN per unit area is one of the three major factors that determine grain yield in wheat,but its heritability was reported moderate because it can be easily influenced by environments [38].Therefore,precise phenotyping of SN is crucial for making breeding decisions and for conducting genetic studies under particular environments.Previous studies have focused on improving model algorithms for spike recognition,but none of them have used this information to check its accuracy for QTL mapping.Using the three data sets,we observed high heritabilities(0.71 to 0.90).Significant variation among the DH lines for all the three data sets indicated that image-based spike count data can be used for QTL identification.A QTL on chromosome 7D (Qsnyz.caas-7DS) was identified across all the data sets (MSN,ISN,and VSN).This indicates that using Faster R-CNN and an image integration approach for spike detection has the potential for QTL analysis.Phenotypic variance explained by the VSN data set based Qsnyz.caas-7DS was 16.6%when validated through MSN and ISN data sets.Previously,a similar QTL for SN on chromosome 7D has been detected in several studies on different sets of wheat genotypes [39-42].Qsnyz.caas-7DL,which was detected only using MSN under two environments,was not reported before and is likely a new QTL.Another QTL on chromosome 4DS was only detected using MSN across two environments.This QTL was close to the dwarf gene Rht-D1b and explained up to 7.2% of the phenotypic variance in SN.A similar QTL on chromosome 4D accounting for 9.2%of the phenotypic variance was reported in a BC2F2population derived from wild relatives of wheat [43].
Our results showed that the Faster R-CNN model prediction can be used to map QTL for SN trait in wheat.But efforts are still needed to increase the correlation between counts obtained using the model and the manual method.Here,we used a 0.5 × 0.5 m square,which limits the sampling size and reduce the throughput.In the future,we will also try to acquire images of a larger area of wheat spikes using an unmanned aerial vehicle carrying highresolution lens.
The development of deep learning and image-based integrative methods can reduce the workload of spike counting for large breeding.We have presented details for an automatic system for spike counting using RGB images captured from a ground-based camera and integration with quantitative genetic analysis.This includes a pipeline for employing machine learning techniques for image classification and spike counting.The spike counting system was successfully able to identify wheat spikes with relatively high accuracy.The accuracy and generalization of the Faster R-CNN model can be improved by expanding the size of training data sets and increasing the number of annotations.Some factors such as the impacts of image brightness and spike densities per unit area should be investigated further to increase the accuracy of the Faster R-CNN model.High accuracy in deep learning models can increase the precision in quantitative genetic analysis for future crop breeding.
CRediT authorship contribution statement
Lei Li:Conceptualization,Methodology,Investigation,Writingoriginal draft,Visualization,Formal analysis,Writing -review &editing.Muhammad Adeel Hassan:Conceptualization,Methodology,Investigation,Writing -original draft,Writing -review &editing.Shurong Yang:Investigation,Formal analysis.Furong Jing:Investigation,Data curation.Mengjiao Yang:Investigation.Awais Rasheed:Writing-review&editing.Jiankang Wang:Writing-review&editing.Xianchun Xia:Resources,Writing-review&editing.Zhonghu He:Supervision,Project administration,Funding acquisition.Yonggui Xiao:Conceptualization,Writing-review&editing,Supervision,Project administration,Funding acquisition.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was funded by the National Natural Science Foundation of China (31671691,3171101265,and 31961143007),the National Key Research and Development Program of China(2016YFD0101804),and the Fundamental Research Funds for the Institute Planning in Chinese Academy of Agricultural Sciences(S2018QY02).
Appendix A.Supplementary data
Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2022.07.007.