摘" " 要:中國在水果產(chǎn)量方面處于全球領(lǐng)先地位,但因人力資源減少和老齡化問題,傳統(tǒng)的人工采摘方式已經(jīng)無法滿足快速高效的采摘需求,研發(fā)集成計(jì)算機(jī)視覺的自動(dòng)化水果采摘設(shè)備成為解決勞動(dòng)力短缺難題的關(guān)鍵。水果大多呈類球狀,相關(guān)的識(shí)別算法研究居多,探討了柑橘、蜜桃等類球狀水果的識(shí)別算法。根據(jù)應(yīng)用場景的不同,分析了傳統(tǒng)類球狀水果識(shí)別算法與基于深度學(xué)習(xí)的類球狀水果識(shí)別算法在網(wǎng)絡(luò)結(jié)構(gòu)方面的差異與改進(jìn),對水果采摘識(shí)別算法進(jìn)行總結(jié)并提出算法的未來發(fā)展趨勢。傳統(tǒng)算法在簡單場景下表現(xiàn)有效,但在復(fù)雜環(huán)境中往往會(huì)受到設(shè)計(jì)特征的限制,基于深度學(xué)習(xí)的算法因其高效性和準(zhǔn)確性更適合自動(dòng)化水果采摘的需求。總結(jié)了類球狀水果識(shí)別算法的研究進(jìn)展,在處理復(fù)雜環(huán)境時(shí)深度學(xué)習(xí)算法具有良好的有效性和適應(yīng)性,更適合部署在自動(dòng)化采摘設(shè)備;也提出了未來的研究方向,即通過優(yōu)化算法性能、數(shù)據(jù)集構(gòu)建及擴(kuò)增,以及結(jié)合多模態(tài)數(shù)據(jù)提升算法的精度和適應(yīng)性。
關(guān)鍵詞:水果采摘;目標(biāo)檢測算法;深度學(xué)習(xí);卷積神經(jīng)網(wǎng)絡(luò);計(jì)算機(jī)視覺
中圖分類號(hào):S66 文獻(xiàn)標(biāo)志碼:A 文章編號(hào):1009-9980(2025)02-0412-15
Research progress in globular fruit picking recognition algorithm based on deep learning
LI Hui1, ZHANG Jun2*, YU Shuochen2, LI Zhixin2
(1School of Information Engineering, Huzhou University, Huzhou 313000, Zhejiang, China; 2Food Science Research Institute, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, Zhejiang, China)
Abstract: China is a global leader in fruit production, and fruit picking mainly relies on manual labor, which helps to select fruits according to fruit size and quality to reduce loss in this way. Different techniques and tools can be adopted according to the characteristics and picking needs of each fruit crop. However, the present picking field is faced with the problem of decreasing human resources and aging problem. Meanwhile, the traditional manual picking method has become unable to meet the demand for fast and efficient picking. To solve the problem of labor shortage, the research and development of automated fruit picking equipment with integrated computer vision have become the key to solve the problem of labor shortage. It can effectively improve the efficiency and quality of fruit picking. Automatic picking equipment combined with computer vision often uses object detection algorithms to identify objects, and object detection algorithms can be divided into both traditional algorithms and deep learning-based object detection algorithms.Traditional algorithms identify the position and bounding box of a specific object in an image or a video, usually by preprocessing the image (Scaling, grayscale or normalization), feature extraction (using traditional hand-designed features or automatic learning based on machine learning), classification or regression (confirming object class and location), and non-maximum suppression to further optimize and filter detected objects. When traditional fruit detection algorithms process images in complex environments, their limited expression ability and robustness are easily affected by illumination, occlusion and other factors, resulting in a decline in recognition accuracy. Furthermore, with the increase of feature complexity and computation amount, the algorithm processing speed will be reduced. When changing scenes, adding fruit types and updating features, the feature extractor needs to be redesigned and adjusted, and in special cases, the entire system needs to be retrained. Compared with traditional fruit detection algorithms, the fruit detection algorithm based on deep learning can extract and learn rich features from a large amount of data, and has higher accuracy and robustness when processing noisy data. When changing new environments and adding new categories, the fruit detection algorithm based on deep learning can improve the recognition ability and recognition accuracy of the model through transfer learning, data enhancement, multi-model combination, feature fusion and multi-modal data. Fruit detection algorithms based on deep learning can be divided into two categories: one-stage target detection algorithm and two-stage target detection algorithm. The one-stage object detection algorithm achieves end-to-end detection by using a single convolutional neural network to directly predict the target location and category. This method achieves fast detection while maintaining high accuracy, transforms the problem of target detection into a regression problem, and completes the location and classification of the target directly. In the training and deployment phase of the algorithm, the first-stage object detection algorithm uses pruning and quantization techniques to reduce the model size, which is suitable for running in mobile devices or embedded systems with limited resources. The two-stage target detection algorithm is called the target detection algorithm based on region of interest or region suggestion, which is usually divided into two stages: 1) Generate a large number of candidate regions by selective search, regional suggestion network (RPN) and other methods; 2) Through the network processing including classifiers and boundary box regressors, the candidate region is identified and accurately located. Traditional algorithms are effective in simple scenarios, but are often limited by design features in complex environments. Algorithms based on deep learning are more suitable for automated fruit picking due to their high efficiency and accuracy. This paper summarizes the improvement and application of traditional object detection algorithm and deep learning-based object detection algorithm. Also, this paper summarizes the improvements and applications of traditional spherical fruit detection algorithms and deep learning-based spherical fruit detection algorithms, and analyzes the advantages and disadvantages of these algorithms in different use scenarios. This paper summarizes the fruit picking recognition algorithm and puts forward the future development trend of the algorithm. With model optimization and lightweight as the starting point, the efficient network architecture or model compression technology is adopted to reduce computational complexity and model size, improve model processing speed and adapt to mobile automatic picking equipment. It is required to enhance data processing, improve model generalization by preprocessing and synthesizing data, and optimize model adaptability in changing environments. The accuracy and robustness of model recognition are improved by combining spectral, infrared, laser and other sensor data. The model adaptive adjustment algorithm should be developed to adjust strategies and parameters according to real-time feedback and adapt to different fruit picking operations and different picking environmental conditions. In the fruit picking recognition algorithm based on deep learning, YOLO can directly predict the boundary box and category probability of the target in a single forward propagation to achieve near real-time detection, which is very important for the fruit picking robot in the orchard that needs fast response. The end-to-end design of YOLO simplifies the training inspection process, reduces complexity, and enables faster deployment in picking robot systems. In the changeable environment of orchards and groves, YOLO can effectively distinguish between fruit and background, improving the accuracy of detection. With the continuous research by domestic scholars, YOLO algorithm is also continuously iteratively optimized, and its ability to detect the objects of different sizes and shapes is significantly improved, which can adapt to the maturity degree, size and occlusion of fruits, and improve the detection performance in complex environments.
Key words: Fruit picking; Object detection algorithms; Deep learning; Convolutional neural networks; Computer vision
中國水果產(chǎn)量位居全球第一[1],常見的類球狀水果如柑橘、蘋果和桃等的采摘工作主要依賴人工。人工采摘時(shí)能根據(jù)果實(shí)成熟程度、大小和品質(zhì)選擇果實(shí)以減少損耗,并能根據(jù)不同果樹的特點(diǎn)和采摘需求采用不同的技術(shù)與工具。因人口老齡化勞動(dòng)力數(shù)量減少,僅靠人工無法高效完成采摘任務(wù),所以開發(fā)自動(dòng)化水果采摘設(shè)備成為解決這一問題的重要途徑。
結(jié)合計(jì)算機(jī)視覺的自動(dòng)化水果采摘設(shè)備,利用采摘識(shí)別算法識(shí)別果園中的水果、判斷水果品質(zhì)并精確定位[2]。與傳統(tǒng)的人力采摘相比,結(jié)合計(jì)算機(jī)視覺的自動(dòng)化采摘設(shè)備具有識(shí)別速度快、精度高、成本低、提升水果采摘效率和質(zhì)量等優(yōu)點(diǎn)[3]。在采摘類球狀水果時(shí),類球狀水果在二維圖像中呈現(xiàn)接近圓形的輪廓,穩(wěn)定的幾何形狀能幫助算法減少因形狀復(fù)雜性帶來的誤檢問題[4]。并且,類球狀水果因具有規(guī)則的形狀和一致的顏色分布,容易進(jìn)行有效的數(shù)據(jù)增強(qiáng)操作,增加訓(xùn)練數(shù)據(jù)的多樣性,提升算法的泛化能力,進(jìn)一步提高設(shè)備的通用性和穩(wěn)定性以更好地適應(yīng)不同的果園環(huán)境。自動(dòng)化采摘設(shè)備在自然環(huán)境下作業(yè)同樣會(huì)遇到許多問題或要求,例如光照變化、遮擋問題、多樣性、復(fù)雜性、背景干擾以及實(shí)時(shí)性要求,這些都會(huì)影響識(shí)別的準(zhǔn)確率。為克服這些挑戰(zhàn),需要持續(xù)改進(jìn)目標(biāo)檢測算法的魯棒性以及適應(yīng)性,即在有限的算力資源下,降低算法參數(shù)量,提高運(yùn)行速度,更好地應(yīng)用于自動(dòng)化采摘設(shè)備中[5]。
近年來,針對類球狀水果采摘中的目標(biāo)檢測問題提出了不同的算法和技術(shù),主要分為兩大類:傳統(tǒng)目標(biāo)檢測算法和基于深度學(xué)習(xí)的目標(biāo)檢測算法?;谏疃葘W(xué)習(xí)的目標(biāo)檢測算法又進(jìn)一步細(xì)分為單階段目標(biāo)檢測算法[6]和兩階段目標(biāo)檢測算法[7]。筆者在本文中對應(yīng)用于類球狀水果采摘的目標(biāo)檢測算法進(jìn)行介紹,并對每類算法的研究成果進(jìn)行歸納分析。
1 傳統(tǒng)目標(biāo)檢測算法
傳統(tǒng)目標(biāo)檢測算法[8]在識(shí)別圖像或視頻中特定對象的位置和邊界框時(shí)通常的步驟包括:對圖片進(jìn)行預(yù)處理(如:縮放、灰度化或歸一化)、特征提?。ɡ脗鹘y(tǒng)手工設(shè)計(jì)特征或是基于機(jī)器學(xué)習(xí)的自動(dòng)學(xué)習(xí)特征)、分類或回歸(確認(rèn)對象類別和位置)以及利用非極大值抑制等方法對檢測到的對象進(jìn)一步優(yōu)化和過濾。Liu等[9]提出使用簡單線性迭代聚類算法對果園圖像進(jìn)行超像素塊分割,從超像素塊中提取顏色特征確定候選區(qū)域,利用方向梯度直方圖描述果實(shí)形狀,用于檢測、定位果實(shí)。夏康利等[10]提出基于HSV顏色空間統(tǒng)計(jì)特征的水果識(shí)別技術(shù),將RGB水果圖像轉(zhuǎn)換為HSV顏色空間,將色調(diào)分布近似為拉普斯分布并將其作為果實(shí)的特征描述,使用Meanshift算法[11]進(jìn)行圖像分割,通過計(jì)算輸入的水果圖像色調(diào)數(shù)據(jù)的馬氏距離、并與預(yù)設(shè)的特征馬氏距離進(jìn)行比較,判斷輸入的水果類別。鄒偉[12]采用工業(yè)相機(jī)獲取柑橘RGB圖像,將柑橘圖像的RGB顏色空間轉(zhuǎn)換為HSV顏色空間,按H、S、V三通道計(jì)算顏色直方圖,利用H通道峰值以及對應(yīng)色調(diào)值對柑橘成熟度進(jìn)行判斷,實(shí)驗(yàn)證明該方法對柑橘成熟度的檢測準(zhǔn)確率在90%以上。陳雪鑫等[13]提出一種基于多顏色和紋理特征的水果識(shí)別算法,利用顏色矩算法和非均勻量化算法對圖像RGB、HSV顏色空間提取顏色特征,使用局部二值化提取紋理特征,將顏色和紋理特征向量優(yōu)化組合,利用BP神經(jīng)網(wǎng)絡(luò)作為分類器對樣本進(jìn)行訓(xùn)練分類,最終的實(shí)驗(yàn)結(jié)果表明通過多特征的結(jié)合可使分類準(zhǔn)確率超過90.0%,高于單一特征算法的準(zhǔn)確率。徐惠榮等[14]設(shè)計(jì)了基于彩色信息的樹上柑橘識(shí)別算法,對各種天氣、光照場景采集圖像,并對圖像進(jìn)行顏色提取,利用柑橘果實(shí)、枝葉在R-B顏色指標(biāo)的差異建立柑橘識(shí)別顏色空間,利用動(dòng)態(tài)閾值法將柑橘果實(shí)從背景分割,實(shí)現(xiàn)對樹上單個(gè)或多個(gè)柑橘果實(shí)的識(shí)別。上述傳統(tǒng)目標(biāo)檢測算法主要依賴于識(shí)別水果的形狀、顏色等單一或組合特征。此類算法通過對背景進(jìn)行建模和特征融合,提取果實(shí)信息并對其進(jìn)行分割,從而在自然環(huán)境下實(shí)現(xiàn)對水果的有效檢測。
傳統(tǒng)目標(biāo)檢測算法在特定場景下表現(xiàn)良好,但具有依賴手工設(shè)計(jì)的特征,難以適應(yīng)復(fù)雜場景和目標(biāo)變化。在自然環(huán)境中,傳統(tǒng)檢測算法的表達(dá)能力和魯棒性有限,易受到光照變化、枝葉遮擋、果實(shí)重疊等因素的影響,導(dǎo)致識(shí)別準(zhǔn)確率下降。當(dāng)場景變更、添加水果種類和更新特征時(shí),需要重新設(shè)計(jì)和調(diào)整特征提取器,特殊情況下甚至需要重新訓(xùn)練整個(gè)系統(tǒng)。相比之下,基于深度學(xué)習(xí)的目標(biāo)檢測算法能從大量數(shù)據(jù)提取、學(xué)習(xí)到豐富的特征,具備更高的準(zhǔn)確性和魯棒性。當(dāng)場景變更、添加水果種類時(shí),基于深度學(xué)習(xí)的目標(biāo)檢測算法可以通過遷移學(xué)習(xí)、數(shù)據(jù)增強(qiáng)、多模型組合、特征融合以及多模態(tài)數(shù)據(jù)提高模型的識(shí)別能力與魯棒性。
2 基于深度學(xué)習(xí)的目標(biāo)檢測算法
基于深度學(xué)習(xí)的目標(biāo)檢測算法可分為兩大類:單階段目標(biāo)檢測算法和兩階段目標(biāo)檢測算法。單階段目標(biāo)檢測算法通過使用單個(gè)卷積神經(jīng)網(wǎng)絡(luò)(Convolutional Neural Network,CNN)直接預(yù)測目標(biāo)位置和類別,實(shí)現(xiàn)端到端的檢測。這種方法在保持高準(zhǔn)確率的同時(shí)可實(shí)現(xiàn)快速檢測,即將目標(biāo)檢測問題轉(zhuǎn)化為回歸問題,直接對目標(biāo)完成位置定位與分類。在算法的訓(xùn)練和部署階段,單階段目標(biāo)檢測算法采用剪枝和量化等技術(shù)減小模型尺寸,適合在資源有限的移動(dòng)設(shè)備或嵌入式系統(tǒng)中運(yùn)行。兩階段目標(biāo)檢測算法稱為基于感興趣區(qū)域或基于區(qū)域建議的目標(biāo)檢測算法,這類算法運(yùn)行時(shí)通常分為兩個(gè)階段:1)利用選擇性搜索、區(qū)域建議網(wǎng)絡(luò)(Region Proposal Network,RPN)[15]等方法生成大量候選區(qū)域;2)通過包含分類器和邊界框回歸器的網(wǎng)絡(luò)處理,對候選區(qū)域進(jìn)行目標(biāo)識(shí)別與精確定位。
2.1 單階段目標(biāo)檢測算法
單階段目標(biāo)檢測算法省略生成候選區(qū)域的步驟,直接在特征圖中生成類概率和位置坐標(biāo),再進(jìn)行分類回歸。常見的單階段目標(biāo)檢測算法有YOLO、SSD[16]、MobileNet[17]、ShuffleNet[18]、Swin-Transformer[19],其中YOLO系列模型應(yīng)用最多。Redmon等[20]為解決兩階段目標(biāo)檢測算法檢測速度慢、提取特征區(qū)域重復(fù)等問題提出了YOLOv1算法,將目標(biāo)檢測轉(zhuǎn)化為回歸問題,使用全局特征預(yù)測邊界框。YOLOv1通過圖像均勻分割避免重復(fù)計(jì)算,提高了檢測速度,適用于實(shí)時(shí)性要求高的自動(dòng)化采摘設(shè)備,但其精度較低。Redmon等[21]提出的YOLOv2算法,通過引入錨框和聯(lián)合訓(xùn)練提升了精度,但在復(fù)雜場景和小目標(biāo)檢測中仍存在誤檢。為了進(jìn)一步提升多尺度特征提取能力,Redmon等[22]提出了YOLOv3算法,采用更深的DarkNet-53骨干網(wǎng)絡(luò),并結(jié)合特征金字塔網(wǎng)絡(luò)(Feature Pyramid Networks,F(xiàn)PN)[23]進(jìn)行特征融合,顯著增強(qiáng)了對不同尺度目標(biāo)的檢測能力。Bochkovskiy等[24]提出的YOLOv4算法,引入Mosaic數(shù)據(jù)增強(qiáng)、CSPDarkNet-53骨干網(wǎng)絡(luò)和SPP模塊,提升了復(fù)雜背景和遮擋情況下的精度,同時(shí)保持實(shí)時(shí)性。在YOLOv4取得成功的基礎(chǔ)上,研究者進(jìn)一步推出了YOLOv5算法[25],YOLOv5通過自適應(yīng)錨框、Focus模塊和輕量化設(shè)計(jì),更適合資源受限的設(shè)備。Li等[26]提出的YOLOv6算法取消了錨框,采用EifficientRep和Rep-PAN模塊,雖然提高了檢測精度,但其復(fù)雜的結(jié)構(gòu)不適合部署在資源受限的移動(dòng)采摘設(shè)備上。Wang等[27]提出了YOLOv7算法,通過BConv、E-ELAN和MPConv層優(yōu)化特征提取,不依賴錨框,提升了硬件的適應(yīng)性。Ultraytics公司提出了YOLOv8算法[28],YOLOv8采用C2F模塊和解耦頭,進(jìn)一步提升了檢測速度和精度,適應(yīng)多場景需求。YOLO系列算法經(jīng)過長期演變,其核心優(yōu)化始終圍繞速度與精度的平衡展開。YOLO以實(shí)時(shí)性為目標(biāo),在保持高檢測精度的同時(shí),能夠滿足各類應(yīng)用場景對快速響應(yīng)的需求(如表1羅列的各類YOLO算法改進(jìn)點(diǎn)及目標(biāo)),而這正是自動(dòng)化水果采摘領(lǐng)域研究的關(guān)鍵所在。因此,YOLO成為果實(shí)識(shí)別任務(wù)中應(yīng)用最廣泛的目標(biāo)檢測算法。隨著技術(shù)的發(fā)展,許多基于YOLO的優(yōu)化模型不斷涌現(xiàn),為提高識(shí)別精度和實(shí)時(shí)性能提供了更多有效的解決方案。
在復(fù)雜的自然環(huán)境中,柑橘早期果實(shí)與背景的枝葉顏色相近,傳統(tǒng)算法很難精確識(shí)別果實(shí),常出現(xiàn)把綠色枝葉背景錯(cuò)誤識(shí)別為果實(shí)以及漏檢的情況。為解決上述類似問題,宋中山等[29]提出D-YOLOv3算法,即采用密集連接卷積網(wǎng)絡(luò)(Densely Connected Convolutional Networks,DenseNet)[30],加強(qiáng)特征傳播,實(shí)現(xiàn)特征的復(fù)用。在構(gòu)建數(shù)據(jù)集時(shí),宋中山等采集不同天氣狀況的柑橘圖片,對圖片進(jìn)行高斯模糊、色彩平衡等處理,提高了數(shù)據(jù)集的多樣性,有效提高了模型的泛化能力與魯棒性,實(shí)驗(yàn)表明D-YOLOv3對柑橘早期果實(shí)的識(shí)別精確率達(dá)83.0%。呂強(qiáng)等[31]基于YOLOv5s優(yōu)化改進(jìn),提出了柑橘早期果實(shí)的檢測算法YOLO-GC。針對模型精度低、模型大的問題,將骨干網(wǎng)絡(luò)換為輕量級(jí)GhostNet,嵌入全局注意力機(jī)制(Global Attention Mechanism,GAM)[32]以提升提取果實(shí)特征的能力。為改善枝葉遮擋、重疊造成的漏檢問題,YOLO-GC采用GIoU損失函數(shù)、結(jié)合非極大值抑制(Soft-Non-Maximum Suppression,Soft-NMS)[33]算法優(yōu)化邊界框的回歸機(jī)制,最終實(shí)驗(yàn)表明,YOLO-GC與YOLOv5s相比,權(quán)重文件大小減少了53.9%,僅占用6.69 MB,平均精度提高1.2%達(dá)到了97.8%。在邊緣設(shè)備端對綠色柑橘果實(shí)檢測時(shí),實(shí)驗(yàn)推理僅用時(shí)108 ms。帖軍等[34]提出一種基于混合注意力機(jī)制和YOLOv5模型改進(jìn)的柑橘識(shí)別方法YOLOv5-SC。在骨干網(wǎng)絡(luò)嵌入SE[35]注意力與CA[36]注意力,使網(wǎng)絡(luò)不僅能捕獲方向和位置信息,也能捕獲通道信息,讓模型更好提取、定位柑橘的圖像特征。YOLOv5-SC引入Varifocal Loss[37]作為損失函數(shù),能夠更加平衡正負(fù)樣本的損失。實(shí)驗(yàn)表明,YOLOv5-SC的平均精度達(dá)到了95.1%,改善了將綠色背景誤檢成綠色柑橘果實(shí)的問題。
在自然環(huán)境下,類球狀水果往往以各種姿態(tài)分布在果樹上,對果園這類遠(yuǎn)距離、大視場的場景進(jìn)行識(shí)別時(shí),樹葉遮擋、果實(shí)目標(biāo)較小或果實(shí)密集分布等因素均會(huì)導(dǎo)致目標(biāo)檢測算法在識(shí)別過程中出現(xiàn)漏檢或誤檢的情況。為了解決此類問題,馬帥等[38]提出基于YOLOv4改進(jìn)的梨果實(shí)識(shí)別算法,將SPP模塊中的最大池化法改為平均池化法,更多地保留目標(biāo)信息,解決了漏檢和誤檢的問題。另外,該算法將SPP模塊前后的卷積、PANet中的部分卷積以及輸出部分的卷積替換為深度可分離卷積,即在保證卷積效果不變的情況下減少了模型所占的空間。使用訓(xùn)練后的改進(jìn)YOLOv4模型對新獲取的圖像樣本進(jìn)行測試,改進(jìn)后的模型權(quán)重文件大小為136 MB,平均精度達(dá)到90.2%。劉忠意等[39]對YOLOv5進(jìn)行改進(jìn),提出了一種橙子果實(shí)的識(shí)別算法。將骨干網(wǎng)絡(luò)部分C3模塊替換為RepVGG模塊,加強(qiáng)特征提取能力,將頸部網(wǎng)絡(luò)中的普通卷積替換成鬼影混洗卷積(Ghost Shuffle Convolution,GConv)[40],在保證精度的同時(shí)也降低了模型參數(shù)量。為提高定位目標(biāo)信息的準(zhǔn)確率,該算法在預(yù)測頭前加入了高通道注意力機(jī)制(Efficient Channel Attention,ECA)[41],最后經(jīng)實(shí)驗(yàn)證明,改進(jìn)后的算法對橙子檢測的平均精度達(dá)到90.1%,誤檢漏檢的問題被有效解決,檢測效果如圖1所示。該算法在無遮擋、復(fù)雜光照、枝葉遮擋及密集小目標(biāo)場景下均展現(xiàn)了良好的檢測效果,具備較強(qiáng)的魯棒性和泛化能力。
賀英豪等[42]設(shè)計(jì)了一種基于YOLOv5s的改進(jìn)算法,有效提升了對李識(shí)別的準(zhǔn)確率,該算法骨干網(wǎng)絡(luò)中的下采樣卷積被替換成FM模塊,保證模型下采樣時(shí)不丟失嚴(yán)重遮擋目標(biāo)和小目標(biāo)的特征信息,使用focal loss和交叉熵函數(shù)的加權(quán)損失作為分類損失,提升密集目標(biāo)的識(shí)別能力,最后測試模型性能發(fā)現(xiàn)平均精度提高了2.8%,達(dá)到97.6%,小目標(biāo)識(shí)別的平均精度達(dá)到92.0%。為實(shí)現(xiàn)對柑橘果實(shí)的精確識(shí)別,黃彤鑌等[43]提出一種基于YOLOv5改進(jìn)模型的識(shí)別方法。該算法通過引入卷積注意力模塊(Convolutional Block Attention Module,CBAM)[44]提高網(wǎng)絡(luò)的特征提取能力,緩解遮擋目標(biāo)與小目標(biāo)的漏檢問題,利用Alpha-IoU[45]損失函數(shù)代替GIoU損失函數(shù)作為邊界框回歸損失函數(shù),提高邊界框定位的精度。最后結(jié)果顯示該模型的平均精度達(dá)到91.3%,對單張柑橘果實(shí)圖像的檢測時(shí)間為16.7 ms。苑迎春等[46]提出基于改進(jìn)YOLOv4-Tiny的果園環(huán)境下桃的實(shí)時(shí)識(shí)別算法,YOLOv4-Tiny-Peach在骨干網(wǎng)絡(luò)引入CBAM,頸部網(wǎng)絡(luò)添加大尺度淺層特征層,提高小目標(biāo)識(shí)別精度,采用雙向特征金字塔網(wǎng)絡(luò)(Bidirectional Feature Pyramid Network,BiFPN)[47]對不同尺度特征進(jìn)行融合。通過訓(xùn)練,YOLOv4-Tiny-Peach平均精度達(dá)87.9%,與YOLOv4-Tiny相比,在大視場和早期桃子識(shí)別場景下該模型檢測效果提升更明顯。為提升全天候自動(dòng)化采摘設(shè)備在夜間環(huán)境中的視覺檢測能力,熊俊濤等[48]提出Des-YOLOv3算法,借鑒ResNet[49]與DenseNet,實(shí)現(xiàn)對多層特征的復(fù)用、融合,加強(qiáng)了夜間環(huán)境下算法對小目標(biāo)、重疊遮擋果實(shí)識(shí)別的魯棒性,檢測效果如圖2所示,實(shí)驗(yàn)表明Des-YOLOv3平均精度達(dá)97.7%。此后,熊俊濤等[50]再次針對夜間采摘作業(yè),提出基于YOLOv5s改進(jìn)和主動(dòng)光源結(jié)合的柑橘識(shí)別算法BI-YOLOv5s,即利用BiFPN進(jìn)行多尺度交叉連接和加權(quán)特征融合,引入CA注意力加強(qiáng)定位信息提取,采用C3TR模塊減少計(jì)算量并提取全局信息。實(shí)驗(yàn)后發(fā)現(xiàn),在光源色環(huán)境下,該模型對夜間柑橘識(shí)別準(zhǔn)確率達(dá)95.3%,實(shí)現(xiàn)了全天候自動(dòng)化采摘作業(yè)。余圣新等[51]利用全維動(dòng)態(tài)卷積替換YOLOv8系列模型中的部分普通卷積以提高YOLOv8系列的魯棒性,并將損失函數(shù)替換為MPDIoU[52],解決了原本CIoU損失函數(shù)退化的問題。通過實(shí)驗(yàn)驗(yàn)證,改進(jìn)后的YOLOv8n、YOLOv8s、YOLOv8m、YOLOv8l、YOLOv8x模型的平均精度分別提高至88.3%、89.3%、89.6%、89.9%、90.1%。岳有軍等[53]基于YOLOv8設(shè)計(jì)了一個(gè)新的特征融合網(wǎng)絡(luò)Rep-YOLOv8實(shí)現(xiàn)高層語義和低層空間特征融合。通過集成EMA注意力模塊到Y(jié)OLOv8中,抑制背景和枝葉遮擋等一般特征信息,使模型更關(guān)注果實(shí)區(qū)域。最后,將C2f模塊替換為三支路DWR模塊,通過多尺度特征融合提高小目標(biāo)檢測能力,使用Inner-SIoU[54]損失函數(shù)提高模型精度。在果園環(huán)境中,以蘋果作為檢測對象,進(jìn)行不同果實(shí)數(shù)量、不同成熟度的實(shí)驗(yàn)對比。實(shí)驗(yàn)結(jié)果表明,該算法平均精度達(dá)到94.0%,在成熟果實(shí)大視場的識(shí)別場景下,改進(jìn)后算法的各項(xiàng)指標(biāo)均有顯著提升,為果實(shí)識(shí)別任務(wù)提供有效支持。
目標(biāo)檢測算法通常包含龐大的參數(shù)量和復(fù)雜的網(wǎng)絡(luò)結(jié)構(gòu),將這些模型部署到嵌入式平臺(tái)時(shí),有限的計(jì)算資源會(huì)嚴(yán)重限制模型的實(shí)時(shí)響應(yīng)速度。為解決這個(gè)問題,呂石磊等[55]提出基于YOLOv3改進(jìn)的輕量化柑橘識(shí)別方法YOLO-LITE,使用MObileNet-v2作為骨干網(wǎng)絡(luò),便于部署到移動(dòng)終端,并引入GIoU[56]邊框回歸損失函數(shù)。最終實(shí)驗(yàn)表明,YOLO-LITE對柑橘目標(biāo)檢測速度可以達(dá)到246幀·s-1,權(quán)重文件大小為28 MB。王卓等[57]以YOLOv4算法為基礎(chǔ)提出輕量級(jí)蘋果實(shí)時(shí)檢測算法YOLOv4-CA,使用輕量級(jí)網(wǎng)絡(luò)MobileNet-v3作為特征提取網(wǎng)絡(luò),并將SE注意力模塊集成其中作為頸部基本塊,提高網(wǎng)絡(luò)對特征通道的敏感程度,增強(qiáng)特征提取能力。為有效壓縮模型參數(shù)量和計(jì)算量,王卓等將特征融合網(wǎng)絡(luò)的普通卷積全部換為深度可分離卷積。最終實(shí)驗(yàn)表明該算法平均檢測精度達(dá)到92.2%,在嵌入式平臺(tái)檢測速度為15.11幀·s-1,內(nèi)存占用量54.1 MB,在保證精度的同時(shí)也可滿足對采摘機(jī)器人實(shí)時(shí)性的需求。曾俊等[58]提出利用YOLO-Faster算法對桃進(jìn)行實(shí)時(shí)快速檢測,在YOLOv5s基礎(chǔ)上將骨干網(wǎng)絡(luò)替換為FasterNet[59],引入部分卷積(Partial Convolution,PConv)[60]有效減少計(jì)算冗余和內(nèi)存訪問,模型檢測速度提升,變得更加輕量化。在骨干網(wǎng)絡(luò)和頸部網(wǎng)絡(luò)之間,增加串聯(lián)的卷積注意力模塊和常規(guī)卷積模塊,強(qiáng)化骨干網(wǎng)絡(luò)和頸部網(wǎng)絡(luò)之間的特征融合和特征提取能力,提高檢測的準(zhǔn)確性。采用SIoU[61]作為損失函數(shù),解決預(yù)測框與真實(shí)框之間不匹配的問題,更好地衡量預(yù)測框和真實(shí)框之間的匹配程度,提高檢測結(jié)果的質(zhì)量。經(jīng)過自建數(shù)據(jù)集的訓(xùn)練和嵌入式設(shè)備Jetson Nano上的部署,該算法平均精度達(dá)到了88.6%,權(quán)重文件大小為8.3 MB,相較于YOLOv5s,平均精度提升了1%。趙輝等[62]提出基于YOLOv3改進(jìn)的蘋果識(shí)別算法,將DarkNet-53網(wǎng)絡(luò)殘差模塊與CSP模塊[63]結(jié)合進(jìn)而降低網(wǎng)絡(luò)計(jì)算量,通過加入SPP模塊將全局、局部特征融合,提高小目標(biāo)召回率。采用SoftNMS算法增強(qiáng)重疊遮擋果實(shí)的識(shí)別能力。改進(jìn)后算法的平均精度達(dá)到96.3%,相較于YOLOv3提高了3.8%,滿足了蘋果自動(dòng)采摘識(shí)別準(zhǔn)確性和實(shí)時(shí)性的要求。然而,當(dāng)光線不足或果實(shí)表面紋理特征不明顯時(shí),算法的準(zhǔn)確率可能會(huì)受到影響。Yan等[64]對YOLOv5S算法進(jìn)行了優(yōu)化改進(jìn),提高了模型表達(dá)能力和空間信息損失處理能力,使其更適合部署在嵌入式設(shè)備上。首先,將模型骨干網(wǎng)絡(luò)的BottleneckCSP模塊橋分支上的卷積層移除,把BottleneckCSP模塊輸入特征映射與另一個(gè)分支的輸出特征映射直接進(jìn)行深度連接,減少模塊中的參數(shù)數(shù)量。其次,將SE注意力嵌入到網(wǎng)絡(luò)模型中,通過學(xué)習(xí)自動(dòng)獲得一種新的特征重新校準(zhǔn)策略,有效提高了模型的表達(dá)能力。最后,將下層感知視野較大的特征提取層輸出與位于中等大小目標(biāo)檢測層之前的特征提取層輸出進(jìn)行融合,以彌補(bǔ)因高層特征分辨率低造成的空間信息損失,檢測效果如圖3所示。王乙涵[65]致力于完成精確且高效的柑橘識(shí)別采摘任務(wù),為此構(gòu)建了適用于采摘機(jī)器人的輕量化目標(biāo)檢測模型LT-YOLOv7,以解決YOLOv7模型存儲(chǔ)空間需求高、不適合移動(dòng)終端等問題。采用RepVGG[66]作為骨干網(wǎng)絡(luò),將其得到的多尺度特征圖與YOLOv7的頸部網(wǎng)絡(luò)進(jìn)行多尺度特征拼接,以保留全局特征并降低整體網(wǎng)絡(luò)的計(jì)算量。頸部網(wǎng)絡(luò)引入深度可分離卷積,以減少參數(shù)量、節(jié)省內(nèi)存并提高模型精度。此外,通過引入ECA增強(qiáng)特征表示,提升目標(biāo)判別能力,降低葉片、枝干等因素對目標(biāo)識(shí)別的干擾。在預(yù)測階段,模型采用soft DIoU_NMS算法進(jìn)行目標(biāo)預(yù)測框的篩選,以優(yōu)化對重疊物體的識(shí)別能力,優(yōu)化后的LT-YOLOv7模型對重疊遮擋柑橘果實(shí)檢測的平均精度達(dá)到了97.0%,如圖4所示,即使在果實(shí)被遮擋的情況下,該算法仍然能夠獲得良好的檢測效果。
Yang等[67]針對蘋果果實(shí)密度高、重疊、網(wǎng)絡(luò)模型參數(shù)化問題,提出了MobileOne-YOLOv7算法。MobileOne-YOLOv7采用多尺度特征提取方法,構(gòu)建特征金字塔輸入模型。多尺度訓(xùn)練提高了模型的魯棒性,避免多尺度特征提取過程中的計(jì)算過多問題。將骨干網(wǎng)絡(luò)的最后一個(gè)ELAN模塊替換為MobileOne模塊,增強(qiáng)模型的非線性和表示能力。同時(shí),還將SPPCSPC模塊更改為SPPFCSPC模塊,將串行通道變?yōu)椴⑿型ǖ?,在保證感受野不變的情況下加快特征融合速度。此外,在頸部網(wǎng)絡(luò)增加了一個(gè)預(yù)測頭,提高了對不同尺度物體的檢測精度。通過引入可重參數(shù)化的分支,訓(xùn)練時(shí)增加模型容量,推理時(shí)簡化結(jié)構(gòu),降低內(nèi)存訪問成本。張震等[68]提出基于YOLOv7改進(jìn)的輕量化蘋果識(shí)別算法,將多分支堆疊模塊中的部分普通卷積換成PConv,以減少模型的參數(shù)量和計(jì)算量。同時(shí),該算法加入ECA解決遮擋目標(biāo)的錯(cuò)檢漏檢問題,保證了模型的精度平衡。在模型訓(xùn)練過程中,該算法采用了基于麻雀搜索算法(Sparrow Search Algorithm,SSA)[69]的學(xué)習(xí)率優(yōu)化策略,顯著提高了模型的檢測精度,實(shí)驗(yàn)表明模型的平均精度達(dá)到了97.0%,模型參數(shù)量和計(jì)算量分別降低了22.9%、27.4%,適合部署在嵌入式設(shè)備中。
2.2 兩階段目標(biāo)檢測算法
Girshick等[70]提出了R-CNN算法,R-CNN利用區(qū)域建議網(wǎng)絡(luò)提取大約2000個(gè)自上而下獨(dú)立于類別的區(qū)域建議。通過大型CNN計(jì)算這些區(qū)域的固定長度特征,使用線性支持向量機(jī)(Support Vector Machines,SVM)[71]對這些特征進(jìn)行分類,確定每個(gè)區(qū)域是否包含特定的目標(biāo)類別。He等[72]提出的SPP-Net算法改進(jìn)了R-CNN,使其能夠處理任意比例的圖像。通過金字塔池化,利用SPP-Net提取不同尺度的特征并整合,生成固定長度的輸出。與R-CNN相比,SPP-Net無需處理所有候選區(qū)域,只需輸入整張圖像即可獲得特征圖,直接從中提取感興趣區(qū)域的特征,減少冗余計(jì)算并提高了速度。Girshick[73]在SPP-Net的基礎(chǔ)上提出了Fast R-CNN算法。Fast R-CNN將整個(gè)對象建議與整張圖像作為輸入,通過多個(gè)卷積和最大池化層生成特征圖。通過一次卷積操作解決了多次卷積產(chǎn)生的冗余問題。Fast R-CNN利用感興趣區(qū)域(Region Of Interest,ROI)池化層從特征圖中獲取固定長度的特征向量,然后通過全連接層進(jìn)行處理,最終分為分類和回歸兩個(gè)輸出層。
Ren等[74]提出的Faster R-CNN算法摒棄傳統(tǒng)的選擇性搜索算法,引入了RPN。RPN通過滑動(dòng)窗口生成不同尺寸的錨框,并根據(jù)設(shè)定的閾值對其進(jìn)行正負(fù)判斷,輸出候選邊界框及概率數(shù)據(jù)。這些候選區(qū)域經(jīng)過ROI池化層操作后,被映射為固定大小的特征圖,然后通過全連接層進(jìn)行物體類別判斷和位置精確定位。Dai等[75]提出了基于區(qū)域的R-FCN算法,由共享的全卷積結(jié)構(gòu)組成。R-FCN生成位置敏感分?jǐn)?shù)圖作為輸出,編碼了相對空間位置信息,其ROI池化層從分?jǐn)?shù)圖中提取信息。Cai等[76]提出了Cascade R-CNN算法,包括提議建立子網(wǎng)絡(luò)和ROI檢測子網(wǎng)絡(luò)。Cascade R-CNN利用級(jí)聯(lián)邊界框回歸將回歸任務(wù)分解,每一步驟都使用專門回歸器。通過級(jí)聯(lián)回歸作為重采樣機(jī)制,解決初始假設(shè)分布嚴(yán)重偏向低質(zhì)量的問題。其中,F(xiàn)aster R-CNN是第一個(gè)實(shí)現(xiàn)端到端的基于深度學(xué)習(xí)的目標(biāo)檢測算法。
在兩階段目標(biāo)檢測算法的基礎(chǔ)上,研究者優(yōu)化檢測模型,提出高效準(zhǔn)確的類球狀水果目標(biāo)檢測算法應(yīng)用于自動(dòng)化采摘作業(yè)當(dāng)中。任會(huì)等[77]利用果園內(nèi)采集的柑橘果實(shí)圖像,通過實(shí)驗(yàn)比較傳統(tǒng)檢測算法和Faster R-CNN對柑橘果實(shí)的識(shí)別效果,實(shí)驗(yàn)發(fā)現(xiàn)傳統(tǒng)檢測算法在增強(qiáng)預(yù)處理且果實(shí)無遮擋的情況下,識(shí)別效果要優(yōu)于Faster R-CNN,但當(dāng)果實(shí)重疊或遮擋時(shí),則Faster R-CNN識(shí)別效果更優(yōu)。Wan等[78]提出了一種基于Faster R-CNN改進(jìn)的多類水果檢測框架。骨干網(wǎng)絡(luò)為VGG-16,包含13個(gè)卷積層、13個(gè)ReLu層和4個(gè)池化層。為避免因樣本較少訓(xùn)練出現(xiàn)過擬合問題、平衡模型的復(fù)雜度和數(shù)據(jù)量,該算法通過正則化對高位參數(shù)進(jìn)行權(quán)值衰減,增加兩個(gè)損失函數(shù)優(yōu)化卷積層和池化層參數(shù),根據(jù)拍攝角度自動(dòng)調(diào)整保證每個(gè)卷積層的大小以及核參數(shù)的合理性,提高檢測精度。Liu等[79]提出基于R-FCN改進(jìn)的水果識(shí)別定位算法,由RPN和FCN組成。RPN用于生成候選區(qū)域框,F(xiàn)CN用于像素級(jí)特征提取,通過反卷積可視化檢測結(jié)果。黃磊磊等[80]為解決算法識(shí)別遮擋重疊柑橘果實(shí)精度低的問題,提出基于深度學(xué)習(xí)的重疊柑橘分割與形態(tài)復(fù)原算法,引入Pointrend分支的Mask R-CNN,實(shí)現(xiàn)對柑橘的識(shí)別及邊緣細(xì)化的實(shí)例分割,采用編解碼器結(jié)構(gòu)的U-Net作為主體網(wǎng)絡(luò)提出形態(tài)粗復(fù)原模型,設(shè)計(jì)局部懲罰損失函數(shù)及交并比形狀損失函數(shù),通過機(jī)器視覺方法根據(jù)粗復(fù)原結(jié)果提取ROI,最后利用基于PConv的形態(tài)精復(fù)原模型完成果實(shí)的形態(tài)復(fù)原。采用該方法對柑橘果實(shí)識(shí)別的平均精度達(dá)93.7%,分割精確度達(dá)96.3%。荊偉斌等[81]針對蘋果園果實(shí)產(chǎn)量預(yù)估提出了一種基于不同特征網(wǎng)絡(luò)的蘋果樹側(cè)面果實(shí)識(shí)別方法。研究人員通過采集果園內(nèi)蘋果樹的側(cè)視圖,測試不同特征提取網(wǎng)絡(luò)與Faster R-CNN模型結(jié)合的識(shí)別效果。實(shí)驗(yàn)中,分別選用了VGG-16和ResNet-50作為特征提取網(wǎng)絡(luò),對兩個(gè)Faster R-CNN模型進(jìn)行訓(xùn)練。結(jié)果顯示,雖然兩者使用了相同的學(xué)習(xí)參數(shù),VGG-16作為特征網(wǎng)絡(luò)的Faster R-CNN模型在各項(xiàng)指標(biāo)上優(yōu)于ResNet-50,識(shí)別精度達(dá)91.0%,單幅圖像的推理時(shí)間為1.4 s。賈艷平等[82]利用相機(jī)采集自然環(huán)境中不同水果的RGB圖像,在Faster R-CNN中添加似然函數(shù)和正則化函數(shù)保證卷積層的大小和核參數(shù)在合理范圍內(nèi),對不同水果進(jìn)行識(shí)別測試,整體識(shí)別準(zhǔn)確率達(dá)99.7%,其中,對橙子的識(shí)別準(zhǔn)確率為77.3%。Lu等[83]利用相機(jī)采集果園內(nèi)綠色柑橘果實(shí)的圖像,采用深淺層特征融合策略增加Mask R-CNN骨干網(wǎng)絡(luò)每一階段提取的特征信息,通過引入骨干網(wǎng)絡(luò)之間的組合連接塊、減少通道數(shù)并提高模型精度,改進(jìn)后的Mask R-CNN在識(shí)別綠色柑橘果實(shí)的平均精度達(dá)95.4%,比原模型提高了1.4%。Min等[84]為了聚合CNN不同層次的注意力特征,設(shè)計(jì)了多尺度注意力網(wǎng)絡(luò)(Multi Scale Attention Network,MSANet)。MSANet引入混合注意力機(jī)制,能將空間通道注意力和不同層的多個(gè)注意力特征聚合到最終的統(tǒng)一表示,使最終表示更具魯棒性、全面性。
3 總結(jié)及未來發(fā)展趨勢
3.1 總結(jié)
本文對水果采摘領(lǐng)域中表現(xiàn)優(yōu)異的檢測算法進(jìn)行了綜述,重點(diǎn)分析了傳統(tǒng)目標(biāo)檢測算法和基于深度學(xué)習(xí)的目標(biāo)檢測算法。針對類球狀水果識(shí)別任務(wù)的傳統(tǒng)目標(biāo)檢測算法依賴手工設(shè)計(jì)特征,通過明確的規(guī)則提取,使得算法的各個(gè)步驟具備高度的可解釋性。傳統(tǒng)算法對數(shù)據(jù)需求較少,僅需少量標(biāo)記數(shù)據(jù)就能實(shí)現(xiàn)模型的調(diào)試和優(yōu)化,且無需復(fù)雜的深度神經(jīng)網(wǎng)絡(luò)運(yùn)算,計(jì)算復(fù)雜度相對較低,對計(jì)算資源的要求不高。然而,在自然環(huán)境中,傳統(tǒng)水果檢測算法在處理果實(shí)重疊、光照變化和枝條遮擋等復(fù)雜場景時(shí),往往難以準(zhǔn)確地提取有效信息。在更換識(shí)別水果種類時(shí),可能需要人工更改算法,缺乏良好的泛化能力。
相比之下,基于深度學(xué)習(xí)的目標(biāo)檢測算法采用多層級(jí)的神經(jīng)網(wǎng)絡(luò)架構(gòu),研究人員可以通過對網(wǎng)絡(luò)內(nèi)模塊的調(diào)整,增強(qiáng)特征表達(dá)能力、減少模型參數(shù)量和提升圖像推理速度等。深度神經(jīng)網(wǎng)絡(luò)架構(gòu)的優(yōu)勢在于,能通過大規(guī)模數(shù)據(jù)的學(xué)習(xí),自主提取復(fù)雜且抽象的多層次特征,并通過層次化特征學(xué)習(xí)逐步捕捉從低級(jí)到高級(jí)的語義信息,提高模型在面對復(fù)雜目標(biāo)和多變環(huán)境時(shí)的檢測精度和魯棒性。
在類球狀水果的識(shí)別任務(wù)中,研究人員通過引入輕量級(jí)特征提取網(wǎng)絡(luò)(如MobileNet、GhostNet)和不同注意力機(jī)制模塊(如SE、CA和CBAM),降低計(jì)算成本和內(nèi)存占用,使得這些算法更適合在資源有限的嵌入式設(shè)備上運(yùn)行。同時(shí),這些改進(jìn)增強(qiáng)了模型對關(guān)鍵特征的敏感性,使網(wǎng)絡(luò)能夠更精準(zhǔn)地捕捉到目標(biāo)對象的關(guān)鍵特征,抑制一般特征信息,從而提高檢測精度,提升算法在復(fù)雜環(huán)境中的表現(xiàn)能力,解決由于背景復(fù)雜性導(dǎo)致的誤檢和漏檢問題。此外,研究人員還運(yùn)用多種先進(jìn)的損失函數(shù)(如SIoU、Alpha-IoU損失等),平衡了正負(fù)樣本的影響,提高邊界框的回歸精度。
3.2 發(fā)展趨勢
近年來,目標(biāo)檢測算法在類球狀水果識(shí)別任務(wù)方面有廣泛應(yīng)用,在遮擋重疊果實(shí)、產(chǎn)量預(yù)測、水果分類分級(jí)和表面缺陷等復(fù)雜檢測任務(wù)中展現(xiàn)出優(yōu)越的性能。但是,由于果園環(huán)境條件復(fù)雜多變,現(xiàn)有的類球狀水果目標(biāo)檢測算法識(shí)別能力的普適性仍有待提高。根據(jù)類球狀水果目標(biāo)檢測算法的發(fā)展趨勢分析,未來的研究可以重點(diǎn)集中在以下幾個(gè)方向:
(1)模型優(yōu)化:基于深度學(xué)習(xí)的目標(biāo)檢測算法需要根據(jù)果實(shí)識(shí)別需求不斷改進(jìn),可以通過引入注意力機(jī)制、改變特征提取網(wǎng)絡(luò)結(jié)構(gòu)、優(yōu)化損失函數(shù)和調(diào)整網(wǎng)絡(luò)深度、寬度等方法,提高果實(shí)目標(biāo)識(shí)別的準(zhǔn)確率、加快識(shí)別速度以及降低漏檢誤檢率。參考其他領(lǐng)域大模型,研究具有優(yōu)異表現(xiàn)的模型是否可以經(jīng)過調(diào)整用于果實(shí)目標(biāo)識(shí)別,進(jìn)一步提高識(shí)別的準(zhǔn)確率和效率。
(2)數(shù)據(jù)集構(gòu)建與擴(kuò)增:根據(jù)不同采摘任務(wù)需求,收集各個(gè)生長階段和不同品種的類球狀水果圖像,構(gòu)建一個(gè)包含不同天氣、光照條件(順光、逆光)、果實(shí)重疊程度以及遮擋情況的數(shù)據(jù)集。結(jié)合圖像處理方法(如圖像旋轉(zhuǎn)、翻轉(zhuǎn)、裁剪、縮放、加噪聲、色彩變換等)或生成對抗網(wǎng)絡(luò)(Generative Adversarial Network,GAN)[85]的圖像生成技術(shù)進(jìn)行數(shù)據(jù)擴(kuò)增。利用多樣化的數(shù)據(jù)集進(jìn)行訓(xùn)練,可以增強(qiáng)模型的泛化能力和魯棒性[86]。
(3)多模態(tài)數(shù)據(jù)結(jié)合:為了進(jìn)一步提升類球狀水果識(shí)別的精度與適應(yīng)性,未來研究可以結(jié)合激光雷達(dá)、深度相機(jī)所獲取的三維信息[87],更全面地獲取果實(shí)形態(tài)和位置信息,特別是在果實(shí)被嚴(yán)重遮擋或在光照條件極差的情況下,多模態(tài)數(shù)據(jù)有助于增強(qiáng)模型的魯棒性。
參考文獻(xiàn) References:
[1] 劉袁,黃彪,陳昌銀,楊文達(dá),張華東,楊濤.水果采摘機(jī)器人采摘裝置機(jī)研究現(xiàn)狀[J].農(nóng)業(yè)科學(xué),2021,11(2):129-132.
LIU Yuan,HUANG Biao,CHEN Changyin,YANG Wenda,ZHANG Huadong,YANG Tao. Research status of fruit picking robot picking device[J]. Journal of Agricultural Sciences,2021,11(2):129-132.
[2] 戴軍. 機(jī)器視覺技術(shù)在瓜菜檢測應(yīng)用中的研究進(jìn)展[J]. 中國瓜菜,2023,36(11):1-9.
DAI Jun. Research progress of machine vision technology in the detection of cucurbits and vegetables[J]. China Cucurbits and Vegetables,2023,36(11):1-9.
[3] 吳劍橋,范圣哲,貢亮,苑進(jìn),周強(qiáng),劉成良. 果蔬采摘機(jī)器手系統(tǒng)設(shè)計(jì)與控制技術(shù)研究現(xiàn)狀和發(fā)展趨勢[J]. 智慧農(nóng)業(yè)(中英文),2020,2(4):17-40.
WU Jianqiao,F(xiàn)AN Shengzhe,GONG Liang,YUAN Jin,ZHOU Qiang,LIU Chengliang. Research status and development direction of design and control technology of fruit and vegetable picking robot system[J]. Smart Agriculture,2020,2(4):17-40.
[4] 初廣麗,張偉,王延杰,丁南南,劉艷瀅. 基于機(jī)器視覺的水果采摘機(jī)器人目標(biāo)識(shí)別方法[J]. 中國農(nóng)機(jī)化學(xué)報(bào),2018,39(2):83-88.
CHU Guangli,ZHANG Wei,WANG Yanjie,DING Nannan,LIU Yanying. A method of fruit picking robot target identification based on machine vision[J]. Journal of Chinese Agricultural Mechanization,2018,39(2):83-88.
[5] 楊健,楊嘯治,熊串,劉力. 基于改進(jìn)YOLOv5的番茄果實(shí)識(shí)別估產(chǎn)方法[J]. 中國瓜菜,2024,37(6):61-68.
YANG Jian,YANG Xiaozhi,XIONG Chuan,LIU Li. An improved YOLOv5-based method for tomato fruit identification and yield estimation[J]. China Cucurbits and Vegetables,2024,37(6):61-68.
[6] DENG J,XUAN X J,WANG W F,LI Z,YAO H W,WANG Z Q. A review of research on object detection based on deep learning[J]. Journal of Physics:Conference Series,2020,1684(1):012028.
[7] DU L X,ZHANG R Y,WANG X T. Overview of two-stage object detection algorithms[C]//Journal of Physics:Conference Series. IOP Publishing,2020,1544(1):012033.
[8] 蔣煥煜,彭永石,申川,應(yīng)義斌. 基于雙目立體視覺技術(shù)的成熟番茄識(shí)別與定位[J]. 農(nóng)業(yè)工程學(xué)報(bào),2008,24(8):279-283.
JIANG Huanyu,PENG Yongshi,SHEN Chuan,YING Yibin. Recognizing and locating ripe tomatoes based on binocular stereovision technology[J]. Transactions of the Chinese Society of Agricultural Engineering,2008,24(8):279-283.
[9] LIU X Y,ZHAO D A,JIA W K,JI W,SUN Y P. A detection method for apple fruits based on color and shape features[J]. IEEE Access,2019,7:67923-67933.
[10] 夏康利,何強(qiáng). 基于顏色統(tǒng)計(jì)的水果采摘機(jī)器人水果識(shí)別的研究[J]. 南方農(nóng)機(jī),2022,53(24):11-16.
XIA Kangli,HE Qiang. Research on fruit recognition for fruit-picking robots based on color statistics[J]. China Southern Agricultural Machinery,2022,53(24):11-16.
[11] COMANICIU D,MEER P. Mean shift:A robust approach toward feature space analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(5):603-619.
[12] 鄒偉. 基于機(jī)器視覺技術(shù)的柑橘果實(shí)成熟度分選研究[J]. 農(nóng)業(yè)與技術(shù),2023,43(17):41-44.
ZOU Wei. Research on citrus fruit maturity sorting based on machine vision technology[J]. Agriculture and Technology,2023,43(17):41-44.
[13] 陳雪鑫,卜慶凱. 基于多顏色和局部紋理的水果識(shí)別算法研究[J]. 青島大學(xué)學(xué)報(bào)(工程技術(shù)版),2019,34(3):52-58.
CHEN Xuexin,BU Qingkai. Research on fruit recognition algorithm based on multi-color and local texture[J]. Journal of Qingdao University (Engineering amp; Technology Edition),2019,34(3):52-58.
[14] 徐惠榮,葉尊忠,應(yīng)義斌. 基于彩色信息的樹上柑橘識(shí)別研究[J]. 農(nóng)業(yè)工程學(xué)報(bào),2005,21(5):98-101.
XU Huirong,YE Zunzhong,YING Yibin. Identification of citrus fruit in a tree canopy using color information[J]. Transactions of the Chinese Society of Agricultural Engineering,2005,21(5):98-101.
[15] FAN Q,ZHUO W,TANG C K,TAI Y W. Few-shot object detection with attention-RPN and multi-relation detector[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA:IEEE,2020:4012-4021.
[16] LIU W,ANGUELOV D,ERHAN D,SZEGEDY C,REED S,F(xiàn)U C Y,BERG A C. SSD:Single shot MultiBox detector[C]//Computer Vision-ECCV 2016:14th European Conference. Amsterdam,The Netherlands:Springer International Publishing,2016:21-37.
[17] HOWARD A G,ZHU M L,CHEN B,KALENICHENKO D,WANG W J,WEYAND T,ANDREETTO M,ADAM H,HEATON J. MobileNets:Efficient convolutional neural networks for mobile vision applications[EB/OL]. 2017:1704.04861. https://arxiv.org/abs/1704.04861v1.
[18] ZHANG X Y,ZHOU X Y,LIN M X,SUN J. ShuffleNet:An extremely efficient convolutional neural network for mobile devices[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE,2018:6848-6856.
[19] LIU Z,LIN Y T,CAO Y,HU H,WEI Y X,ZHANG Z,LIN S,GUO B N. Swin transformer:Hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal,QC,Canada:IEEE,2021:9992-10002.
[20] REDMON J,DIVVALA S,GIRSHICK R,F(xiàn)ARHADI A. You only look once:Unified,real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA:IEEE,2016:779-788.
[21] REDMON J,F(xiàn)ARHADI A. YOLO9000:Better,faster,stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu,HI,USA:IEEE,2017:6517-6525.
[22] REDMON J,F(xiàn)ARHADI A. YOLOv3:An incremental improvement[EB/OL]. 2018:1804.02767. https://arxiv.org/abs/1804.02-767v1.
[23] LIN T Y,DOLLáR P,GIRSHICK R,HE K M,HARIHARAN B,BELONGIE S. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 21-26,2017,Honolulu,HI,USA. IEEE,2017:936-944.
[24] BOCHKOVSKIY A,WANG C Y,LIAO H Y M. YOLOv4:Optimal speed and accuracy of object detection[EB/OL]. 2020:2004.10934. https://arxiv.org/abs/2004.10934v1
[25] JOCHER G. YOLOv5 by Ultralytics (Version7.0) Computersoftware[CP]. 2020,https://doi.org/10.5281/zenodo.3908559.
[26] LI C Y,LI L L,JIANG H L,WENG K H,GENG Y F,LI L,KE Z D,LI Q Y,CHENG M,NIE W Q,LI Y D,ZHANG B,LIANG Y F,ZHOU L Y,XU X M,CHU X X,WEI X M,WEI X L. YOLOv6:A single-stage object detection framework for industrial applications[EB/OL]. arxiv preprint arxiv,2022:2209.02976.
[27] WANG C Y,BOCHKOVSKIY A,LIAO H Y M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver,BC,Canada:IEEE,2023:7464-7475.
[28] VARGHESE R,M S. YOLOv8:A novel object detection algorithm with enhanced performance and robustness[C]//2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS). Chennai,India:IEEE,2024:1-6.
[29] 宋中山,劉越,鄭祿,帖軍,汪進(jìn). 基于改進(jìn)YOLOV3的自然環(huán)境下綠色柑橘的識(shí)別算法[J]. 中國農(nóng)機(jī)化學(xué)報(bào),2021,42(11):159-165.
SONG Zhongshan,LIU Yue,ZHENG Lu,TIE Jun,WANG Jin. Identification of green citrus based on improved YOLOV3 in natural environment[J]. Journal of Chinese Agricultural Mechanization,2021,42(11):159-165.
[30] HUANG G,LIU Z,VAN DER MAATEN L,WEINBERGER K Q. Densely connected convolutional networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu,HI,USA:IEEE,2017:2261-2269.
[31] 呂強(qiáng),林剛,蔣杰,王明之,張皓楊,易時(shí)來. 基于改進(jìn)YOLOv5s模型的自然場景中綠色柑橘果實(shí)檢測[J]. 農(nóng)業(yè)工程學(xué)報(bào),2024,40(18):147-154.
Lü Qiang,LIN Gang,JIANG Jie,WANG Mingzhi,ZHANG Haoyang,YI Shilai. Detecting green citrus fruit in natural scenes using improved YOLOv5s model[J]. Transactions of the Chinese Society of Agricultural Engineering,2024,40(18):147-154.
[32] LIU Y C,SHAO Z R,HOFFMANN N. Global attention mechanism:Retain information to enhance channel-spatial interaction-s[EB/OL]. 2021:2112.05561. https://arxiv.org/abs/2112.05561v1
[33] BODLA N,SINGH B,CHELLAPPA R,DAVIS L S. Soft-NMS-improving object detection with one line of code[C]//2017 IEEE International Conference on Computer Vision (ICCV). Venice:IEEE,2017:5561-5569.
[34] 帖軍,趙捷,鄭祿,吳立鋒,洪博文. 改進(jìn)YOLOv5模型在自然環(huán)境下柑橘識(shí)別的應(yīng)用[J]. 中國農(nóng)業(yè)科技導(dǎo)報(bào),2024,26(7):111-120.
TIE Jun,ZHAO Jie,ZHENG Lu,WU Lifeng,HONG Bowen. Application of improved YOLOv5 model in citrus recognition in natural environment[J]. Journal of Agricultural Science and Technology,2024,26(7):111-120.
[35] HU J,SHEN L,SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE,2018:7132-7141.
[36] HOU Q B,ZHOU D Q,F(xiàn)ENG J S. Coordinate attention for efficient mobile network design[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville,TN,USA:IEEE,2021:13708-13717.
[37] ZHANG H Y,WANG Y,DAYOUB F,SüNDERHAUF N. VarifocalNet:An IoU-aware dense object detector[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville,TN,USA:IEEE,2021:8510-8519.
[38] 馬帥,張艷,周桂紅,劉博. 基于改進(jìn)YOLOv4模型的自然環(huán)境下梨果實(shí)識(shí)別[J]. 河北農(nóng)業(yè)大學(xué)學(xué)報(bào),2022,45(3):105-111.
MA Shuai,ZHANG Yan,ZHOU Guihong,LIU Bo. Recognition of pear fruit under natural environment using an improved YOLOv4 model[J]. Journal of Hebei Agricultural University,2022,45(3):105-111.
[39] 劉忠意,魏登峰,李萌,周紹發(fā),魯力,董雨雪. 基于改進(jìn)YOLOv5的橙子果實(shí)識(shí)別方法[J]. 江蘇農(nóng)業(yè)科學(xué),2023,51(19):173-181.
LIU Zhongyi,WEI Dengfen,LI Meng,ZHOU Shaofa,LU Li,DONG Yuxue. Orange fruit recognition method based on improved YOLOv5[J]. Jiangsu Agricultural Sciences,2023,51(19):173-181.
[40] HAN K,WANG Y H,TIAN Q,GUO J Y,XU C J,XU C. GhostNet:More features from cheap operations[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA:IEEE,2020:1577-1586.
[41] WANG Q L,WU B G,ZHU P F,LI P H,ZUO W M,HU Q H. ECA-net:efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA:IEEE,2020:11531-11539.
[42] 賀英豪,唐德釗,倪銘,蔡起起. 基于改進(jìn)YOLOv5對果園環(huán)境中李的識(shí)別[J]. 華中農(nóng)業(yè)大學(xué)學(xué)報(bào),2024,43(5):31-40.
HE Yinghao,TANG Dezhao,NI Ming,CAI Qiqi. Recognizing plums in orchard environment based on improved YOLOv5[J]. Journal of Huazhong Agricultural University,2024,43(5):31-40.
[43] 黃彤鑌,黃河清,李震,呂石磊,薛秀云,代秋芳,溫威. 基于YOLOv5改進(jìn)模型的柑橘果實(shí)識(shí)別方法[J]. 華中農(nóng)業(yè)大學(xué)學(xué)報(bào),2022,41(4):170-177.
HUANG Tongbin,HUANG Heqing,LI Zhen,Lü Shilei,XUE Xiuyun,DAI Qiufang,WEN Wei. Citrus fruit recognition method based on the improved model of YOLOv5[J]. Journal of Huazhong Agricultural University,2022,41(4):170-177.
[44] WOO S,PARK J,LEE J Y,KWEON I S. CBAM:Convolutional block attention module[EB/OL]. 2018:1807.06521. https://arxiv.org/abs/1807.06521v2
[45] HE J B,ERFANI S,MA X J,BAILEY J,CHI Y,HUA X S. Alpha-IoU:A family of power intersection over union losses for bounding box regression[EB/OL]. 2021:2110.13675. https://arxiv.org/abs/2110.13675v2
[46] 苑迎春,張傲,何振學(xué),張若晨,雷浩. 基于改進(jìn)YOLOv4-tiny的果園復(fù)雜環(huán)境下桃果實(shí)實(shí)時(shí)識(shí)別[J]. 中國農(nóng)機(jī)化學(xué)報(bào),2024,45(8):254-261.
YUAN Yingchun,ZHANG Ao,HE Zhenxue,ZHANG Ruochen,LEI Hao. Peach fruit real-time recognition in complex orchard environment based on improved YOLOv4-tiny[J]. Journal of Chinese Agricultural Mechanization,2024,45(8):254-261.
[47] ZHU L,DENG Z J,HU X W,F(xiàn)U C W,XU X M,QIN J,HENG P A. Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection[C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham:Springer International Publishing,2018:122-136.
[48] 熊俊濤,鄭鎮(zhèn)輝,梁嘉恩,鐘灼,劉柏林,孫寶霞. 基于改進(jìn)YOLOv3網(wǎng)絡(luò)的夜間環(huán)境柑橘識(shí)別方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2020,51(4):199-206.
XIONG Juntao,ZHENG Zhenhui,LIANG Jia’en,ZHONG Zhuo,LIU Bolin,SUN Baoxia. Citrus detection method in night environment based on improved YOLOv3 network[J]. Transactions of the Chinese Society for Agricultural Machinery,2020,51(4):199-206.
[49] HE K M,ZHANG X Y,REN S Q,SUN J. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA:IEEE,2016:770-778.
[50] 熊俊濤,霍釗威,黃啟寅,陳浩然,楊振剛,黃煜華,蘇穎苗. 結(jié)合主動(dòng)光源和改進(jìn)YOLOv5s模型的夜間柑橘檢測方法[J]. 華南農(nóng)業(yè)大學(xué)學(xué)報(bào),2024,45(1):97-107.
XIONG Juntao,HUO Zhaowei,HUANG Qiyin,CHEN Haoran,YANG Zhengang,HUANG Yuhua,SU Yingmiao. Detection method of citrus in nighttime environment combined with active light source and improved YOLOv5s model[J]. Journal of South China Agricultural University,2024,45(1):97-107.
[51] 余圣新,韋瑩瑩,方輝,李敏,柴秀娟,曾志康,覃澤林. 基于改進(jìn)YOLOv8的自然環(huán)境下柑橘果實(shí)識(shí)別[J]. 湖北農(nóng)業(yè)科學(xué),2024,63(8):23-27.
YU Shengxin,WEI Yingying,F(xiàn)ANG Hui,LI Min,CHAI Xiujuan,ZENG Zhikang,QIN Zelin. Citrus fruit recognition in natural environment based on improved YOLOv8[J]. Hubei Agricultural Sciences,2024,63(8):23-27.
[52] MA S L,XU Y,MA S L,XU Y. MPDIoU:A loss for efficient and accurate bounding box regression[EB/OL]. 2023:2307.07662. https://arxiv. org/abs/2307.07662v1.
[53] 岳有軍,漆瀟,趙輝,王紅君. 基于改進(jìn)YOLOv8的果園復(fù)雜環(huán)境下蘋果檢測模型研究[J/OL]. 南京信息工程大學(xué)學(xué)報(bào),2024:1-13(2024-07-15). https://doi.org/10.13878/j.cnki.jnuist.20240410002.
YUE Youjun,QI Xiao,ZHAO Hui,WANG Hongjun. Research on apple detection model in complex orchard environments based on improved YOLOv8[J/OL]. Journal of Nanjing University of Information Science amp; Technology,2024:1-13(2024-07-15). https://doi.org/10.13878/j.cnki.jnuist.20240410002.
[54] ZHANG H,XU C,ZHANG S J. Inner-IoU:More effective intersection over union loss with auxiliary bounding box[EB/OL]. 2023:2311.02877. https://arxiv.org/abs/2311.02877v4.
[55] 呂石磊,盧思華,李震,洪添勝,薛月菊,吳奔雷. 基于改進(jìn)YOLOv3-LITE輕量級(jí)神經(jīng)網(wǎng)絡(luò)的柑橘識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(17):205-214.
Lü Shilei,LU Sihua,LI Zhen,HONG Tiansheng,XUE Yueju,WU Benlei. Orange recognition method using improved YOLOv3-LITE lightweight neural network[J]. Transactions of the Chinese Society of Agricultural Engineering,2019,35(17):205-214.
[56] REZATOFIGHI H,TSOI N,GWAK J,SADEGHIAN A,REID I,SAVARESE S. Generalized intersection over union:A metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach,CA,USA:IEEE,2019:658-666.
[57] 王卓,王健,王梟雄,時(shí)佳,白曉平,趙泳嘉. 基于改進(jìn)YOLOv4的自然環(huán)境蘋果輕量級(jí)檢測方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2022,53(8):294-302.
WANG Zhuo,WANG Jian,WANG Xiaoxiong,SHI Jia,BAI Xiaoping,ZHAO Yongjia. Lightweight real-time apple detection method based on improved YOLOv4[J]. Transactions of the Chinese Society for Agricultural Machinery,2022,53(8):294-302.
[58] 曾俊,陳仁凡,鄒騰躍. 基于改進(jìn)YOLO的自然環(huán)境下桃子成熟度快速檢測模型[J]. 南方農(nóng)機(jī),2023,54(24):24-27.
ZENG Jun,CHEN Renfan,ZOU Tengyue. Rapid maturity detection model for peaches in natural environment based on improved YOLO[J]. China Southern Agricultural Machinery,2023,54(24):24-27.
[59] CHEN J R,KAO S H,HE H,ZHUO W P,WEN S,LEE C H,CHAN S H G. Run,Don’t walk:Chasing higher FLOPS for faster neural networks[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver,BC,Canada:IEEE,2023:12021-12031.
[60] LIU G L,REDA F A,SHIH K J,WANG T C,TAO A,CATANZARO B. Image inpainting for irregular holes using partial convolutions[C]//Proceedings of the European conference on computer vision (ECCV). Cham:Springer International Publishing,2018:85-100.
[61] GEVORGYAN Z. SIoU loss:More powerful learning for bounding box regression[EB/OL]. 2022:2205.12740. https://arxiv.org/abs/2205.12740v1
[62] 趙輝,喬艷軍,王紅君,岳有軍. 基于改進(jìn)YOLOv3的果園復(fù)雜環(huán)境下蘋果果實(shí)識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2021,37(16):127-135.
ZHAO Hui,QIAO Yanjun,WANG Hongjun,YUE Youjun. Apple fruit recognition in complex orchard environment based on improved YOLOv3[J]. Transactions of the Chinese Society of Agricultural Engineering,2021,37(16):127-135.
[63] WANG C Y,MARK LIAO H Y,WU Y H,CHEN P Y,HSIEH J W,YEH I H. CSPNet:a new backbone that can enhance learning capability of CNN[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle,WA,USA:IEEE,2020:1571-1580.
[64] YAN B,F(xiàn)AN P,LEI X Y,LIU Z J,YANG F Z. A real-time apple targets detection method for picking robot based on improved YOLOv5[J]. Remote Sensing,2021,13(9):1619.
[65] 王乙涵. 基于改進(jìn)YOLOv7的自然環(huán)境下柑橘果實(shí)識(shí)別與定位方法研究[D]. 雅安:四川農(nóng)業(yè)大學(xué),2023.
WANG Yihan. Research on detection and localization of citrus in natural environment based on improved YOLOv7[D]. Ya’an:Sichuan agricultural university,2023.
[66] DING X H,ZHANG X Y,MA N N,HAN J G,DING G G,SUN J. RepVGG:Making VGG-style ConvNets great again[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville,TN,USA:IEEE,2021:13728-13737.
[67] YANG H W,LIU Y Z,WANG S W,QU H X,LI N,WU J,YAN Y F,ZHANG H J,WANG J X,QIU J F. Improved apple fruit target recognition method based on YOLOv7 model[J]. Agriculture,2023,13(7):1278.
[68] 張震,周俊,江自真,韓宏琪. 基于改進(jìn)YOLOv7輕量化模型的自然果園環(huán)境下蘋果識(shí)別方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2024,55(3):231-242.
ZHANG Zhen,ZHOU Jun,JIANG Zizhen,HAN Hongqi. Lightweight apple recognition method in natural orchard environment based on improved YOLOv7 model[J]. Transactions of the Chinese Society for Agricultural Machinery,2024,55(3):231-242.
[69] XUE J K,SHEN B. A novel swarm intelligence optimization approach:Sparrow search algorithm[J]. Systems Science amp; Control Engineering,2020,8(1):22-34.
[70] GIRSHICK R,DONAHUE J,DARRELL T,MALIK J. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus,OH,USA:IEEE,2014:580-587.
[71] HEARST M A,DUMAIS S T,OSUNA E,PLATT J,SCHO-LKOPF B. Support vector machines[J]. IEEE Intelligent Systems and Their Applications,1998,13(4):18-28.
[72] HE K M,ZHANG X Y,REN S Q,SUN J. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[73] GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV). Santiago,Chile:IEEE,2015:1440-1448.
[74] REN S Q,HE K M,GIRSHICK R,SUN J. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[75] DAI J F,LI Y,HE K M,SUN J. R-FCN:Object detection via region-based fully convolutional networks[C]. Proceedings of the 30th International Conference on Neural Information Processing Systems,2016:379-387.
[76] CAI Z W,VASCONCELOS N. Cascade R-CNN:Delving into high quality object detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE,2018:6154-6162.
[77] 任會(huì),朱洪前. 基于深度學(xué)習(xí)的目標(biāo)橘子識(shí)別方法研究[J]. 計(jì)算機(jī)時(shí)代,2021(1):57-60.
REN Hui,ZHU Hongqian. Research on the method of identifying target orange with deep learning[J]. Computer Era,2021(1):57-60.
[78] WAN S H,GOUDOS S. Faster R-CNN for multi-class fruit detection using a robotic vision system[J]. Computer Networks,2020,168:107036.
[79] LIU J,ZHAO M R,GUO X F. A fruit detection algorithm based on R-FCN in natural scene[C]//2020 Chinese Control and Decision Conference (CCDC). Hefei,China:IEEE,2020:487-492.
[80] 黃磊磊,苗玉彬. 基于深度學(xué)習(xí)的重疊柑橘分割與形態(tài)復(fù)原[J]. 農(nóng)機(jī)化研究,2023,45(10):70-75.
HUANG Leilei,MIAO Yubin. Overlapping citrus segmentation and morphological restoration based on deep learning[J]. Journal of Agricultural Mechanization Research,2023,45(10):70-75.
[81] 荊偉斌,李存軍,競霞,趙葉,程成. 基于深度學(xué)習(xí)的蘋果樹側(cè)視圖果實(shí)識(shí)別[J]. 中國農(nóng)業(yè)信息,2019,31(5):75-83.
JING Weibin,LI Cunjun,JING Xia,ZHAO Ye,CHENG Cheng. Fruit identification with apple tree side view based on deep learning[J]. China Agricultural Informatics,2019,31(5):75-83.
[82] 賈艷平,桑妍麗,李月茹. 基于改進(jìn)Faster R-CNN模型的水果分類識(shí)別[J]. 食品與機(jī)械,2023,39(8):129-135.
JIA Yanping,SANG Yanli,LI Yueru. Fruit identification using improved Faster R-CNN model[J]. Food amp; Machinery,2023,39(8):129-135.
[83] LU J Q,YANG R F,YU C R,LIN J H,CHEN W D,WU H W,CHEN X,LAN Y B,WANG W X. Citrus green fruit detection via improved feature network extraction[J]. Frontiers in Plant Science,2022,13:946154.
[84] MIN W Q,WANG Z L,YANG J H,LIU C L,JIANG S Q. Vision-based fruit recognition via multi-scale attention CNN[J]. Computers and Electronics in Agriculture,2023,210:107911.
[85] GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,XU B,WARDE-FARLEY D,OZAIR S,COURVILLE A,BENGIO Y. Generative adversarial networks[J]. Communications of the ACM,2020,63(11):139-144.
[86] 戈明輝,張俊,陸慧娟. 基于機(jī)器視覺的食品外包裝缺陷檢測算法研究進(jìn)展[J]. 食品與機(jī)械,2023,39(9):95-102.
GE Minghui,ZHANG Jun,LU Huijuan. Research progress of food packaging defect detection based on machine vision[J]. Food amp; Machinery,2023,39(9):95-102.
[87] 任磊,張俊,陸勝民. 脫囊衣橘片自動(dòng)分揀機(jī)器視覺算法研究[J]. 浙江農(nóng)業(yè)學(xué)報(bào),2015,27(12):2212-2217.
REN Lei,ZHANG Jun,LU Shengmin. Research on machine vision algorithm for automatic sorting of membrane-removed mandarin segments[J]. Acta Agriculturae Zhejiangensis,2015,27(12):2212-2217.
收稿日期:2024-06-18 接受日期:2024-12-06
基金項(xiàng)目:國家柑橘產(chǎn)業(yè)技術(shù)體系(CARS-26-29)
作者簡介:李輝,男,碩士,主要從事基于3D視覺的采摘機(jī)器人檢測算法研究。E-mail:3023763876@qq.com
*通信作者Author for correspondence. E-mail:hunterzju@163.com