楊蜀秦,劉楊啟航,王 振,韓媛媛,王勇勝,藍賢勇
基于融合坐標信息的改進YOLO V4模型識別奶牛面部
楊蜀秦1,2,3,劉楊啟航1,2,3,王 振1,韓媛媛1,王勇勝4,藍賢勇5
(1. 西北農林科技大學機械與電子工程學院,楊凌 712100;2. 農業(yè)農村部農業(yè)物聯(lián)網(wǎng)重點實驗室,楊凌 712100;3. 陜西省農業(yè)信息感知與智能服務重點實驗室,楊凌 712100;4. 西北農林科技大學動物醫(yī)學院,楊凌 712100;5. 西北農林科技大學動物科技學院,楊凌 712100)
為實現(xiàn)奶牛個體的準確識別,基于YOLO V4目標檢測網(wǎng)絡,提出了一種融合坐標信息的奶牛面部識別模型。首先,采集71頭奶牛面部圖像數(shù)據(jù)集,并通過數(shù)據(jù)增強擴充提高模型的泛化性能。其次,在YOLO V4網(wǎng)絡的特征提取層和檢測頭部分分別引入坐標注意力機制和包含坐標通道的坐標卷積模塊,以增強模型對目標位置的敏感性,提高識別精度。試驗結果表明,改進的YOLO V4模型能夠有效提取奶牛個體面部特征,平均精度均值為93.68%,平均幀率為18幀/s,雖然檢測速度低于無錨框的CenterNet,但平均精度均值提高了10.92%;與Faster R-CNN和SSD模型相比,在檢測速度提高的同時,精度分別提高了1.51和16.32個百分點;與原始YOLO V4相比,mAP提高0.89%,同時檢測速度基本不變。該研究為奶牛精準養(yǎng)殖中的牛臉圖像識別提供了一種有效的技術支持。
圖像識別;動物;奶牛面部;YOLO V4;注意力機制;坐標卷積
精準養(yǎng)殖是現(xiàn)代智慧牧業(yè)發(fā)展的重要方向之一[1-3]。奶牛精準養(yǎng)殖中,奶牛個體身份識別是實現(xiàn)智能化和規(guī)?;B(yǎng)殖的前提[4-7],其能夠為個體飼喂方案制定、產奶效能和健康狀況分析提供基礎信息[8],也成為奶品溯源、防疫防病和保險理賠等管理工作的重要環(huán)節(jié)[2]。
傳統(tǒng)奶牛身份識別以耳標、烙印、頸鏈和刺標等人工觀測方法[9]為主,這些方法不僅費時費力,且易引發(fā)應激反應,造成奶牛和人員損傷。將無線射頻技術(Radio Frequency Identification, RFID)應用于奶牛的個體識別,可根據(jù)編號追蹤奶牛從出生到被屠宰的全部信息[10-11],但是其在耐用性和成本上還存在缺陷。此外,有學者采用牛鼻鏡紋路、虹膜、視網(wǎng)膜血管等生理特征對奶牛個體進行識別[12-14],但由于獲取這些特征在實際操作時不便采集,因而影響了方法的推廣性[15]。
隨著養(yǎng)殖場攝像設備的普及,越來越多的研究基于計算機視覺技術開展家畜個體身份和行為識別工作[16-19]。例如Cai等[20]使用局部二值模式提取紋理特征建立面部描述模型對牛臉進行識別。隨著深度學習的發(fā)展,F(xiàn)aster R-CNN[21]、SSD[22]、YOLO系列[23-25]和CenterNet[26]等目標檢測網(wǎng)絡的提出有助于進一步開展奶牛圖像識別。例如,通過提取奶牛的軀干特征,趙凱旋等[27]利用卷積神經(jīng)網(wǎng)絡的方法對30頭奶牛進行識別;文獻[28]和[29]以奶牛背部紋理作為特征,分別基于YOLO V3和RCNN模型識別奶牛個體;姚禮垚等[30]構建了一個超過10 000張不同條件下的奶牛數(shù)據(jù)集,對比分析了幾種深度學習方法在牛臉檢測中的性能,但該研究只檢測牛臉,并未進行個體識別。
綜上所述,前人采用人工特征識別奶牛面部的方法對數(shù)據(jù)集收集操作要求較高,在復雜條件下,當奶牛個體面部特征出現(xiàn)變化時存在識別精度低等問題。因此,為實現(xiàn)非接觸、低成本和高效率的奶牛個體識別,本文構建了包含多種姿態(tài)的71頭荷斯坦奶牛面部圖像數(shù)據(jù)集,基于YOLO V4[31]模型,在特征圖中融入坐標信息,以增加模型對奶牛位置的敏感度,從而提高奶牛面部識別的準確性和快速性,擬為奶牛個體識別提供了一種有效的技術支持。
奶牛圖像拍攝于陜西省咸陽市楊凌區(qū)科元克隆股份有限公司。采用索尼FDR-AX100E攝像機分別于2019年1月20日、2020年10月14日和2021年1月17日對實際場景下的71頭美國荷斯坦奶牛跟蹤拍攝,每段視頻時長約1 min,幀率為30幀/s,分辨率為1 440像素×1 080像素。數(shù)據(jù)集中包括育成期、青年期、干奶期和泌乳期等不同生長階段、不同光照條件、不同姿態(tài)和不同遮擋程度的奶牛。采集到的奶牛面部圖像有全黑和黑白相間的2種顏色類型,其中數(shù)據(jù)集中純黑奶牛數(shù)據(jù)較少,面部黑白相間的奶牛數(shù)據(jù)占主要部分,如圖1所示。
將拍攝的視頻按15幀/s截取并剔除模糊、遮擋嚴重、光線不足等圖像,共獲得71類6 486幅奶牛面部圖像,其中90%劃分為訓練集,剩余10%作為測試集。同時,為增強識別模型的魯棒性,結合實際拍攝時會存在傾斜角度、明暗程度和分辨率不同等情況,對原始訓練集圖像采用?10°到10°旋轉、隨機亮度調整和裁剪的數(shù)據(jù)增強方法進行擴充,最終得到16 614幅訓練集圖像,其中包括2019年和2020年的10 940幅圖像以及2021年拍攝的部分5 674幅圖像,剩余2021年的649幅圖像作為測試集。使用LabelImg圖像標注工具對訓練集和測試集的奶牛圖像進行標注。
目標檢測是從圖像中完成準確快速識別和定位物體的任務。YOLO目標識別算法將分類、定位、檢測功能融合到一個網(wǎng)絡當中,只需要將圖像輸入網(wǎng)絡,就可同時得到目標的位置和類別。該網(wǎng)絡將檢測任務視作回歸問題,兼具良好的檢測速度與精度。
YOLO V4目標檢測網(wǎng)絡是在YOLO V3基礎上,對骨干特征提取網(wǎng)絡、特征融合的頸部網(wǎng)絡和分類回歸的預測輸出部分進行改進。在檢測過程中,YOLO V4將輸入的圖像劃分為不同大小的網(wǎng)格,當物體中心坐標落在某個網(wǎng)格中時,由該網(wǎng)格負責檢測目標。在骨干特征提取網(wǎng)絡方面,YOLO V4引入了跨階段局部網(wǎng)絡(Cross Stage Partial Network, CSPNet)[32]的思想構造CSPDarknet53結構,在頸部網(wǎng)絡部分,YOLO V4引入了空間金字塔池化模塊(Spatial Pyramid Pooling, SPP)[33]和路徑聚合網(wǎng)絡(Path Aggregation Network, PANet)[34]。SPP模塊對CSPDarkNet53網(wǎng)絡最后輸出的特征層利用不同大小的池化核進行最大池化來提高感受野,將上下文中的重要特征提取出來。PANet能夠將特征信息從下至上傳遞,融合了豐富的特征信息,避免信息丟失。
奶牛面部識別與牛臉在圖像中的位置密切相關,準確定位牛臉有助于提高奶牛個體識別的準確率。本文在YOLO V4模型的骨干特征提取網(wǎng)絡和檢測頭部分分別添加坐標注意力模塊(Coordinate Attention)[35]和坐標卷積模塊(CoordConv)[36],從2方面提升奶牛面部檢測精度。圖2是改進的YOLO V4奶牛面部檢測網(wǎng)絡結構圖。
2.2.1 骨干網(wǎng)絡中添加坐標注意力
在特征提取部分,由于高分辨特征圖對位置敏感性更高。因此,為保留奶牛圖像位置信息,可將坐標注意力添加到CBM(Convolution-Batch Normalization-Mish)模塊之后,該模塊由卷積層、批量正則化和Mish[37]激活函數(shù)構成。
坐標注意力模塊通過精確的位置信息對通道關系和長程依賴進行編碼,分為坐標信息嵌入和坐標注意力生成2個步驟。首先,對輸入圖像使用不同尺寸的池化核沿著水平和垂直坐標方向對每個通道進行解碼,沿著2個空間方向進行特征聚合后得到一對方向感知的特征圖。每個特征圖都沿一個空間方向捕獲輸入特征圖的長程依賴關系,實現(xiàn)可將位置信息保存在生成的特征圖中。然后通過乘法將2個特征圖均用于輸入特征圖中,以強調注意區(qū)域的表示。
2.2.2 坐標卷積模塊
如圖2所示,在檢測頭部分,將坐標信息構成的二維通道特征與高層語義特征堆疊,增強檢測頭中的位置敏感性。坐標卷積的核心是在卷積層之前顯式加入坐標特征,對卷積層和位置信息結合進行計算。其對卷積層的輸入數(shù)據(jù)加入了2個通道來分別表示特征圖的橫縱坐標值,對于橫坐標特征通道,每一個像素點代表橫坐標的數(shù)值,其第一行填充為0,第二行填充為1,第三行填充為2,以此類推;縱坐標通道與此類似,并進行歸一化操作。保留特征之間的相對位置,便于卷積訪問特征坐標信息。
2.3.1 訓練參數(shù)設置
可以說,奧林匹克文化深刻地影響了希臘歐洲的哲學變革,引導著哲學家們向人的原始沖動中尋求答案。在此,筆者僅舉兩例:
硬件環(huán)境為GeForce RTX 2080Ti GPU,顯存為12 GB,操作系統(tǒng)為Ubuntu 16.04,使用Pytorch深度學習框架構建模型。在參數(shù)設置方面,訓練圖像尺寸設為416像素×416像素,模型訓練100個epoch,試驗中每完成一個epoch,保存一次權重參數(shù),當模型訓練結束后一共有100個模型權值參數(shù),以評價訓練模型的性能,批尺寸設置為8。
本文利用遷移學習的思想,通過預訓練模型提高訓練速度,首先在前50次訓練凍結骨干網(wǎng)絡參數(shù)對分類器進行訓練,學習率設置為1×10-3,較大的學習率可快速更新參數(shù),加速收斂。在后50次訓練中,對骨干網(wǎng)絡參數(shù)解凍,學習率設置為1×10-4,此時通過設置較小的學習率對網(wǎng)絡進行微調,逐步逼近最優(yōu)解。
2.3.2 評價指標
本文采用平均精度值(Average Precision,AP)、平均精度均值(Mean Average Precision,mAP)和平均幀率(Frame Per Second,F(xiàn)PS)作為模型的評價指標,其中FPS指1 s內識別的圖像數(shù)。AP是以召回率(Recall,)為橫軸,精準率(Precision,)為縱軸繪制-曲線并對其積分求出曲線下的面積得到,mAP是對每個類別的AP值求和然后取平均。其計算公式如(1)和(2)所示。
式中()為曲線函數(shù)表達,代表奶牛的類別數(shù),AP代表第類奶牛的平均精度值。
從不同模型的試驗結果以及對面部遮擋奶牛的識別結果2個方面,驗證并分析改進模型對奶牛面部識別的性能。
采用SSD模型、CenterNet模型、Faster R-CNN模型、YOLO V4模型、在特征提取網(wǎng)絡增加坐標注意力的CA-YOLO V4模型和改進YOLO V4模型對測試集圖像進行識別,結果如表1所示。
表1 不同模型對奶牛個體的識別結果
可以看出,本研究提出的改進的YOLO V4奶牛面部識別模型識別效果優(yōu)于其他模型。其中,CA-YOLO V4是添加坐標注意力模塊的YOLO V4模型,相較于原始YOLO V4模型的mAP提升了0.7個百分點。在此基礎上,本文添加坐標注意力和坐標卷積模塊的改進模型相對于CA-YOLO V4模型進一步有效提升了奶牛數(shù)據(jù)集檢測結果,得到更高的檢測精度,mAP比CA-YOLO V4提高了0.19%;比原始YOLO V4模型提高了0.89個百分點;與CenterNet模型相比,雖然檢測速度低,但是mAP值提高了10.92個百分點;與Faster R-CNN和SSD模型相比,在檢測速度提高的同時,精度分別提高了1.51和16.32個百分點。也就是說,在特征提取網(wǎng)絡中添加坐標注意力和檢測頭添加坐標位置信息能夠有效提升奶牛面部的識別精度,從而驗證了本文提出方法的有效性。
不同模型的部分檢測結果對比如圖3所示。可以看出,SSD、CenterNet、Faster R-CNN與YOLO V4模型出現(xiàn)了漏檢或錯檢的情況。例如,由于15083號和17125號奶牛面部黑色斑紋面積較大,與身體部分相似,模型提取的面部特征信息不充足,因此出現(xiàn)SSD模型對17125號奶牛以及CenterNet模型對15083號奶牛出現(xiàn)漏檢情況;另外,由于面部斑紋分布相似,F(xiàn)aster R-CNN模型將14188號奶牛誤識別為18044號奶牛,YOLO V4模型將17060號奶牛誤識別為18051號奶牛。而本文模型對上述4類奶牛均得出了正確的識別結果。
拍攝過程中,受奶牛養(yǎng)殖場環(huán)境的影響,當奶牛在圍欄內部活動時,采集到的奶牛面部圖像易受到不同程度遮擋。本文選用了10類奶牛共計120幅面部遮擋圖像進行測試,試驗結果如表2所示。
由表2可知,本文模型對于10類奶牛遮擋圖像的AP值均高于或等于其他4種模型,其mAP達92.60%。比CenterNet和SSD模型分別提高了30.52和12.79個百分點。相較于YOLO V4模型,mAP提高了10.95個百分點,其中20093號、20098號和20121號奶牛的AP值分別提升了40、9和44個百分點;相較于Faster R-CNN模型,mAP提高了7.91個百分點,其中17107號、20104號和20121號奶牛的AP值分別提升了26、14和24個百分點。上述結果表明,在特征提取網(wǎng)絡和檢測頭中融入坐標信息有助于改進YOLO V4模型應對面部遮擋的挑戰(zhàn)。
表2 不同模型對面部遮擋奶牛的識別精度
注:表中每只奶牛編號對應的張數(shù)為該奶牛面部遮擋樣本圖像的數(shù)量。
Note:The number corresponding to each cow is the number of its face occlusion sample images.
1)奶牛面部紋理對識別結果的影響
提出的YOLO V4改進模型對奶牛面部識別的準確率達到93.68%,與CA-YOLO V4、YOLO V4、CenterNet、Faster R-CNN、SSD模型相比,在精度上有所提升,但是仍然存在一些識別問題。改進的YOLO V4模型對部分面部黑白相間的奶牛未能正確識別,其中單頭奶牛錯誤識別情況如圖4所示,圖4a中18044號奶牛被錯誤識別成15036號奶牛。由圖4b的15036號奶牛面部圖像可見,其誤識別原因是由于兩類奶牛面部花斑極為相似,缺少能夠顯著區(qū)分兩者的特征,因此導致識別錯誤。
2)奶牛面部遮擋對識別結果的影響
如圖5所示,17107號和20121號奶牛均出現(xiàn)嚴重的遮擋情況,導致特征提取的有效信息不充足,同時,這兩類奶牛面部顏色基本為黑色,面部輪廓與身體黑色部分相似,進一步增加了特征提取困難,從而使得改進的YOLO V4模型難以準確識別。
奶牛面部識別是奶牛智慧養(yǎng)殖中個體精準飼喂和行為理解的重要前提。為實現(xiàn)快速、準確和非接觸式奶牛面部識別,本文將坐標信息融入YOLO V4目標檢測模型中,通過提高模型對奶牛定位的敏感性增強模型的識別性能。主要結論如下:
1)通過在特征提取層添加嵌入位置信息的注意力模塊,并在檢測頭中添加坐標卷積模塊,提出一種改進的YOLO V4模型用于奶牛個體面部識別。改進的模型在測試集中的平均精度均值達到93.68%,分別比SSD模型、CenterNet模型、YOLO V4模型、Faster R-CNN模型和CA-YOLO V4模型提高16.32、10.92、0.89、1.51和0.19個百分點;在檢測速度方面,改進的YOLO V4模型略低于原始YOLO V4模型,但對奶牛面部識別精度更高。試驗結果表明,改進的YOLO V4模型增強了奶牛面部定位的位置敏感性,進一步提高了識別精度。
2)改進的YOLO V4模型對于奶牛面部遮擋情況的識別效果優(yōu)于SSD模型、CenterNet模型、YOLO V4模型和Faster R-CNN模型,識別率達到了92.60%,分別比SSD模型、CenterNet模型、YOLO V4模型和Faster R-CNN模型提高12.79、30.52、10.95和7.91個百分點,但對于遮擋面積大、光線過暗導致面部特征不明顯等方面的識別精度仍有待提高。
[1] Matthews S G, Miller A L, Clapp J, et al. Early detection of health and welfare compromises through automated detection of behavioural changes in pigs[J]. The Veterinary Journal, 2016, 217: 43-51.
[2] 何東健,劉冬,趙凱旋. 精準畜牧業(yè)中動物信息智能感知與行為檢測研究進展[J]. 農業(yè)機械學報,2016,47(5):231-244.
He Dongjian, Liu Dong, Zhao Kaixuan. Review of perceiving animal information and behavior in precision livestock farming[J]. Transactions of the Chinese Society for Agricultural Machinery, 2016, 47(5): 231-244. (in Chinese with English abstract)
[3] Santoni M M, Sensuse D I, Arymurthy A M, et al. Cattle race classification using gray level co-occurrence matrix convolutional neural networks[J]. Procedia Computer Science, 2015, 59: 493-502.
[4] Tsai D M, Huang C Y. A motion and image analysis method for automatic detection of estrus and mating behavior in cattle[J]. Computers and Electronics in Agriculture, 2014, 104: 25-31.
[5] Kumar S, Pandey A, Kondamudi S, et al. Deep learning framework for recognition of cattle using muzzle point image pattern[J]. Measurement, 2018, 116: 1-17.
[6] 何東健,劉暢,熊虹婷. 奶牛體溫植入式傳感器與實時監(jiān)測系統(tǒng)設計與試驗[J]. 農業(yè)機械學報,2018,49(12):195-202.
He Dongjian, Liu Chang, Xiong Hongting. Design and experiment of implantable sensor and real-time detection system for temperature monitoring of cow[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(12): 195-202. (in Chinese with English abstract)
[7] 劉忠超,何東健. 基于卷積神經(jīng)網(wǎng)絡的奶牛發(fā)情行為識別方法[J]. 農業(yè)機械學報,2019,50(7):186-193.
Liu Zhongchao, He Dongjian, Recognition method of cow estrus behavior based on convolutional neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(7): 186-193. (in Chinese with English abstract)
[8] 劉忠超,翟天嵩,何東健. 精準養(yǎng)殖中奶牛個體信息監(jiān)測研究現(xiàn)狀及進展[J]. 黑龍江畜牧獸醫(yī),2019(13):30-33,38.
Liu Zhongchao, Zhai Tiansong, He Dongjian. Research status and progress of individual information monitoring of dairy cows in precision breeding[J]. Heilongjiang Animal Science and Veterinary Medicine, 2019(13): 30-33, 38. (in Chinese with English abstract)
[9] 蔣國濱. 奶牛個體識別的標記方法[J]. 飼料博覽,2018(5):86.
[10] 孫雨坤,王玉潔,霍鵬舉,等. 奶牛個體識別方法及其應用研究進展[J]. 中國農業(yè)大學學報,2019,24(12):62-70.
Sun Yukung, Wang Yujie, Huo Pengju, et al. Research progress on methods and application of dairy cow identification[J]. Journal of China Agricultural University, 2019, 24(12): 62-70. (in Chinese with English abstract)
[11] 耿麗微,錢東平,趙春輝. 基于射頻技術的奶牛身份識別系統(tǒng)[J]. 農業(yè)工程學報,2009,25(5):137-141.
Geng Liwei, Qian Dongping, Zhao Chunhui. Cow identification technology system based on radio frequency[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2009, 25(5): 137-141. (in Chinese with English abstract)
[12] Barry B, Gonzales-Barron U, Butler F, et al. Using muzzle pattern recognition as a biometric approach for cattle identification[J]. Transactions of the ASABE, 2007, 50(3): 1073-1080.
[13] Lu Y, He X, Wen Y, et al. A new cow identification system based on iris analysis and recognition[J]. International Journal of Biometrics, 2014, 6(1): 18-32.
[14] Allen A, Golden B, Taylor M, et al. Evaluation of retinal imaging technology for the biometric identification of bovine animals in Northern Ireland[J]. Livestock Science, 2008, 116(1): 42-52.
[15] 許貝貝,王文生,郭雷風,等. 基于非接觸式的牛只身份識別研究進展與展望[J]. 中國農業(yè)科技導報,2020,22(7):79-89.
Xu Beibei, Wang Wensheng, Guo Leifeng, et al. A review and future prospects on cattle recognition based on non-contact identification[J]. Journal of Agricultural Science and Technology, 2020, 22(7): 79-89. (in Chinese with English abstract)
[16] 燕紅文,劉振宇,崔清亮,等. 基于特征金字塔注意力與深度卷積網(wǎng)絡的多目標生豬檢測[J]. 農業(yè)工程學報,2020,36(11):193-202.
Yan Hongwen, Liu Zhenyu, Cui Qingliang, et al. Multi-target detection based on feature pyramid attention and deep convolution network for pigs[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(11): 193-202. (in Chinese with English abstract)
[17] 胡志偉,楊華,婁甜田. 采用雙重注意力特征金字塔網(wǎng)絡檢測群養(yǎng)生豬[J]. 農業(yè)工程學報,2021,37(5):166-174.
Hu Zhiwei, Yang Hua, Lou Tiantian. Instance detection of group breeding pigs using a pyramid network with dual attention feature[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(5): 166-174. (in Chinese with English abstract)
[18] 蔡騁,宋肖肖,何進榮. 基于計算機視覺的牛臉輪廓提取算法及實現(xiàn)[J]. 農業(yè)工程學報,2017,33(11):171-177.
Cai Cheng, Song Xiaoxiao, He Jinrong. Algorithm and realization for cattle face contour extraction based on computer vision[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(11): 171-177. (in Chinese with English abstract)
[19] 宋懷波,牛滿堂,姬存慧,等. 基于視頻分析的多目標奶牛反芻行為監(jiān)測[J]. 農業(yè)工程學報,2018,34(18):211-218.
Song Huaibo, Niu Mantang, Ji Cunhui, et al Monitoring of multi-target cow ruminant behavior based on video analysis technology[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(18):211-218. (in Chinese with English abstract)
[20] Cai C, Li J. Cattle face recognition using local binary pattern descriptor[C]//Signal and Information Processing Association Annual Summit and Conference, 2013: 1-4.
[21] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[22] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multiBox detector[C]// European Conference on Computer Vision. 2016: 21-37.
[23] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 779-788.
[24] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 6517-6525.
[25] Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
[26] Zhou X, Wang D, Krhenbühl P. Objects as points[J]. arXiv preprint arXiv: 1904.07850, 2019.
[27] 趙凱旋,何東健. 基于卷積神經(jīng)網(wǎng)絡的奶牛個體身份識別方法[J]. 農業(yè)工程學報,2015,31(5):181-187.
Zhao Kaixuan, He Dongjian. Recognition of individual dairy cattle based on convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(5): 181-187. (in Chinese with English abstract )
[28] 何東健,劉建敏,熊虹婷,等. 基于改進YOLO v3模型的擠奶奶牛個體識別方法[J]. 農業(yè)機械學報,2020,51(4):250-260.
He Dongjian, Liu Jianmin, Xiong Hongting, et al. Individual identification of dairy cows based on improved YOLO v3[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51(4): 250-260. (in Chinese with English abstract)
[29] Andrew W, Greatwood C, Burghardt T. Visual localisation and individual identification of holstein friesian cattle via deep learning[C]// Proceedings of the IEEE International Conference on Computer Vision Workshops. 2017: 2850-2859.
[30] 姚禮垚,熊浩,鐘依健,等. 基于深度網(wǎng)絡模型的牛臉檢測算法比較[J]. 江蘇大學學報:自然科學版,2019,40(2):197-202.
Yao Liyao, Xiong Hao, Zhong Yijian et al. Comparison of cow face detection algorithms based on deep network model[J]. Journal of Jiangsu University: Natural Science Edition, 2019, 40(2): 197-202. (in Chinese with English abstract)
[31] Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.
[32] Wang C Y, Liao H Y M, Yeh I H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[J]. arXiv preprint arXiv: 1911.11929, 2019.
[33] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[34] Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.
[35] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[J]. arXiv preprint arXiv: 2103.02907, 2021.
[36] Liu R, Lehman J, Molino P, et al. An intriguing failing of convolutional neural networks and the coordconv solution[J]. arXiv, preprint arXiv: 1807.03247, 2018.
[37] Misra D. Mish: A self regularized non-monotonic neural activation function[J]. arXiv preprint arXiv: 1908.08681, 2019.
Improved YOLO V4 model for face recognition of diary cow by fusing coordinate information
Yang Shuqin1,2,3, Liu Yangqihang1,2,3, Wang Zhen1, Han Yuanyuan1, Wang Yongsheng4, Lan Xianyong5
(1.,,712100,; 2.,,712100,; 3.,712100,; 4.,,712100,; 5.,,712100,)
Individual identity identification of dairy cows is one of the most prerequisites for the intelligent, precision, and large-scale breeding of dairy cows. It can also provide basic information for the formulation of individual feeding plans, milk production efficiency, and health status analysis. As such, an important link can serve in the management of milk source traceability, disease prevention, and insurance claim settlement. Traditional artificial identification of cows, such as ear tags, brands, neck chains, and pricks, is time-consuming and laborious, particularly easy to cause a stress response, resulting in injuries to cows and people. Current identification using Radio Frequency Identification (RFID) or some physiological characteristics, such as bovine nose mirror lines, iris, retinal blood vessels, still have some defects in durability, cost, and accessibility. In this study, a cow face identification was proposed to fuse with the coordinate information using an improved YOLO V4 detection model, in order to identify individual dairy cows accurately and nondestructively. Holstein cow was also taken as a research object. First, 71 facial images were collected in an actual dairy farm over three years, including the cows with different growth stages, various lighting conditions, postures, and degrees of occlusion. A preprocessing step was also selected to remove the blurry, severe occlusion, insufficient light, and abnormal images. The preprocessed dataset was enhanced and then expanded by -10° to 10° rotation, random brightness adjustment, and cropping, thereby improving the generalization performance of the model. In total, 16 614 images of the training set were obtained, including 10 940 images in 2019 and 2020, and some 5 674 images taken in 2021, where the remaining 649 images in 2021 were used as the test set. Secondly, the coordinate attention and coordinate convolution module (CoordConv) containing the coordinate channel were introduced into the feature extraction layer and detection head part of the YOLO V4 network, particularly for the model sensitivity of target location. Finally, the improved YOLO V4 model was compared with 5 object detection models to verify the effectiveness. The test results showed that the average accuracy of the improved YOLO V4 model was 93.68%. Specifically, the new model was improved by 16.32, 10.92, 0.89, 1.51 and 0.19 percentage points, respectively, compared with SSD, CenterNet, YOLO V4, Faster R-CNN, and CA-YOLO V4 model. The improved YOLO V4 model was slightly lower than the original YOLO V4 model, in terms of detection speed. Furthermore, better recognition performance was achieved for the cows with the face occlusion in the improved YOLO V4 model than others. The recognition rate reached 92.60%, the new model was 12.79, 30.52, 10.95 and 7.91 percentage points higher than that of SSD, CenterNet, YOLO V4, and Faster R-CNN model, respectively. Nevertheless, it was necessary to enhance the recognition accuracy, when the facial features were not obvious leading by large occlusion area and dark light. Consequently, the experiment demonstrated that the coordinate information greatly contributed to enhancing the position sensitivity of the cow face for a higher recognition accuracy in the improved YOLO V4 model. This finding can provide effective technical support to identify the cow face in precise dairy cow breeding.
image recognition; animals; dairy cow face; YOLO V4; attentional mechanism; coordinate convolution
楊蜀秦,劉楊啟航,王振,等. 基于融合坐標信息的改進YOLO V4模型識別奶牛面部[J]. 農業(yè)工程學報,2021,37(15):129-135.doi:10.11975/j.issn.1002-6819.2021.15.016 http://www.tcsae.org
Yang Shuqin, Liu Yangqihang, Wang Zhen, et al. Improved YOLO V4 model for face recognition of diary cow by fusing coordinate information[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(15): 129-135. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2021.15.016 http://www.tcsae.org
2021-06-04
2021-07-21
陜西省農業(yè)科技創(chuàng)新轉化項目(NYKJ-2020-YL-07)
楊蜀秦,博士,副教授,研究方向為計算機視覺在農業(yè)信息領域中的應用。Email:yangshuqin1978@163.com
10.11975/j.issn.1002-6819.2021.15.016
TP391.4
A
1002-6819(2021)-15-0129-07