A comparative study of texture measures with classification based on feature distributions
Ojala,T; Pietikainen,M; Harwood,D
Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection
Belhumeur,PN; Hespanha,JP; Kriegman,DJ
Robust real-time face detection
Viola,P; Jones,MJ
PCA versus LDA
Martinez,AM; Kak,AC; et al.
人臉識(shí)別技術(shù)綜述
張翠平,蘇光大
(清華大學(xué)電子工程系“智能技術(shù)與系統(tǒng)”國(guó)家重點(diǎn)實(shí)驗(yàn)室圖形圖象分室,北京 100084)
人臉識(shí)別技術(shù)綜述
張翠平,蘇光大
熱點(diǎn)追蹤
人臉特征提取與識(shí)別
·編者按·
人臉識(shí)別,也稱作面部識(shí)別或人像識(shí)別,是利用計(jì)算機(jī)技術(shù),基于人的臉部特征信息進(jìn)行身份識(shí)別的技術(shù),主要包括三個(gè)主要環(huán)節(jié),即人臉檢測(cè)、臉部特征點(diǎn)定位和特征提取和識(shí)別。廣義的人臉識(shí)別研究范圍主要包括人臉檢測(cè)、人臉表征、人臉鑒別、表情姿態(tài)分析和生理分類等,該研究同屬于生物特征識(shí)別、計(jì)算機(jī)視覺、人工智能等多個(gè)領(lǐng)域,涉及到模式識(shí)別、圖象處理及生理、心理學(xué)等多方面的知識(shí)。狹義的人臉識(shí)別是將待識(shí)別人臉?biāo)崛〉奶卣髋c數(shù)據(jù)庫中人臉的特征進(jìn)行對(duì)比,根據(jù)相似度判別分類。本專題側(cè)重關(guān)注狹義人臉識(shí)別,即人臉特征提取與識(shí)別技術(shù)。
與指紋、視網(wǎng)膜、虹膜、基因等其他人體生物特征識(shí)別系統(tǒng)相比,人臉識(shí)別具有簡(jiǎn)便、友好、直接的特點(diǎn),更易于為用戶所接受,并且還能夠獲得一些額外信息,因而,該技術(shù)在許多領(lǐng)域有著廣闊的應(yīng)用前景。人臉識(shí)別可用于安全驗(yàn)證系統(tǒng)、公安系統(tǒng)、人機(jī)交互系統(tǒng)等,此外還可用于檔案管理、醫(yī)學(xué)領(lǐng)域、金融領(lǐng)域以及視頻會(huì)議等方面。
人臉識(shí)別技術(shù)的具體實(shí)現(xiàn)方法是將檢測(cè)出的人臉圖像信息與數(shù)據(jù)庫中的人臉圖像進(jìn)行對(duì)比,從中找出與之匹配的人臉。人臉識(shí)別的理論研究以正面圖像輸入模式為主,主要經(jīng)歷了三個(gè)階段。第一階段主要研究人臉識(shí)別所需要的面部特征,主要的研究者以Bertillon、Allen和Parke為代表;第二階段主要研究人機(jī)交互識(shí)別,主要的研究者有Goldstion,Harmon和Lesk;第三階段開啟了機(jī)器自動(dòng)識(shí)別技術(shù)研究,主要包括基于幾何特征、基于代數(shù)特征和基于連接機(jī)制的三種識(shí)別技術(shù)。常用的人臉特征提取與識(shí)別方法主要包括:基于主元分析的人臉識(shí)別方法、基于奇異值分解的人臉識(shí)別方法、基于幾何結(jié)構(gòu)特征與灰度特征融合的人臉識(shí)別方法、非線性建模人臉識(shí)別方法、基于隱馬爾可夫模型的人臉識(shí)別方法和基于圖像重建和圖像融合的人臉識(shí)別方法。
目前,深度學(xué)習(xí)和大數(shù)據(jù)是人臉識(shí)別研究的一個(gè)主要熱點(diǎn)方向,現(xiàn)有的大規(guī)模人臉數(shù)據(jù)集合包括:北京曠視科技(Megvii)有限公司旗下的新型視覺服務(wù)平臺(tái)Face++的5 million images of 20000 subjects,谷歌公司的人工智能系統(tǒng)FaceNet的100-200 million images of 8 million subjects,騰訊公司的優(yōu)圖團(tuán)隊(duì)Tencent-BestImage的1 million images of 20000 subjects,以及中國(guó)科學(xué)院自動(dòng)化研究所的494414 images of 10575 subjects。
本專題得到了譚鐵牛院士(中國(guó)科學(xué)院自動(dòng)化研究所)的大力支持。
·熱點(diǎn)數(shù)據(jù)排行·
截至2015年5月7日,中國(guó)知網(wǎng)(CNKI)和Web of Science(WOS)的數(shù)據(jù)報(bào)告顯示,以人臉識(shí)別為詞條檢索到的期刊文獻(xiàn)分別為4612與4795條,本專題將相關(guān)數(shù)據(jù)按照:研究機(jī)構(gòu)發(fā)文數(shù)、作者發(fā)文數(shù)、期刊發(fā)文數(shù)、被引用頻次進(jìn)行排行,結(jié)果如下。
研究機(jī)構(gòu)發(fā)文數(shù)量排名(CNKI)
研究機(jī)構(gòu)發(fā)文數(shù)量排名(WOS)
作者發(fā)文數(shù)量排名(CNKI)
作者發(fā)文數(shù)量排名(WOS)
期刊發(fā)文數(shù)量排名(CNKI)
期刊發(fā)文數(shù)量排名(WOS)
根據(jù)中國(guó)知網(wǎng)(CNKI)數(shù)據(jù)報(bào)告,以人臉識(shí)別為詞條檢索到的高被引論文排行結(jié)果如下。
國(guó)內(nèi)數(shù)據(jù)庫高被引論文排行
根據(jù)Web of Science統(tǒng)計(jì)數(shù)據(jù),以人臉識(shí)別為詞條檢索到的高被引論文排行結(jié)果如下。
國(guó)外數(shù)據(jù)庫高被引論文排行
·經(jīng)典文獻(xiàn)推薦·
基于Web of Science檢索結(jié)果,利用Histcite軟件選取LCS(Local Citation Score,本地引用次數(shù))TOP 30文獻(xiàn)作為節(jié)點(diǎn)進(jìn)行分析,并結(jié)合專家意見,得到本領(lǐng)域推薦的經(jīng)典文獻(xiàn)如下。
來源出版物:Journal of Cognitive Neuroscience,1991,3(1):71-86
A comparative study of texture measures with classification based on feature distributions
Ojala,T; Pietikainen,M; Harwood,D
Abstract: This paper evaluates the performance both of some texture measures which have been successfully used in various applications and of some new promising approaches proposed recently. For classification a method based on Kullback discrimination of sample and prototype distributions is used. The classification results for single features with one-dimensional feature value distributions and for pairs of complementary features with two-dimensional distributions are presented.
Keywords: texture analysis; classification; feature distribution; Brodatz textures; Kullback discriminant; performance evaluation
來源出版物:Pattern Recognition,1996,29(1): 51-59
Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection
Belhumeur,PN; Hespanha,JP; Kriegman,DJ
Abstract: We develop a face recognition algorithm which is insensitive to large Variation in lighting direction and facial expression. Taking a pattern classification approach,we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face,under varying illumination but fixed pose,lie in a 3D linear subspace of the high dimensional image space - if the face is a Lambertian surface without shadowing. However,since faces are not truly Lambertian surfaces and do indeed produce self-shadowing,images will deviate from this linear subspace. Rather than explicitly modeling this deviation,we linearly project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projectionmethod is based on Fisher's Linear Discriminant and produces well separated classes in a low-dimensional subspace,even under severe variation in lighting and facial expressions. The Eigenface technique,another method based on linearly projecting the image space to a low dimensional subspace,has similar computational requirements. Yet,extensive experimental results demonstrate that the proposed ''Fisherface'' method has error rates that are tower than those of the Eigenface technique for tests on the Harvard and Yale Face Databases.
Keywords: appearance-based vision; face recognition; illumination invariance; Fisher's linear discriminant
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(7): 711-720
Robust real-time face detection
Viola,P; Jones,MJ
Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the "Integral Image" which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm(Freund and Schapire,1995)to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a "cascade" which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems(Sung and Poggio,1998;Rowley et al.,1998; Schneiderman and Kanade,2000; Roth et al.,2000). Implemented on a conventional desktop,face detection proceeds at 15 frames per second.
Keywords: face detection; boosting; human sensing
來源出版物:International Journal of Computer Vision,2004,57(2): 137-154
PCA versus LDA
Martinez,AM; Kak,AC; et al.
Abstract: In the context of the appearance-based paradigm for object recognition,it is generally believed that algorithms based on LDA(Linear Discriminant Analysis)are superior to those based on PCA(Principal Components Analysis). in this communication,we show that this is not always the case. We present our case first by using intuitively plausible arguments and,then. by showing actual results on a face database. Our overall conclusion is that when the training data set is small,PCA can outperform LDA and,also,that PCA is less sensitive to different training data sets.
Keywords: face recognition; pattern recognition; principal components analysis; linear discriminant analysis; learning from undersampled distributions; small training data sets
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(2): 228-233
·推薦綜述·
人臉識(shí)別技術(shù)綜述
張翠平,蘇光大
(清華大學(xué)電子工程系“智能技術(shù)與系統(tǒng)”國(guó)家重點(diǎn)實(shí)驗(yàn)室圖形圖象分室,北京 100084)
摘編自《中國(guó)圖像圖形學(xué)報(bào)》2015年5卷11期:885~894頁,圖、表、參考文獻(xiàn)已省略
0引言
計(jì)算機(jī)人臉識(shí)別技術(shù)也就是利用計(jì)算機(jī)分析人臉圖象,進(jìn)而從中提取出有效的識(shí)別信息,用來“辨認(rèn)”身份的一門技術(shù)。人臉識(shí)別技術(shù)應(yīng)用背景廣泛,可用于公安系統(tǒng)的罪犯身份識(shí)別、駕駛執(zhí)照及護(hù)照等與實(shí)際持證人的核對(duì)、銀行及海關(guān)的監(jiān)控系統(tǒng)及自動(dòng)門衛(wèi)系統(tǒng)等。雖然人類的人臉識(shí)別能力很強(qiáng),能夠記住并辨別上千個(gè)不同人臉,可是計(jì)算機(jī)則困難多了。其表現(xiàn)在:人臉表情豐富;人臉隨年齡增長(zhǎng)而變化;人臉?biāo)蓤D象受光照、成象角度及成象距離等影響;而且從二維圖象重建三維人臉是病態(tài)(illposed)過程,目前尚沒有很好的描述人臉的三維模型。另外,人臉識(shí)別還涉及到圖象處理、計(jì)算機(jī)視覺、模式識(shí)別以及神經(jīng)網(wǎng)絡(luò)等學(xué)科,也和人腦的認(rèn)識(shí)程度緊密相關(guān)。這諸多因素使得人臉識(shí)別成為一項(xiàng)極富挑戰(zhàn)性的課題。
計(jì)算機(jī)人臉識(shí)別技術(shù)是近20a才逐漸發(fā)展起來的,90年代更成為科研熱點(diǎn)。僅1990年到1998年之間,EI可檢索到的相關(guān)文獻(xiàn)就多達(dá)數(shù)千篇。由于人臉識(shí)別實(shí)驗(yàn)所采用的人臉庫通常不大,最常見的人臉庫僅包括100幅左右的人臉圖象,如MIT庫、Yale庫、CMU庫等人臉庫均為小型庫,且由于不同人臉庫之間的輸入條件各異,因此不同的識(shí)別程序之間很難進(jìn)行比較。為促進(jìn)人臉識(shí)別算法的深入研究和實(shí)用化,美國(guó)國(guó)防部發(fā)起了人臉識(shí)別技術(shù)(FaceRecognition Technology簡(jiǎn)稱FERET)工程[1],它包括一個(gè)通用人臉庫和一套通用測(cè)試標(biāo)準(zhǔn)。該FERET庫可用于各種人臉識(shí)別算法的測(cè)試比較。1997年,F(xiàn)ERET人臉庫存儲(chǔ)了取自1199個(gè)人的14126幅圖象,其中同一人的圖象差異,包括不同表情、不同光照、不同頭部姿態(tài)以及不同時(shí)期(相隔18個(gè)月以上)拍攝差異等。如今FERET人臉庫仍在擴(kuò)充,并定期對(duì)各種人臉識(shí)別程序進(jìn)行性能測(cè)試,其分析測(cè)試結(jié)果對(duì)未來的工作起到了一定的指導(dǎo)作用。由于 FERET庫中包括軍人的圖片,不能在美國(guó)以外獲得,因此其他國(guó)家的研究只能采用本地的人臉庫,如英國(guó)的Manchester人臉庫[2]。
通常,人類進(jìn)行人臉識(shí)別依靠的感覺器官包括視覺,聽覺,嗅覺,觸覺等,一般人臉的識(shí)別可以用單個(gè)感官完成,也可以是多感官相配合來存儲(chǔ)和檢索人臉,而計(jì)算機(jī)的人臉識(shí)別所利用的則主要是視覺數(shù)據(jù)。另外,計(jì)算機(jī)人臉識(shí)別的進(jìn)展還受限于對(duì)人類本身識(shí)別系統(tǒng)的認(rèn)識(shí)程度。研究表明[2],人類視覺數(shù)據(jù)的處理是一個(gè)分等級(jí)的過程,其中最底層的視覺過程(視網(wǎng)膜功能)起信息轉(zhuǎn)儲(chǔ)的作用,即將人眼接收的大量圖象數(shù)據(jù)變換為一個(gè)比較規(guī)則的緊湊表達(dá)形式。生理學(xué)的研究表明,人眼視網(wǎng)膜上存在著低層次和高層次的細(xì)胞。其中,低層次的細(xì)胞對(duì)空間的響應(yīng)和小波變換的結(jié)果相似[2];而高層次的細(xì)胞則依據(jù)一群低層次細(xì)胞的響應(yīng),而作出具體的線、面乃至物體模式的響應(yīng)。以此為依據(jù),在計(jì)算機(jī)人臉識(shí)別中,可以將那些通過大量圖象數(shù)據(jù)簡(jiǎn)單處理后獲得的特征定義為低層次特征,而將線、面、模式等描述特征定義為高層次特征。由此,圖象KL變換后的系數(shù)特征、小波變換特征及一些統(tǒng)計(jì)特征均屬低層次特征的范疇,而人臉部件形狀分析的結(jié)果則為高層次特征。由于視覺數(shù)據(jù)經(jīng)傳輸后的重建,需依賴于人腦中早期形成的先驗(yàn)知識(shí),因此在人的識(shí)別系統(tǒng)中,人臉的檢測(cè)是一個(gè)整體識(shí)別和特征識(shí)別共同作用的結(jié)果[3];具體說來,遠(yuǎn)處辨認(rèn)人,主要是整體識(shí)別,而在近距離的人臉識(shí)別中,特征部件的識(shí)別則更重要。另外,人臉的各部件對(duì)識(shí)別的貢獻(xiàn)也不相同,如眼睛和嘴巴的重要程度大于人的鼻子,人臉上半部分重要性大于人臉下半部分,其中特別的人臉更容易被識(shí)別記?。?],比如說歪嘴,或是獨(dú)眼龍等人臉就更容易為人記起,沒有個(gè)性的人臉相對(duì)就需要更長(zhǎng)的時(shí)間來辨認(rèn)。根據(jù)對(duì)人腦的研究表明[3],人臉的表情識(shí)別和人臉識(shí)別雖然存在聯(lián)系,但總體說是分開的、并行的處理過程。這些結(jié)論對(duì)于設(shè)計(jì)有效的識(shí)別方法起到了一定啟發(fā)作用。在現(xiàn)有的識(shí)別算法中,特征臉方法[4]和神經(jīng)網(wǎng)絡(luò)方法[5]是基于整體人臉的識(shí)別,而基于提取眼睛等部件特征而形成特征向量[6]的方法就是基于人臉特征的識(shí)別。
人臉識(shí)別的研究始于60年代末,最早的研究見于文獻(xiàn)[7],Bledsoe以人臉特征點(diǎn)的間距、比率等參數(shù)為特征,建成了一個(gè)半自動(dòng)的人臉識(shí)別系統(tǒng)。而且早期人臉識(shí)別研究主要有兩大方向:一是提取人臉幾何特征的方法[7],包括人臉部件規(guī)一化的點(diǎn)間距離和比率以及人臉的一些特征點(diǎn),如眼角、嘴角、鼻尖等部位所構(gòu)成的二維拓?fù)浣Y(jié)構(gòu);二是模板匹配的方法,主要是利用計(jì)算模板和圖象灰度的自相關(guān)性來實(shí)現(xiàn)識(shí)別功能。Berto在1993年對(duì)這兩類方法作了較全面的介紹和比較后認(rèn)為,模板匹配的方法優(yōu)于幾何特征的方法[8]。目前的研究也主要有兩個(gè)方向:其一是基于整體的研究方法,它考慮了模式的整體屬性,包括特征臉(Eigenface)方法、SVD分解的方法[9]、人臉等密度線分析匹配方法[10]、彈性圖匹配(elastic graph matching)方法[11]、隱馬爾可夫模型(Hidden Markov Model)方法[12]以及神經(jīng)網(wǎng)絡(luò)的方法等;其二是基于特征分析的方法,也就是將人臉基準(zhǔn)點(diǎn)的相對(duì)比率和其它描述人臉臉部特征的形狀參數(shù)或類別參數(shù)等一起構(gòu)成識(shí)別特征向量。這種基于整體臉的識(shí)別不僅保留了人臉部件之間的拓?fù)潢P(guān)系,而且也保留了各部件本身的信息,而基于部件的識(shí)別則是通過提取出局部輪廓信息及灰度信息來設(shè)計(jì)具體識(shí)別算法。文獻(xiàn)[8]認(rèn)為基于整個(gè)人臉的分析要優(yōu)于基于部件的分析,理由是前者保留了更多信息,但是這種說法值得商榷,因?yàn)榛谌四槻考淖R(shí)別要比基于整體的方法來得直觀,它提取并利用了最有用的特征,如關(guān)鍵點(diǎn)的位置以及部件的形狀分析等,而對(duì)基于整個(gè)人臉的識(shí)別而言,由于把整個(gè)人臉圖象作為模式,那么光照、視角以及人臉尺寸會(huì)對(duì)人臉識(shí)別有很大的影響,因此如何能夠有效地去掉這些干擾很關(guān)鍵。雖然如此,但對(duì)基于部件分析的人臉識(shí)別方法而言也有困難,其難點(diǎn)在于如何建立好的模型來表達(dá)識(shí)別部件。近年來的一個(gè)趨勢(shì)是將人臉的整體識(shí)別和特征分析的方法結(jié)合起來,如Kin-Man Lam提出的基于分析和整體的方法[13],Andreas Lanitis提出的利用可變形模型(Flexible Models)來對(duì)人臉進(jìn)行解釋和編碼的方法[14]。
在介紹重要的人臉識(shí)別方法之前,先扼要說明一下應(yīng)用于人臉識(shí)別的其它方法。其中SVD方法和特征臉識(shí)別方法同屬統(tǒng)計(jì)分析的范疇,都是將表達(dá)人臉的大量圖象數(shù)據(jù)降維后進(jìn)行模式分類,其區(qū)別僅是變換基的給出不同;而等密度線的分析方法則試圖通過從二維的人臉圖上抽取等密度線(即等灰度線)來反映人臉的三維信息,其根據(jù)是地圖上的等高線能反映地形特征,那么通過不同人臉的等密度線也可比較人臉的相似度;HMM 是語音處理中成功的一種統(tǒng)計(jì)方法;而神經(jīng)網(wǎng)絡(luò)方法通常需要將人臉作為一個(gè)一維向量輸入,因此輸入節(jié)點(diǎn)龐大,其識(shí)別重要的一個(gè)目標(biāo)就是降維處理。根據(jù)文獻(xiàn)[15]對(duì)于自組織神經(jīng)網(wǎng)絡(luò)方法的分析,該文認(rèn)為可采用自組織神經(jīng)網(wǎng)絡(luò)的P個(gè)節(jié)點(diǎn)來表達(dá)原始的N個(gè)輸入(P < N),但由于將P個(gè)輸出進(jìn)行分類,其識(shí)別的效果僅相當(dāng)于提取人臉空間特征向量后進(jìn)行的識(shí)別分類,因此采用此類神經(jīng)網(wǎng)絡(luò)進(jìn)行識(shí)別的效果只能是特征臉的水平,所以本文將不對(duì)神經(jīng)網(wǎng)絡(luò)作專門介紹。需要說明的是,由于人臉處于高維空間,如100×100的圖象為10000維,這樣神經(jīng)網(wǎng)絡(luò)的輸入節(jié)點(diǎn)將很龐大,因此實(shí)際訓(xùn)練網(wǎng)絡(luò)的時(shí)候參數(shù)繁多,實(shí)現(xiàn)起來很困難,但神經(jīng)網(wǎng)絡(luò)方法的優(yōu)點(diǎn)是可以針對(duì)特定的問題進(jìn)行子空間設(shè)計(jì),如神經(jīng)網(wǎng)絡(luò)的方法可以用作性別識(shí)別等問題[15]。
1常用的人臉識(shí)別方法簡(jiǎn)介
1.1基于KL變換的特征臉識(shí)別方法
1.1.1基本原理
KL變換是圖象壓縮中的一種最優(yōu)正交變換。人們將它用于統(tǒng)計(jì)特征提取,從而形成了子空間法模式識(shí)別的基礎(chǔ)。若將KL變換用于人臉識(shí)別,則需假設(shè)人臉處于低維線性空間,且不同人臉具有可分性。由于高維圖象空間KL變換后可得到一組新的正交基,因此可通過保留部分正交基,以生成低維人臉空間。而低維空間的基則是通過分析人臉訓(xùn)練樣本集的統(tǒng)計(jì)特性來獲得。KL變換的生成矩陣可以是訓(xùn)練樣本集的總體散布矩陣,也可以是訓(xùn)練樣本集的類間散布矩陣,即可采用同一人的數(shù)張圖象的平均來進(jìn)行訓(xùn)練,這樣可在一定程度上消除光線等的干擾,且計(jì)算量也得到減少,而識(shí)別率不會(huì)下降。
也就是說,根據(jù)總體散布矩陣或類間散布矩陣可求出一組正交的特征向量μ1,μ2,…,μn,其對(duì)應(yīng)的全部特征值分別為λ1,λ2,…,λn,這樣,在新的正交空間中,人臉樣本X就可以表示為
若通過選用m(m < n)個(gè)特征向量作為正交基,則在該正交空間的子空間中,就可得到以下近似表達(dá)式
如將子空間的正交基按照?qǐng)D象陣列排列,則可以看出這些正交基呈現(xiàn)人臉的形狀,因此這些正交基也被稱作特征臉,這種人臉識(shí)別方法也叫特征臉方法。關(guān)于正交基的選擇有不同的考慮,即與較大特征值對(duì)應(yīng)的正交基(也稱主分量)可用來表達(dá)人臉的大體形狀,而具體細(xì)節(jié)還需要用與小特征值對(duì)應(yīng)的特征向量(也稱次分量)來加以描述,因此也可理解為低頻成分用主分量表示,而高頻成分用次分量表示。其中,采用主分量作正交基的方法稱為主分量方法(PCA)。同時(shí),也有人采用m個(gè)次分量作為正交基,原因是所有人臉的大體形狀和結(jié)構(gòu)相似,真正用來區(qū)別不同人臉的信息是那些用次分量表達(dá)的高頻成分。由訓(xùn)練得到特征臉后,將待識(shí)別人臉投影到新的m維人臉空間,即用一系列特征臉的線性加權(quán)和來表示它,這樣即得到一投影系數(shù)向量來代表待識(shí)別人臉,這時(shí)候,人臉識(shí)別問題已轉(zhuǎn)化為m低維空間的坐標(biāo)系數(shù)矢量分類問題,而分類最簡(jiǎn)單的做法是最小距離分類。
KL變換在90年代初受到了很大的重視,實(shí)際用于人臉識(shí)別也取得了很好的效果,其識(shí)別率從70~100%不等,這取決于人臉庫圖象的質(zhì)量。從壓縮能量的角度來看,KL變換是最優(yōu)的,它不僅使得從n維空間降到m維空間前后的均方誤差最小,而且變換后的低維空間有很好的人臉表達(dá)能力,然而這不是說已經(jīng)具有很好的人臉辨別能力。選擇訓(xùn)練樣本的散布矩陣作為KL變換的生成矩陣,是由于其最大特征向量抓住了該樣本集合的主要分布,但這是圖象統(tǒng)計(jì),而不是人臉統(tǒng)計(jì)方法。它雖然考慮了圖象之間所有的差異,但由于它不管這樣的差異是由照明、發(fā)型變更或背景導(dǎo)致,還是屬于人臉的內(nèi)在差異,因此特征臉識(shí)別的方法用于人臉識(shí)別存在理論的缺陷。研究表明,特征臉的方法隨著光線、角度及人臉的尺寸等因素的引入,識(shí)別率急劇下降。雖然可通過采用同一人的訓(xùn)練樣本的平均來計(jì)算類間散布矩陣,但也只能在一定程度上糾正這個(gè)缺點(diǎn)。研究結(jié)果表明,主分量的方法使得變換后表達(dá)能力最佳,次分量的方法則考慮了高頻的人臉區(qū)分能力。由于對(duì)KL變換而言,外在因素帶來的圖象差異和人臉本身帶來的差異是不加任何區(qū)分的,因此,不管如何選擇正交基,也不能根本解決問題。其改善的一個(gè)思路是針對(duì)干擾所在,對(duì)輸入圖象作規(guī)范化處理,其中包括將輸入圖的均值方差歸一化、人臉尺寸歸一化等;另一種改進(jìn)是考慮到局部人臉圖象受外在干擾相對(duì)較小,在進(jìn)行人臉識(shí)別時(shí),除計(jì)算特征臉之外,還可利用KL變換計(jì)算出特征眼睛、特征嘴巴等。然后將局部特征向量加權(quán)進(jìn)行匹配,就能夠得到一些好的效果。
1.1.2對(duì)特征臉方法的改進(jìn)
一種較好的特征臉改進(jìn)方法是fisher臉方法(fisherface)[17],眾所周知,fisher線性判別準(zhǔn)則是模式識(shí)別里的經(jīng)典方法,一般應(yīng)用fisher準(zhǔn)則是假設(shè)不同類別在模式空間是線性可分的,而引起它們可分的主要原因是不同人臉之間的差異。fisher的判別準(zhǔn)則是:不同類樣本盡可能遠(yuǎn),同類樣本盡可能近。文獻(xiàn)[17]對(duì)用KL變換和fisher準(zhǔn)則分別求出來的一些特征臉進(jìn)行比較后得出如下結(jié)論,即認(rèn)為特征臉很大程度上反映了光照等的差異,而fisher臉則能壓制圖象之間的與識(shí)別信息無關(guān)的差異。Belhumeur的試驗(yàn)[17],是通過對(duì)160幅人臉圖象(一共16個(gè)人,每個(gè)人10幅不同條件下的圖象)進(jìn)行識(shí)別,若采用KL變換進(jìn)行識(shí)別,其識(shí)別率為81%;若采用fisher方法則識(shí)別率為99.4%,顯然fisher方法有了很大的改進(jìn)。 Chengjun Liu在KL變換基礎(chǔ)上提出了PRM(Probalistic Reasoning Models)模型[18],并在PRM中采用了貝葉斯分類器,它是利用最大后驗(yàn)概率進(jìn)行分類,其類條件概率密度的方差參數(shù)用類內(nèi)散布矩陣來估計(jì),而且,PRM是采用馬氏距離,而不是采用最小歐氏距離的判別準(zhǔn)則,并且特征臉和fisher臉均可以看成是PRM的特殊情況。
文獻(xiàn)[19]的改進(jìn)方法是將人臉圖象進(jìn)行差異分類,即分為臉間差異和臉內(nèi)差異,其中臉內(nèi)差異屬于同一個(gè)人臉的各種可能變形,而臉間差異則表示不同人的本質(zhì)差異,而實(shí)際人臉圖的差異為兩者之和。通過分析人臉差異圖,如果臉內(nèi)差異比臉間差異大,則認(rèn)為兩人臉屬于同一人的可能性大,反之屬不同人的可能性大。 假設(shè)該兩類差異都是高斯分布,則先估計(jì)出所需的條件概率密度[19],最后也歸為求差圖在臉內(nèi)差異特征空間和臉間差異特征空間的投影問題。如果說fisher臉的方法是試圖減少光照等的外在干擾,那么文獻(xiàn)[19]則是解決表情干擾的一點(diǎn)有效嘗試,雖然這樣的嘗試還很初步。文獻(xiàn)[19]中提到,ARPA在1996年進(jìn)行的FERET人臉識(shí)別測(cè)試中,該算法取得了最好的識(shí)別效果,其綜合識(shí)別能力優(yōu)于其它任何參加測(cè)試的算法。
1.1.3特征臉方法小結(jié)
如今特征臉方法用于人臉識(shí)別仍存在如下一些弊?。菏紫龋捎谧鳛橐环N圖象的統(tǒng)計(jì)方法,圖象中的所有象素被賦予了同等的地位,可是角度、光照、尺寸及表情等干擾會(huì)導(dǎo)致識(shí)別率急劇下降,因此較好的識(shí)別算法[19]都對(duì)人臉進(jìn)行了矯正處理,且只考慮裸臉;其次,根據(jù)文獻(xiàn)[2],人臉在人臉空間的分布近似高斯分布,且普通人臉位于均值附近,而特殊人臉則位于分布邊緣。由此可見,越普通的人臉越難識(shí)別,雖然特征臉的方法本質(zhì)上是抓住了人群的統(tǒng)計(jì)特性,但好的表達(dá)能力不等于好的區(qū)分能力;特征臉雖反映了特定庫的統(tǒng)計(jì)特性,但不具有普遍代表性,而廣泛的應(yīng)用,則需要訓(xùn)練出的特征臉具有普遍意義;采用此方法的重要假設(shè)是人臉處于低維線性空間,即人臉相加和相減后還是人臉[2],顯然這是不可能的,因?yàn)榧词乖诙ㄎ缓统叽缦嗤那闆r下,由于部件的相對(duì)位置不同,相加、相減后的人臉也一樣存在模糊,因此文獻(xiàn)[14]提出形狀無關(guān)人臉(shapeless face)的概念,即依據(jù)臉部基準(zhǔn)點(diǎn)將人臉變形到標(biāo)準(zhǔn)臉,再進(jìn)行特征臉處理。總之,有效的特征臉識(shí)別方法需要做大量預(yù)處理,以減少干擾。而如何表達(dá),并去除表情因素則是識(shí)別的另一關(guān)鍵。
1.2形狀和灰度分離的可變形模型
文獻(xiàn)[14]提出了一個(gè)形狀和灰度分離的模型,即從形狀、總體灰度、局部灰度分布3個(gè)方面來描述一個(gè)人臉(如圖1、圖2、圖3所示)。其中,點(diǎn)分布模型(圖1)用來描述人臉的形狀特征,該點(diǎn)分布模型中是用每點(diǎn)的局部灰度信息(圖3是采用耳朵上一點(diǎn)附近的方向投影)來描述人臉的局部灰度特征;然后用點(diǎn)分布模型將圖象進(jìn)行變形,以生成形狀無關(guān)人臉(圖2),再做特征臉分析,從而得到人臉的總體灰度模式分布特征。這種三者相結(jié)合的識(shí)別方法,識(shí)別率為92%(300個(gè)人臉),雖然該方法作了一些改進(jìn),但構(gòu)成該方法的基礎(chǔ)仍是KL變換。一般在特征臉的方法中,是由行或列掃描后的人臉圖象數(shù)據(jù)來生成特征臉子空間,這里則對(duì)應(yīng)于3種由不同類型參數(shù)生成的3種特征子空間。該方法首先是循序取每點(diǎn)坐標(biāo)位置信息,并將其排列成待訓(xùn)練數(shù)據(jù)以生成形狀特征子空間;然后對(duì)點(diǎn)分布模型的每一點(diǎn)(如圖3中耳朵附近一點(diǎn))取局部投影信息來代表該點(diǎn)附近的局部灰度特征,再通過訓(xùn)練后生成與該點(diǎn)相對(duì)應(yīng)的局部灰度分布特征子空間。若將所有人臉的關(guān)鍵點(diǎn)都變形到規(guī)定位置,則生成形狀無關(guān)人臉,然后對(duì)所有的形狀無關(guān)人臉進(jìn)行特征臉分析,以生成特征臉子空間。雖然每一個(gè)特征子空間都可以單獨(dú)用來識(shí)別人臉,但若要完整地描述一個(gè)人臉,則需要 3個(gè)特征子空間的人臉參數(shù)。文獻(xiàn)[14]還試圖通過形狀特征子空間來分離和表情相關(guān)的參數(shù),而設(shè)計(jì)形狀和灰度分離的模型是希望能夠有一個(gè)好的人臉模型。試驗(yàn)中,將這樣的模型用于三維姿態(tài)復(fù)原、身份識(shí)別、性別識(shí)別、表情識(shí)別以及人臉的重建,均取得了一定的效果。
1.3基于小波特征的彈性匹配方法
1.3.1基本原理
在KL變換中,待識(shí)別人臉X和庫中人臉C之間采用了通常的歐氏距離來進(jìn)行匹配。雖然歐氏距離計(jì)算簡(jiǎn)單,但是當(dāng)X和C只有位移、膨脹(如affine變換)或是表情不同時(shí),則歐氏距離不會(huì)等于零,甚至很大,此外,若C作為人臉庫中的已知人臉模板,應(yīng)該是描述人臉的關(guān)鍵特征,它的維數(shù)并不需要和待識(shí)別人臉一樣,因而此時(shí)歐氏距離就不合適;而彈性圖匹配法是在二維的空間中定義了這樣一個(gè)距離,它對(duì)通常的人臉變形具有一定的不變性,也不要求C、X維數(shù)一定相同??刹捎脤傩酝?fù)鋱D來表達(dá)人臉(圖4采用的是規(guī)則的二維網(wǎng)格圖),其拓?fù)鋱D的任一頂點(diǎn)均包含一特征矢量,它記錄了人臉在該頂點(diǎn)位置的分布信息(如圖5),如文獻(xiàn)[11]中介紹的二維拓?fù)鋱D的頂點(diǎn)矢量就是人臉經(jīng)小波變換后的特征矢量。在圖象的敏感位置(如輪廓線、突出點(diǎn)等),小波變換后生成的特征矢量的模較大。用拓?fù)鋱D分別代表已知和待識(shí)別人臉,還可根據(jù)匹配拓?fù)鋱D算出它們的“距離”,作為人臉的相似度準(zhǔn)則。由于篇幅所限,詳細(xì)的拓?fù)鋱D生成過程文獻(xiàn)[11]、[15]。
人臉的相似度可用拓?fù)鋱D的“距離”來表示,而最佳的匹配應(yīng)同時(shí)考慮頂點(diǎn)特征矢量的匹配和相對(duì)幾何位置的匹配。由圖 6(和圖5一樣,它們的每一頂點(diǎn)均為一特征矢量)可見,特征匹配即:S1上的頂點(diǎn)i,與S中相對(duì)應(yīng)的頂點(diǎn)j(j= M(i),M為匹配函數(shù)),其特征的匹配度則表示i和j頂點(diǎn)的特征矢量相似度,而幾何位置的匹配則為S中相近的兩頂點(diǎn),匹配后,S1中對(duì)應(yīng)的兩頂點(diǎn)也應(yīng)該相近,因此文獻(xiàn)[11]用了以下能量函數(shù)E(M)來評(píng)價(jià)待識(shí)別人臉圖象矢量場(chǎng)和庫中已知人臉的矢量場(chǎng)之間的匹配程度
式中的第一項(xiàng)是計(jì)算兩個(gè)矢量場(chǎng)中對(duì)應(yīng)的局部特征Xj和Ci的相似程度,第二項(xiàng)則是計(jì)算局部位置關(guān)系和匹配次序。由此可見,最佳匹配也就是最小能量函數(shù)時(shí)的匹配。
在求能量函數(shù)實(shí)現(xiàn)匹配的時(shí)候,可以有如下兩種匹配的方法:其中一種是嚴(yán)格的匹配方法;另一種匹配即所謂彈性圖匹配方法(見圖7)。由圖7可見,網(wǎng)格S經(jīng)過了變形,即由原來網(wǎng)格S中的一點(diǎn)對(duì)S1中一點(diǎn)的嚴(yán)格匹配,變成了S中一點(diǎn)和S1中一點(diǎn)領(lǐng)域范圍內(nèi)的匹配,其目的是為了進(jìn)一步減小能量函數(shù),通過最終收斂到一個(gè)最小值,來實(shí)現(xiàn)彈性匹配,正是這樣的匹配容忍了表情的細(xì)微變化。
根據(jù)Jun Zhang[15]對(duì)綜合MIT、Olivetti、W wizmann、和Bem等人臉庫所形成的包括272幅照片的綜合人臉庫,分別用KL方法和彈性匹配方法進(jìn)行識(shí)別試驗(yàn)比較[15],所得的識(shí)別率分別為66%和93%。其中KL變換的識(shí)別率很低,其原因主要是由于綜合庫里來自4個(gè)人臉庫的人臉圖象在光照上有很大的差異所造成的,文獻(xiàn)[15]之所以作出了彈性圖形匹配優(yōu)于KL變換的結(jié)論,其原因之一是由于拓?fù)鋱D的頂點(diǎn)采用了小波變換特征,因?yàn)樗鼘?duì)于光線、變換、尺寸和角度具有一定的不變性。大家知道,小波特征分析是一種時(shí)頻分析,即空間—頻率分析,若空間一點(diǎn)周圍區(qū)域的不同的頻率響應(yīng)構(gòu)成該點(diǎn)的特征串,則其高頻部分就對(duì)應(yīng)了小范圍內(nèi)的細(xì)節(jié),而低頻部分則對(duì)應(yīng)了該點(diǎn)周圍較大范圍內(nèi)的概貌。根據(jù)該原理,文獻(xiàn)[20]提出了用數(shù)學(xué)形態(tài)學(xué)上的腐蝕擴(kuò)張方法形成的多尺度(多分辨率)特征矢量來取代小波特征,并證明了它具有和小波特征相似的效果,它能夠反映空間一點(diǎn)周圍的高低頻信息?,F(xiàn)已證明,彈性圖形匹配能保留二維圖象的空間相關(guān)性信息,而特征臉方法在將圖象排成一維向量后,則丟失了很多空間相關(guān)性信息。 這些都是彈性匹配方法優(yōu)于特征臉方法的原因,如向人臉庫中加入新的人臉時(shí),由于不能保證已有特征臉的通用性,因而有可能需要重新計(jì)算特征臉;而對(duì)于彈性匹配的方法,則不需要改變已有的數(shù)據(jù),通過直接加入新的模板數(shù)據(jù)即可,但計(jì)算較復(fù)雜是彈性匹配的一大缺點(diǎn)。根據(jù)引言中提出的低層次特征和高層次特征的定義,這里的小波特征類似于外界景物在人眼視網(wǎng)膜上的響應(yīng),屬低層次特征,沒有線、面、模式的概念。 由于低層次特征中信息的冗余不僅使得計(jì)算復(fù)雜,而且由于大量與識(shí)別無關(guān)的信息沒有過濾掉,因而識(shí)別率會(huì)大打折扣,另外特征臉也存在這樣的問題,其中典型的無用信息就是頭發(fā)。
針對(duì)彈性匹配方法的缺陷,可從以下兩方面進(jìn)行改進(jìn):一是降低計(jì)算復(fù)雜度,即對(duì)表達(dá)人臉的二維矢量場(chǎng)進(jìn)行特征壓縮和提??;二是減少冗余信息,即將所提取出來的低層次特征和高層次特征(如眼角、鼻端的位置等)結(jié)合起來,以突出關(guān)鍵點(diǎn)的識(shí)別地位。
1.3.2對(duì)彈性匹配方法的改進(jìn)及分析
文獻(xiàn)[20]提出了一種彈性匹配的改進(jìn)方法,即將KL變換應(yīng)用于小波變換,來生成二維網(wǎng)格中頂點(diǎn)的矢量串,以減少其維數(shù),從而大大減少了表達(dá)一幅人臉?biāo)枰奶卣鲾?shù)量,而識(shí)別率不會(huì)明顯下降。
文獻(xiàn)[21]是采用人臉基準(zhǔn)點(diǎn),而不是采用二維網(wǎng)格作為拓?fù)鋱D的節(jié)點(diǎn),同時(shí)節(jié)點(diǎn)特征也是小波變換特征,即它忽略了除重要人臉部件以外的特征數(shù)據(jù),把研究的重點(diǎn)直接定位到感興趣的區(qū)域(參照?qǐng)D8)。
文獻(xiàn)[21]還采用了和文獻(xiàn)[11]不同的結(jié)構(gòu)來存儲(chǔ)人臉特征(如圖9所示)。
由于文獻(xiàn)[11]中特征庫的存儲(chǔ)是面向人臉的,即對(duì)每一張人臉都需要存儲(chǔ)描述該人臉的整個(gè)拓?fù)鋱D,因而導(dǎo)致了人臉的特征庫很龐大,文獻(xiàn)[21]中特征庫的存儲(chǔ)是面向人臉基準(zhǔn)點(diǎn)的(如圖9),且對(duì)應(yīng)每個(gè)基準(zhǔn)點(diǎn)有一串的特征矢量,當(dāng)由某一人臉的對(duì)應(yīng)基準(zhǔn)點(diǎn)提取出來的矢量不同于庫中已有的任意矢量時(shí),就添入到該結(jié)構(gòu)中存儲(chǔ)起來,并編號(hào)。這樣識(shí)別每個(gè)人臉只需知道人臉對(duì)應(yīng)基準(zhǔn)點(diǎn)在該存儲(chǔ)結(jié)構(gòu)中的特征矢量序號(hào)即可。該存儲(chǔ)結(jié)構(gòu)一個(gè)主要優(yōu)點(diǎn)是,由于不同人臉在同一個(gè)基準(zhǔn)點(diǎn)所對(duì)應(yīng)的特征矢量可能相同,因此和面向人臉的存儲(chǔ)形式相比,數(shù)據(jù)量會(huì)大大減少;另一優(yōu)點(diǎn)是該存儲(chǔ)結(jié)構(gòu)有很強(qiáng)的表達(dá)潛力,設(shè)有10個(gè)基準(zhǔn)點(diǎn),如庫中每一基準(zhǔn)點(diǎn)都存儲(chǔ)了50個(gè)特征矢量,那么該存儲(chǔ)結(jié)構(gòu)能表達(dá)5010個(gè)不同的人臉。由此可見,文獻(xiàn)[21]對(duì)文獻(xiàn)[11]方法的一大改進(jìn)是結(jié)合了人臉的高層次特征。
另外,彈性匹配方法在實(shí)現(xiàn)時(shí),需要考慮具體的參數(shù)選擇,如二維網(wǎng)格的大小、小波變換參數(shù)的選擇等,這些參數(shù)都會(huì)影響識(shí)別的效果。毫無疑問,有效的識(shí)別效果依賴于關(guān)鍵識(shí)別信息的提取,如采取多大的人臉分辨率?能否對(duì)提取出來的特征(具體的或抽象的)進(jìn)行篩選?經(jīng)驗(yàn)知識(shí)使我們關(guān)注人臉部件及其附近的特征,而能否再次對(duì)這些特征進(jìn)行篩選?并有何依據(jù)?文獻(xiàn)[2]正是希望能夠回答這些問題。
文獻(xiàn)[2]的方法稱為緊湊多層圖形方法,它是采用三維的拓?fù)鋱D來表達(dá)人臉(如圖10)。
該圖構(gòu)成了一個(gè)金字塔的人臉模型,而且每一層中節(jié)點(diǎn)的特征矢量也是小波變換的結(jié)果。通過這樣的金字塔模型就實(shí)現(xiàn)了同一個(gè)人臉的多分辨率表達(dá)。另外,文獻(xiàn)[2]有如下兩點(diǎn)創(chuàng)新:(1)將高低層特征聯(lián)系起來,并通過手工選擇一些關(guān)鍵點(diǎn)(如眼角、嘴角等)來定位三維拓?fù)鋱D,同時(shí)去除了背景、頭發(fā)等所在節(jié)點(diǎn);(2)對(duì)三維拓?fù)鋱D的特征進(jìn)行了特征選擇,選出了活躍的特征(包括節(jié)點(diǎn)內(nèi)的特征分量和不同節(jié)點(diǎn)之間兩種活躍性能比較),還去除了相當(dāng)多的貢獻(xiàn)不大的特征,從而形成了人臉的稀疏表達(dá)。由于特征選擇后,不同人臉的拓?fù)鋱D保留的節(jié)點(diǎn)不完全一樣,因此用于比較的兩個(gè)人臉的三維拓?fù)鋱D在數(shù)值上和結(jié)構(gòu)上都不相同,為此,文獻(xiàn)[2]定義了一種距離來計(jì)算它們的相似度。為提取活躍特征,我們?cè)鴩L試?yán)媚切┦止ぬ崛〉年P(guān)鍵點(diǎn),來生成訓(xùn)練庫的形狀無關(guān)模型(不是形狀無關(guān)人臉),即通過插值小波變換后生成的二維拓?fù)鋱D來形成人臉的連續(xù)表達(dá)模型,并假設(shè)所有人的臉內(nèi)差異相同(即表情等),然后根據(jù)訓(xùn)練庫的統(tǒng)計(jì)形狀無關(guān)模型,在一人一張照片的情況下,估計(jì)出個(gè)人表達(dá)模型中的活躍特征。打個(gè)比方,人的眼睛都是相似的,假設(shè)眼睛的分布為高斯分布,那么一個(gè)眼睛離平均眼睛越遠(yuǎn),這個(gè)眼睛的特征就越顯著,即,若有一定的與眾(平均眼)不同性,就可以認(rèn)為是該人的活躍特征,詳細(xì)內(nèi)容參考文獻(xiàn)[2],該文有很多創(chuàng)新,它是以人腦對(duì)人臉的識(shí)別為依據(jù),因此有很好的參考價(jià)值。
通過上述的介紹分析,可看出彈性匹配方法比特征臉識(shí)別方法前進(jìn)了一大步。它是采用小波變換特征來描述人臉的局部信息,并和人眼視網(wǎng)膜對(duì)圖象的響應(yīng)相似[2],而且一定程度上容忍光線等干擾,對(duì)細(xì)微表情也不敏感。而且彈性匹配中的人臉模型還考慮了局部人臉細(xì)節(jié),并保留了人臉的空間分布信息,且它的可變形匹配方式一定程度上能夠容忍人臉從三維到二維投影引起的變形。目前還沒有見到國(guó)內(nèi)有利用彈性匹配進(jìn)行識(shí)別的相關(guān)報(bào)道,但是從國(guó)外眾多的關(guān)于彈性匹配的研究結(jié)果來看,它在人臉識(shí)別眾方法中具有重要地位。
1.4傳統(tǒng)的部件建模的方法
文獻(xiàn)[8]認(rèn)為在人臉識(shí)別中,模型匹配方法要優(yōu)于基于相對(duì)距離的特征分析方法。盡管如此,傳統(tǒng)的部件分析方法還是被一些研究室用于人臉識(shí)別,究其原因,一方面是由于其它方法還處于摸索階段,另一方面是利用曲線去擬合部件、分析部件的形狀比較直觀,也容易取得一定的成果[6]。
在各種人臉識(shí)別方法中,定位眼睛往往是人臉識(shí)別的第一步,由于兩眼睛的對(duì)稱性以及眼珠呈現(xiàn)為低灰度值的圓形,因此在人臉圖象清晰端正的時(shí)候,眼睛的提取是比較容易的,如從400幅人像庫上可取得96%的眼睛定位率[6],但是如果人臉圖象模糊,或者噪聲很多,則往往需要利用更多的信息(如眼睛和眉毛、鼻子的相對(duì)位置等),而且這將使得眼睛的定位變得很復(fù)雜。由于通常眼睛的形狀模型為橢圓[22],嘴巴的形狀模型為拋物線[22],因此橢圓和拋物線的參數(shù)和位置能夠用作表達(dá)當(dāng)前人臉的特征,文獻(xiàn)[6]考慮到眼睛用橢圓表達(dá)過于簡(jiǎn)單,故又采用了二值化,并通過跟蹤以得到眼睛形狀的方法,由于眉毛和臉形的形狀具有任意性,因此在一些研究中曾采用snake動(dòng)態(tài)能量曲線來逼近形狀[13,22],如臉頰的形狀采用了折線,下巴采用拋物線的模型。這些都是傳統(tǒng)的提取和分析形狀的方法。雖然人臉是剛體,可實(shí)際圖象中,部件未必輪廓分明,有時(shí)人用眼看也只是個(gè)大概,計(jì)算機(jī)提取就更成問題。另外,由于用拋物線、橢圓或者直線作為模型也不能很好的表達(dá)反映變化多端的人臉部件,且由于人臉識(shí)別還受到表情的影響,且不能在模型中表達(dá)表情,因而導(dǎo)致描述同一個(gè)人的不同人臉時(shí),其模型參數(shù)可能相差很大,而失去識(shí)別意義,這也是部件建模識(shí)別方法近年受冷落的原因。盡管如此,在正確提取部件以及表情變化微小的前提下,該方法依然奏效,因此在許多方面仍可應(yīng)用,如對(duì)標(biāo)準(zhǔn)身份證照的應(yīng)用。
2人臉識(shí)別方法的分析和總結(jié)
2.1特征來源以及特征的后處理
眾所周知,人臉的結(jié)構(gòu)大體相同,所不同的是一些細(xì)節(jié)上的差異,原始的人臉圖象不僅數(shù)據(jù)龐大,而且還會(huì)隨著拍攝條件及表情神態(tài)變化而變化,這就使得人臉的識(shí)別成為模式分析中的一個(gè)難題。一般從人臉圖象上進(jìn)行有效的識(shí)別需要提取穩(wěn)定的人臉特征,目前所利用的特征可以概括為形狀、灰度分布、頻域特征3種。其中,形狀特征包括人臉各部件的形狀以及人臉各部件之間的相對(duì)位置,這是最初研究所采用的特征;灰度分布特征,即將人臉圖象看成一維或二維灰度模式,所計(jì)算出的不同灰度模式之間的距離就是整體的灰度分布特征,例如特征臉的方法,此外還有描述局部關(guān)鍵點(diǎn)領(lǐng)域的灰度分布特征的分析方法;頻域特征,即將人臉圖象變換到頻域后所做的特征臉分析方法就是頻域特征臉方法,此時(shí)的特征即為頻域特征,如小波特征就是頻域特征。雖然形狀特征是3個(gè)特征中最具體形象的特征,但是它也和灰度特征一樣受到光照、角度和表情的影響,而頻域特征雖然相對(duì)較穩(wěn)定,但作為低層次特征,不易直接用于匹配和識(shí)別,因此對(duì)它進(jìn)行進(jìn)一步的解釋是目前需要解決的問題。
在彈性匹配中,若對(duì)每個(gè)節(jié)點(diǎn)運(yùn)用KL變換,則能夠減少特征數(shù),而不降低識(shí)別率。其特征后處理的一個(gè)重要方面是特征的選擇,也就是需選出最活躍的識(shí)別特征,去除對(duì)識(shí)別不重要的信息。在人臉識(shí)別的特征選擇中,生物心理學(xué)家首先研究了人臉各部件對(duì)識(shí)別的重要性,接著文獻(xiàn)[2]從模式識(shí)別的角度出發(fā),結(jié)合人臉各部件信息,并運(yùn)用最大后驗(yàn)概率,對(duì)表達(dá)人臉的低層次特征進(jìn)行了篩選,從而減少了人臉信息的存儲(chǔ)量,并改善了識(shí)別的效果。
2.2人臉的定位問題
雖然人臉定位問題是人臉識(shí)別的第一步,但在前面介紹各種人臉識(shí)別方法的時(shí)候,并沒有介紹具體的定位問題。事實(shí)上,對(duì)大多數(shù)方法而言,人臉的定位過程也就是人臉識(shí)別特征的生成過程,而且定位算法也是和識(shí)別算法密切相關(guān)的。為了說明這一點(diǎn),下面給出一些人臉識(shí)別所采用的定位方法:
方法1特征臉的方法也可用于定位人臉,這是因?yàn)槿四樐J皆谔卣髂樧涌臻g的投影系數(shù)基本相似,若先將子圖在特征臉空間投影后重建,然后比較原圖和重建圖,就能夠說明原圖是否是人臉,這是因?yàn)樘卣髂樋臻g能反映人臉的分布,而對(duì)于非人臉則沒有很好的表達(dá)力,因此重建圖和原圖的差異會(huì)較大。
方法2最初的模板匹配方法是直接計(jì)算人臉圖象和模板人臉圖象之間的相似度,匹配最好時(shí),即是人臉在原圖中的位置,如彈性圖形匹配中,采用的也是一種模板匹配,但是其參與匹配的是用小波特征表達(dá)的二維(或三維)拓?fù)鋱D,若將模板拓?fù)鋱D在全圖生成的拓?fù)鋱D上移動(dòng)匹配(嚴(yán)格的或彈性的),則其最佳匹配就給出了人臉的位置,如文獻(xiàn)[2]就采用了多分辨率的三維模型,其定位的時(shí)候是從最低分辨率開始定位,然后依次增加分辨率,直到位置不變?yōu)橹?,這是由于文獻(xiàn)[2]考慮的是定位的分辨率可以遠(yuǎn)小于識(shí)別所需要的分辨率。
方法3在定位人臉的同時(shí)也就定位出了具體的部件位置,雖然方法1和方法3的基本原理不需要定位人臉的部件,而依賴于部件分析來進(jìn)行人臉識(shí)別的方法[6]通常是應(yīng)用一些先驗(yàn)知識(shí)(如眼睛的投影直方圖形狀、人臉的部件分布比例等)來初步給出人臉的大致位置,然后再精確定位人臉的各個(gè)部件。這里部件的定位通常使用投影方法、hough變換的方法以及構(gòu)造模型能量函數(shù)的匹配方法。
2.3識(shí)別效果的比較
由于采用的人臉庫不同,因此不同識(shí)別算法之間的優(yōu)劣沒有可比性,前面的論述也是盡量從理論上進(jìn)行比較。根據(jù)Moghaddam等在1996年進(jìn)行的FFEIT人臉庫測(cè)試[19],結(jié)果說明區(qū)別臉內(nèi)差異和臉間差異的Bay esian特征臉方法的表現(xiàn)最佳,即從5000幅待識(shí)別人像中,第一候選的識(shí)別率為89.5%,而灰度和形狀分離的可變形模型在300幅人像中的識(shí)別率達(dá)到92%。另根據(jù)文獻(xiàn)[15]的測(cè)試,在2000幅人臉圖象的綜合庫中,利用小波特征彈性圖形匹配的方法獲得了93%的識(shí)別率,而PCA識(shí)別率只達(dá)66%。
3結(jié)論
人臉識(shí)別是一個(gè)跨學(xué)科富挑戰(zhàn)性的前沿課題,但目前人臉識(shí)別還只是研究課題,尚不是實(shí)用化領(lǐng)域的活躍課題。人臉識(shí)別難度較大,主要難在人臉都是有各種變化的相似剛體,由于人臉部件不僅存在各種變形,而且和皮膚之間是平緩過渡,因此人臉是不能用經(jīng)典的幾何模型來進(jìn)行識(shí)別分類的典型例子。如今人臉識(shí)別研究人員已經(jīng)慢慢地將研究重點(diǎn)從傳統(tǒng)的點(diǎn)和曲線的分析方法,過渡到用新的人臉模型來表達(dá)和識(shí)別人臉,其中彈性圖匹配就是較成功的嘗試。雖然人臉識(shí)別算法的開發(fā)需要工程人員的努力,但也和解剖學(xué)、生理學(xué)等的研究密切相關(guān)。從目前的研究成果來看,就二維圖象而言,成功的人臉識(shí)別至少需要考慮以下幾個(gè)方面:(1)由于外部干擾不可避免,預(yù)處理的效果將會(huì)影響到識(shí)別結(jié)果,好的人臉模型應(yīng)能夠在識(shí)別的同時(shí),抑制分離外在干擾的影響;(2)細(xì)節(jié)是區(qū)分不同人臉的關(guān)鍵,因此很多識(shí)別方法都十分注重細(xì)節(jié),如彈性圖匹配中的局部細(xì)節(jié),就是通過節(jié)點(diǎn)的小波變換特征加以表達(dá),而在灰度形狀分離的可變形模型中,局部灰度投影分布也描述了人臉細(xì)節(jié),另外,傳統(tǒng)的點(diǎn)和曲線的方法更是直接從局部細(xì)節(jié)入手,可是特征臉方法則缺少對(duì)細(xì)節(jié)的考慮,故需和別的方法相結(jié)合,才能取得好的識(shí)別效果;(3)在匹配的時(shí)候,不僅要考慮各種因素所導(dǎo)致的人臉微小變形,而且在容忍變形的同時(shí),還不能損害到人臉識(shí)別的有效性,如彈性圖匹配的方法不論從特征的選擇上,還是從匹配的方法上都力圖遵循這一原則。由此可見,人臉變形在人臉識(shí)別中具有重要意義,因?yàn)槿四権S富的變形就是導(dǎo)致傳統(tǒng)的點(diǎn)線分析方法失敗的原因;(4)對(duì)于表達(dá)人臉的各種特征需要進(jìn)行比較和選擇,以找出人臉最活躍的特征。這可以通過如下兩種途徑:一是比較同一個(gè)人的多張圖片,以得到穩(wěn)定的特征;另一種方法就是比較不同人的圖片,以得出該人最“與眾不同”之處[2]。
此外,實(shí)用的識(shí)別系統(tǒng)還必須考慮計(jì)算復(fù)雜度,現(xiàn)有的識(shí)別方法中,通過從人臉圖中提取出特征串,來對(duì)數(shù)據(jù)庫進(jìn)行檢索的方法速度快,而利用拓?fù)鋵傩詧D匹配來確定匹配度的方法則相對(duì)慢,而且隨數(shù)據(jù)庫增加,前者的識(shí)別率要比后者下降得快,因此改進(jìn)的思路是將兩者相結(jié)合,首先用快速的特征串匹配,來縮小檢索范圍,再進(jìn)行拓?fù)鋱D慢匹配,此外,用減小拓?fù)鋱D存儲(chǔ)量的方法也能夠加快匹配速度,但這需要提取有效特征和去掉冗余信息。
本文介紹和分析的各種人臉識(shí)別方法同樣可用于攝像機(jī)輸入人臉的識(shí)別,而對(duì)于攝像機(jī)圖象而言,人臉的定位和表情的分析還可以利用序列圖象之間的相關(guān)性信息,如從攝像機(jī)輸入動(dòng)態(tài)圖可以進(jìn)行二維及三維的運(yùn)動(dòng)估計(jì),從而建立三維的人臉模型。 由于從攝像機(jī)動(dòng)態(tài)輸入圖中得到的信息很多,故還有可能進(jìn)行有效的表情分析,以作為身份辨認(rèn)的輔助手段。本文只是對(duì)目前應(yīng)用于人臉識(shí)別的技術(shù)作了選擇性的介紹,也是對(duì)文獻(xiàn)[3]、[15]的一點(diǎn)補(bǔ)充。由于人臉識(shí)別的理論還不完善,具體算法的實(shí)現(xiàn)也有很多的因素待研究,因此計(jì)算機(jī)人臉識(shí)別的實(shí)用化還需要眾多研究人員的不懈努力。
·高被引論文摘要·
被引頻次:673
人臉識(shí)別技術(shù)綜述
張翠平,蘇光大
首先對(duì)計(jì)算機(jī)人臉自動(dòng)識(shí)別技術(shù)的研究背景及發(fā)展歷程做了簡(jiǎn)單回顧,然后對(duì)人臉正面像的識(shí)別方法,按照識(shí)別特征的不同進(jìn)行了分類綜述,主要介紹了特征臉(Eigenface)方法、基于小波特征的彈性匹配(ElasticMatching)的方法、形狀和灰度模型分離的可變形模型(Flexible Model)以及傳統(tǒng)的部件建模等分析方法。通過對(duì)各種識(shí)別方法的分析與比較,總結(jié)了影響人臉識(shí)別技術(shù)實(shí)用化的幾個(gè)因素,并提出了研究和開發(fā)成功的人臉識(shí)別技術(shù)所需要考慮的幾個(gè)重要方面,進(jìn)而展望了人臉識(shí)別技術(shù)今后的發(fā)展方向。
人臉識(shí)別;特征臉;小波特征;形狀無關(guān)模型
來源出版物:中國(guó)圖象圖形學(xué)報(bào),2000,5(11):885-894
被引頻次:636
人臉自動(dòng)識(shí)別方法綜述
周杰,盧春雨,張長(zhǎng)水,等
摘要:人臉自動(dòng)識(shí)別是模式識(shí)別、圖像處理等學(xué)科的一大研究熱點(diǎn),近幾年來關(guān)于人臉識(shí)別的研究取得了很大進(jìn)展。本文重點(diǎn)對(duì)近三、四年來人臉識(shí)別的研究進(jìn)行綜述并對(duì)各種方法加以評(píng)論。
關(guān)鍵詞:人臉自動(dòng)識(shí)別;人臉檢測(cè);人臉定位
來源出版物:電子學(xué)報(bào),2000,28(4):102-106
被引頻次:466
生物特征識(shí)別技術(shù)綜述
孫冬梅,裘正定
摘要:生物特征識(shí)別技術(shù)作為一種身份識(shí)別的手段,具有獨(dú)特的優(yōu)勢(shì),近年來已逐漸成為國(guó)際上的研究熱點(diǎn)。本文綜述了各種生物特征識(shí)別技術(shù)的基本原理和一些關(guān)鍵技術(shù),對(duì)每種生物特征的優(yōu)勢(shì)和不足進(jìn)行了分析,并對(duì)生物特征識(shí)別技術(shù)中存在的問題和未來的研究方向進(jìn)行了討論。
關(guān)鍵詞:生物特征識(shí)別;身份識(shí)別;身份認(rèn)證;人臉識(shí)別 指紋識(shí)別;虹膜識(shí)別;手形識(shí)別;掌紋識(shí)別;簽名識(shí)別;說話人識(shí)別
來源出版物:電子學(xué)報(bào),2001,29:1744-1748
被引頻次:368
人臉識(shí)別理論研究進(jìn)展
周激流,張曄
摘要:綜述了人臉識(shí)別理論的概念和研究現(xiàn)狀,討論了其中的關(guān)鍵技術(shù)和難點(diǎn)以及應(yīng)用和發(fā)展前景,最后對(duì)人臉識(shí)別研究中應(yīng)注意的問題提出了我們的看法。
關(guān)鍵詞:人臉識(shí)別;面部特征提?。槐砬?姿態(tài)分析
來源出版物:計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào),1999,11(2):180-184
被引頻次:300
綜述人臉識(shí)別中的子空間方法
劉青山,盧漢清,馬頌德
摘要:如何描述每個(gè)個(gè)體人臉的特征,使之區(qū)別于其他個(gè)體,是人臉識(shí)別研究中的關(guān)鍵問題之一。近年來提出了大量的方法,其中隨著主元分析在人臉識(shí)別中的成功應(yīng)用之后,子空間分析因其具有描述性強(qiáng)、計(jì)算代價(jià)小、易實(shí)現(xiàn)及可分性好的特點(diǎn),受到了廣泛的關(guān)注。文中結(jié)合近年來已發(fā)表的文獻(xiàn),按照線性和非線性的劃分,對(duì)子空間分析在人臉識(shí)別中的應(yīng)用作一回顧、比較和總結(jié),以供其他人參考。
關(guān)鍵詞:主元分析;子空間分析;人臉識(shí)別
來源出版物:自動(dòng)化學(xué)報(bào),2003,29(6):900-911
被引頻次:154
基于奇異值特征和統(tǒng)計(jì)模型的人像識(shí)別算法
洪子泉,楊靜宇
摘要:人像識(shí)別是模式識(shí)別領(lǐng)域中的一個(gè)前沿課題。目前多數(shù)研究者采用人臉的一維和二維幾何特征來完成識(shí)別任務(wù)。人臉的幾何特征抽取以及這些特征的有效性都面臨著很多問題,至今人像識(shí)別的研究仍然處于較低的水平。作者證明了圖象矩陣的奇異值特征矢量具備了代數(shù)上和幾何上的不變性以及穩(wěn)定性,提出用它作為識(shí)別人臉的代數(shù)特征。本文的人像識(shí)別算法是基于奇異值特征矢量建立Sammon最佳鑒別平面上的正態(tài)Bayes分類模型。在本文的實(shí)驗(yàn)中,我們用9張人像照片建立的統(tǒng)計(jì)模型能完全正確地識(shí)到這9張照片。對(duì)同一個(gè)人的不同歷史時(shí)期的照片,本文也給出識(shí)別實(shí)驗(yàn)結(jié)果。
關(guān)鍵詞:人像識(shí)別;奇異值特征;圖象識(shí)別;代數(shù)特征抽??;鑒別向量;維數(shù)壓縮
來源出版物:計(jì)算機(jī)研究與發(fā)展,1994,31(3):60-65
被引頻次:140
基于奇異值分解和判別式KL投影的人臉識(shí)別
周德龍,高文,趙德斌
摘要:臉識(shí)別是計(jì)算機(jī)視覺和模式識(shí)別領(lǐng)域的一個(gè)活躍課題,有著十分廣泛的應(yīng)用前景。提出了一種新的彩色人臉識(shí)別方法。該算法采用模擬K-L變換、奇異值分解、主分量分析和Fisher線性判別分析技術(shù)來提取最終特征,可以使分類器的設(shè)計(jì)更加簡(jiǎn)潔、有效,使用較少的特征向量數(shù)目就能取得較高的識(shí)別率。仿真結(jié)果表明了該方法的有效性。
關(guān)鍵詞:人臉識(shí)別;特征提取;K-L變換;奇異值特征向量;主分量分析;Fisher線性判別分析
來源出版物:軟件學(xué)報(bào),2003,14(4):783-789
被引頻次:135
人臉識(shí)別系統(tǒng)中的特征提取
李華勝,楊樺,袁保宗
摘要:研究了人臉識(shí)別系統(tǒng)中正面人臉的特征提取。通過區(qū)域增長(zhǎng)從人臉圖像中分割出人臉,再利用邊緣檢測(cè)、Hough變換、模板匹配和方差投影技術(shù)可以快速有效地提取出人臉面部器官眼睛、鼻子和嘴巴特征。實(shí)驗(yàn)結(jié)果表明本文所采用的方法具有較高的準(zhǔn)確率和光照魯棒性。
關(guān)鍵詞:人臉識(shí)別;Hough變換;模板匹配;方差投影
來源出版物:北京交通大學(xué)學(xué)報(bào),2001,25(2):18-21
被引頻次:123
人臉識(shí)別研究綜述
李武軍,王崇駿,張煒,等
摘要:人臉識(shí)別已成為多個(gè)學(xué)科領(lǐng)域的研究熱點(diǎn)之一。本文對(duì)人臉識(shí)別的發(fā)展歷史、研究現(xiàn)狀進(jìn)行了綜述,系統(tǒng)地對(duì)目前主流人臉識(shí)別方法進(jìn)行了分類。針對(duì)人臉識(shí)別面臨的挑戰(zhàn),著重對(duì)近幾年來在光照和姿態(tài)變化處理方面的研究進(jìn)展進(jìn)行了詳細(xì)論述,并對(duì)未來人臉識(shí)別的發(fā)展方向進(jìn)行了展望。
關(guān)鍵詞:人臉識(shí)別;人臉檢測(cè);模式識(shí)別
來源出版物:模式識(shí)別與人工智能,2006,19(1):58-66
被引頻次:120
一種基于奇異值特征的神經(jīng)網(wǎng)絡(luò)人臉識(shí)別新途徑
甘俊英,張有為
摘要:本文在ZHong等人使用的奇異值分解(SVD)基礎(chǔ)上,將人臉圖像矩陣的奇異值作為識(shí)別特征,解決了奇異值處理、神經(jīng)網(wǎng)絡(luò)訓(xùn)練策略和競(jìng)爭(zhēng)選擇問題;運(yùn)用BP網(wǎng)絡(luò)進(jìn)行識(shí)別,提出了一種基于奇異值特征的神經(jīng)網(wǎng)絡(luò)人臉識(shí)別新方法。基于ORL人臉數(shù)據(jù)庫的多次反復(fù)實(shí)驗(yàn)結(jié)果表明,在大樣本情況下,識(shí)別方法具有實(shí)現(xiàn)簡(jiǎn)單、識(shí)別速度快、識(shí)別率高的特點(diǎn),為人臉的實(shí)時(shí)識(shí)別提供了一種新途徑。
關(guān)鍵詞:人臉識(shí)別;奇異值特征;神經(jīng)網(wǎng)絡(luò);模式識(shí)別
來源出版物:電子學(xué)報(bào),2004,32(1):170-173
被引頻次:4503
Eigenfaces for recognition
Turk,M; Pentland,A
Abstract: 參見“經(jīng)典文獻(xiàn)推薦”欄目
被引頻次:3890
Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection
Belhumeur,PN; Hespanha,JP; Kriegman,DJ
Abstract: 參見“經(jīng)典文獻(xiàn)推薦”欄目
被引頻次:1936
Shape matching and object recognition using shape contexts
Belongie,S; Malik,J; Puzicha,J
Abstract: We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework,the measurement of similarity is preceded by 1)solving for correspondences between points on the two shapes,2)using the correspondences to estimate an aligning transform. In order to solve the correspondence problem,we attach a descriptor,the shape context,to each point. The shape context at a reference point captures the distribution of the remaining points relative to it,thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape contexts,enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences,we estimate the transformation that best aligns the two shapes;regularized thin-plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points,together with a term measuring the magnitude of the aligning transform. We treat recognition in a nearest-neighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. Results are presented for silhouettes,trademarks,handwritten digits,and the COIL data set.
Keywords: shape; object recognition; digit recognition; correspondence problem; MPEG7; image registration; deformable templates
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,24(4):509-522
被引頻次:1614
Robust face recognition via sparse representation
Wright,J; Yang,AY; Ganesh,A; et al.
Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination,as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by l(1)-minimization,we propose a general classification algorithm for(image-based)object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction,we show that if sparsity in the recognition problem is properly harnessed,the choice of features is no longer critical. What is critical,however,is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as Eigenfaces and Laplacianfaces,as long as the dimension of the feature space surpasses certain threshold,predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard(pixel)basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.
Keywords: face recognition; feature extraction; occlusion and corruption; sparse representation; compressed sensing; l(1)-minimization;validation and outlier rejection
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,31(2): 210-227
被引頻次:1271
From few to many: Illumination cone models for face recognition under variable lighting and pose
Georghiades,AS; Belhumeur,PN; Kriegman,DJ
Abstract: We present a generative appearance-based method for recognizing human faces under variation in lighting and viewpoint. Our method exploits the fact that the set of images of an object in fixed pose,but under all possible illumination conditions,is a convex cone in the space of images. Using a small number of training images of each face taken with different lighting directions,the shape and albedo of the face can be reconstructed. In turn,this reconstruction serves as a generative model that can be used to render-or synthesize-images of the face under novel poses and illumination conditions. The pose space is then sampled and,for each pose. the corresponding illumination cone is approximated by a low-dimensional linear subspace whose basis vectors are estimated using the generative model. Our recognition algorithm assigns to a test image the identity of the closest approximated illumination cone(based on Euclidean distance within the image space). We test our face recognition method on 4050 images from the Yale Face Database B; these images contain 405 viewing conditions(9 poses x 45 illumination conditions)for 10 individuals. The method performs almost without error,except on the most extreme lighting directions,and significantly outperforms popular recognition methods that do not use a generative model.
Keywords: face recognition; image-based rendering; appearance-based vision; face modeling; illumination and pose modeling; lighting;illumination cones; generative models
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(6):643-660
被引頻次:1248
Face recognition by elastic bunch graph matching
Wiskott,L; Fellous,JM; Kruger,N; et al.
Abstract: We present a system for recognizing human faces from single images out of a large database containing one image per person. Faces are represented by labeled graphs,based on a Gabor wavelet transform. Image graphs of new faces are extracted by an elastic graph matching process and can be compared by a simple similarity function. The system differs from the preceding one in three respects. Phase information is used for accurate node positioning. Object-adapted graphs are used to handle large rotations in depth. Image graph extraction is based on a novel data structure,the bunch graph,which is constructed from a small set of sample image graphs.
Keywords: face recognition; different poses; Gabor wavelets; elastic graph matching; bunch graph; ARPA/ARL FERET database; Bochum database
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(7): 775-779
被引頻次:1147
Face recognition using Laplacianfaces
He,XF; Yan,SC; Hu,YX; et al.
Abstract: We propose an appearance-based face recognition method called the Laplacianface approach. By using Locality Preserving Projections(LPP),the face images are mapped into a face subspace for analysis. Different from Principal Component Analysis(PCA)and Linear Discriminant Analysis(LDA)which effectively see only the Euclidean structure of face space,LPP finds an embedding that preserves local information,and obtains a face subspace that best detects the essential face manifold structure. The Laplacianfaces are the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the face manifold. In this way,the unwanted variations resulting from changes in lighting,facial expression,and pose may be eliminated or reduced. Theoretical analysis shows that PCA,LDA,and LPP can be obtained from different graph models. We compare the proposed Laplacianface approach with Eigenface and Fisherface methods on three different face data sets. Experimental results suggest that the proposed Laplacianface approach provides a better representation and achieves lower error rates in face recognition.
Keywords: face recognition; principal component analysis; linear discriminant analysis; locality preserving projections; face manifold;subspace learning
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(3): 328-340
被引頻次:958
Face recognition - features versus templates
BRUNELLI,R; POGGIO,T
Abstract: Over the last 20 years,several different techniques have been proposed for computer recognition of human faces. The purpose of this paper is to compare two simple but general strategies on a common database(frontal images of faces of 47 people: 26 males and 21 females,four images per person). We have developed and implemented two new algorithms; the first one is based on the computation of a set of geometrical features,such as nose width and length,mouth position,and chin shape,and the second one is based on almost-grey-level template matching. The results obtained on the testing sets(about 90% correct recognition using geometrical features and perfect recognition using template matching)favor our implementation of the template-matching approach.
Keywords: classification; face recognition; karhunen-loeve expansion; template matching
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,1993,15(10):1042-1052
被引頻次:953
Two-dimensional PCA: A new approach to appearance-based face representation and recognition
Yang,J; Zhang,D; Frangi,AF ; et al.
Abstract: In this paper,a new technique coined two-dimensional principal component analysis(2DPCA)is developed for image representation. As opposed to PCA,2DPCA is based on 2D image matrices rather than 1 D vectors so the image matrix does not need to be transformed into a vector prior to feature extraction. Instead,an image covariance matrix is constructed directly using the original image matrices,and its eigenvectors are derived for image feature extraction. To test 2DPCA and evaluate its performance,a series of experiments were performed on three face image databases: ORL,AR,and Yale face databases. The recognition rate across all trials was higher using 2DPCA than PCA. The experimental results also indicated that the extraction of image features is computationally more efficient using 2DPCA than PCA.
Keywords: Principal Component Analysis(PCA); eigentaces; feature extraction; image representation; face recognition
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,2004,26(1):131 - 137
被引頻次:953
PCA versus LDA
Martinez,AM; Kak,AC
Abstract: 參見“經(jīng)典文獻(xiàn)推薦”欄目
·推薦論文摘要·
基于改進(jìn)ORB特征的多姿態(tài)人臉識(shí)別
周凱汀,鄭力新
摘要:為了克服通過全局特征以及每位個(gè)體單個(gè)模板樣本進(jìn)行多姿態(tài)人臉識(shí)別的不足,提出基于改進(jìn)的ORB局部特征以及每位個(gè)體多個(gè)模板樣本的多姿態(tài)人臉識(shí)別方法.首先改進(jìn)ORB算子的采樣模式提高算子對(duì)人臉視角變化的魯棒性,并采用每位個(gè)體的多個(gè)訓(xùn)練樣本建立模板庫,然后提取并匹配測(cè)試樣本與模板樣本的改進(jìn)ORB特征。在特征提取階段,為避免關(guān)鍵點(diǎn)數(shù)目變化的干擾,對(duì)全部樣本提取一致數(shù)目的關(guān)鍵點(diǎn);在特征匹配階段,采用基于模型和基于方向的雙重策略剔除誤匹配點(diǎn)對(duì),使用匹配點(diǎn)對(duì)數(shù)目與平均距離評(píng)價(jià)測(cè)試樣本與每個(gè)模板樣本的吻合程度。對(duì)CAS-PEAL-R1和XJTU數(shù)據(jù)庫的實(shí)驗(yàn)結(jié)果表明,改進(jìn)的ORB特征具有更好的識(shí)別性能;與采用多個(gè)訓(xùn)練樣本構(gòu)建個(gè)體單個(gè)模板樣本的方法相比,在訓(xùn)練樣本數(shù)目相同的條件下,該方法能較好地避免姿態(tài)的干擾,具有更好的識(shí)別效果。與SIFT算子相比,ORB算子在特征提取與特征匹配2個(gè)階段都具有明顯的速度優(yōu)勢(shì)。
關(guān)鍵詞:人臉識(shí)別;多姿態(tài);多視圖;ORB;特征匹配
來源出版物:計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào),2015,27(2):287-295聯(lián)系郵箱:鄭力新,1050920138@qq.com
基于鑒別稀疏保持嵌入的人臉識(shí)別算法
馬小虎,譚延琪
摘要:鑒于近年來稀疏表示(Sparse representation,SR)在高維數(shù)據(jù)例如人臉圖像的特征提取與降維領(lǐng)域的快速發(fā)展,對(duì)原始的稀疏保持投影(Sparsity preserving projection,SPP)算法進(jìn)行了改進(jìn),提出了一種叫做鑒別稀疏保持嵌入(Discriminant sparsity preserving embedding,DSPE)的算法.通過求解一個(gè)最小二乘問題來更新SPP中的稀疏權(quán)重并得到一個(gè)更能真實(shí)反映鑒別信息的鑒別稀疏權(quán)重,最后以最優(yōu)保持這個(gè)稀疏權(quán)重關(guān)系為目標(biāo)來計(jì)算高維數(shù)據(jù)的低維特征子空間.該算法是一個(gè)線性的監(jiān)督學(xué)習(xí)算法,通過引入鑒別信息,能夠有效地對(duì)高維數(shù)據(jù)進(jìn)行降維.在ORL庫、Yale庫、擴(kuò)展Yale B庫和CMU PIE庫上的大量實(shí)驗(yàn)結(jié)果驗(yàn)證了算法的有效性。
關(guān)鍵詞:人臉識(shí)別;稀疏表示;稀疏保持投影;鑒別稀疏保持嵌入
來源出版物:自動(dòng)化學(xué)報(bào),2014,40(1):73-82聯(lián)系郵箱:馬小虎,xhma@suda.edu.cn
一種基于改進(jìn)BP神經(jīng)網(wǎng)絡(luò)的PCA人臉識(shí)別算法
李康順,李凱,張文生
摘要:人臉識(shí)別作為模式識(shí)別領(lǐng)域的熱點(diǎn)研究問題受到了廣泛的關(guān)注。傳統(tǒng)BP算法雖然具有自學(xué)習(xí)、自適應(yīng)以及強(qiáng)大的非線性映射能力并且在人臉圖像識(shí)別準(zhǔn)確率上占有很大的優(yōu)勢(shì),但算法具有收斂緩慢、訓(xùn)練過程振蕩、易陷入局部極小點(diǎn)等缺點(diǎn)。針對(duì)傳統(tǒng)BP算法的不足提出一種基于改進(jìn)BP神經(jīng)網(wǎng)絡(luò)的PCA人臉識(shí)別算法,該算法采用PCA算法提取圖像的主要特征,并結(jié)合一種新的權(quán)值調(diào)整方法改進(jìn)BP算法進(jìn)行圖像分類識(shí)別。仿真實(shí)驗(yàn)表明,通過使用該算法對(duì)ORL人臉數(shù)據(jù)庫的圖像進(jìn)行識(shí)別,其結(jié)果比傳統(tǒng)算法具有更快的收斂速度和更高的識(shí)別率。
關(guān)鍵詞:人臉識(shí)別;主成分分析;BP神經(jīng)網(wǎng)絡(luò);附加動(dòng)量;彈性梯度下降法
來源出版物:計(jì)算機(jī)應(yīng)用與軟件,2014,(1):158-161
基于非下采樣Contourlet梯度方向直方圖的人臉識(shí)別
奉俊鵬,楊恢先,蔡勇勇,等
摘要:針對(duì)人臉識(shí)別系統(tǒng)準(zhǔn)確度不高的問題,提出一種基于非下采樣 Contourlet梯度方向直方圖(HNOG)的人臉識(shí)別算法。先對(duì)人臉圖像進(jìn)行非下采樣Contourlet變換(NSCT),并將變換后的各系數(shù)矩陣進(jìn)行分塊,再計(jì)算各分塊的梯度方向直方圖(HOG),將所有分塊的直方圖串接得到人臉圖像HNOG特征,最后用多通道最近鄰分類器進(jìn)行分類。在YALE人臉庫、ORL人臉庫上和CAS-PEAL-R1人臉庫上的實(shí)驗(yàn)結(jié)果表明,人臉的HNOG特征有很強(qiáng)的辨別能力,特征維數(shù)較小,且對(duì)光照、表情、姿態(tài)的變化具有較好的魯棒性。
關(guān)鍵詞:非下采樣Contourlet變換;梯度方向直方圖;人臉識(shí)別;最近鄰分類器
來源出版物:計(jì)算機(jī)應(yīng)用,2014,34(1):158-161聯(lián)系郵箱:楊恢先,yanghx@xtu.edu.cn
用于人臉識(shí)別的相對(duì)梯度直方圖特征描述
楊利平,辜小花
摘要:由于方向邊緣幅值模式(POEM)在劇烈光照變化情況下無法獲得足夠的特征描述信息,本文分析了相對(duì)梯度幅值圖像特點(diǎn),提出了相對(duì)梯度直方圖特征描述方法。該方法根據(jù)圖像的梯度方向?qū)ο鄬?duì)梯度幅值圖像進(jìn)行分解、濾波、局部二值模式編碼和特征降維,形成了對(duì)光照變化,尤其是非均勻光照變化具有健壯性的低維直方圖特征。在FERET和YaleB子集上的人臉識(shí)別實(shí)驗(yàn)證實(shí):在光照變化較小時(shí),相對(duì)梯度直方圖特征描述方法與方向邊緣幅值模式的性能相當(dāng),均顯著優(yōu)于經(jīng)典的局部二值模式特征;在光照劇烈變化時(shí),前者的識(shí)別精度比方向邊緣幅值模式至少高 5%,性能顯著優(yōu)于方向邊緣幅值模式和局部二值模式,展示了相對(duì)梯度直方圖特征描述方法的有效性和對(duì)光照變化的良好健壯性。
關(guān)鍵詞:人臉識(shí)別;相對(duì)梯度直方圖;局部二值模式;特征描述
來源出版物:光學(xué)精密工程,2014,22(1):152-159聯(lián)系郵箱:楊利平,yanglp@cqu.edu.cn
基于簡(jiǎn)化脈沖耦合神經(jīng)網(wǎng)絡(luò)的人臉識(shí)別
聶仁燦,姚紹文,周冬明
摘要:基于簡(jiǎn)化脈沖耦合神經(jīng)網(wǎng)絡(luò)(S-PCNN),提出了一種新穎的人臉識(shí)別方法。首先通過對(duì)神經(jīng)元振蕩特性的分析,將神經(jīng)元振蕩時(shí)間序列(OTS)分解為捕獲性振蕩時(shí)間序列(C-OTS)和自激性振蕩時(shí)間序列(S-OTS)。然后通過圖像幾何變換和振蕩頻圖,分析了X-OTS(OTS、C-OTS和S-OTS)的鑒別特性。最后利用C-OTS+S-OTS和余弦距離測(cè)度給出了人臉識(shí)別的系統(tǒng)結(jié)構(gòu)。人臉庫中的實(shí)驗(yàn)結(jié)果驗(yàn)證了所提方法的有效性,顯示了它比其它傳統(tǒng)算法具有更好的識(shí)別性能。
關(guān)鍵詞:簡(jiǎn)化脈沖耦合神經(jīng)網(wǎng)絡(luò);振蕩時(shí)間序列;人臉識(shí)別
來源出版物:計(jì)算機(jī)科學(xué),2014,41(2):297-301聯(lián)系郵箱:聶仁燦,huomu_ren@163.com
基于子模式的Gabor特征融合的單樣本人臉識(shí)別
王科俊,鄒國(guó)鋒
摘要:針對(duì)傳統(tǒng)人臉識(shí)別方法在單訓(xùn)練樣本條件下效果不佳的缺點(diǎn),提出基于子模式的Gabor特征融合方法并用于單樣本人臉識(shí)別。首先采用Gabor變換抽取人臉局部信息,為有效利用面部器官的空間位置信息,將Gabor人臉圖像分塊構(gòu)成子模式,采用最小距離分類器對(duì)各子模式分類。最后對(duì)各子模式分類結(jié)果做決策級(jí)融合得出分類結(jié)果。根據(jù)子模式構(gòu)成原則和決策級(jí)融合策略不同,提出兩種子模式Gabor特征融合方法。利用ORL人臉庫和CAS-PEAL-R1人臉庫進(jìn)行實(shí)驗(yàn)和比較分析,實(shí)驗(yàn)結(jié)果表明文中方法有效提高單樣本人臉識(shí)別的正確率,改善單樣本人臉識(shí)別系統(tǒng)的性能。
關(guān)鍵詞:?jiǎn)螛颖救四樧R(shí)別;Gabor變換;局部特征;圖像子模式;決策級(jí)融合;模糊綜合
來源出版物:模式識(shí)別與人工智能,2013,26(1):50-56聯(lián)系郵箱:王科俊,15124551941@139.com
基于低秩子空間恢復(fù)的聯(lián)合稀疏表示人臉識(shí)別算法
胡正平,李靜
摘要:針對(duì)陰影、反光及遮擋等原因破壞圖像低秩結(jié)構(gòu)這一問題,提出基于低秩子空間恢復(fù)的聯(lián)合稀疏表示識(shí)別算法。首先將每個(gè)個(gè)體的所有訓(xùn)練樣本圖像看作矩陣D,將矩陣D分解為低秩矩陣A和稀疏誤差矩陣E,其中A表示某類個(gè)體的‘干凈’人臉,嚴(yán)格遵循子空間結(jié)構(gòu),E表示由陰影、反光、遮擋等引起的誤差項(xiàng),這些誤差項(xiàng)破壞了人臉圖像的低秩結(jié)構(gòu)。然后用低秩矩陣A和誤差矩陣E構(gòu)造訓(xùn)練字典,將測(cè)試樣本表示為低秩矩陣A和誤差矩陣E的聯(lián)合稀疏線性組合,利用這兩部分的稀疏逼近計(jì)算殘差,進(jìn)行分類判別。實(shí)驗(yàn)證明該稀疏表示識(shí)別算法有效,識(shí)別精度得到了有效提高。
關(guān)鍵詞:人臉識(shí)別;稀疏表示;聯(lián)合稀疏;低秩子空間恢復(fù)
來源出版物:電子學(xué)報(bào),2013,41(5):987-991
基于低分辨率局部二值模式的人臉識(shí)別
戴金波,肖霄,趙宏偉
摘要:為提高人臉識(shí)別的準(zhǔn)確度,提出了一種基于低分辨率局部二值模式的人臉識(shí)別方法。該方法將原始人臉圖像濾波下采樣處理成低分辨率圖像,將其劃分成若干塊矩形塊圖像,對(duì)每一塊圖像進(jìn)行局部二值模式計(jì)算,統(tǒng)計(jì)出每一塊LBP圖譜的直方圖,再連接在一起成為這幅圖片的最終特征向量。經(jīng)實(shí)驗(yàn)表明,該算法在ORL和YALE上均取得了更好的識(shí)別效果,且對(duì)光照、表情、姿勢(shì)等的變化具備魯棒性。
關(guān)鍵詞:計(jì)算機(jī)應(yīng)用;局部二值模式;低分辨率;特征提取;人臉識(shí)別
來源出版物:吉林大學(xué)學(xué)報(bào):工學(xué)版,2013,43(2):435-438聯(lián)系郵箱:趙宏偉,zhaohw@jlu.edu.cn
面向光照可變的人臉識(shí)別方法
李昕昕,陳丹,許鳳嬌
摘要:傳統(tǒng) Retinex算法在側(cè)光嚴(yán)重的情況下難以消除陰影,為此提出一個(gè)對(duì)數(shù)形式的傳導(dǎo)函數(shù),取得了很好的光照補(bǔ)償效果。為提高人臉識(shí)別率,將該問題看成一個(gè)典型的模式分類問題,提出基于局部二值模式(LBP)特征的支持向量機(jī)(SVM)人臉識(shí)別方法,使用“一對(duì)一”的方法將多類問題轉(zhuǎn)化為SVM分類器可以解決的兩類問題,實(shí)現(xiàn)了高效的人臉識(shí)別。在CMU PIE、AR、CAS-PEAL以及自行采集的人臉庫上進(jìn)行了仿真實(shí)驗(yàn),結(jié)果表明該方法能夠有效地去除光照影響,相對(duì)傳統(tǒng)方法具有較優(yōu)的識(shí)別性能。
關(guān)鍵詞:人臉識(shí)別;光照;局部二值模式;支持向量機(jī);視網(wǎng)膜皮層
來源出版物:計(jì)算機(jī)應(yīng)用,2013,33(2):507-510聯(lián)系郵箱:李昕昕,xinxinli@foxmail.com
Facenet: A unified embedding for face recognition and clustering
Florian Schroff; Dmitry Kalenichenko; James Philbin
Abstract:Despite significant recent advances in the field of face recognition,implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system,called FaceNet,that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced,tasks such as face recognition,verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors. Our method uses a deep convolutional network trained to directly optimize the embedding itself,rather than an intermediate bottleneck layer as in previous deep learning approaches. To train,we use triplets of roughly aligned matching / non-matching face patches generated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-art face recognition performance using only 128-bytes per face. On the widely used Labeled Faces in the Wild(LFW)dataset,our system achieves a new record accuracy of 99.63%. On YouTube Faces DB it achieves 95.12%. Our system cuts the error rate in comparison to the best published result by 30% on both datasets.
來源出版物:preprint arXiv:1503.03832,2015
Face Search at Scale: 80 Million Gallery
Dayong Wang; Charles Otto; Anil K. Jain
Abstract:Due to the prevalence of social media websites,one challenge facing computer vision researchers is to devise methods to process and search for persons of interest among the billions of shared photos on these websites. Facebook revealed in a 2013 white paper that its users have uploaded more than 250 billion photos,and are uploading 350 million new photos each day. Due to this humongous amount of data,large-scale face search for mining web images is both important and challenging. Despite significant progress in face recognition,searching a large collection of unconstrained face images has not been adequately addressed. To address this challenge,we propose a face search system which combines a fast search procedure,coupled with a state-of-the-art commercial off the shelf(COTS)matcher,in a cascaded framework. Given a probe face,we first filter the large gallery of photos to find the top-k most similar faces using deep features generated from a convolutional neural network. The k candidates are re-ranked by combining similarities from deep features and the COTS matcher. We evaluate the proposed face search system on a gallery containing 80 million web-downloaded face images. Experimental results demonstrate that the deep features are competitive with state-of-the-art methods on unconstrained face recognition benchmarks(LFW and IJB-A). Further,the proposed face search system offers an excellent trade-off between accuracy and scalability on datasets consisting of millions of images. Additionally,in an experiment involving searching for face images of the Tsarnaev brothers,convicted of the Boston Marathon bombing,the proposed face search system could find the younger brother's(Dzhokhar Tsarnaev)photo at rank 1 in 1 second on a 5M gallery and at rank 8 in 7 seconds on an 80M gallery.
來源出版物:preprint arXiv:1507.07242,2015
Non-rigid visible and infrared face registration via regularized Gaussian fields criterion
Ma,JY; Zhao,J; Ma,Y; et al.
Abstract: Registration of multi-sensor data(particularly visible color sensors and infrared sensors)is a prerequisite for multimodal image analysis such as image fusion. Typically,the relationships between image pairs are modeled by rigid or affine transformations. However,this cannot produce accurate alignments when the scenes are not planar,for example,face images. In this paper,we propose a regularized Gaussian fields criterion for non-rigid registration of visible and infrared face images. The key idea is to represent an image by its edge map and align the edge maps by a robust criterion with a non-rigid model. We model the transformation between images in a reproducing kernel Hilbert space and a sparse approximation is applied to the transformation to avoid high computational complexity. Moreover,a coarse-to-fine strategy by applying deterministic annealing is used to overcome local convergence problems. The qualitative and quantitative comparisons on two publicly available databases demonstrate that our method significantly outperforms the state-of-the-art method with an affine model. As a result,our method will be beneficial for fusion-based face recognition.
Keywords: registration; image fusion; infrared; non-rigid; face recognition; Gaussian fields
來源出版物:Pattern Recognition,2015,48(3): 772-784聯(lián)系郵箱:Ma,JY; jiayima@whu.edu.cn
Fully automatic 3D facial expression recognition using polytypic multi-block local binary patterns
Li,XL; Ruan,QQ; Jin,Y; et al.
Abstract: 3D facial expression recognition has been greatly promoted for overcoming the inherent drawbacks of 2D facial expression recognition and has achieved superior recognition accuracy to the 2D. In this paper,a novel holistic,full-automatic approach for 3D facial expression recognition is proposed. First,3D face models are represented in 2D-image-like structure which makes it possible to take advantage of the wealth of 2D methods to analyze 3D models. Then an enhanced facial representation,namely polytypic multi-block local binary patterns(P-MLBP),is proposed. The P-MLBP involves both the feature-based irregular divisions to depict the facial expressions accurately and the fusion of depth and texture information of 3D models to enhance the facial feature. Based on the BU-3DFE database,three kinds of classifiers are employed to conduct 3D facial expression recognition for evaluation. Their experimental results outperform the state of the art and show the effectiveness of P-MLBP for 3D facial expression recognition. Therefore,the proposed strategy is validated for 3D facial expression recognition; and its simplicity opens a promising direction for fully automatic 3D facial expression recognition.
Keywords: 3D facial expression recognition; automatic data normalization; P-MLBP; feature-based irregular divisions; feature fusion
來源出版物:Signal Processing,2015,108: 297-308聯(lián)系郵箱:Li XL; 09112087@bjtu.edu.cn
UGC-JU face database and its benchmarking using linear regression classifier
Seal,A; Bhattacharjee,D; Nasipuri,M; et al.
Abstract: In this paper,a new face database has been presented which will be freely available to academicians and research community for research purposes. The face database consists of both visual and thermal face images of 84 persons with varying poses,expressions and occlusions(39 different variations for each type,visual or thermal). A new thermal face image recognition technique based on Gappy Principal Component Analysis and Linear Regression Classifier has also been presented here. The recognition performance of this technique on the thermal face images of this database is found to be 98.61 %,which can be considered as the initial benchmark recognition performance this database.
Keywords: thermal face images; visual images; face database; GappyPCA; LRC classifier; decision level fusion
來源出版物:Multimedia Tools and Applications,2015,74(9): 2913-2937聯(lián)系郵箱:Seal,A; ayanseal30@ieee.org
Learning face representation from scratch
Dong Yi; Zhen Lei; Shengcai Liao; Stan Z. Li
Abstract: Pushing by big data and deep convolutional neural network(CNN),the performance of face recognition is becoming comparable to human. Using private large scale training datasets,several groups achieve very high performance on LFW,i.e.,97% to 99%. While there are many open source implementations of CNN,none of large scale face dataset is publicly available. The current situation in the field of face recognition is that data is more important than algorithm. To solve this problem,this paper proposes a semi-automatical way to collect face images from Internet and builds a large scale dataset containing about 10000 subjects and 500000 images,called CASIAWebFace. Based on the database,we use a 11-layer CNN to learn discriminative representation and obtain state-of-theart accuracy on LFW and YTF. The publication of CASIAWebFace will attract more research groups entering this field and accelerate the development of face recognition in the wild.
來源出版物:preprint arXiv:1411.7923,2014
Joint sparse representation for robust multimodal biometrics recognitionl
Shekhar,S; Patel,VM; Nasrabadi,NM; et al.
Abstract: Traditional biometric recognition systems rely on a single biometric signature for authentication. While the advantage of using multiple sources of information for establishing the identity has been widely recognized,computational models for multimodal biometrics recognition have only recently received attention. We propose a multimodal sparse representation method,which represents the test data bya sparse linear combination of training data,while constraining the observations from different modalities of the test subject to share their sparse representations. Thus,we simultaneously take into account correlations as well as coupling information among biometric modalities. A multimodal quality measure is also proposed to weigh each modality as it gets fused. Furthermore,we also kernelize the algorithm to handle nonlinearity in data. The optimization problem is solved using an efficient alternative direction method. Various experiments show that the proposed method compares favorably with competing fusion-based methods.
Keywords: Multimodal biometrics; feature fusion; sparse representation
來源出版物:IEEE Transactions On Pattern Analysis and Machine Intelligence,2014,36(1): 113-126
聯(lián)系郵箱:Shekhar,S; sshekha@umiacs.umd.edu
Half-quadratic-based iterative minimization for robust sparse representation
He,R; Zheng,WS; Tan,TN; et al.
Abstract: Robust sparse representation has shown significant potential in solving challenging problems in computer vision such as biometrics and visual surveillance. Although several robust sparse models have been proposed and promising results have been obtained,they are either for error correction or for error detection,and learning a general framework that systematically unifies these two aspects and explores their relation is still an open problem. In this paper,we develop a half-quadratic( HQ)framework to solve the robust sparse representation problem. By defining different kinds of half-quadratic functions,the proposed HQ framework is applicable to performing both error correction and error detection. More specifically,by using the additive form of HQ,we propose an l(1)-regularized error correction method by iteratively recovering corrupted data from errors incurred by noises and outliers; by using the multiplicative form of HQ,we propose an l(1)-regularized error detection method by learning from uncorrupted data iteratively. We also show that the l(1)-regularization solved by soft-thresholding function has a dual relationship to Huber M-estimator,which theoretically guarantees the performance of robust sparse representation in terms of M-estimation. Experiments on robust face recognition under severe occlusion and corruption validate our framework and findings.
Keywords: I(1)-minimization; half-quadratic optimization; sparse representation; M-estimator; correntropy
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,36(2): 261-275
聯(lián)系郵箱:He,R; rhe@nlpr.ia.ac.cn
Image Quality Assessment for Fake Biometric Detection: Application to Iris,F(xiàn)ingerprint,and Face Recognition
Galbally,Javier; Marcel,Sebastien; Fierrez,Julian
Abstract: To ensure the actual presence of a real legitimate trait in contrast to a fake self-manufactured synthetic or reconstructed sample is a significant problem in biometric authentication,which requires the development of new and efficient protection measures. In this paper,we present a novel software-based fake detection method that can be used in multiple biometric systems to detect different types of fraudulent access attempts. The objective of the proposed system is to enhance the security of biometric recognition frameworks,by adding liveness assessment in a fast,user-friendly,and non-intrusive manner,through the use of image quality assessment. The proposed approach presents a very low degree of complexity,which makes it suitable for real-time applications,using 25 general image quality features extracted from one image(i.e.,the same acquired for authentication purposes)to distinguish between legitimate and impostor samples. The experimental results,obtained on publicly available data sets of fingerprint,iris,and 2D face,show that the proposed method is highly competitive compared with other state-of-the-art approaches and that the analysis of the general image quality of real biometric samples reveals highly valuable information that may be very efficiently used to discriminate them from fake traits.
Keywords: image quality assessment; biometrics; security; attacks; countermeasures
來源出版物:IEEE Transactions on Image Processing,2014,23(2): 710-724
聯(lián)系郵箱:Galbally,Javier; javier.galbally@jrc.ec.europa.es
Robust face recognition via occlusion dictionary learning
Ou,WH; You,XG; Tao,DC; et al.
Abstract: Sparse representation based classification(SRC)has recently been proposed for robust face recognition. To deal with occlusion,SRC introduces an identity matrix as an occlusion dictionary on the assumption that the occlusion has sparse representation in this dictionary. However,the results show that SRC's use of this occlusion dictionary is not nearly as robust to large occlusion as it is to random pixel corruption. In addition,the identity matrix renders the expanded dictionary large,which results in expensive computation. In this paper,we present a novel method,namely structured sparse representation based classification(SSRC),for face recognition with occlusion. A novel structured dictionary learning method is proposed to learn an occlusion dictionary from the data instead of an identity matrix. Specifically,a mutual incoherence of dictionaries regularization term is incorporated into the dictionary learning objective function which encourages the occlusion dictionary to be as independent as possible of the training sample dictionary. So that the occlusion can then besparsely represented by the linear combination of the atoms from the learned occlusion dictionary and effectively separated from the occluded face image. The classification can thus be efficiently carried out on the recovered non-occluded face images and the size of the expanded dictionary is also much smaller than that used in SRC. The extensive experiments demonstrate that the proposed method achieves better results than the existing sparse representation based face recognition methods,especially in dealing with large region contiguous occlusion and severe illumination variation,while the computational cost is much lower.
Keywords: face recognition; occlusion dictionary learning; mutual incoherence; structured sparse representation
來源出版物:Pattern Recognition,2014,47(4): 1559-1572聯(lián)系郵箱:You,XG; you1231cncn@gmail.com
Discriminative multimanifold analysis for face recognition from a single training sample per person
Lu ,JW; Tan,YP; Wang,G
Abstract: Conventional appearance-based face recognition methods usually assume that there are multiple samples per person(MSPP)available for discriminative feature extraction during the training phase. In many practical face recognition applications such as law enhancement,e-passport,and ID card identification,this assumption,however,may not hold as there is only a single sample per person(SSPP)enrolled or recorded in these systems. Many popular face recognition methods fail to work well in this scenario because there are not enough samples for discriminant learning. To address this problem,we propose in this paper a novel discriminative multimanifold analysis(DMMA)method by learning discriminative features from image patches. First,we partition each enrolled face image into several nonoverlapping patches to form an image set for each sample per person. Then,we formulate the SSPP face recognition as a manifold-manifold matching problem and learn multiple DMMA feature spaces to maximize the manifold margins of different persons. Finally,we present a reconstruction-based manifold-manifold distance to identify the unlabeled subjects. Experimental results on three widely used face databases are presented to demonstrate the efficacy of the proposed approach.
Keywords: face recognition; manifold learning; subspace learning; single training sample per person
來源出版物:IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1): 39-51
聯(lián)系郵箱:Lu ,JW; jiwen.lu@adsc.com.sg
Fast l(1)-minimization algorithms for robust face recognition
Yang,AY; Zhou,ZH; Balasubramanian,AG; et al.
Abstract: l(1)-minimization refers to finding the minimum l(1)-norm solution to an underdetermined linear system b = Ax. Under certain conditions as described in compressive sensing theory,the minimum l(1)-norm solution is also the sparsest solution. In this paper,we study the speed and scalability of its algorithms. In particular,we focus on the numerical implementation of a sparsity-based classification framework in robust face recognition,where sparse representation is sought to recover human identities from high-dimensional facial images that may be corrupted by illumination,facial disguise,and pose variation. Although the underlying numerical problem is a linear program,traditional algorithms are known to suffer poor scalability for large-scale applications. We investigate a new solution based on a classical convex optimization framework,known as augmented Lagrangian methods. We conduct extensive experiments to validate and compare its performance against several popular l(1)-minimization solvers,including interior-point method,Homotopy,F(xiàn)ISTA,SESOPCD,approximate message passing,and TFOCS. To aid peer evaluation,the code for all the algorithms has been made publicly available.
Keywords: l(1)-minimization; augmented Lagrangian methods; face recognition
來源出版物:IEEE Transactions on Image Processing,2013,22(8): 3234-3246聯(lián)系郵箱:Yang,AY;yang@eecs.berkeley.edu
Hybrid Deep Learning for Face Verification
Yi Sun; Xiaogang Wang; Xiaoou Tang
Abstract: his paper proposes a hybrid convolutional network(ConvNet)-Restricted Boltzmann Machine(RBM)model for face verification in wild conditions. A key contribution of this work is to directly learn relational visual features,which indicate identity similarities,from raw pixels of face pairs with a hybrid deep network. The deep ConvNets in our model mimic the primary visual cortex to jointly extract local relational visual features from two face images compared with the learned filter pairs. These relational features are further processed through multiple layers to extract high-level and global features. Multiple groups of ConvNets are constructed in order to achieve robustness and characterize face similarities from different aspects. The top-layer RBM performs inference from complementary high-level features extracted from different ConvNet groups with a two-level average pooling hierarchy. The entire hybrid deep network is jointly fine-tuned to optimize for the task of face verification. Our model achieves competitive face verification performance on the LFW dataset.
來源出版物: 2013 IEEE International Conference on. IEEE,2013: 1489-1496.
編輯:王微
We have developed a near-real-time computer system that can locate and track a subject's head,and then recognize the person by comparing characteristics of the face to those of known individuals. The computational approach taken in this system is motivated by both physiology and information theory,as well as by the practical requirements of near-real-time performance and accuracy. Our approach treats the face recognition problem as an intrinsically two-dimensional(2-D)recognition problem rather than requiring recovery of three-dimensional geometry,taking advantage of the fact that faces are normally upright and thus may be described by a small set of 2-D characterstic views. The system functions by projecting face images onto a feature space that spans the significant variations among known face images. The significant features are known as "eigenfaces," because they are the eigenvectors(principal components)of the set of faces; they do not necessarily correspond to features such as eyes,ears,and noses. The projection operation characterizes an individual face by a weighted sum of the eigenface features,and so to recognize a particular face it is necessary only to compare these weights to those of known individuals. Some particular advantages of our approach are that it provides for the ability to learn and later recognize new faces in an unsupervised manner,and that it is easy to implement using a neural network architecture.
superior temporal sulcus; human faces; neurons; macaque; monkey; cortex
典
文章題目第一作者來源出版物1Eigenfaces for RecognitionTurk,MJournal of Cognitive Neuroscience,1991,3(1):71-86 2 Pattern Recognition,1996,29(1): 51-59 3Eigenfaces vs. Fisherfaces: Recognition using class A comparative study of texture measures with classification based on feature distributions Ojala,T specific linear projectionBelhumeur,PNIEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(7): 711-720 4 2004,57(2): 137-154 5PCA versus LDAMartinez,AMIEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(2): 228-233 Robust real-time face detection Viola,P International Journal of Computer Vision,
Eigenfaces for Recognition
Turk,M; Pentland,A