Puja S.Prasad ,Adepu Sree Lakshmi ,Sandeep Kautish ,Simar Preet Singh ,Rajesh Kumar Shrivastava ,Abdulaziz S.Almazyad ,Hossam M.Zawbaa and Ali Wagdy Mohamed
1CSE Department,Geethanjali College of Engineering,Hyderabad,Telangana,501301,India
2Department of Computer Science,LBEF Campus,Kathmandu,44600,Nepal
3School of Computer Science Engineering and Technology(SCSET),Bennett University,Greater Noida,201310,India
4Department of Computer Engineering,College of Computer and Information Sciences,King Saud University,P.O.Box 51178,Riyadh,11543,Saudi Arabia
5CeADAR Ireland’s Centre for AI,Technological University Dublin,Dublin,D07 EWV4,Ireland
6Operations Research Department,Faculty of Graduate Studies for Statistical Research,Cairo University,Giza,12613,Egypt
7Applied Science Research Center,Applied Science Private University,Amman,11937,Jordan
ABSTRACT Pupil dynamics are the important characteristics of face spoofing detection.The face recognition system is one of the most used biometrics for authenticating individual identity.The main threats to the facial recognition system are different types of presentation attacks like print attacks,3D mask attacks,replay attacks,etc.The proposed model uses pupil characteristics for liveness detection during the authentication process.The pupillary light reflex is an involuntary reaction controlling the pupil’s diameter at different light intensities.The proposed framework consists of two-phase methodologies.In the first phase,the pupil’s diameter is calculated by applying stimulus(light)in one eye of the subject and calculating the constriction of the pupil size on both eyes in different video frames.The above measurement is converted into feature space using Kohn and Clynes model-defined parameters.The Support Vector Machine is used to classify legitimate subjects when the diameter change is normal(or when the eye is alive)or illegitimate subjects when there is no change or abnormal oscillations of pupil behavior due to the presence of printed photograph,video,or 3D mask of the subject in front of the camera.In the second phase,we perform the facial recognition process.Scale-invariant feature transform(SIFT)is used to find the features from the facial images,with each feature having a size of a 128-dimensional vector.These features are scale,rotation,and orientation invariant and are used for recognizing facial images.The brute force matching algorithm is used for matching features of two different images.The threshold value we considered is 0.08 for good matches.To analyze the performance of the framework,we tested our model in two Face antispoofing datasets named Replay attack datasets and CASIA-SURF datasets,which were used because they contain the videos of the subjects in each sample having three modalities(RGB,IR,Depth).The CASIA-SURF datasets showed an 89.9%Equal Error Rate,while the Replay Attack datasets showed a 92.1%Equal Error Rate.
KEYWORDS SIFT;pupil;CASIA-SURF;pupillary light reflex;replay attack dataset;brute force
Biometric measures are used to secure the digital world.None of the two people has the same biometric print.The Biometric is a combination of two words,bio and metric.It deals with the physiological and behavioral characteristics of a person.Digitization needs a robust,authentic system due to more use of digital devices as it makes the data readily available to any corner of the world.
Due to increasing cybersecurity risks,decreasing hardware cost,and voluminous data,many organizations prefer to use biometric authentication systems due to their advantages over the traditional way of authentication like signature or password.The goal of such a system is to ensure that the available applications are accessed or used only by a genuine user and not by others.Such services include computer systems,secure access to buildings,cell phones,smartphones,laptops and ATMs.In the lack of a robust individual recognition system,these things remain susceptible to the tricks of an impostor.This is the reason why biometrics has been adopted in many applications.The physiological and behavioral characteristics used for biometric recognition include fingerprint,hand geometry,iris,retina,Face,palmprint,ear,DNA,voice,gait,signature,keystroke dynamics,etc.Because biometrics involve pattern recognition,the organ that gives patterns is mostly used for biometrics.However,the main issues are involved in designing and commissioning a practical biometric system.Table 1 lists the most commonly used biometrics.These identifiers are also called mature biometrics.Table 1 gives some important biometrics that are commonly used.
Table 1: Commonly used biometrics
The face is considered one of the most popular biometric models for authentication purposes.The emergence of many face recognition conferences like Audio and Video-Based Authentication international conference,Automatic Face and Gesture Recognition Conference,and systematic empirical Face Evaluations Techniques (FRT),including those reported by Grother et al.2019 [1],Phillips et al.2003[2,3],etc.,make it more popular.
Facial Recognition Technology(FRT)is designed to authenticate a person by using facial images without any physical contact.There are two major steps for designing FRT.In the first stage,the enrollment of the subject takes place by extracting features of the facial image using feature extraction or some image processing algorithm and creating a template.In the second stage.When the subject comes for authentication,the facial feature is extracted,and using some pattern-matching algorithm.It is matched with the template.While taking or stealing someone’s biometric traits is difficult,it is still possible for fraud to circumvent a biometric system using spoofed or artificial traits.A large number of studies have shown that it is quite possible to construct gluey fingers using lifted fingerprint impressions and utilize them to attack the biometric system.Behavioral biometrics like voice and signatures are more susceptible to such attacks than physiological traits.The facial recognition system is vulnerable to several presentation attacks (direct attacks or spoof attacks).A presentation attack uses Fake faces or facial artifacts for unauthorized access using a facial recognition authentication system.A presentation attack may be dynamic or static in nature.It is applied in two-dimensional as well as three-dimensional images.
In two-dimensional static presentation attacks,an attacker may use a photograph or a flat paper plastic mask as an artifact.On the other side,in a two-dimensional dynamic attack,the fraud perpetrator uses a screen video display or several photographs shown one by one.The threedimensional presentation attack is also static or dynamic in nature.Static attacks occur using sculpture or 3D print,whereas attacks using robots or well-prepared makeup come under dynamic attack.
Fig.1 shows the anatomy of the human eye.Human eyes consist of many parts that can be used in biometrics,such as the iris,pupil,retina,eye movement,etc.The cornea is the front,white,domeshaped part that is responsible for focusing light when it falls on the eyes.The anterior chamber is present behind the cornea and is filled with a fluid called the aqueous humous.The iris is the colored part of the eyes that is present behind the anterior chamber,having a dark hole at its center called the pupil.When light falls on the eye,the iris muscle constricts and dilates the pupil to control the amount of light entering the eye.The pupillary light reflex is the involuntary reflex that controls the diameter of the pupil.The pupil is one of the most important parts of our vision system.The size of the pupil dilates or gets bigger when the light intensity is dim and it constricts or gets smaller when the intensity of light is brighter.
Figure 1:Different parts of the eye
By calculating the diameter of the pupil in different video frames of the subject and comparing the diameter.If we find the difference in diameter in two or more frames,then liveness will be verified as pupillary light reflex changes the diameter of the pupil,and this is not possible with the artifacts as shown in Fig.2.As the Pupillary light reflex causes the change in the diameter of the pupil,it is now one of the hot research areas in detecting the liveness of the face as well as iris biometric models.
Figure 2:Response of the pupil to light
The initial work for facial recognition was found in the 1954’s by Bruner and Tagiuri in the handbook of psychology about the perception of people [4].Some of the earliest works include Darwin’s works on the functionality of emotions in the year 1972.According to Darwin,facial expression evolved as the result of certain kinds of emotions and had an important communicative function[5].Galton works on facial profile-based biometrics in the year 1988.
As the facial recognition system is one of the most popular biometrics among all different biometrics,it is more vulnerable to direct attacks and indirect attacks because of progressive technology.Table 2 depicts the important contribution of the same.A number of researches were going on over the last four decades in order to strengthen the biometric authentication system.Different biometric traits have their own pros and cons.In order to make biometric authentication systems more robust nowadays,two or more biometric traits have been used.Using more than one biometric is called multimodal biometric.Combining iris pattern and pupil is also used for authenticating purposes.
Table 2: Related work
The pupil diameter changes with the intensity of light.This pupillary light reflex is used as one trait for authenticating people’s identity[19,20].Biometric authentication techniques are also used for identifying animals like sheep using retinas[21].The initial work for facial recognition was found in the 1954’s by Bruner and Tagiuri in the handbook of psychology about the perception of people[22].Some of the earliest works include Darwin’s works on the functionality of emotions in the year 1972.According to Darwin,facial expression evolved as the result of certain kinds of emotions and had an important communicative function[21].Galton works on facial profile-based biometrics in the year 1988.Actual work on automatic facial recognition using machines started in the year 1970s by Kelly,in which complex processing tasks were performed on a picture taken from television for the first time[23].The shallow Method of Face detection and recognition does not use a deep learning method;instead,extracting the feature from an image using handcrafted image descriptors like SIFT,MOPS,GLOH,and LBP[23-27]and combining these local descriptors using pooling mechanism to generate overall face descriptors like Fisher Vectors [28,29].Face recognition and detection have been done by using different approaches like locality preserving projections LPP[30],and modular PCA[31]in which the face image is divided into sub-images and the PCA method is applied to each sub-image.The improved recognition rate is found under different lighting directions,illumination expressions,and poses trained in different data sets[32,33].Some facial recognition techniques use HOG feature extraction and fast PCA enhancing accuracy rate and Support Vector Machine used for recognizing faces[34].The unified LDA/PCA algorithm combines LDA and PCA and reduces the drawbacks of LDA in facial recognition systems.Learning Based Descriptor[35]reduces the number of issues that occur during the matching and representation of facial images and is highly discriminative,compact,and easy to extract[36].Content-based facial recognition technique is also gaining popularity[37,38].Table 3 represents some important method used by early researchers.
Table 3: FRT existing technology
Our proposed algorithm consists of two main steps.In the first steps,we will find the pupil diameter for verifying the liveness of a subject.After verifying the subject is live,the facial recognition algorithm runs to authenticate the person’s identity.
Measuring minute fluctuations in pupil diameter change in response to a stimulus is called Pupillometry.The measurement of diameter can be done using digital image processing.
1.Capture facial images under different illumination conditions.
2.Extract the iris portion in order to calculate the diameter of the pupil using Matlab image processing.
3.Find the diameter of the pupil in five different frames under different illumination conditions.
4.Compare the diameter of the pupil in different frames.
5.Extract the facial feature vector using SIFT and MOPS.
6.Train the model to find the embedding function of images.
7.Train the proposed system to obtain the classifier model using SIFT feature vectors.
8.Test the classifier by giving query image feature vector.
9.Calculate the similarity index of the query image and the images of data sets using the classifier.
In this step,we detect the pupil inside the captured image,and the size is calculated using the segmentation process.
3.1.1DetectionandLocalization
Detection and Localization are the two important steps,in which detection confirms the presence og the pupil inside the frame and detection gives its position.The Hough transform is used to find the boundary between the iris and the pupil.Pupil diameter has been calculated using MATLAB Image Processing toolbox that provides several algorithm environment tools for processing images,analyzing images,visualization,as well as developing algorithms.
The different Image Processing steps involved are for finding the pupil dynamics are:
· Preprocessing
· Segmentation
· Data Processing.
· Feature extraction
The transform technique is modified to be sensitive towards a dark circular shape instead of any other light irregular shape.Using gradient and sensitivity,this proposed algorithm becomes robust as it detects the pupil even after the eyelashes half cover it.
3.1.2ArtifactsRemoval
Two types of noises are detected in the raw linear pupil radii signal.The first one is Noise generated at the time of pupil detection because of the blinks called pupil detection error,and the second one is Pupil segmentation error that arises due to eye motion,Non-circular pupil,Partially covered pupil,and off-axis gaze.Segmentation error is very difficult to identify,and it is only observed when a sudden change in radii is marked compared to the previous neighboring frame,whereas detection error we can rectify during the modeling of pupil dynamics.For building the classification model,we use the support vector machine as it is considered one of the best classifiers that perform very well in low-dimensional feature vectors.
The captured image contains different types of noises.The specular noises produced over the eye’s surface due to infrared illumination have been removed using the different noise removal functions of the image processing tools of Matlab.The RGB images are converted into binary images and the circular region is calculated for the iris and the pupil.The Hough transform is used for extracting the features as it finds the instances of objects within a certain class of shapes.
All facial images have certain content that describes the images and differentiates them from other facial images,and this is called Image features.Edges,corner or interest points,BLOB,and ridges provide rich information about image content.Image features are used as input for the facial recognition process following major steps:
· Detection of face using a camera either solo or from a crowd.
· Analyzing the geometry of the face using a certain algorithm.Geometry includes the depth of eyes,the distance between eyes and eyebrows,the distance between chin and forehead,the contour of nose,lips,ears,etc.
· After analyzing the face,the mathematical form called face print is generated that contains the digital information of the facial features.
· Finally,finding the match of the given person.Matching Image is an important task for a facial recognition system.
After getting the feature descriptor,the main task is to use the key points of the feature descriptor for matching purposes.The main goal of matching is that it would be able to perfectly match the same images if they are taken from different angles,different viewpoints,different scales,and different camera parameters.Panorama stitching of two different images of the same scene or object and finding key points for matching purposes is the main task of the matching algorithm.
3.2.1SIFTAlgorithm
Scale Invariant Feature Transform is a feature extractor algorithm that is invariant to scaling and rotation.SIFT is invariant to rotation,viewpoint,and illumination and gives good results.Continuous improvement is going on SIFT to enhance the performance as well as accuracy.The number of variants introduced with making certain changes in the steps or pipeline of the SIFT algorithm.MOPS,also called Multi-Scale Oriented Patches Descriptor is one of the variants of the SIFT feature descriptor in which patches around the key points is rotated according to the dominant gradient orientation,and after that,the histogram and descriptor are computed.Multi-Scale Oriented Patches consist of normalized patches oriented via blurred local gradient and the features are positioned at Harris Corner.This is useful because by rotating the patch to its dominant gradient orientations all key points have canonical orientation same[20].The SIFT algorithm mainly have following four steps:
I) Constructing scale space extrema
II) Localizing key point
III) Estimating orientation
IV) Keypoint descriptor.
3.2.2ScaleSpaceExtremaConstruction
Using the Difference of Gaussian(DOG),the scale space extrema step locates an intriguing point(invariant to scale and orientation).DOG findings that approximate the Laplacian of the Gaussian.Discovering the value of the pixel that is the maximum or minimum value in the surrounding scale images and around its spatial region is referred to as discovering the extrema in scale space.Extrema Construction first creates a scale space for an image for scale space.Scale-space is created by taking an image and convolving it with the Gaussian.Using the k-time Gaussian operator,convolve a second image of the same scale.
All these groups of images form an octave.An octave consists of the number of images depending on the value of k.Again,subsample the image and repeat the same process.The scale space group of blurred images of different scales.Blurring is referred to as a technique in which the convolution of the Gaussian operator with every pixel of the image takes place to output blurred images[46].
To find the extrema,the Laplacian idea is used to find a Gaussian difference by taking the difference between successive Gaussian images and constructing this set for every octave.
3.2.3KeypointLocalizattion
This step is for finding the exact location of minima and maxima ie whether it is present either between the two coordinates or between the two scales.
3.2.4DeterminationofExactLocationandScale
Taylor Series Expansion is used to find the exact location.To determine the location of the extrema,the derivatives compute the first derivative and second derivative simply by finite differences to find where the actual extrema exists.The low contrast point,as well as the edge point,need to be removed.
3.2.5EdgeElimination
Very similar to Harris Corner,Lowe [47] proposed using Hessian of D finding curvature or sharp changes in different direction.Hats Eigen value of Hessian also give good estimate for corner estimation.
TheTr(H)2/Det(H) is minimum when r=1,when r=1 thenαandβare close to each other or equally high that is useful for finding corner point.SIFT proposes that reject a keypoint ifTr(H)2/Det(H)>a threshold value.
At the last of this step exact location and scale at every extrema point as well as also selected some stable keypoint by rejecting edges and low contrast point[47].
For rotation invariance orientation estimation is required.This step is used to find the orientation of extrema.For this use scale of point to choose appropriate image:
After that using finite differences compute gradient magnitude and orientation:
where gradient magnitude ism(x,y) at every point andθ(x,y) is orientation gradient.The result of this step is magnitude and orientation for every point in a image.
To create a histogram,select the area surrounding the key point and consider ?athe orientation and magnitude of all the other key topics.Histogram input is weighted using gradient magnitude and the Gaussian function to increase performance.The histogram will be less dense if certain spots are farther away.Instead of a single point,local factors determine the ?apeak.
If another peak is within 80 percent of max peak then this also as keypoint with different direction[48].
Descriptor gradient information is utilized to locate critical points.Take the 16*16 window size with the nearby key point found for this.This 16 * 16 window should be divided into four 4 * 4 quadrants.Create a histogram of the position of the points inside these quadrants and align it with eight bins.Compared to the closer point,the further point makes up less of the histogram.Create a gradient orientation histogram using 8 histogram bins for each 4 by 4 quadrant.The raw version key point Descriptor is formed at the end of this 128 non-negative vector.16 histograms with 8 values each produced 128 non-negative vectors.The 128-dimensional feature vector is created simply by gradientorienting the area surrounding the key point.In order to decrease the effects of contrast,normalization to unit length is done for 128-d vector.The values are clipped to 0.2 and the resulting vector is once again normalized to unit length to make the descriptor robust to different photometric variations.
In this proposed system,two popular antispoofing image and video data sets Casia-surf,Replay attack dataset are used to perform test on our proposed algorithm.
There are four decisions every biometric system gives.(1)Authorized a legitimate person called True Positive.(2) Authorized an illegitimate person called False Positive.(3) Deny an illegitimate person called True Negative.(4)Deny a legitimate person called False Negative.Our proposed method is evaluated in terms of performance using two different classifiers:decision trees and random forests.Table 4 analysis reveals that decision tree classifiers are more accurate than random forest classifiers in the M2VTS,ORL,and Face 94 databases,respectively,while table analysis also reveals that the accuracy of random forest classifiers is higher in the Yale 2B,FERET,and Face 94 databases.Results for M2VTS,ORL,and FERET are encouraging.
Table 4: Datasets details used in the proposed work for facial recognition purpose
Small data sets were the primary cause of Face 94’s performance decline.When it comes to true positive rates,the random forest classifier performs better with Yale2B data sets and the decision tree performs better with M2VTS data sets.The performance is shown in Table 5 as an equal error rate.The classifier’s false positive rate across all datasets is displayed in Tables 6 and 7.The high AUC in Table 8 indicates that the suggested feature set is effective for the facial recognition system.The performance evaluation in terms of execution time is shown in Table 9.The proposed algorithm is compared with different existing algorithms and significant improvements are found for M2VTS,YALE 2B,and Face 94 data sets,and performance diminishes in the case of FERET and ORL.
Table 5: Performance evaluation(equal error rate)
Table 6: Performance table true positive rate for random forest classifier
Table 7: Performance table false positive rate for random forest classifier
Table 8: Area under curve
Table 9: Evaluation time in seconds(decision tree)
A strong facial authentication method is suggested in this study.Two-way authentication is suggested in this architecture.By measuring the change in pupil diameter under various lighting conditions during the image-capturing process,the liveliness of the facial images is first validated.Following the confirmation of liveness,the facial recognition system integrates SIFT and MOPs,two distinct feature descriptors,into a deep neural network architecture for recognition and detection.The accuracy,area under the curve,false positive rate,and true positive rate of two distinct classifier decision trees and random forests are measured.The Yale2B,M2VTS,FERET,ORL,and FACE 94 datasets are used for the training,which has been demonstrated to be computationally effective.By analyzing the results,it is found that the proposed algorithm is an efficient and acceptable method for facial recognition.To sum up,this technique appears to be a strong contender for liveness detection,and it has a lot of potential for real-world application.
Our proposed algorithm has three limitations and opens the path for future work in this area.First,this algorithm does not include the measurement of elderly people as more elusive changes in their pupil size.Second,the dynamics feature measurement takes time,and the devices designed to capture iris images do not allow additional time during capturing.The third one is drug or alcohol ingestion,which also alters pupil dynamics.
Acknowledgement:The authors present their appreciation to King Saud University for funding the publication of this research through Researchers Supporting Program(RSPD2023R809),King Saud University,Riyadh,Saudi Arabia.
Funding Statement:The research is funded by Researchers Supporting Program at King Saud University(RSPD2023R809).
Author Contributions:The authors confirm contribution to the paper as follows: study conception and design: Puja S.Prasad,Ali Wagdy Mohamed;data collection: Adepu Sree Lakshmi,Hossam M.Zawbaa;analysis and interpretation of results: Sandeep Kautish,Abdulaziz S.Almazyad;draft manuscript preparation: Simar Preet Singh,Rajesh Kumar Shrivastava.All authors reviewed the results and approved the final version of the manuscript.
Availability of Data and Materials:In this paper,we used open access data,which is available in open repositories for researchers.
Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.
Computer Modeling In Engineering&Sciences2024年4期