The first stage (from benefiting from the year of milian to benefiting from the year of Huining)
At present, face recognition is usually only studied as a general pattern recognition problem, and the main technical scheme adopted is the method based on facial geometric features. This is mainly reflected in people's research on profile, and people have done a lot of research on the extraction and analysis of the structural characteristics of facial profile curves. Artificial neural network has been used by researchers for face recognition. In addition to bledsoe, there are other early researchers engaged in AFR research, such as Goldstein, Harmon and Kinderwood. A few years later, Chu Jin Wu Xiong finished his first doctoral thesis on AFR at Kyoto University. Until now, as a professor at the Robotics Institute of Carnegie Mellon University (CMU), he is still one of the active figures in the field of face recognition. His research group is also an important force in the field of face recognition. Generally speaking, this stage is the primary stage of face recognition research, and there are not many very important achievements, and it has basically not been applied in practice.
The second stage (benefit year ~ benefit teenagers)
Although this stage is relatively short, it is the climax of face recognition research, which can be described as fruitful: not only some representative face recognition algorithms have been born, but also the famous FERET face recognition algorithm has been tested by the US military, and a number of face recognition systems have emerged, such as the most famous Visionics (now Identix) FaceIt system.
The "feature face" method proposed by Turk and Pentland of MIT Media Lab is undoubtedly the most famous face recognition method in this period. Many subsequent face recognition technologies are more or less related to feature face, and now feature face has become the benchmark algorithm for face recognition performance test together with normalized correlation method.
Another important work in this period is the comparative experiment made by brunelli and Borgio of the Artificial Intelligence Laboratory of Massachusetts Institute of Technology before and after the year when they benefited from power supply. They compared the recognition performance of the imitation method based on structural features and the imitation method based on template matching, and gave a clear conclusion that the imitation method based on template matching is better than the imitation method based on features. This guiding conclusion works together with the characteristic face * * *, which basically stops the research on face recognition based on structural features, promotes the development of linear subspace modeling based on appearance and face recognition based on statistical pattern recognition technology to a great extent, and makes it gradually become the mainstream face recognition technology.
The Fisherface face recognition method proposed by Belhumeur is another important achievement in this period. Firstly, principal component analysis (PCA) is used to reduce the dimension of the apparent features of the image. On this basis, LinearDiscriminant Analysis (LDA) is used to transform the principal components after dimensionality reduction, so as to obtain "as large as possible between-class divergence and as small as possible within-class divergence". At present, this method is still one of the mainstream face recognition methods, which has produced many different variants, such as zero-space method, subspace discriminant model, enhanced discriminant model, direct LDA discriminant method and some recent improved strategies based on kernel learning.
On the other hand, Moghaddam of MIT proposed a face recognition method based on Bayesian probability estimation in dual space. This method transforms the similarity calculation of two face image pairs into a two-class (intra-class difference and inter-class difference) classification problem by "difference method". Both intra-class difference and inter-class difference data must be reduced by principal component analysis, and the conditional probability density of the two classes must be calculated. Finally, face recognition is carried out by Bayesian decision (maximum likelihood or maximum posterior probability).
Another important method in face recognition-Elastic Graph Matching (EGM) was also put forward at this stage. The basic idea is to describe the face with an attribute graph: the vertex of the attribute graph represents the key feature points of the face, and its attribute is the multi-resolution and multi-directional local feature-Gabor transform power reception feature at the corresponding feature point, and the attribute called Jet edge is the geometric relationship between different feature points. For any input face image, elastic graph matching uses an optimized strategy to locate some predefined key face feature points, and at the same time extract their Jet features to get the attribute map of the input image. Finally, the recognition process is completed by calculating the similarity between it and the known face attribute map. The advantage of this method is that it not only retains the global structural features of the face, but also models the key local features of the face. Recently, this method has been extended.
Local feature analysis technology was proposed by Atick and others of Rockefeller University. In essence, LFA is a low-dimensional object description method based on statistics. Compared with PCA, which can only extract global features but can't retain local topological structure, LFA can extract local features based on global PCA description while retaining global topological information, so it has better description and discrimination ability. LFA technology has been transformed into a famous FaceIt system, so no new academic progress has been published in the later period.
The FERET project funded by the Anti-technology Development Program Office of the US Department of Defense is undoubtedly a crucial event at this stage. The goal of FERET project is to develop AFR technology that can be used by security, intelligence and law enforcement departments. The project includes three parts: funding a number of face recognition research, creating a FERET face image database, and organizing the performance evaluation of FERET face recognition. In this project, the evaluation of face recognition was organized in Huilian Year, Huinian Year and Huimi Year respectively. Several most famous face recognition algorithms are tested, which greatly promotes the perfection and practicality of these algorithms. Another important contribution of this test is to give the further development direction of face recognition: face recognition under non-ideal acquisition conditions such as illumination and posture has gradually become a hot research direction.
Flexible models including active shape model (ASM) and active appearance model (AAM) are important contributions to face modeling in this period. ASM/AAM describes the face as two independent parts: electric D shape and texture, which are modeled by statistical method (PCA) respectively, and then the two parts are combined by PCA to statistically model the face. Flexible model has good ability of face synthesis, and image analysis technology based on synthesis can be used for feature extraction and modeling of face images. Flexible model has been widely used in face alignment and recognition, and many improved models have appeared.
Generally speaking, face recognition technology is developing very rapidly at this stage, and the proposed algorithm has achieved very good performance in ideal image acquisition conditions, object coordination and small and medium-sized frontal face database, so several well-known face recognition companies have emerged. From the technical scheme, linear subspace discriminant analysis, statistical apparent model and statistical pattern recognition method of electric D face images are the mainstream technologies at present.
The third stage (benefit year ~ present)
The evaluation of FERET' Huimi face recognition algorithm shows that the mainstream face recognition technology is not robust to illumination and posture changes caused by non-ideal acquisition conditions or uncoordinated objects. Therefore, illumination and posture problems have gradually become research hotspots. At the same time, the injury system of face recognition has been further developed. Therefore, on the basis of FERET test, the US military organized two damage system assessments in 2000 and 2002 respectively.
The multi-pose and multi-illumination face recognition method based on illumination cone model proposed by Georghiades et al. is one of the important achievements in this period. They proved an important conclusion: all images of the same face in the same perspective and under different illumination conditions form a convex cone in the image space, that is, the illumination cone. In order to calculate the illumination cone from a few face images with unknown illumination conditions, they also extended the traditional photometric stereo vision method. Under the assumption of Lambert model, they can restore the dawn shape of the object, the surface reflection coefficient of the surface point, the convex surface and the far point light source according to a few images with unknown illumination conditions (traditional photometric stereo vision can restore the normal vector direction of the object surface according to a given image with known illumination conditions), so that they can easily synthesize images with arbitrary illumination conditions from this perspective to complete illumination. Recognition is accomplished by calculating the distance from the input image to each illumination cone.
During this period, the statistical learning theory represented by support vector machine is also applied to face recognition and confirmation. Support vector machines are two kinds of classifiers, while face recognition is a multi-class problem. There are usually three strategies to solve this problem, namely: intra-class difference/inter-class difference method, one-to-many method and one-to-one method.
The face image analysis and recognition method under multi-pose and multi-illumination conditions based on small D deformation model proposed by Brands and Vetter is a pioneering work at this stage. This method essentially belongs to the analysis technology based on synthesis. Its main contribution lies in that, on the basis of knowing the statistical deformation model of D shape and texture (similar to AAM in electric D), it also uses the graphic simulation method to model the perspective projection and illumination model parameters in the process of image acquisition, so that the internal attributes of face such as face shape and texture are completely separated from the external parameters such as camera configuration and illumination, which is more conducive to the analysis and recognition of face images. Blanz experiments show that this method achieves high recognition rate on CMU- Pai (multi-pose, illumination and expression) face database and FERET multi-pose face database, which proves the effectiveness of this method.
At the International Computer Vision Conference (ICCV) in 2000, researchers Viola and Jones of Compaq Research Institute demonstrated their real-time face detection system based on simple rectangular features and AdaBoost, and the speed of detecting quasi-frontal faces in CIF format reached more than multiple frames per second. The main contributions of this method include: using simple rectangular features that can be calculated quickly as face image features; Based on AdaBoost, a large number of weak classifiers are combined to form a learning method of strong classifiers; Cascade technology is adopted to improve the detection speed. At present, the strategy based on this face/non-face learning has been able to realize quasi-real-time multi-pose face detection and tracking. This provides a good foundation for back-end face recognition.
Shashua proposed a face image recognition and rendering technology based on injured image understanding. This technology is a rendering technology based on the learning of specific object image set, which can synthesize a synthetic image of any input face image under various illumination conditions according to a small number of images with different illumination in the training set. Based on this, Shasuha and others also gave the definition of face signature image with constant illumination, which can be used for face recognition with constant illumination. Experiments have proved its effectiveness.
Basri and Jacobs use spherical harmonic function to represent illumination and convolution process to describe Lambert reflection, which analytically proves an important conclusion: the set of all Lambert reflection functions obtained by any distant light source constitutes a linear subspace. This means that the image set of a convex Lambert surface object under various illumination conditions can be approximated by a low-dimensional linear subspace. This is not only consistent with the empirical experimental results of previous illumination statistical modeling methods, but also further promotes the development of linear subspace target recognition methods in theory. Moreover, it is possible to force the illumination function to be non-negative by convex optimization method, which provides an important idea for solving the illumination problem.
After the FERET project, several face recognition injury systems appeared. The relevant departments of the US Department of Defense have further organized the evaluation of FRVT for the face recognition injury system, which has been held twice so far: FRVT zero zero and FRVT zero zero. On the one hand, these two tests compare the performance of well-known face recognition systems. For example, FRVT electric zero zero test, Cognitec, Identix and Eyematic are far ahead of other systems, but there is little difference between them. On the other hand, the development status of face recognition technology is comprehensively summarized: under ideal conditions (front visa photos), the highest preferred recognition rate of face recognition of a large number of images is less than%, and the EER association of face recognition is about m%. Another important contribution of FRVT test is that it further points out some problems that need to be solved urgently in the current face recognition algorithm. For example, FRVT electric zero-zero test shows that the performance of the current face recognition injury system is still very sensitive to indoor and outdoor lighting changes, posture, time span and other changing conditions, and the problem of effective recognition on a large-scale face database is also very serious. These problems still need further efforts.
Generally speaking, under the condition of non-ideal imaging (especially illumination and posture) and the incongruity of objects, face recognition on large-scale face database has gradually become a hot issue. However, nonlinear modeling method, statistical learning theory, boosting-based learning technology and face modeling and recognition method based on small D model have gradually become the development trend of high-value technologies.
In a word, face recognition is a research topic with both scientific research value and broad application prospects. A large number of international researchers have achieved fruitful research results in decades, and automatic face recognition technology has been successfully applied under certain restrictions. These achievements have deepened our understanding of automatic face recognition, especially its challenge. Although the existing automatic face recognition system may have surpassed human beings in the comparison speed and even accuracy of massive face data, it is far less robust and accurate than human beings for general face recognition problems under complex changing conditions. The essential reason of this gap is still unknown, after all, our understanding of human visual system is still superficial. However, from the perspective of pattern recognition and computer vision, this may not only mean that we have not found an effective sensor to sample facial information reasonably (considering the difference between monocular camera and human binocular system), but also mean that we have adopted an inappropriate face modeling method (the internal representation of face), and may also mean that we have not realized the extreme accuracy that automatic face recognition technology can achieve. But in any case, it is the dream of many researchers in this field to give computing devices the ability of human face recognition. I believe that with the deepening of research, our understanding should be able to approach the correct answers to these questions more accurately.