Features versus Context: An Approach for Precise and Detailed Detection and Delineation of Faces and Facial Features

Authors:
Liya Ding;Aleix M. Martinez
Affiliations:
The Ohio State University, Columbus;The Ohio State University, Columbus
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2010

Citing 0
Cited 10

Face analysis using curve edge maps

ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing - Volume Part II
Analysis of variance of Gabor filter banks parameters for optimal face recognition

Pattern Recognition Letters
Learning deformable shape manifolds

Pattern Recognition
Facial expressions in American sign language: Tracking and recognition

Pattern Recognition
Detection of facial features on color face images

ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part I
A model of the perception of facial expressions of emotion by humans: research overview and perspectives

The Journal of Machine Learning Research
Using linking features in learning non-parametric part models

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Guaranteed ellipse fitting with the sampson distance

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Adaptive weighted learning for linear regression problems via Kullback-Leibler divergence

Pattern Recognition
Salient and non-salient fiducial detection using a probabilistic graphical model

Pattern Recognition

Quantified Score

Hi-index	0.14

Visualization

Abstract

The appearance-based approach to face detection has seen great advances in the last several years. In this approach, we learn the image statistics describing the texture pattern (appearance) of the object class we want to detect, e.g., the face. However, this approach has had limited success in providing an accurate and detailed description of the internal facial features, i.e., eyes, brows, nose, and mouth. In general, this is due to the limited information carried by the learned statistical model. While the face template is relatively rich in texture, facial features (e.g., eyes, nose, and mouth) do not carry enough discriminative information to tell them apart from all possible background images. We resolve this problem by adding the context information of each facial feature in the design of the statistical model. In the proposed approach, the context information defines the image statistics most correlated with the surroundings of each facial component. This means that when we search for a face or facial feature, we look for those locations which most resemble the feature yet are most dissimilar to its context. This dissimilarity with the context features forces the detector to gravitate toward an accurate estimate of the position of the facial feature. Learning to discriminate between feature and context templates is difficult, however, because the context and the texture of the facial features vary widely under changing expression, pose, and illumination, and may even resemble one another. We address this problem with the use of subclass divisions. We derive two algorithms to automatically divide the training samples of each facial feature into a set of subclasses, each representing a distinct construction of the same facial component (e.g., closed versus open eyes) or its context (e.g., different hairstyles). The first algorithm is based on a discriminant analysis formulation. The second algorithm is an extension of the AdaBoost approach. We provide extensive experimental results using still images and video sequences for a total of 3,930 images. We show that the results are almost as good as those obtained with manual detection.