Locating nose-tips and estimating head poses in images by tensorposes

  • Authors:
  • Jilin Tu;Yun Fu;Thomas S. Huang

  • Affiliations:
  • Visualization and Computer Vision Laboratory, General Electric, Niskayuna, NY;BBN Technologies, Cambndge, MA;Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a head pose estimation system that automatically localizes the nose-tips of the faces and estimates head poses in images simultaneously. In the training stage, the nose-tips of the faces are first manually labeled. The appearance variations caused by head pose changes are then characterized by a Tensorposes model. Given an image with unknown head pose and nose-tip location, the nose-tip of the face is automatically localized in a coarse-to-fine fashion after the skin color segmentation. The head pose is also estimated simultaneously. The performance of our system is evaluated on the Pointing'04 head pose image data set. We first evaluate the classification performance of the Tensorposes models with image patches of the faces cropped according to the manually labeled nose-tip locations of the faces in the Pointing '04 data set. By leaving-one-person-out evaluation strategy, we obtain the optimal parameters of the Tensorposes model, and evaluate the discriminative power of the Tensorposes model built based on high order singular value decomposition (HOSVD) and multilinear independent component analysis (MICA), and naive principal component analysis (PCA) subspace models. It is shown Tensorposes model by HOSVD and MICA decomposition performs similarly good but much better than naive PCA subspace models. The Tensorposes model is then utilized to automatically localize nose-tip location in the testing image and to simultaneously estimate the head pose. The nose-tip localization and pose estimation accuracy of the proposed system are evaluated against the ground truth. Finally, cross-database evaluation of the performance of our system is carried out on Pointing'04 database, a selected subset of CMU PIE database, and some pictures from CLEAR'07 head pose evaluation database. The experiments show that our system generalizes reasonably well to the real-world scenarios.