Unsupervised feature extraction for the representation and recognition of lip motion video

  • Authors:
  • Michelle Jeungeun Lee;Kyungsuk David Lee;Soo-Young Lee

  • Affiliations:
  • Korea Advanced Institute of Science and Technology, Department of BioSystems, Daejeon, South Korea;Department of Computer Science, University of Wisconsin at Madision, Madison, Wisconsin;Korea Advanced Institute of Science and Technology, Department of BioSystems, Daejeon, South Korea

  • Venue:
  • ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The lip-reading recognition is reported with lip-motion features extracted from multiple video frames by three unsupervised learning algorithms, i.e., Principle Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Since the human perception of facial motion goes through two different pathways, i.e., the lateral fusifom gyrus for the invariant aspects and the superior temporal sulcus for the changeable aspects of faces, we extracted the dynamic video features from multiple consecutive frames for the latter. The multiple-frame features require less number of coefficients for the same frame length than the single-frame static features. The ICA-based features are most sparse, while the corresponding coefficients for the video representation are the least sparse. PCA-based features have the opposite characteristics, while the characteristics of the NMF-based features are in the middle. Also the ICA-based features result in much better recognition performance than the others.