Multimodal Sparse Features for Object Detection

  • Authors:
  • Martin Haker;Thomas Martinetz;Erhardt Barth

  • Affiliations:
  • Institute for Neuro- and Bioinformatics, University of Lübeck, Lübeck, Germany 23538;Institute for Neuro- and Bioinformatics, University of Lübeck, Lübeck, Germany 23538;Institute for Neuro- and Bioinformatics, University of Lübeck, Lübeck, Germany 23538

  • Venue:
  • ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper the sparse coding principle is employed for the representation of multimodal image data, i.e. image intensity and range. We estimate an image basis for frontal face images taken with a Time-of-Flight (TOF) camera to obtain a sparse representation of facial features, such as the nose. These features are then evaluated in an object detection scenario where we estimate the position of the nose by template matching and a subsequent application of appropriate thresholds that are estimated from a labeled training set. The main contribution of this work is to show that the templates can be learned simultaneously on both intensity and range data based on the sparse coding principle, and that these multimodal templates significantly outperform templates generated by averaging over a set of aligned image patches containing the facial feature of interest as well as multimodal templates computed via Principal Component Analysis (PCA). The system achieves a detection rate of 96.4% on average with a false positive rate of 3.7%.