View-invariant gesture recognition using 3D optical flow and harmonic motion context

  • Authors:
  • M. B. Holte;T. B. Moeslund;P. Fihl

  • Affiliations:
  • Computer Vision and Media Technology Laboratory, Aalborg University, Niels Jernes Vej 14, DK-9220 Aalborg, Denmark;Computer Vision and Media Technology Laboratory, Aalborg University, Niels Jernes Vej 14, DK-9220 Aalborg, Denmark;Computer Vision and Media Technology Laboratory, Aalborg University, Niels Jernes Vej 14, DK-9220 Aalborg, Denmark

  • Venue:
  • Computer Vision and Image Understanding
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an approach for view-invariant gesture recognition. The approach is based on 3D data captured by a SwissRanger SR4000 camera. This camera produces both a depth map as well as an intensity image of a scene. Since the two information types are aligned, we can use the intensity image to define a region of interest for the relevant 3D data. This data fusion improves the quality of the motion detection and hence results in better recognition. The gesture recognition is based on finding motion primitives (temporal instances) in the 3D data. Motion is detected by a 3D version of optical flow and results in velocity annotated point clouds. The 3D motion primitives are represented efficiently by introducing motion context. The motion context is transformed into a view-invariant representation using spherical harmonic basis functions, yielding a harmonic motion context representation. A probabilistic Edit Distance classifier is applied to identify which gesture best describes a string of primitives. The approach is trained on data from one viewpoint and tested on data from a very different viewpoint. The recognition rate is 94.4% which is similar to the recognition rate when training and testing on gestures from the same viewpoint, hence the approach is indeed view-invariant.