Real-time 3D pointing gesture recognition for mobile robots with cascade HMM and particle filter

  • Authors:
  • Chang-Beom Park;Seong-Whan Lee

  • Affiliations:
  • Department of Computer Science and Engineering, Korea University, Anam-dong, Seongbuk-ku, Seoul 136-713, Republic of Korea;Department of Computer Science and Engineering, Korea University, Anam-dong, Seongbuk-ku, Seoul 136-713, Republic of Korea and Department of Brain and Cognitive Engineering, Korea University, Anam ...

  • Venue:
  • Image and Vision Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a real-time 3D pointing gesture recognition algorithm for mobile robots, based on a cascade hidden Markov model (HMM) and a particle filter. Among the various human gestures, the pointing gesture is very useful to human-robot interaction (HRI). In fact, it is highly intuitive, does not involve a-priori assumptions, and has no substitute in other modes of interaction. A major issue in pointing gesture recognition is the difficultly of accurate estimation of the pointing direction, caused by the difficulty of hand tracking and the unreliability of the direction estimation. The proposed method involves the use of a stereo camera and 3D particle filters for reliable hand tracking, and a cascade of two HMMs for a robust estimate of the pointing direction. When a subject enters the field of view of the camera, his or her face and two hands are located and tracked using particle filters. The first stage HMM takes the hand position estimate and maps it to a more accurate position by modeling the kinematic characteristics of finger pointing. The resulting 3D coordinates are used as input into the second stage HMM that discriminates pointing gestures from other types. Finally, the pointing direction is estimated for the pointing state. The proposed method can deal with both large and small pointing gestures. The experimental results show gesture recognition and target selection rates of better than 89% and 99% respectively, during human-robot interaction.