Multi-modal descriptors for multi-class hand pose recognition in human computer interaction systems

  • Authors:
  • Jordi Abella;Raúl Alcaide;Anna Sabaté;Joan Mas;Sergio Escalera;Jordi Gonzàlez;Coen Antens

  • Affiliations:
  • Computer Vision Center, Cerdanyola, Barcelona, Spain;Computer Vision Center, Cerdanyola, Barcelona, Spain;Computer Vision Center, Cerdanyola, Barcelona, Spain;Computer Vision Center, Cerdanyola, Barcelona, Spain;University of Barcelona, Computer Vision Center, Cerdanyola, Barcelona, Spain;Univeritat de Barcelona, Computer Vision Center, Cerdanyola, Barcelona, Spain;Computer Vision Center, Cerdanyola, Barcelona, Spain

  • Venue:
  • Proceedings of the 15th ACM on International conference on multimodal interaction
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hand pose recognition in advanced Human Computer Interaction systems (HCI) is becoming more feasible thanks to the use of affordable multi-modal RGB-Depth cameras. Depth data generated by these sensors is a very valuable input information, although the representation of 3D descriptors is still a critical step to obtain robust object representations. This paper presents an overview of different multi-modal descriptors, and provides a comparative study of two feature descriptors called Multi-modal Hand Shape (MHS) and Fourier-based Hand Shape (FHS), which compute local and global 2D-3D hand shape statistics to robustly describe hand poses. A new dataset of 38K hand poses has been created for real-time hand pose and gesture recognition, corresponding to five hand shape categories recorded from eight users. Experimental results show good performance of the fused MHS and FHS descriptors, improving recognition accuracy while assuring real-time computation in HCI scenarios.