Coupled grouping and matching for sign and gesture recognition

Authors:
Ruiduo Yang;Sudeep Sarkar
Affiliations:
Computer Science and Engineering, University of South Florida, Tampa, FL 33620, USA;Computer Science and Engineering, University of South Florida, Tampa, FL 33620, USA
Venue:
Computer Vision and Image Understanding
Year:
2009

Citing 21
Cited 2

Object recognition by computer: the role of geometric constraints

Object recognition by computer: the role of geometric constraints
Space and Time Bounds on Indexing 3D Models from 2D Images

IEEE Transactions on Pattern Analysis and Machine Intelligence - Special issue on interpretation of 3-D scenes—part I
A framework for recognizing the simultaneous aspects of American sign language

Computer Vision and Image Understanding - Modeling people toward vision-based underatanding of a person's shape, appearance, and movement
Mean Shift: A Robust Approach Toward Feature Space Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
View-Invariant Representation and Recognition of Actions

International Journal of Computer Vision
Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Tracking the human arm using constraint fusion and multiple-cue localization

Machine Vision and Applications
Multi-Modal System for Locating Heads and Faces

FG '96 Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96)
Gesture Modeling and Recognition Using Finite State Machines

FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Real-time American Sign Language recognition from video using hidden Markov models

ISCV '95 Proceedings of the International Symposium on Computer Vision
An Integrated Boundary and Region Approach to Perceptual Grouping

ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 1
Image Parsing: Unifying Segmentation, Detection, and Recognition

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Simultaneous Localization and Recognition of Dynamic Hand Gestures

WACV-MOTION '05 Proceedings of the IEEE Workshop on Motion and Video Computing (WACV/MOTION'05) - Volume 2 - Volume 02
Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Isometric Self-Organizing Map for 3D Hand Pose Estimation

FGR '06 Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition
Gesture Recognition using Hidden Markov Models from Fragmented Observations

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Detecting Coarticulation in Sign Language using Conditional Random Fields

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Extraction of Hand Gestures with Adaptive Skin Color Models and Its Applications to Meeting Analysis

ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
Hand Gesture Recognition Research Based on Surface EMG Sensors and 2D-accelerometers

ISWC '07 Proceedings of the 2007 11th IEEE International Symposium on Wearable Computers
Recovering the linguistic components of the manual signs in American Sign Language

AVSS '07 Proceedings of the 2007 IEEE Conference on Advanced Video and Signal Based Surveillance
Handling Movement Epenthesis and Hand Segmentation Ambiguities in Continuous Sign Language Recognition Using Nested Dynamic Programming

IEEE Transactions on Pattern Analysis and Machine Intelligence

On the use of graph parsing for recognition of isolated hand postures of Polish Sign Language

Pattern Recognition
Thai sign language translation using Scale Invariant Feature Transform and Hidden Markov Models

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Matching an image sequence to a model is a core problem in gesture or sign recognition. In this paper, we consider such a matching problem, without requiring a perfect segmentation of the scene. Instead of requiring that low- and mid-level processes produce near-perfect segmentation, we take into account that such processes can only produce uncertain information and use an intermediate grouping module to generate multiple candidates. From the set of low-level image primitives, such as constant color region patches found in each image, a ranked set of salient, overlapping, groups of these primitives are formed, based on low-level cues such as region shape, proximity, or color. These groups corresponds to underlying object parts of interest, such as the hands. The sequence of these frame-wise group hypotheses are then matched to a model by casting it into a minimization problem. We show the coupling of these hypotheses with both non-statistical matching (match to sample-based modeling of signs) and statistical matching (match to HMM models) are possible. Our algorithm not only produces a matching score, but also selects the best group in each image frame, i.e. recognition and final segmentation of the scene are coupled. In addition, there is no need for tracking of features across sequences, which is known to be a hard task. We demonstrate our method using data from sign language recognition and gesture recognition, we compare our results with the ground truth hand groups, and achieved less than 5% performance loss for both two models. We also tested our algorithm on a sports video dataset that has moving background.