Handling Movement Epenthesis and Hand Segmentation Ambiguities in Continuous Sign Language Recognition Using Nested Dynamic Programming

Authors:
Ruiduo Yang;Sudeep Sarkar;Barbara Loeding
Affiliations:
University of South Florida, Tampa;University of South Florida, Tampa;University of South Florida Polytechnic, Lakeland
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2010

Citing 0
Cited 9

Coupled grouping and matching for sign and gesture recognition

Computer Vision and Image Understanding
Model-based segmentation and recognition of dynamic gestures in continuous video streams

Pattern Recognition
American sign language recognition with the kinect

ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Short communication: Selective Subsequence Time Series clustering

Knowledge-Based Systems
Robust hand tracking by integrating appearance, location and depth cues

Proceedings of the 4th International Conference on Internet Multimedia Computing and Service
Non parametric, self organizing, scalable modeling of spatiotemporal inputs: The sign language paradigm

Neural Networks
Methodological foundation for sign language 3d motion trajectory analysis

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Finding recurrent patterns from continuous sign language sentences for automated extraction of signs

The Journal of Machine Learning Research
Towards subject independent continuous sign language recognition: A segment and merge approach

Pattern Recognition

Quantified Score

Hi-index	0.14

Visualization

Abstract

We consider two crucial problems in continuous sign language recognition from unaided video sequences. At the sentence level, we consider the movement epenthesis (me) problem and at the feature level, we consider the problem of hand segmentation and grouping. We construct a framework that can handle both of these problems based on an enhanced, nested version of the dynamic programming approach. To address movement epenthesis, a dynamic programming (DP) process employs a virtual me option that does not need explicit models. We call this the enhanced level building (eLB) algorithm. This formulation also allows the incorporation of grammar models. Nested within this eLB is another DP that handles the problem of selecting among multiple hand candidates. We demonstrate our ideas on four American Sign Language data sets with simple background, with the signer wearing short sleeves, with complex background, and across signers. We compared the performance with Conditional Random Fields (CRF) and Latent Dynamic-CRF-based approaches. The experiments show more than 40 percent improvement over CRF or LDCRF approaches in terms of the frame labeling rate. We show the flexibility of our approach when handling a changing context. We also find a 70 percent improvement in sign recognition rate over the unenhanced DP matching algorithm that does not accommodate the me effect.