American sign language recognition: reducing the complexity of the task with phoneme-based modeling and parallel hidden markov models

Authors:
Christian Philipp Vogler;Dimitris N. Metaxas
Affiliations:
-;-
Venue:
American sign language recognition: reducing the complexity of the task with phoneme-based modeling and parallel hidden markov models
Year:
2003

Citing 0
Cited 7

Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Synthetic data generation technique in Signer-independent sign language recognition

Pattern Recognition Letters
Modelling and recognition of the linguistic components in American Sign Language

Image and Vision Computing
A new probabilistic model for recognizing signs with systematic modulations

AMFG'07 Proceedings of the 3rd international conference on Analysis and modeling of faces and gestures
Toward modeling sign language coarticulation

GW'09 Proceedings of the 8th international conference on Gesture in Embodied Communication and Human-Computer Interaction
Non parametric, self organizing, scalable modeling of spatiotemporal inputs: The sign language paradigm

Neural Networks
One-shot learning gesture recognition from RGB-D data using bag of features

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this thesis I present a framework for recognizing American Sign Language (ASL) from 3D data. The goal is to develop approaches that will scale well with increasing vocabulary sizes. Scalability is a major concern, because the computational treatment of ASL is a very complex undertaking. Two points particularly stand out: First, ASL is a highly inflected language, resulting in too many appearances of inflectional variants to model them all separately. Second, in ASL events occur both sequentially and simultaneously. Unlike speech recognition, ASL recognition cannot consider all possible combinations of simultaneous events explicitly, because of their sheer number. As a result, the computational treatment of ASL is much more complex than the computational treatment of spoken languages. Reducing the complexity of the task requires a two-pronged approach, which encompasses work on both the modeling and the computational sides. On the modeling side, I tackle the many appearances by breaking the signs down into their constituent phonemes, which are limited in number. I use the Movement-Hold phonological model for ASL as a guideline, and extend the parts of it that are not directly applicable to recognition systems. In addition, I recast it to describe simultaneous events in independent channels, so that it is no longer necessary to consider all their possible combinations. The result is a significant reduction of the modeling complexity. On the recognition side, I pose parallel hidden Markov models (PaHMMs) as an extension to conventional hidden Markov models. I develop a PaHMM recognition algorithm specifically geared toward the properties of sign languages. PaHMMs are the computational counterpart to modeling simultaneous events in independent channels, and allow putting them together on the fly at recognition time, instead of having to consider them a-priori. I validate the modeling approach and the PaHMM recognition algorithm in a pilot study with experiments on 53-sign and 22-sign data sets. In the PaHMM experiments, the independent channels consist of the hand movements of both hands, and the handshape of the strong hand. The results demonstrate the viability of both the phoneme modeling and the description of simultaneous events in independent channels.