Recent developments in visual sign language recognition

Authors:
Ulrich von Agris;Jörg Zieren;Ulrich Canzler;Britta Bauer;Karl-Friedrich Kraiss
Affiliations:
RWTH Aachen University, Institute of Man–Machine Interaction, Ahornstrasse 55, 52074, Aachen, Germany;RWTH Aachen University, Institute of Man–Machine Interaction, Ahornstrasse 55, 52074, Aachen, Germany;RWTH Aachen University, Institute of Man–Machine Interaction, Ahornstrasse 55, 52074, Aachen, Germany;RWTH Aachen University, Institute of Man–Machine Interaction, Ahornstrasse 55, 52074, Aachen, Germany;RWTH Aachen University, Institute of Man–Machine Interaction, Ahornstrasse 55, 52074, Aachen, Germany
Venue:
Universal Access in the Information Society
Year:
2008

Citing 0
Cited 9

Extending Fuzzy Sets with New Evidence for Improving a Sign Language Recognition System

WILF '09 Proceedings of the 8th International Workshop on Fuzzy Logic and Applications
Partially Observable Markov Decision Process (POMDP) Technologies for Sign Language Based Human-Computer Interaction

UAHCI '09 Proceedings of the 5th International Conference on Universal Access in Human-Computer Interaction. Part III: Applications and Services
Self-directed-learning for sign language recognition

ISCGAV'09 Proceedings of the 9th WSEAS international conference on Signal processing, computational geometry and artificial vision
Spatial and temporal pyramids for grammatical expression recognition of American sign language

Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility
A framework for continuous multimodal sign language recognition

Proceedings of the 2009 international conference on Multimodal interfaces
Facial expressions in American sign language: Tracking and recognition

Pattern Recognition
Influence of handshape information on automatic sign language recognition

GW'09 Proceedings of the 8th international conference on Gesture in Embodied Communication and Human-Computer Interaction
Non parametric, self organizing, scalable modeling of spatiotemporal inputs: The sign language paradigm

Neural Networks
Dynamic affine-invariant shape-appearance handshape features and classification in sign language videos

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Research in the field of sign language recognition has made significant advances in recent years. The present achievements provide the basis for future applications with the objective of supporting the integration of deaf people into the hearing society. Translation systems, for example, could facilitate communication between deaf and hearing people in public situations. Further applications, such as user interfaces and automatic indexing of signed videos, become feasible. The current state in sign language recognition is roughly 30 years behind speech recognition, which corresponds to the gradual transition from isolated to continuous recognition for small vocabulary tasks. Research efforts were mainly focused on robust feature extraction or statistical modeling of signs. However, current recognition systems are still designed for signer-dependent operation under laboratory conditions. This paper describes a comprehensive concept for robust visual sign language recognition, which represents the recent developments in this field. The proposed recognition system aims for signer-independent operation and utilizes a single video camera for data acquisition to ensure user-friendliness. Since sign languages make use of manual and facial means of expression, both channels are employed for recognition. For mobile operation in uncontrolled environments, sophisticated algorithms were developed that robustly extract manual and facial features. The extraction of manual features relies on a multiple hypotheses tracking approach to resolve ambiguities of hand positions. For facial feature extraction, an active appearance model is applied which allows identification of areas of interest such as the eyes and mouth region. In the next processing step, a numerical description of the facial expression, head pose, line of sight, and lip outline is computed. The system employs a resolution strategy for dealing with mutual overlapping of the signer’s hands and face. Classification is based on hidden Markov models which are able to compensate time and amplitude variances in the articulation of a sign. The classification stage is designed for recognition of isolated signs, as well as of continuous sign language. In the latter case, a stochastic language model can be utilized, which considers uni- and bigram probabilities of single and successive signs. For statistical modeling of reference models each sign is represented either as a whole or as a composition of smaller subunits—similar to phonemes in spoken languages. While recognition based on word models is limited to rather small vocabularies, subunit models open the door to large vocabularies. Achieving signer-independence constitutes a challenging problem, as the articulation of a sign is subject to high interpersonal variance. This problem cannot be solved by simple feature normalization and must be addressed at the classification level. Therefore, dedicated adaptation methods known from speech recognition were implemented and modified to consider the specifics of sign languages. For rapid adaptation to unknown signers the proposed recognition system employs a combined approach of maximum likelihood linear regression and maximum a posteriori estimation.