Extending the bioinspired hierarchical temporal memory paradigm for sign language recognition

Authors:
David Rozado;Francisco B. Rodriguez;Pablo Varona
Affiliations:
GNB group at Escuela Politécnica Superior, Calle Francisco Tomás y Valiente, 11, Universidad Autónoma de Madrid, Madrid 28049, Spain;GNB group at Escuela Politécnica Superior, Calle Francisco Tomás y Valiente, 11, Universidad Autónoma de Madrid, Madrid 28049, Spain;GNB group at Escuela Politécnica Superior, Calle Francisco Tomás y Valiente, 11, Universidad Autónoma de Madrid, Madrid 28049, Spain
Venue:
Neurocomputing
Year:
2012

Citing 15
Cited 2

Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Real-Time Continuous Gesture Recognition System for Sign Language

FG '98 Proceedings of the 3rd. International Conference on Face & Gesture Recognition
A SRN/HMM System for Signer-Independent Continuous Sign Language Recognition

FGR '02 Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition
Analysis of perfect mappings of the stimuli through neural temporal sequences

Neural Networks
Australian sign language recognition

Machine Vision and Applications
Arabic Sign Language Recognition an Image-Based Approach

AINAW '07 Proceedings of the 21st International Conference on Advanced Information Networking and Applications Workshops - Volume 01
Linguistic properties based on American Sign Language isolated word recognition with artificial neural networks using a sensory glove and motion tracker

Neurocomputing
Video-based signer-independent Arabic sign language recognition using hidden Markov models

Applied Soft Computing
Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in video

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Feature modeling using polynomial classifiers and stepwise regression

Neurocomputing
Persian sign language (PSL) recognition using wavelet transform and neural networks

Expert Systems with Applications: An International Journal
Optimizing hierarchical temporal memory for multivariable time series

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
Gesture-based interaction and communication: automated classification of hand gesture contours

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Gesture Recognition: A Survey

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A Survey of Glove-Based Systems and Their Applications

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Low cost remote gaze gesture recognition in real time

Applied Soft Computing
Fixed frame temporal pooling

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.02

Visualization

Abstract

Sign language recognition, SLR, using spatial positions and arrangements of the hands over time is a challenging multi-variable time series recognition problem with several potential applications. Here we explore, for SLR purposes, a hierarchically connected network of nodes based on a Bayesian-like paradigm known as hierarchical temporal memory, HTM, that models neocortical principles of organization and information coding. HTM is a broad paradigm for pattern recognition, control, attention and forward prediction that exploits the hierarchy in time and space existing in the physical world during both learning and inference. In this work we focus on HTM capabilities for pattern recognition. We extend the traditional HTM paradigm with an original top node in order to improve HTMs performance in problems where instances unfold over time. The extended top node stores and compares sequences of spatio-temporally codified inputs to handle the temporal evolution of instances in sign language. Sequence comparison is carried out using the Needleman-Wunsch algorithm for sequence alignment that employs dynamic programming. We compare the performance of the extended HTM with traditional HTMs and machine learning algorithms routinely used in the literature for SLR. The extended HTM improves performance of traditional HTM for SLR, reaching 91% recognition accuracy for a data set of 95 categories of Australian sign language. When sufficient training instances are available, the extended HTM matches or outperforms state of the art methods for SLR such as Hidden Markov Models or Metafeatures T-Classes without the usage of a language model, nor pre-processing of sensor data. The extended HTM employs relatively small feature vectors in comparison to methods in the literature. Our method learns the spatio-temporal data structures and transitions that occur in the data without depending on manually predefined features to be searched for and works well in real time. These results suggest that the extended HTM approach is a valid bioinspired alternative to existing SLR engines and that it can be successfully applied to other machine learning tasks whose input instances also unfold over time.