Synergistic use of direct manipulation and natural language
CHI '89 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
Intelligent multi-media interface technology
Intelligent user interfaces
A generic platform for addressing the multimodal challenge
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Artificial Intelligence Review - Special issue on integration of natural language and vision processing: recent advances
Neural Computation
An evaluation of an eye tracker as a device for computer input2
CHI '87 Proceedings of the SIGCHI/GI Conference on Human Factors in Computing Systems and Graphics Interface
Perceptual user interfaces: multimodal interfaces that process what comes naturally
Communications of the ACM
On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence
On-road driver eye movement tracking using head-mounted devices
ETRA '02 Proceedings of the 2002 symposium on Eye tracking research & applications
“Put-that-there”: Voice and gesture at the graphics interface
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Toward a theory of organized multimodal integration patterns during human-computer interaction
Proceedings of the 5th international conference on Multimodal interfaces
Using multimodal interaction to navigate in arbitrary virtual VRML worlds
Proceedings of the 2001 workshop on Perceptive user interfaces
Multilayer architecture in sign language recognition system
Proceedings of the 6th international conference on Multimodal interfaces
Modeling Individual and Group Actions in Meetings: A Two-Layer HMM Framework
CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 7 - Volume 07
Indexing Multidimensional Time-Series
The VLDB Journal — The International Journal on Very Large Data Bases
A real-time system for hand gesture controlled operation of in-car devices
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 06
Comparison of approaches to continuous hand gesture recognition for a visual dialog system
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 06
Hidden Conditional Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
Audiovisual recognition of spontaneous interest within conversations
Proceedings of the 9th international conference on Multimodal interfaces
Using dynamic time warping for online temporal fusion in multisensor systems
Information Fusion
Image and Vision Computing
Multimodal authentication using asynchronous HMMs
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Improving connected letter recognition by lipreading
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: plenary, special, audio, underwater acoustics, VLSI, neural networks - Volume I
Multimodal integration-a statistical view
IEEE Transactions on Multimedia
Fusion of face and speech data for person identity verification
IEEE Transactions on Neural Networks
Computers in Biology and Medicine
Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario
ACM Transactions on Speech and Language Processing (TSLP)
Dynamic Time Warping for Chinese calligraphic character matching and recognizing
Pattern Recognition Letters
Keyword spotting exploiting Long Short-Term Memory
Speech Communication
LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework
Image and Vision Computing
Hi-index | 0.01 |
To overcome the computational complexity of the asynchronous hidden Markov model (AHMM), we present a novel multidimensional dynamic time warping (DTW) algorithm for hybrid fusion of asynchronous data. We show that our newly introduced multidimensional DTW concept requires significantly less decoding time while providing the same data fusion flexibility as the AHMM. Thus, it can be applied in a wide range of real-time multimodal classification tasks. Optimally exploiting mutual information during decoding even if the input streams are not synchronous, our algorithm outperforms late and early fusion techniques in a challenging bimodal speech and gesture fusion experiment.