“Put-that-there”: Voice and gesture at the graphics interface
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Prosody Based Co-analysis for Continuous Recognition of Coverbal Gestures
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A Real-Time Framework for Natural Multimodal Interaction with Large Screen Displays
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Unification-based multimodal parsing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Finite-state multimodal parsing and understanding
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
When do we interact multimodally?: cognitive load and multimodal communication patterns
Proceedings of the 6th international conference on Multimodal interfaces
Individual differences in multimodal integration patterns: what are they and why do they exist?
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Salience modeling based on non-verbal modalities for spoken language understanding
Proceedings of the 8th international conference on Multimodal interfaces
Optimization in multimodal interpretation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Multimodal integration-a statistical view
IEEE Transactions on Multimedia
Proceedings of the 2009 international conference on Multimodal interfaces
IEEE Transactions on Information Technology in Biomedicine - Special section on new and emerging technologies in bioinformatics and bioengineering
Verbal control of mathematical tools for simulation and virtual environments
Proceedings of the 2010 Summer Computer Simulation Conference
Hi-index | 0.00 |
We propose an integrative recognition method of speech accompanied with gestures such as pointing. Simultaneously generated speech and pointing complementarily help the recognition of both, and thus the integration of these multiple modalities may improve recognition performance. As an example of such multimodal speech, we selected the explanation of a geometry problem. While the problem was being solved, speech and fingertip movements were recorded with a close-talking microphone and a 3D position sensor. To find the correspondence between utterance and gestures, we propose probability distribution of the time gap between the starting times of an utterance and gestures. We also propose an integrative recognition method using this distribution. We obtained approximately 3-point improvement for both speech and fingertip movement recognition performance with this method.