An integrative recognition method for speech and gestures

Authors:
Madoka Miki;Chiyomi Miyajima;Takanori Nishino;Norihide Kitaoka;Kazuya Takeda
Affiliations:
Nagoya University, Nagoya, Japan;Nagoya University, Nagoya, Japan;Nagoya University, Nagoya, Japan;Nagoya University, Nagoya, Japan;Nagoya University, Nagoya, Japan
Venue:
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Year:
2008

Citing 10
Cited 3

“Put-that-there”: Voice and gesture at the graphics interface

SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
Prosody Based Co-analysis for Continuous Recognition of Coverbal Gestures

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A Real-Time Framework for Natural Multimodal Interaction with Large Screen Displays

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Unification-based multimodal parsing

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Finite-state multimodal parsing and understanding

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
When do we interact multimodally?: cognitive load and multimodal communication patterns

Proceedings of the 6th international conference on Multimodal interfaces
Individual differences in multimodal integration patterns: what are they and why do they exist?

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Salience modeling based on non-verbal modalities for spoken language understanding

Proceedings of the 8th international conference on Multimodal interfaces
Optimization in multimodal interpretation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Multimodal integration-a statistical view

IEEE Transactions on Multimedia

Temporal aspects of CARE-based multimodal fusion: from a fusion mechanism to composition components and WoZ components

Proceedings of the 2009 international conference on Multimodal interfaces
Toward emotion aware computing: an integrated approach using multichannel neurophysiological recordings and affective visual stimuli

IEEE Transactions on Information Technology in Biomedicine - Special section on new and emerging technologies in bioinformatics and bioengineering
Verbal control of mathematical tools for simulation and virtual environments

Proceedings of the 2010 Summer Computer Simulation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose an integrative recognition method of speech accompanied with gestures such as pointing. Simultaneously generated speech and pointing complementarily help the recognition of both, and thus the integration of these multiple modalities may improve recognition performance. As an example of such multimodal speech, we selected the explanation of a geometry problem. While the problem was being solved, speech and fingertip movements were recorded with a close-talking microphone and a 3D position sensor. To find the correspondence between utterance and gestures, we propose probability distribution of the time gap between the starting times of an utterance and gestures. We also propose an integrative recognition method using this distribution. We obtained approximately 3-point improvement for both speech and fingertip movement recognition performance with this method.