Integrating simultaneous input from speech, gaze, and hand gestures
Intelligent multimedia interfaces
Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
QuickSet: multimodal interaction for distributed applications
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Mutual disambiguation of recognition errors in a multimodel architecture
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
BEAT: the Behavior Expression Animation Toolkit
Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Toward Natural Gesture/Speech Control of a Large Display
EHCI '01 Proceedings of the 8th IFIP International Conference on Engineering for Human-Computer Interaction
Prosody Based Co-analysis for Continuous Recognition of Coverbal Gestures
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality
Proceedings of the 5th international conference on Multimodal interfaces
Hand motion gestural oscillations and multimodal discourse
Proceedings of the 5th international conference on Multimodal interfaces
Converting text into agent animations: assigning gestures to text
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Hi-index | 0.00 |
Classification of natural hand gestures is usually approached by applying pattern recognition to the movements of the hand. However, the gesture categories most frequently cited in the psychology literature are fundamentally multimodal; the definitions make reference to the surrounding linguistic context. We address the question of whether gestures are naturally multimodal, or whether they can be classified from hand-movement data alone. First, we describe an empirical study showing that the removal of auditory information significantly impairs the ability of human raters to classify gestures. Then we present an automatic gesture classification system based solely on an n-gram model of linguistic context; the system is intended to supplement a visual classifier, but achieves 66% accuracy on a three-class classification problem on its own. This represents higher accuracy than human raters achieve when presented with the same information.