Multimodal integration-a statistical view

Authors:
Lizhong Wu;S. L. Oviatt;P. R. Cohen
Affiliations:
Dept. of Comput. Sci. & Eng., Oregon Graduate Inst. of Sci. & Technol., Portland, OR;-;-
Venue:
IEEE Transactions on Multimedia
Year:
1999

Citing 0
Cited 50

Perceptual user interfaces: multimodal interfaces that process what comes naturally

Communications of the ACM
Taming recognition errors with a multimodal interface

Communications of the ACM
Something from nothing: augmenting a paper-based work practice via multimodal interaction

DARE '00 Proceedings of DARE 2000 on Designing augmented reality environments
Creating tangible interfaces by augmenting physical objects with multimodal language

Proceedings of the 6th international conference on Intelligent user interfaces
EMBASSI: multimodal assistance for universal access to infotainment and service infrastructures

WUAUC'01 Proceedings of the 2001 EC/NSF workshop on Universal accessibility of ubiquitous computing: providing for the elderly
Multimodal interfaces

The human-computer interaction handbook
Advances in the robust processing of multimodal speech and pen systems

Multimodal interface for human-machine communication
Advances in Robust Multimodal Interface Design

IEEE Computer Graphics and Applications
Gene functional classification by semi-supervised learning from heterogeneous data

Proceedings of the 2003 ACM symposium on Applied computing
Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality

Proceedings of the 5th international conference on Multimodal interfaces
A visual modality for the augmentation of paper

Proceedings of the 2001 workshop on Perceptive user interfaces
Machine learning in low-level microarray analysis

ACM SIGKDD Explorations Newsletter
Finite-state multimodal parsing and understanding

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures

Proceedings of the 6th international conference on Multimodal interfaces
SketchREAD: a multi-domain sketch recognition engine

Proceedings of the 17th annual ACM symposium on User interface software and technology
Semisupervised learning from different information sources

Knowledge and Information Systems
Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Clustering Genes Using Gene Expression and Text Literature Data

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Dynamically constructed Bayes nets for multi-domain sketch understanding

ACM SIGGRAPH 2006 Courses
Optimization in multimodal interpretation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Dynamically constructed Bayes nets for multi-domain sketch understanding

ACM SIGGRAPH 2007 courses
SketchREAD: a multi-domain sketch recognition engine

ACM SIGGRAPH 2007 courses
Bi-modal emotion recognition from expressive face and body gestures

Journal of Network and Computer Applications
User and context adaptive neural networks for emotion recognition

Neurocomputing
Feedback Information Expression and Fusion Method for Human-Robot Interaction

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
An integrative recognition method for speech and gestures

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Designing the user interface for multimodal speech and pen-based gesture applications: state-of-the-art systems and future research directions

Human-Computer Interaction
Context shifts: extending the meanings of physical objects with language

Human-Computer Interaction
Multimodal Interfaces: A Survey of Principles, Models and Frameworks

Human Machine Interaction
Being bored? Recognising natural interest by extensive audiovisual integration for real-life application

Image and Vision Computing
Cognitive principles in robust multimodal interpretation

Journal of Artificial Intelligence Research
Multimodal integration: a biological view

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Dynamically constructed Bayes nets for multi-domain sketch understanding

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Graph-based partial hypothesis fusion for pen-aided speech input

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Cross-modality semantic integration with hypothesis rescoring for robust interpretation of multimodal user interactions

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams

Neurocomputing
Music clustering with features from different information sources

IEEE Transactions on Multimedia - Special section on communities and media computing
Automatic temporal segment detection and affect recognition from face and body display

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Usage patterns and latent semantic analyses for task goal inference of multimodal user interactions

Proceedings of the 15th international conference on Intelligent user interfaces
Survey of data management and analysis in disaster situations

Journal of Systems and Software
Semi-synchronous speech and pen input for mobile user interfaces

Speech Communication
Learning and knowledge-based sentiment analysis in movie review key excerpts

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Distributed architecture for intelligent robotic agents for assembly

ICS'06 Proceedings of the 10th WSEAS international conference on Systems
Mutual information as a variable to differentiate the roles of gaze in the multimodal interface

Proceedings of the 2010 workshop on Eye gaze in intelligent human machine interaction
Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge

Speech Communication
Fusing face and body display for bi-modal emotion recognition: single frame analysis and multi-frame post integration

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Multimodal architectures: issues and experiences

OTM'06 Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part I
Modelling stress recognition in conflict resolution scenarios

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
Review Article: Multimodal interaction: A review

Pattern Recognition Letters
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.03

Visualization

Abstract

We present a statistical approach to developing multimodal recognition systems and, in particular, to integrating the posterior probabilities of parallel input signals involved in the multimodal system. We first identify the primary factors that influence multimodal recognition performance by evaluating the multimodal recognition probabilities. We then develop two techniques, an estimate approach and a learning approach, which are designed to optimize accurate recognition during the multimodal integration process. We evaluate these methods using Quickset, a speech/gesture multimodal system, and report evaluation results based on an empirical corpus collected with Quickset. From an architectural perspective, the integration technique presented offers enhanced robustness. It also is premised on more realistic assumptions than previous multimodal systems using semantic fusion. From a methodological standpoint, the evaluation techniques that we describe provide a valuable tool for evaluating multimodal systems