Self-organized language modeling for speech recognition
Readings in speech recognition
Class-based n-gram models of natural language
Computational Linguistics
Automatic referent resolution of deictic and anaphoric expressions
Computational Linguistics
An algorithm for pronominal anaphora resolution
Computational Linguistics
Centering: a framework for modeling the local coherence of discourse
Computational Linguistics
QuickSet: multimodal interaction for distributed applications
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Mutual disambiguation of recognition errors in a multimodel architecture
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Multimodal conversational systems for automobiles
Communications of the ACM - Multimodal interfaces that flex, adapt, and persist
A probabilistic approach to reference resolution in multimodal user interfaces
Proceedings of the 9th international conference on Intelligent user interfaces
The effectiveness of corpus-induced dependency grammars for post-processing speech
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Unification-based multimodal parsing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Linguistic theories in efficient multimodal reference resolution: an empirical investigation
Proceedings of the 10th international conference on Intelligent user interfaces
MATCH: an architecture for multimodal dialogue systems
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Integration of speech recognition and natural language processing in the MIT VOYAGER system
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Optimization in multimodal interpretation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Utilizing visual attention for cross-modal coreference interpretation
CONTEXT'05 Proceedings of the 5th international conference on Modeling and Using Context
Salience modeling based on non-verbal modalities for spoken language understanding
Proceedings of the 8th international conference on Multimodal interfaces
Proceedings of the 13th international conference on Intelligent user interfaces
Salience-driven Contextual Priming of Speech Recognition for Human-Robot Interaction
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Gesture salience as a hidden variable for coreference resolution and keyframe extraction
Journal of Artificial Intelligence Research
A salience-driven approach to speech recognition for human-robot interaction
ESSLLI'08/09 Proceedings of the 2008 international conference on Interfaces: explorations in logic, language and computation
Hi-index | 0.00 |
To improve the robustness in multimodal input interpretation, this paper presents a new salience driven approach. This approach is based on the observation that, during multimodal conversation, information from deictic gestures (e.g., point or circle) on a graphical display can signal a part of the physical world (i.e., representation of the domain and task) of the application which is salient during the communication. This salient part of the physical world will prime what users tend to communicate in speech and in turn can be used to constrain hypotheses for spoken language understanding, thus improving overall input interpretation. Our experimental results have indicated the potential of this approach in reducing word error rate and improving concept identification in multimodal conversation.