On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
An algorithm for pronominal anaphora resolution
Computational Linguistics
Centering: a framework for modeling the local coherence of discourse
Computational Linguistics
QuickSet: multimodal interaction for distributed applications
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Video Manga: generating semantically meaningful video summaries
MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
An interactive comic book presentation for exploring video
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Prosody-based automatic segmentation of speech into sentences and topics
Speech Communication - Special issue on accessing information in spoken audio
Advances in Automatic Text Summarization
Advances in Automatic Text Summarization
Multimodal human discourse: gesture and speech
ACM Transactions on Computer-Human Interaction (TOCHI)
Cognitive Status and Form of Reference in Multimodal Human-Computer Interaction
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Towards a Computational Theory of Definite Anaphora Comprehension in English Discourse
Towards a Computational Theory of Definite Anaphora Comprehension in English Discourse
A machine learning approach to coreference resolution of noun phrases
Computational Linguistics - Special issue on computational anaphora resolution
Functional centering: grounding referential coherence in information structure
Computational Linguistics
A property-sharing constraint in Centering
ACL '86 Proceedings of the 24th annual meeting on Association for Computational Linguistics
A centering approach to pronouns
ACL '87 Proceedings of the 25th annual meeting on Association for Computational Linguistics
Multi-paragraph segmentation of expository text
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Finite-state multimodal parsing and understanding
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Multimodal model integration for sentence unit detection
Proceedings of the 6th international conference on Multimodal interfaces
A model-theoretic coreference scoring scheme
MUC6 '95 Proceedings of the 6th conference on Message understanding
Improving machine learning approaches to coreference resolution
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Applying Co-Training to reference resolution
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Text and knowledge mining for coreference resolution
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Applying co-training methods to statistical parsing
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Chunking with support vector machines
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A machine learning approach to pronoun resolution in spoken dialogue
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Coreference resolution using competition learning approach
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Towards a model of face-to-face grounding
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
An overview of the SPHINX-II speech recognition system
HLT '93 Proceedings of the workshop on Human Language Technology
Structural event detection for rich transcription of speech
Structural event detection for rich transcription of speech
Optimizing Referential Coherence in Text Generation
Computational Linguistics
Centering: A Parametric Theory and Its Instantiations
Computational Linguistics
Exploring evidence for shallow parsing
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
The influence of minimum edit distance on reference resolution
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Content-based multimedia information retrieval: State of the art and challenges
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Hand Motion Gesture Frequency Properties and Multimodal Discourse Analysis
International Journal of Computer Vision
Hidden Conditional Random Fields for Gesture Recognition
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Using maximum entropy (ME) model to incorporate gesture cues for SU detection
Proceedings of the 8th international conference on Multimodal interfaces
Optimization in multimodal interpretation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Modeling local coherence: an entity-based approach
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Improving pronoun resolution using statistics-based semantic compatibility information
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Minimum cut model for spoken lecture segmentation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Using semantic relations to refine coreference decisions
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
On coreference resolution performance metrics
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A salience driven approach to robust input interpretation in multimodal conversational systems
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Effective use of prosody in parsing conversational speech
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Hidden-variable models for discriminative reranking
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Computational approaches to temporal sampling of video sequences
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
The Journal of Machine Learning Research
Domain adaptation with structural correspondence learning
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Gesture improves coreference resolution
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Turning lectures into comic books using linguistically salient gestures
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Learning content selection rules for generating object descriptions in dialogue
Journal of Artificial Intelligence Research
Shallow semantics for coreference resolution
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
The AMI meeting corpus: a pre-announcement
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
VACE multimodal meeting corpus
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Prosody based audiovisual coanalysis for coverbal gesture recognition
IEEE Transactions on Multimedia
Hi-index | 0.00 |
Gesture is a non-verbal modality that can contribute crucial information to the understanding of natural language. But not all gestures are informative, and noncommunicative hand motions may confuse natural language processing (NLP) and impede learning. People have little difficulty ignoring irrelevant hand movements and focusing on meaningful gestures, suggesting that an automatic system could also be trained to perform this task. However, the informativeness of a gesture is context-dependent and labeling enough data to cover all cases would be expensive. We present conditional modality fusion, a conditional hidden-variable model that learns to predict which gestures are salient for coreference resolution, the task of determining whether two noun phrases refer to the same semantic entity. Moreover, our approach uses only coreference annotations, and not annotations of gesture salience itself. We show that gesture features improve performance on coreference resolution, and that by attending only to gestures that are salient, our method achieves further significant gains. In addition, we show that the model of gesture salience learned in the context of coreference accords with human intuition, by demonstrating that gestures judged to be salient by our model can be used successfully to create multimedia keyframe summaries of video. These summaries are similar to those created by human raters, and significantly outperform summaries produced by baselines from the literature.