Prosody-based automatic segmentation of speech into sentences and topics
Speech Communication - Special issue on accessing information in spoken audio
Multimodal human discourse: gesture and speech
ACM Transactions on Computer-Human Interaction (TOCHI)
Finite-state multimodal parsing and understanding
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Visual and linguistic information in gesture classification
Proceedings of the 6th international conference on Multimodal interfaces
Multimodal model integration for sentence unit detection
Proceedings of the 6th international conference on Multimodal interfaces
Non-verbal cues for discourse structure
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Towards a model of face-to-face grounding
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Gesture improves coreference resolution
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Gesture features for coreference resolution
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Hi-index | 0.00 |
Although the natural-language processing community has dedicated much of its focus to text, face-to-face spoken language is ubiquitous, and offers the potential for breakthrough applications in domains such as meetings, lectures, and presentations. Because spontaneous spoken language is typically more disfluent and less structured than written text, it may be critical to identify features from additional modalities that can aid in language understanding. However, due to the long-standing emphasis on text datasets, there has been relatively little work on nontextual features in unconstrained natural language (prosody being the most studied non-textual modality, e.g. (Shriberg et al., 2000)).