Class-based n-gram models of natural language
Computational Linguistics
Prosody-based automatic segmentation of speech into sentences and topics
Speech Communication - Special issue on accessing information in spoken audio
Multimodal human discourse: gesture and speech
ACM Transactions on Computer-Human Interaction (TOCHI)
Recovering the Temporal Structure of Natural Gesture
FG '96 Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96)
FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
A Parallel Algorithm for Dynamic Gesture Tracking
RATFG-RTS '99 Proceedings of the International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems
Gesture Cues for Conversational Interaction in Monocular Video
RATFG-RTS '99 Proceedings of the International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems
Gesture Patterns during Speech Repairs
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Prosody Based Co-analysis for Continuous Recognition of Coverbal Gestures
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A probabilistic approach to reference resolution in multimodal user interfaces
Proceedings of the 9th international conference on Intelligent user interfaces
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Statistical language modeling for speech disfluencies
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
IEEE Transactions on Neural Networks
Utilizing gestures to better understand dynamic structure of human communication
Proceedings of the 6th international conference on Multimodal interfaces
Using maximum entropy (ME) model to incorporate gesture cues for SU detection
Proceedings of the 8th international conference on Multimodal interfaces
Incorporating gesture and gaze into multimodal models of human-to-human communication
NAACL-DocConsortium '06 Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume: doctoral consortium
Semantic back-pointers from gesture
NAACL-DocConsortium '06 Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume: doctoral consortium
Gesture improves coreference resolution
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Gesture salience as a hidden variable for coreference resolution and keyframe extraction
Journal of Artificial Intelligence Research
Recognizing child's emotional state in problem-solving child-machine interactions
Proceedings of the 2nd Workshop on Child, Computer and Interaction
The recognition and comprehension of hand gestures: a review and research agenda
ZiF'06 Proceedings of the Embodied communication in humans and machines, 2nd ZiF research group international conference on Modeling communication with robots and virtual humans
Utilizing gestures to improve sentence boundary detection
Multimedia Tools and Applications
VACE multimodal meeting corpus
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Hi-index | 0.00 |
In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the detection of SUs as a classification task such that for each inter-word boundary, the classifier decides whether there is an SU boundary or not. In addition to gesture cues, we also utilize prosody and lexical knowledge sources. In this first investigation, we find that gesture features complement the prosodic and lexical knowledge sources for this task. By using all of the knowledge sources, the model is able to achieve the lowest overall SU detection error rate.