Machine Learning
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Prosody-based automatic segmentation of speech into sentences and topics
Speech Communication - Special issue on accessing information in spoken audio
Fast and Robust Features for Prosodic Classification
TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
Using conditional random fields for sentence boundary detection in speech
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A Comparison of Language Models for Dialog Act Segmentation of Meeting Transcripts
TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Multi-view semi-supervised learning for dialog act segmentation of speech
IEEE Transactions on Audio, Speech, and Language Processing
The CALO meeting assistant system
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
We explore the use of prosodic features beyond pauses, including duration, pitch, and energy features, for automatic sentence segmentation of ICSI meeting data We examine two different approaches to boundary classification: score-level combination of independent language and prosodic models using HMMs, and feature-level combination of models using a boosting-based method (BoosTexter) We report classification results for reference word transcripts as well as for transcripts from a state-of-the-art automatic speech recognizer (ASR) We also compare results using the lexical model plus a pause-only prosody model, versus results using additional prosodic features Results show that (1) information from pauses is important, including pause duration both at the boundary and at the previous and following word boundaries; (2) adding duration, pitch, and energy features yields significant improvement over pause alone; (3) the integrated boosting-based model performs better than the HMM for ASR conditions; (4) training the boosting-based model on recognized words yields further improvement.