Prosody-based automatic segmentation of speech into sentences and topics
Speech Communication - Special issue on accessing information in spoken audio
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Structural event detection for rich transcription of speech
Structural event detection for rich transcription of speech
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Automatic call section segmentation for contact-center calls
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Reconstructing false start errors in spontaneous speech text
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Multi-view semi-supervised learning for dialog act segmentation of speech
IEEE Transactions on Audio, Speech, and Language Processing
Appropriately handled prosodic breaks help PCFG parsing
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Contextual information improves OOV detection in speech
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
The CALO meeting assistant system
IEEE Transactions on Audio, Speech, and Language Processing
Better punctuation prediction with dynamic conditional random fields
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Lessons learned in part-of-speech tagging of conversational speech
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Contextual maximum entropy model for edit disfluency detection of spontaneous speech
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Using prosody for automatic sentence segmentation of multi-party meetings
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Hi-index | 0.00 |
Sentence boundary detection in speech is important for enriching speech recognition output, making it easier for humans to read and downstream modules to process. In previous work, we have developed hidden Markov model (HMM) and maximum entropy (Maxent) classifiers that integrate textual and prosodic knowledge sources for detecting sentence boundaries. In this paper, we evaluate the use of a conditional random field (CRF) for this task and relate results with this model to our prior work. We evaluate across two corpora (conversational telephone speech and broadcast news speech) on both human transcriptions and speech recognition output. In general, our CRF model yields a lower error rate than the HMM and Maxent models on the NIST sentence boundary detection task in speech, although it is interesting to note that the best results are achieved by three-way voting among the classifiers. This probably occurs because each model has different strengths and weaknesses for modeling the knowledge sources.