Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A TAG-based noisy channel model of speech repairs
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Using conditional random fields for sentence boundary detection in speech
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Exploring features for identifying edited regions in disfluent sentences
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Reconstructing spontaneous speech
Reconstructing spontaneous speech
What lies beneath: semantic and syntactic analysis of manually reconstructed spontaneous speech
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Integrating sentence- and word-level error identification for disfluency correction
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Hi-index | 0.00 |
This paper presents a conditional random field-based approach for identifying speaker-produced disfluencies (i.e. if and where they occur) in spontaneous speech transcripts. We emphasize false start regions, which are often missed in current disfluency identification approaches as they lack lexical or structural similarity to the speech immediately following. We find that combining lexical, syntactic, and language model-related features with the output of a state-of-the-art disfluency identification system improves overall word-level identification of these and other errors. Improvements are reinforced under a stricter evaluation metric requiring exact matches between cleaned sentences annotator-produced reconstructions, and altogether show promise for general reconstruction efforts.