Journal of VLSI Signal Processing Systems
Applying repair processing in Chinese homophone disambiguation
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Intonational boundaries, speech repairs and discourse markers: modeling spoken dialog
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Multimodal model integration for sentence unit detection
Proceedings of the 6th international conference on Multimodal interfaces
Practical issues in compiling typed unification grammars for speech recognition
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Generating training data for medical dictations
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
LM studies on filled pauses in spontaneous medical dictation
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Extracting clauses for spoken language understanding in conversational systems
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
PCFGs with syntactic and prosodic indicators of speech repairs
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Hybrid Multi-step Disfluency Detection
MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Using integer linear programming for detecting speech disfluencies
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Multi-view semi-supervised learning for dialog act segmentation of speech
IEEE Transactions on Audio, Speech, and Language Processing
The CALO meeting assistant system
IEEE Transactions on Audio, Speech, and Language Processing
Cross-domain speech disfluency detection
SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Contextual maximum entropy model for edit disfluency detection of spontaneous speech
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Hi-index | 0.00 |
Speech disfluencies (such as filled pauses, repetitions, restarts) are among the characteristics distinguishing spontaneous speech from planned or read speech. We introduce a language model that predicts disfluencies probabilistically and uses an edited, fluent context to predict following words. The model is based on a generalization of the standard N-gram language model. It uses dynamic programming to compute the probability of a word sequence, taking into account possible hidden disfluency events. We analyze the model's performance for various disfluency types on the Switchboard corpus. We find that the model reduces the word perplexity in the neighborhood of disfluency events; however, overall differences are small and have no significant impact on the recognition accuracy. We also note that for modeling of the most frequent type of disfluency, filled pauses, a segmentation of utterances into linguistic (rather than acoustic) units is required. Our analysis illustrates a generally useful technique for language model evaluation based on local perplexity comparisons.