The kappa statistic: a second look
Computational Linguistics
Using conditional random fields for sentence boundary detection in speech
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Recognizing disfluencies in conversational speech
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
Automatic speech recognition systems (ASR) have more trouble processing spontaneous speech (e.g. debates) than prepared speech (e.g. broadcast news). These difficulties are due to peculiarities of spontaneous speech (false start, repetition, schwa, etc.). In this paper, we highlight some of these peculiarities, especially in French. We show that the use of manual transcriptions having no link with the focused application, but which contains only transcriptions of very spontaneous speech, allows to estimate a better language model, strongly decreasing perplexity and significantly decreasing the word error rate on spontaneous speech. But other knowledge bases used by the ASR have to be adapted. For example, our work shows that adding specific pronunciation variants seems useful, but has to be constrained and modelized. Finally, we compare errors of our CMU Sphinx-based ASR system on spontaneous vs. prepared speech.