Evaluation of spoken language systems: the ATIS domain
HLT '90 Proceedings of the workshop on Speech and Natural Language
Speaker-independent continuous speech dictation
Speech Communication
Language-independent and language-adaptive acoustic modeling for speech recognition
Speech Communication
Cross-task portability of a broadcast news speech recognition system
Speech Communication
MAP estimation of continuous density HMM: theory and applications
HLT '91 Proceedings of the workshop on Speech and Natural Language
Hi-index | 0.00 |
As core speech recognition technology improves, opening up a wider range of applications, genericity and portability are becoming important issues. Most of todays recognition systems are still tuned to a particular task and porting the system to a new task (or language) requires a substantial investment of time and money, as well as human expertise. This paper addresses issues in speech recognizer portability and in the development of generic core speech recognition technology. First, the genericity of wide domain models is assessed by evaluating their performance on several tasks of varied complexity. Then, techniques aimed at enhancing the genericity of these wide domain models are investigated. Multi-source acoustic training is shown to reduce the performance gap between task-independent and task-dependent acoustic models, and for some tasks to out-perform task-dependent acoustic models. Transparent methods for porting generic models to a specific task are also explored. Transparent unsupervised acoustic model adaptation is contrasted with supervised adaptation, and incremental unsupervised adaptation of both the acoustic and linguistic models is investigated. Experimental results on a dialog task show that with the proposed scheme, a transparently adapted generic system can perform nearly as well (about a 1% absolute gap in word error rate) as a task-specific system trained on several tens of hours of manually transcribed data.