Fundamentals of speech recognition
Fundamentals of speech recognition
Speech Communication - Special issue on speech under stress
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Robust speech recognition in embedded system and PC applications
Robust speech recognition in embedded system and PC applications
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
IE '10 Proceedings of the 2010 Sixth International Conference on Intelligent Environments
Detection of unknown speakers in an unsupervised speech controlled system
IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments
Evaluation of two approaches for speaker specific speech recognition
IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments
Discriminative In-Set/Out-of-Set Speaker Recognition
IEEE Transactions on Audio, Speech, and Language Processing
A novel speech content authentication algorithm based on Bessel-Fourier moments
Digital Signal Processing
Hi-index | 0.00 |
A novel approach for joint speaker identification and speech recognition is presented in this article. Unsupervised speaker tracking and automatic adaptation of the human-computer interface is achieved by the interaction of speaker identification, speech recognition and speaker adaptation for a limited number of recurring users. Together with a technique for efficient information retrieval a compact modeling of speech and speaker characteristics is presented. Applying speaker specific profiles allows speech recognition to take individual speech characteristics into consideration to achieve higher recognition rates. Speaker profiles are initialized and continuously adapted by a balanced strategy of short-term and long-term speaker adaptation combined with robust speaker identification. Different users can be tracked by the resulting self-learning speech controlled system. Only a very short enrollment of each speaker is required. Subsequent utterances are used for unsupervised adaptation resulting in continuously improved speech recognition rates. Additionally, the detection of unknown speakers is examined under the objective to avoid the requirement to train new speaker profiles explicitly. The speech controlled system presented here is suitable for in-car applications, e.g. speech controlled navigation, hands-free telephony or infotainment systems, on embedded devices. Results are presented for a subset of the SPEECON database. The results validate the benefit of the speaker adaptation scheme and the unified modeling in terms of speaker identification and speech recognition rates.