Automatic Speech Recognition: The Development of the Sphinx Recognition System
Automatic Speech Recognition: The Development of the Sphinx Recognition System
Das ISADORA-System - ein akustisch-phonetisches Netzwerk zur automatischen Spracherkennung
Mustererkennung 1991, 13. DAGM-Symposium
An investigation of PLP and IMELDA acoustic representations and of their potential for combination
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Hi-index | 0.00 |
This paper addresses the choice of suitable subword units for the HMM-based front-end of a speaker-independent large vocabulary continuous speech dialog system (EVAR [1]). In contrast to the well-known approach of using context-dependent phone-like units (for instance generalized triphones) we developped inventories of larger sized subword units, so-called context-freezing units (CFU). CFU models can be considered as an approximation to the extremely desirable situation of having whole word HMMs under the limiting conditions of the training speech data at hand. Recognition experiments indicate an advantage of the context-freezing units over triphone/biphone/phone combinations in terms of the achieved word accuracy, at least in the case of German speech. Using triphones with contexts generalized by means of broad phonetic classes, we achieved results comparable to the CFU ones.