Statistical methods for speech recognition
Statistical methods for speech recognition
Pronunciation variants across system configuration, language and speaking style
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
In search of better pronunciation models for speech recognition
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Effects of speaking rate and word frequency on pronunciations in conversational speech
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Speaking in shorthand — a syllable-centric perspective for understanding pronunciation variation
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Maximum likelihood modelling of pronunciation variation
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Stochastic pronunciation modelling from hand-labelled phonetic corpora
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Modeling pronunciation variation for ASR: a survey of the literature
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Deleted interpolation and density sharing for continuous hidden Markov models
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
A syllable-synchronous network search algorithm for word decoding in Chinese speech recognition
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Mandarin accent adaptation based on context-independent/context-dependent pronunciation modeling
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
State-dependent phoneme-based model merging for dialectal Chinese speech recognition
Speech Communication
State-dependent phoneme-based model merging for dialectal chinese speech recognition
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Hi-index | 0.00 |
The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. In this paper, the factors that may affect the recognition performance are analyzed, including those specific to the Chinese language. By studying the INITIAL/FINAL (IF) characteristics of Chinese language and developing the Bayesian equation, the concepts of generalized INITIAL/FINAL (GIF) and generalized syllable (GS), the GIF modeling and the IF-GIF modeling, as well as the context-dependent pronunciation weighting, are proposed based on a well phonetically transcribed seed database. By using these methods, the Chinese syllable error rate (SER) is reduced by 6.3% and 4.2% compared with the GIF modeling and IF modeling respectively when the language model, such as syllable or word N-gram, is not used. The effectiveness of these methods is also proved when more data without the phonetic transcription are used to refine the acoustic model using the proposed iterative forced-alignment based transcribing (IFABT) method, achieving a 5.7% SER reduction.