A comparison of novel techniques for rapid speaker adaptation
Speech Communication
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Model Parameter Estimation for Mixture Density Polynomial Segment Models
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
A semi-continuous stochastic trajectory model for phoneme-based continuous speech recognition
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
The HDM: a segmental hidden dynamic model of coarticulation
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Using a large vocabulary continuous speech recognizer for a constrained domain with limited training
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Unified frame and segment based models for automatic speech recognition
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Fast algorithms for phone classification and recognition usingsegment-based models
IEEE Transactions on Signal Processing
IEEE Transactions on Audio, Speech, and Language Processing
A Robust Viterbi Algorithm Against Impulsive Noise With Application to Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
Recently, the polynomial segment models (PSMs) have been shown to be a competitive alternative to the HMM in large vocabulary continuous recognition task [Li, C., Siu, M., Au-yeung, S., 2006. Recursive likelihood evaluation and fast search algorithm for polynomial segment model with application to speech recognition. IEEE Trans. on Audio, Speech and Language Processing 14, 1704-1708]. Its more constrained nature raises the issue of robustness under environmental mis-matches. In this paper, we examine the robustness properties of PSMs using the Aurora 4 corpus under both clean training and multi-conditional training. In addition, we generalize two unsupervised model adaptation schemes, namely, the maximum likelihood linear regression (MLLR) and reference speaker weighting (RSW), to be applicable for PSMs and explore their effectiveness in PSM environmental adaptation. Our experiments showed that although the word error rate differences between PSMs and HMMs became smaller under noisy test environments than under clean test environment, PSMs were still competitive under mis-match conditions. After model adaptation, especially with the RSW adaptation, the word error rates were reduced for both HMMs and PSMs. The best word error rate was obtained with RSW-adapted PSMs by rescoring lattices generated with the adapted HMMs. Overall, with model adaptation, the recognition word error rate can be reduced by more than 20%.