Evaluation of the robustness of the polynomial segment models to noisy environments with unsupervised adaptation

Authors:
Jeff Siu-Kei Au-Yeung;Manhung Siu
Affiliations:
Department of ECE, Hong Kong University of Science and Technology, Clearwater Bay, Hong Kong;BBN Technologies, 10 Moulton Street, Cambridge, MA 02138, United States
Venue:
Speech Communication
Year:
2008

Citing 13
Cited 0

A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal

Signal Processing
Environment normalization training and environment adaptation using mixture stochastic trajectory model

Speech Communication
A comparison of novel techniques for rapid speaker adaptation

Speech Communication
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
A segmental-feature HMM for continuous speech recognition based on a parametric trajectory model

Speech Communication
Model Parameter Estimation for Mixture Density Polynomial Segment Models

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
A semi-continuous stochastic trajectory model for phoneme-based continuous speech recognition

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
The HDM: a segmental hidden dynamic model of coarticulation

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Using a large vocabulary continuous speech recognizer for a constrained domain with limited training

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Unified frame and segment based models for automatic speech recognition

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Fast algorithms for phone classification and recognition usingsegment-based models

IEEE Transactions on Signal Processing
Recursive likelihood evaluation and fast search algorithm for polynomial segment model with application to speech recognition

IEEE Transactions on Audio, Speech, and Language Processing
A Robust Viterbi Algorithm Against Impulsive Noise With Application to Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, the polynomial segment models (PSMs) have been shown to be a competitive alternative to the HMM in large vocabulary continuous recognition task [Li, C., Siu, M., Au-yeung, S., 2006. Recursive likelihood evaluation and fast search algorithm for polynomial segment model with application to speech recognition. IEEE Trans. on Audio, Speech and Language Processing 14, 1704-1708]. Its more constrained nature raises the issue of robustness under environmental mis-matches. In this paper, we examine the robustness properties of PSMs using the Aurora 4 corpus under both clean training and multi-conditional training. In addition, we generalize two unsupervised model adaptation schemes, namely, the maximum likelihood linear regression (MLLR) and reference speaker weighting (RSW), to be applicable for PSMs and explore their effectiveness in PSM environmental adaptation. Our experiments showed that although the word error rate differences between PSMs and HMMs became smaller under noisy test environments than under clean test environment, PSMs were still competitive under mis-match conditions. After model adaptation, especially with the RSW adaptation, the word error rates were reduced for both HMMs and PSMs. The best word error rate was obtained with RSW-adapted PSMs by rescoring lattices generated with the adapted HMMs. Overall, with model adaptation, the recognition word error rate can be reduced by more than 20%.