Discriminative semi-parametric trajectory model for speech recognition

Authors:
K. C. Sim;M. J. F. Gales
Affiliations:
Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ, United Kingdom;Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ, United Kingdom
Venue:
Computer Speech and Language
Year:
2007

Citing 3
Cited 4

Maximum likelihood estimation for multivariate mixture observations of Markov chins

IEEE Transactions on Information Theory
Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition

Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition
Buried Markov models for speech recognition

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02

The application of hidden Markov models in speech recognition

Foundations and Trends in Signal Processing
Representing musical sounds with an interpolating state model

IEEE Transactions on Audio, Speech, and Language Processing
Investigations to minimum phone error training in bilingual speech recognition

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 4
Temporally Varying Weight Regression: A Semi-Parametric Trajectory Model for Automatic Speech Recognition

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hidden Markov models (HMMs) are the most commonly used acoustic model for speech recognition. In HMMs, the probability of successive observations is assumed independent given the state sequence. This is known as the conditional independence assumption. Consequently, the temporal (inter-frame) correlations are poorly modelled. This limitation may be reduced by incorporating some form of trajectory modelling. In this paper, a general perspective on trajectory modelling is provided, where time-varying model parameters are used for the Gaussian components. A discriminative semi-parametric trajectory model is then described where the Gaussian mean vector and covariance matrix parameters vary with time. The time variation is modelled as a semi-parametric function of the observation sequence via a set of centroids in the acoustic space. The model parameters are estimated discriminatively using the minimum phone error (MPE) criterion. The performance of these models is investigated and benchmarked against a state-of-the-art CUHTK Mandarin evaluation systems.