Factoring Gaussian precision matrices for linear dynamic models

Authors:
Joe Frankel;Simon King
Affiliations:
Centre for Speech Technology Research, University of Edinburgh, 2 Buccleuch Place, Edinburgh EH8 9LW, United Kingdom;Centre for Speech Technology Research, University of Edinburgh, 2 Buccleuch Place, Edinburgh EH8 9LW, United Kingdom
Venue:
Pattern Recognition Letters
Year:
2007

Citing 4
Cited 2

Segment-based stochastic models of spectral dynamics for continuous speech recognition

Segment-based stochastic models of spectral dynamics for continuous speech recognition
A unifying review of linear Gaussian models

Neural Computation
Factored sparse inverse covariance matrices

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Speech Recognition Using Linear Dynamic Models

IEEE Transactions on Audio, Speech, and Language Processing

Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification

IEEE Transactions on Audio, Speech, and Language Processing
Discriminative semi-supervised learning of dynamical systems for motion estimation

Pattern Recognition

Quantified Score

Hi-index	0.10

Visualization

Abstract

The linear dynamic model (LDM), also known as the Kalman filter model, has been the subject of research in the engineering, control, and more recently, machine learning and speech technology communities. The Gaussian noise processes are usually assumed to have diagonal, or occasionally full, covariance matrices. A number of recent papers have considered modelling the precision rather than covariance matrix of a Gaussian distribution, and this work applies such ideas to the LDM. A Gaussian precision matrix P can be factored into the form P=U^TSU where U is a transform and S a diagonal matrix. By varying the form of U, the covariance can be specified as being diagonal or full, or used to model a given set of spatial dependencies. Furthermore, the transform and scaling components can be shared between models, allowing richer distributions with only marginally more parameters than required to specify diagonal covariances. The method described in this paper allows the construction of models with an appropriate number of parameters for the amount of available training data. We provide illustrative experimental results on synthetic and real speech data in which models with factored precision matrices and automatically-selected numbers of parameters are as good as or better than models with diagonal covariances on small data sets and as good as models with full covariance matrices on larger data sets.