Speaker recognition with mixtures of Gaussians with sparse regression matrices

Authors:
Constantinos Boulis
Affiliations:
University of Washington
Venue:
HLT-SRWS '04 Proceedings of the Student Research Workshop at HLT-NAACL 2004
Year:
2004

Citing 4
Cited 0

Learning Belief Networks in the Presence of Missing Values and Hidden Variables

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Natural statistical models for automatic speech recognition

Natural statistical models for automatic speech recognition
Learning with mixtures of trees

The Journal of Machine Learning Research
Factored sparse inverse covariance matrices

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

When estimating a mixture of Gaussians there are usually two choices for the covariance type of each Gaussian component. Either diagonal or full covariance. Imposing a structure though may be restrictive and lead to degraded performance and/or increased computations. In this work, several criteria to estimate the structure of regression matrices of a mixture of Gaussians are introduced and evaluated. Most of the criteria attempt to estimate a discriminative structure, which is suited for classification tasks. Results are reported on the 1996 NIST speaker recognition task and performance is compared with structural EM, a well-known, non-discriminative, structure-finding algorithm.