On semi-supervised learning of Gaussian mixture models for phonetic classification

Authors:
Jui-Ting Huang;Mark Hasegawa-Johnson
Affiliations:
University of Illinois at Urbana-Champaign, Illinois, IL;University of Illinois at Urbana-Champaign, Illinois, IL
Venue:
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Year:
2009

Citing 4
Cited 0

Heterogeneous acoustic measurements and multiple classifiers for speech recognition

Heterogeneous acoustic measurements and multiple classifiers for speech recognition
Exploitation of Unlabeled Sequences in Hidden Markov Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Semi-supervised classification with hybrid generative/discriminative methods

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Semisupervised Learning of Hidden Markov Models via a Homotopy Method

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates semi-supervised learning of Gaussian mixture models using an unified objective function taking both labeled and unlabeled data into account. Two methods are compared in this work - the hybrid discriminative/generative method and the purely generative method. They differ in the criterion type on labeled data; the hybrid method uses the class posterior probabilities and the purely generative method uses the data likelihood. We conducted experiments on the TIMIT database and a standard synthetic data set from UCI Machine Learning repository. The results show that the two methods behave similarly in various conditions. For both methods, unlabeled data improve training on models of higher complexity in which the supervised method performs poorly. In addition, there is a trend that more unlabeled data results in more improvement in classification accuracy over the supervised model. We also provided experimental observations on the relative weights of labeled and unlabeled parts of the training objective and suggested a critical value which could be useful for selecting a good weighing factor.