Using weak supervision in learning Gaussian mixture models

Authors:
Soumya Ghosh;Soundararajan Srinivasan;Burton Andrews
Affiliations:
Department of Computer Science, University of Colorado, Boulder, CO;Research and Technology Center, Robert Bosch LLC, Pittsburgh, PA;Research and Technology Center, Robert Bosch LLC, Pittsburgh, PA
Venue:
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Year:
2009

Citing 10
Cited 0

Mixture models for learning from incomplete data

Computational learning theory and natural learning systems: Volume IV
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Classification of multispectral data by joint supervised-unsupervised learning

Classification of multispectral data by joint supervised-unsupervised learning
Semisupervised Learning of Classifiers: Theory, Algorithms, and Their Application to Human-Computer Interaction

IEEE Transactions on Pattern Analysis and Machine Intelligence
Training a naive bayes classifier via the EM algorithm with a class distribution constraint

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Simple, robust, scalable semi-supervised learning via expectation regularization

Proceedings of the 24th international conference on Machine learning
Estimating labels from label proportions

Proceedings of the 25th international conference on Machine learning
The regularized EM algorithm

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

The expectation maximization algorithm is a popular approach to learning Gaussian mixture models from unlabeled data. In addition to the unlabeled data, in many applications, additional sources of information such as apriori knowledge of mixing proportions are also available. We present a weakly supervised approach, in the form of a penalized expectation maximization algorithm that uses apriori knowledge to guide the model training process. The algorithm penalizes those models whose predicted mixing proportions have high divergence from the a-priori mixing proportions. We also present an extension to incorporate both labeled and unlabeled data in a semi-supervised setting. Systematic evaluations on several publicly available datasets show that the proposed algorithms outperforms the expectation maximization algorithm. The performance gains are particularly significant when the amount of unlabeled data is limited and in the presence of noise.