Semi-Supervised Learning with Measure Propagation

Authors:
Amarnag Subramanya;Jeff Bilmes
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2011

Citing 36
Cited 1

A critical investigation of recall and precision as measures of retrieval system performance

ACM Transactions on Information Systems (TOIS)
Elements of information theory

Elements of information theory
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Learning to classify text from labeled and unlabeled documents

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Approximate nearest neighbor queries in fixed dimensions

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
An Algorithm for Finding Best Matches in Logarithmic Expected Time

ACM Transactions on Mathematical Software (TOMS)
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Term Weighting Approaches in Automatic Text Retrieval

Term Weighting Approaches in Automatic Text Retrieval
The information geometry of em variants for speech and image processing

The information geometry of em variants for speech and image processing
Distributional word clusters vs. words for text categorization

The Journal of Machine Learning Research
Convex Optimization

Convex Optimization
In Defense of One-Vs-All Classification

The Journal of Machine Learning Research
Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection

The Journal of Machine Learning Research
Beyond the point cloud: from transductive to semi-supervised learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Propagating distributions on a hypergraph by dual information regularization

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning from labeled and unlabeled data on a directed graph

ICML '05 Proceedings of the 22nd international conference on Machine learning
Semi-supervised learning with graphs

Semi-supervised learning with graphs
Label propagation through linear neighborhoods

ICML '06 Proceedings of the 23rd international conference on Machine learning
Large scale semi-supervised linear SVMs

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Clustering with Bregman Divergences

The Journal of Machine Learning Research
Large Scale Transductive SVMs

The Journal of Machine Learning Research
Simple, robust, scalable semi-supervised learning via expectation regularization

Proceedings of the 24th international conference on Machine learning
Large scale manifold transduction

Proceedings of the 25th international conference on Machine learning
Graph transduction via alternating minimization

Proceedings of the 25th international conference on Machine learning
Statistical framework for a Spanish spoken dialogue corpus

Speech Communication
Challenges in searching social media

Proceedings of the 2008 ACM workshop on Search in social media
Soft-supervised learning for text classification

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Graph-based learning for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Keepin' it real: semi-supervised learning with realistic tuning

SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Semi-Supervised Learning

Semi-Supervised Learning
SWITCHBOARD: telephone speech corpus for research and development

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
On information regularization

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Graph-Based transduction with confidence

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a new objective for graph-based semi-supervised learning based on minimizing the Kullback-Leibler divergence between discrete probability measures that encode class membership probabilities. We show how the proposed objective can be efficiently optimized using alternating minimization. We prove that the alternating minimization procedure converges to the correct optimum and derive a simple test for convergence. In addition, we show how this approach can be scaled to solve the semi-supervised learning problem on very large data sets, for example, in one instance we use a data set with over 108 samples. In this context, we propose a graph node ordering algorithm that is also applicable to other graph-based semi-supervised learning approaches. We compare the proposed approach against other standard semi-supervised learning algorithms on the semi-supervised learning benchmark data sets (Chapelle et al., 2007), and other real-world tasks such as text classification on Reuters and WebKB, speech phone classification on TIMIT and Switchboard, and linguistic dialog-act tagging on Dihana and Switchboard. In each case, the proposed approach outperforms the state-of-the-art. Lastly, we show that our objective can be generalized into a form that includes the standard squared-error loss, and we prove a geometric rate of convergence in that case.