Semi-supervised learning using randomized mincuts

Authors:
Avrim Blum;John Lafferty;Mugizi Robert Rwebangira;Rajashekar Reddy
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Year:
2004

Citing 8
Cited 38

Learnability with respect to fixed distributions

Theoretical Computer Science
Polynomial-time approximation algorithms for the Ising model

SIAM Journal on Computing
On the chromatic roots of generalized theta graphs

Journal of Combinatorial Theory Series B
PAC-Bayesian Stochastic Model Selection

Machine Learning
A Database for Handwritten Text Recognition Research

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning from Labeled and Unlabeled Data using Graph Mincuts

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Detecting a network failure

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Network failure detection and graph connectivity

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms

Estimating and computing density based distance metrics

ICML '05 Proceedings of the 22nd international conference on Machine learning
Data Clustering with Partial Supervision

Data Mining and Knowledge Discovery
An analysis of graph cut size for transductive learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
MISSL: multiple-instance semi-supervised learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Word sense disambiguation using label propagation based semi-supervised learning

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Relation extraction using label propagation based semi-supervised learning

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Query expansion with the minimum user feedback by transductive learning

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
A learning framework using Green's function and kernel regularization with application to recommender system

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Semisupervised Query Expansion with Minimal Feedback

IEEE Transactions on Knowledge and Data Engineering
The Cost of Learning Directed Cuts

ECML '07 Proceedings of the 18th European conference on Machine Learning
Classifying networked entities with modularity kernels

Proceedings of the 17th ACM conference on Information and knowledge management
Label propagation via bootstrapped support vectors for semantic relation extraction between named entities

Computer Speech and Language
Using graph-based metrics with empirical risk minimization to speed up active learning on networked data

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Automating knowledge capture in the aerospace domain

Proceedings of the fifth international conference on Knowledge capture
Semi-supervised polarity lexicon induction

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Improving learning in networked data by combining explicit and mined links

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Graph based semi-supervised approach for information extraction

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Query Selection via Weighted Entropy in Graph-Based Semi-supervised Classification

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
A discriminative model for semi-supervised learning

Journal of the ACM (JACM)
Interactive image segmentation using probabilistic hypergraphs

Pattern Recognition
Network Elucidation Template: A framework for human-guided network inference

Computers and Industrial Engineering
Semi-supervised learning applied to large data sets with very few labeled examples

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Learning unknown graphs

ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
A classification algorithm based on local cluster centers with a few labeled training examples

Knowledge-Based Systems
Semi-supervised ranking for document retrieval

Computer Speech and Language
Ant based semi-supervised classification

ANTS'10 Proceedings of the 7th international conference on Swarm intelligence
Predicting the labels of an unknown graph via adaptive exploration

Theoretical Computer Science
Do they belong to the same class: active learning by querying pairwise label homogeneity

Proceedings of the 20th ACM international conference on Information and knowledge management
Tri-training and data editing based semi-supervised clustering algorithm

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Query expansion with the minimum relevance judgments

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Mining inter-entity semantic relations using improved transductive learning

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Graffiti: graph-based classification in heterogeneous networks

World Wide Web
On the complexity of finding an unknown cut via vertex queries

COCOON'07 Proceedings of the 13th annual international conference on Computing and Combinatorics
Set-Similarity joins based semi-supervised sentiment analysis

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
Aggregation pheromone metaphor for semi-supervised classification

Pattern Recognition
Semi-supervised learning using greedy max-cut

The Journal of Machine Learning Research
Random spanning trees and the prediction ofweighted graphs

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many application domains there is a large amount of unlabeled data but only a very limited amount of labeled training data. One general approach that has been explored for utilizing this unlabeled data is to construct a graph on all the data points based on distance relationships among examples, and then to use the known labels to perform some type of graph partitioning. One natural partitioning to use is the minimum cut that agrees with the labeled data (Blum & Chawla, 2001), which can be thought of as giving the most probable label assignment if one views labels as generated according to a Markov Random Field on the graph. Zhu et al. (2003) propose a cut based on a relaxation of this field, and Joachims (2003) gives an algorithm based on finding an approximate min-ratio cut.In this paper, we extend the mincut approach by adding randomness to the graph structure. The resulting algorithm addresses several short-comings of the basic mincut approach, and can be given theoretical justification from both a Markov random field perspective and from sample complexity considerations. In cases where the graph does not have small cuts for a given classification problem, randomization may not help. However, our experiments on several datasets show that when the structure of the graph supports small cuts, this can result in highly accurate classifiers with good accuracy/coverage tradeoffs. In addition, we are able to achieve good performance with a very simple graph-construction procedure.