Surrogate learning: from feature independence to semi-supervised classification

Authors:
Sriharsha Veeramachaneni;Ravi Kumar Kondadadi
Affiliations:
Thomson Reuters Research and Development, Eagan, MN;Thomson Reuters Research and Development, Eagan, MN
Venue:
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Year:
2009

Citing 8
Cited 1

Solving the multiple instance problem with axis-parallel rectangles

Artificial Intelligence
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Bootstrapping

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research
Two-view feature generation model for semi-supervised learning

Proceedings of the 24th international conference on Machine learning
Estimating labels from label proportions

Proceedings of the 25th international conference on Machine learning

Public record aggregation using semi-supervised entity resolution

Proceedings of the 13th International Conference on Artificial Intelligence and Law

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the task of learning a classifier from the feature space X to the set of classes Y = {0, 1}, when the features can be partitioned into class-conditionally independent feature sets X1 and X2. We show that the class-conditional independence can be used to represent the original learning task in terms of 1) learning a classifier from X2 to X1 (in the sense of estimating the probability P(x1/x 2))and 2) learning the class-conditional distribution of the feature set X1. This fact can be exploited for semi-supervised learning because the former task can be accomplished purely from unlabeled samples. We present experimental evaluation of the idea in two real world applications.