Positive and Unlabeled Examples Help Learning

Authors:
Francesco De Comité;François Denis;Rémi Gilleron;Fabien Letouzey
Affiliations:
-;-;-;-
Venue:
ALT '99 Proceedings of the 10th International Conference on Algorithmic Learning Theory
Year:
1999

Citing 8
Cited 7

A theory of the learnable

Communications of the ACM
C4.5: programs for machine learning

C4.5: programs for machine learning
Efficient noise-tolerant learning from statistical queries

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Learning to classify text from labeled and unlabeled documents

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning From Noisy Examples

Machine Learning
PAC Learning with Constant-Partition Classification Noise and Applications to Decision Tree Induction

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
PAC Learning from Positive Statistical Queries

ALT '98 Proceedings of the 9th International Conference on Algorithmic Learning Theory

Learning from Positive and Unlabeled Examples

ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
Towards the web of concepts: extracting concepts from large datasets

Proceedings of the VLDB Endowment
A survey of recent trends in one class classification

AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
A semi-naive bayesian learning method for utilizing unlabeled data

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
A new PU learning algorithm for text classification

MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
Named entity disambiguation in streaming data

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A parallel genetic programming for single class classification

Proceedings of the 15th annual conference companion on Genetic and evolutionary computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many learning problems, labeled examples are rare or expensive while numerous unlabeled and positive examples are available. However, most learning algorithms only use labeled examples. Thus we address the problem of learning with the help of positive and unlabeled data given a small number of labeled examples. We present both theoretical and empirical arguments showing that learning algorithms can be improved by the use of both unlabeled and positive data. As an illustrating problem, we consider the learning algorithm from statistics for monotone conjunctions in the presence of classiffication noise and give empirical evidence of our assumptions. We give theoretical results for the improvement of Statistical Query learning algorithms from positive and unlabeled data. Lastly, we apply these ideas to tree induction algorithms. We modify the code of C4.5 to get an algorithm which takes as input a set LAB of labeled examples, a set POS of positive examples and a set UNL of unlabeled data and which uses these three sets to construct the decision tree. We provide experimental results based on data taken from UCI repository which confirm the relevance of this approach.