Communications of the ACM
Equivalence of models for polynomial learnability
Information and Computation
C4.5: programs for machine learning
C4.5: programs for machine learning
Efficient noise-tolerant learning from statistical queries
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
An introduction to computational learning theory
An introduction to computational learning theory
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Noise-tolerant learning, the parity problem, and the statistical query model
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Machine Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Positive and Unlabeled Examples Help Learning
ALT '99 Proceedings of the 10th International Conference on Algorithmic Learning Theory
On the Efficiency of Noise-Tolerant PAC Algorithms Derived from Statistical Queries
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
PAC Learning from Positive Statistical Queries
ALT '98 Proceedings of the 9th International Conference on Algorithmic Learning Theory
Rough set and ensemble learning based semi-supervised algorithm for text classification
Expert Systems with Applications: An International Journal
A survey of recent trends in one class classification
AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
TAMC'11 Proceedings of the 8th annual conference on Theory and applications of models of computation
A new PU learning algorithm for text classification
MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
Automatic state abstraction from demonstration
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Named entity disambiguation in streaming data
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A parallel genetic programming for single class classification
Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
Theoretical Computer Science
Hi-index | 0.00 |
In many machine learning settings, examples of one class (called positive class) are easily available. Also, unlabeled data are abundant. We investigate in this paper the design of learning algorithms from positive and unlabeled data only. Many machine learning and data mining algorithms use examples for estimate of probabilities. Therefore, we design an algorithm which is based on positive statistical queries (estimates for probabilities over the set of positive instances) and instance statistical queries (estimates for probabilities over the instance space). Our algorithm guesses the weight of the target concept (the ratio of positive instances in the instance space) with the help of a hypothesis testing algorithm. It is proved that any class learnable in the Statistical Query model [Kea93] such that a lower bound on the weight of any target concept f can be estimated in polynomial time is learnable from positive statistical queries and instance statistical queries only. Then, we design a decision tree induction algorithm POSC4.5, based on C4.5 [Qui93], using only positive and unlabeled examples. We also give experimental results for this algorithm.