A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Partially Supervised Classification of Text Documents
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
PEBL: positive example based learning for Web page classification using SVM
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
One-class svms for document classification
The Journal of Machine Learning Research
A needle in a haystack: local one-class optimization
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Text Classification without Labeled Negative Documents
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Learning to classify texts using positive and unlabeled data
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Directly Identify Unexpected Instances in the Test Set by Entropy Maximization
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Distributional similarity vs. PU learning for entity set expansion
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Negative training data can be harmful to text classification
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Semi-supervised learning from only positive and unlabeled data using entropy
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Estimate unlabeled-data-distribution for semi-supervised PU learning
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Ensemble based positive unlabeled learning for time series classification
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Positive unlabeled learning for time series classification
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
What users care about: a framework for social content alignment
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Instance selection and instance weighting for cross-domain sentiment classification via PU learning
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Traditional classification involves building a classifier using labeled training examples from a set of predefined classes and then applying the classifier to classify test instances into the same set of classes. In practice, this paradigm can be problematic because the test data may contain instances that do not belong to any of the previously defined classes. Detecting such unexpected instances in the test set is an important issue in practice. The problem can be formulated as learning from positive and unlabeled examples (PU learning). However, current PU learning algorithms require a large proportion of negative instances in the unlabeled set to be effective. This paper proposes a novel technique to solve this problem in the text classification domain. The technique first generates a single artificial negative document AN. The sets P and {AN} are then used to build a naïve Bayesian classifier. Our experiment results show that this method is significantly better than existing techniques.