Statistical analysis with missing data
Statistical analysis with missing data
Transforming classifier scores into accurate multiclass probability estimates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
One-class svms for document classification
The Journal of Machine Learning Research
Building Text Classifiers Using Positive and Unlabeled Examples
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
PEBL: Web Page Classification without Negative Examples
IEEE Transactions on Knowledge and Data Engineering
Support Vector Data Description
Machine Learning
A Bayesian network framework for reject inference
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Classification without Negative Examples Revisit
IEEE Transactions on Knowledge and Data Engineering
Single-Class Classification with Mapping Convergence
Machine Learning
Estimating the Support of a High-Dimensional Distribution
Neural Computation
Learning from positive and unlabeled examples
Theoretical Computer Science - Algorithmic learning theory (ALT 2000)
Making generative classifiers robust to selection bias
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A note on Platt's probabilistic outputs for support vector machines
Machine Learning
Autonomously semantifying wikipedia
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Partially supervised classification – based on weighted unlabeled samples support vector machine
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Learning to Find Relevant Biological Articles without Negative Training Examples
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Cool Blog Classification from Positive and Unlabeled Examples
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Audience selection for on-line brand advertising: privacy-friendly social network targeting
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
OcVFDT: one-class very fast decision tree for one-class classification of data streams
Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
Active learning in partially supervised classification
Proceedings of the 18th ACM conference on Information and knowledge management
A large-scale active learning system for topical categorization on the web
Proceedings of the 19th international conference on World wide web
Intelligent selection of language model training data
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Distributional similarity vs. PU learning for entity set expansion
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Negative training data can be harmful to text classification
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Semi-supervised learning from only positive and unlabeled data using entropy
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Semi-Supervised Novelty Detection
The Journal of Machine Learning Research
Beyond keyword search: discovering relevant scientific literature
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Labeling negative examples in supervised learning of new gene regulatory connections
CIBB'10 Proceedings of the 7th international conference on Computational intelligence methods for bioinformatics and biostatistics
Bayesian classifiers for positive unlabeled learning
WAIM'11 Proceedings of the 12th international conference on Web-age information management
A pairwise ranking based approach to learning with positive and unlabeled examples
Proceedings of the 20th ACM international conference on Information and knowledge management
A bootstrapping algorithm to improve cohort identification using structured data
Journal of Biomedical Informatics
ISMB/ECCB'09 Proceedings of the 2009 workshop of the BioLink Special Interest Group, international conference on Linking Literature, Information, and Knowledge for Biology
A software framework for classification models of geographical data
Computers & Geosciences
Learning from positive and unlabeled amazon reviews: towards identifying trustworthy reviewers
Proceedings of the 21st international conference companion on World Wide Web
Accurate measurements of pointing performance from in situ observations
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Estimate unlabeled-data-distribution for semi-supervised PU learning
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Ensemble based positive unlabeled learning for time series classification
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Automatic state abstraction from demonstration
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Learning very fast decision tree from uncertain data streams with positive and unlabeled samples
Information Sciences: an International Journal
Crosslingual distant supervision for extracting relations of different complexity
Proceedings of the 21st ACM international conference on Information and knowledge management
Multiple-instance learning as a classifier combining problem
Pattern Recognition
Learning from positive and unlabelled examples using maximum margin clustering
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Mining large streams of user data for personalized recommendations
ACM SIGKDD Explorations Newsletter
Towards never-ending learning from time series streams
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Heat pump detection from coarse grained smart meter data with positive and unlabeled learning
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning from data streams with only positive and unlabeled data
Journal of Intelligent Information Systems
Cross social networks interests predictions based ongraph features
Proceedings of the 7th ACM conference on Recommender systems
Timeline adaptation for text classification
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Supervised hypothesis discovery using syllogistic patterns in the biomedical literature
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Proceedings of the 7th ACM international conference on Web search and data mining
Differential privacy based on importance weighting
Machine Learning
A bagging SVM to learn from positive and unlabeled examples
Pattern Recognition Letters
Hi-index | 0.00 |
The input to an algorithm that learns a binary classifier normally consists of two sets of examples, where one set consists of positive examples of the concept to be learned, and the other set consists of negative examples. However, it is often the case that the available training data are an incomplete set of positive examples, and a set of unlabeled examples, some of which are positive and some of which are negative. The problem solved in this paper is how to learn a standard binary classifier given a nontraditional training set of this nature. Under the assumption that the labeled examples are selected randomly from the positive examples, we show that a classifier trained on positive and unlabeled examples predicts probabilities that differ by only a constant factor from the true conditional probabilities of being positive. We show how to use this result in two different ways to learn a classifier from a nontraditional training set. We then apply these two new methods to solve a real-world problem: identifying protein records that should be included in an incomplete specialized molecular biology database. Our experiments in this domain show that models trained using the new methods perform better than the current state-of-the-art biased SVM method for learning from positive and unlabeled examples.