Information, Prediction, and Query by Committee
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Transforming classifier scores into accurate multiclass probability estimates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamic bayesian networks: representation, inference and learning
Dynamic bayesian networks: representation, inference and learning
Cost-Sensitive Learning by Cost-Proportionate Example Weighting
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Learning and evaluating classifiers under sample selection bias
ICML '04 Proceedings of the twenty-first international conference on Machine learning
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Reverse testing: an efficient framework to select amongst classifiers under sample selection bias
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Making generative classifiers robust to selection bias
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
More bang for their bucks: assessing new features for online advertisers
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
More bang for their bucks: assessing new features for online advertisers
Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising
Learning classifiers from only positive and unlabeled data
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable pattern mining with Bayesian networks as background knowledge
Data Mining and Knowledge Discovery
Decision support and profit prediction for online auction sellers
Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data
Evaluating online ad campaigns in a pipeline: causal models at scale
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Ontology and instance matching
Knowledge-driven multimedia information extraction and ontology evolution
Differential privacy based on importance weighting
Machine Learning
Hi-index | 0.00 |
Most learning methods assume that the training set is drawn randomly from the population to which the learned model is to be applied. However in many applications this assumption is invalid. For example, lending institutions create models of who is likely to repay a loan from training sets consisting of people in their records to whom loans were given in the past; however, the institution approved loan applications previously based on who was thought unlikely to default. Learning from only approved loans yields an incorrect model because the training set is a biased sample of the general population of applicants. The issue of including rejected samples in the learning process, or alternatively using rejected samples to adjust a model learned from accepted samples only, is called reject inference.The main contribution of this paper is a systematic analysis of different cases that arise in reject inference, with explanations of which cases arise in various real-world situations. We use Bayesian networks to formalize each case as a set of conditional independence relationships and identify eight cases, including the familiar missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR) cases. For each case we present an overview of available learning algorithms. These algorithms have been published in separate fields of research, including epidemiology, econometrics, clinical trial evaluation, sociology, and credit scoring; our second major contribution is to describe these algorithms in a common framework.