Policy mining: learning decision policies from fixed sets of data

Authors:
Bianca Zadrozny;Charles P. Elkan
Affiliations:
-;-
Venue:
Policy mining: learning decision policies from fixed sets of data
Year:
2003

Citing 0
Cited 3

Cross channel optimized marketing by reinforcement learning

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Handling Generalized Cost Functions in the Partitioning Optimization Problem through Sequential Binary Programming

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Optimizing debt collections using constrained reinforcement learning

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this thesis we present a new data mining methodology for extracting decision policies from datasets containing descriptions of interactions with an environment. This methodology, which we call policy mining, is valuable for applications in which experimental interaction is not feasible but for which fixed sets of collected data are available. Examples of such applications are direct marketing, credit card fraud detection, recommender systems and medical treatment. Recent advances in classifier learning and the availability of a great variety of off-the-shelf learners make it very attractive to use classifier learning as the core generalization tool in policy mining. However, in order to successfully apply classifier learning methods to policy mining, three important improvements to the current classifier learning technology are necessary. First, standard classifier learners assume that all incorrect predictions are equally costly. This thesis presents two general methods for cost-sensitive learning that take into account the fact that misclassification costs are different for different examples and unknown for some examples. The methods we propose are evaluated carefully with experiments using large, difficult and highly cost-sensitive datasets from the direct marketing domain. Second, most existing learning methods produce classifiers that output ranking scores along with the class label. These scores, however, are classifier dependent and cannot be easily combined with other sources of information for decision-making. This thesis presents a fast and effective calibration algorithm for transforming ranking scores into accurate class membership probability estimates. Experimental results using datasets from a variety of domains shows that the method produces probability estimates that are comparable to or better than the ones produced by other methods. Finally, learning algorithms commonly assume that the available data consists of randomly drawn examples from the same underlying distribution of examples about which the learned model is expected to make predictions. In many situations, however, this assumption is violated because we do not have control over the data gathering process. This thesis formalizes the sample selection bias problem in machine learning and presents methods for learning and evaluation under sample selection bias.