Machine learning paradigms for utility-based data mining
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Data acquisition and cost-effective predictive modeling: targeting offers for electronic commerce
Proceedings of the ninth international conference on Electronic commerce
Get another label? improving data quality and data mining using multiple, noisy labelers
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Decision Support Systems
Efficiently learning the accuracy of labeling sources for selective sampling
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Induction over Strategic Agents
Information Systems Research
Repeated labeling using multiple noisy labelers
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Data mining requires certain information---for example, supervised learning requires training data. Some prior research has recognized that this information often does not simply present itself for free, but involves various acquisition costs. In addition, applying the learned models involves costs and benefits. I introduce a general economic setting that includes as special cases the settings of many different streams of prior research, such as cost-sensitive learning, traditional active learning, semi-supervised learning, active feature acquisition, progressive sampling, and budgeted learning, which are interwoven inextricably. For data mining in the general setting I suggest a strategy of maximum expected-utility data acquisition. Finally, I discuss how there are many open research issues that must be addressed. As a simple example, we must be able to deal with the seemingly straightforward problem of handling missing values in induction and inference.