C4.5: programs for machine learning
C4.5: programs for machine learning
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Inducing Cost-Sensitive Trees via Instance Weighting
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
An iterative method for multi-class cost-sensitive learning
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Decision trees with minimal costs
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Active Feature-Value Acquisition for Classifier Induction
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Cost-Constrained Data Acquisition for Intelligent Data Preparation
IEEE Transactions on Knowledge and Data Engineering
On multi-class cost-sensitive learning
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Budgeted learning of nailve-bayes classifiers
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Learning and classifying under hard budgets
ECML'05 Proceedings of the 16th European conference on Machine Learning
Hi-index | 0.00 |
Data acquisition is the first and one of the most important steps in many data mining applications. It is a time consuming and costly task. Acquiring an insufficient number of examples makes the learned model and future prediction inaccurate, while acquiring more examples than necessary wastes time and money. Thus it is very important to estimate the number examples needed for learning algorithms in machine learning. However, most previous learning algorithms learn from a given and fixed set of examples. To our knowledge, little previous work in machine learning can dynamically acquire examples as it learns, and decide the ideal number of examples needed. In this paper, we propose a simple on-line framework for fast data acquisition (FDA). FDA is an extrapolation method that estimates the number of examples needed in each acquisition and acquire them simultaneously. Comparing to the naïve step-by-step data acquisition strategy, FDA reduces significantly the number of times of data acquisition and model building. This would significantly reduce the total cost of misclassification, data acquisition arrangement, computation, and examples acquired costs.