C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
A Comparative Analysis of Methods for Pruning Decision Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Classification for Imprecise Environments
Machine Learning
Bayes and Pseudo-Bayes Estimates of Conditional Probabilities and Their Reliability
ECML '93 Proceedings of the European Conference on Machine Learning
Pruning Decision Trees with Misclassification Costs
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Bootstrap Methods for the Cost-Sensitive Evaluation of Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Magical thinking in data mining: lessons from CoIL challenge 2000
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Boosting Naive Bayes for Claim Fraud Diagnosis
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Sequential cost-sensitive decision making with reinforcement learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Transforming classifier scores into accurate multiclass probability estimates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Is random model better? On its accuracy and efficiency
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Cost-Sensitive Learning by Cost-Proportionate Example Weighting
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Gene functional classification by semi-supervised learning from heterogeneous data
Proceedings of the 2003 ACM symposium on Applied computing
Active Sampling for Class Probability Estimation and Ranking
Machine Learning
A Case Study of Applying Boosting Naive Bayes to Claim Fraud Diagnosis
IEEE Transactions on Knowledge and Data Engineering
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
An intelligent system for customer targeting: a data mining approach
Decision Support Systems
An iterative method for multi-class cost-sensitive learning
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Music artist style identification by semi-supervised learning from both lyrics and content
Proceedings of the 12th annual ACM international conference on Multimedia
Shape-Based Recognition of Wiry Objects
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Knowledge and Data Engineering
Mining Customer Value: From Association Rules to Direct Marketing
Data Mining and Knowledge Discovery
Wrapper-based computation and evaluation of sampling methods for imbalanced datasets
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
One-Benefit learning: cost-sensitive learning with restricted cost information
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Augmenting naive Bayes for ranking
ICML '05 Proceedings of the 22nd international conference on Machine learning
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
Using secure coprocessors for privacy preserving collaborative data mining and analysis
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Cost-sensitive learning with conditional Markov networks
ICML '06 Proceedings of the 23rd international conference on Machine learning
Feature value acquisition in testing: a sequential batch test algorithm
ICML '06 Proceedings of the 23rd international conference on Machine learning
Reverse testing: an efficient framework to select amongst classifiers under sample selection bias
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Test Strategies for Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
ROC curves and video analysis optimization in intestinal capsule endoscopy
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Extracting Actionable Knowledge from Decision Trees
IEEE Transactions on Knowledge and Data Engineering
Combining multiple class distribution modified subsamples in a single tree
Pattern Recognition Letters
Expert Systems with Applications: An International Journal
Minimax Regret Classifier for Imprecise Class Distributions
The Journal of Machine Learning Research
Cost-Sensitive-Data Preprocessing for Mining Customer Relationship Management Databases
IEEE Intelligent Systems
Machine Learning
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
Perceptron and SVM learning with generalized cost models
Intelligent Data Analysis
Extending boosting for large scale spoken language understanding
Machine Learning
A weighted rough set based method developed for class imbalance learning
Information Sciences: an International Journal
An approach to mining the multi-relational imbalanced database
Expert Systems with Applications: An International Journal
Extending boosting for large scale spoken language understanding
Machine Learning
Effective label acquisition for collective classification
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-class cost-sensitive boosting with p-norm loss functions
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Automatically countering imbalance and its empirical relationship to cost
Data Mining and Knowledge Discovery
Cost-sensitive learning with conditional Markov networks
Data Mining and Knowledge Discovery
Improve Flow Accuracy and Byte Accuracy in Network Traffic Classification
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
Electronic promotion to new customers using mkNN learning
Information Sciences: an International Journal
Foundations and Trends in Databases
Risk-Sensitive Learning via Minimization of Empirical Conditional Value-at-Risk
IEICE - Transactions on Information and Systems
A new marketing strategy map for direct marketing
Knowledge-Based Systems
Quantification and semi-supervised classification methods for handling changes in class distribution
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cost-Based Sampling of Individual Instances
Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
Thresholding for making classifiers cost-sensitive
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
On multi-class cost-sensitive learning
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Decision support and profit prediction for online auction sellers
Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data
Cost-sensitive learning based on Bregman divergences
Machine Learning
Learning from labeled and unlabeled data: an empirical study across techniques and domains
Journal of Artificial Intelligence Research
Integrating learning from examples into the search for diagnostic policies
Journal of Artificial Intelligence Research
Reflect and correct: A misclassification prediction approach to active inference
ACM Transactions on Knowledge Discovery from Data (TKDD)
Proceedings of the 18th ACM conference on Information and knowledge management
Cost-Sensitive Learning Vector Quantization for Financial Distress Prediction
EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Advances in Artificial Intelligence
Estimation of class membership probabilities in the document classification
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Journal of Systems and Software
On active learning of record matching packages
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
An unsupervised self-organizing learning with support vector ranking for imbalanced datasets
Expert Systems with Applications: An International Journal
Empirical system learning for statistical pattern recognition with non-uniform error criteria
IEEE Transactions on Signal Processing
Proceedings of the VLDB Endowment
Maximum Likelihood in Cost-Sensitive Learning: Model Specification, Approximations, and Upper Bounds
The Journal of Machine Learning Research
Linguistic cost-sensitive learning of genetic fuzzy classifiers for imprecise data
International Journal of Approximate Reasoning
Temporal multi-hierarchy smoothing for estimating rates of rare events
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
HAIS'11 Proceedings of the 6th international conference on Hybrid artificial intelligent systems - Volume Part I
A unifying view on dataset shift in classification
Pattern Recognition
Metric anomaly detection via asymmetric risk minimization
SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Cost-Sensitive learning of SVM for ranking
ECML'06 Proceedings of the 17th European conference on Machine Learning
Many are better than one: improving probabilistic estimates from decision trees
MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
Preprocessing time series data for classification with application to CRM
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Ensembles of classifiers from spatially disjoint data
MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Expert Systems with Applications: An International Journal
Decision tree classifiers sensitive to heterogeneous costs
Journal of Systems and Software
CSDTM a cost sensitive decision tree based method
ADVIS'06 Proceedings of the 4th international conference on Advances in Information Systems
Artificial Intelligence in Medicine
DCPE co-training for classification
Neurocomputing
Handling concept drift via ensemble and class distribution estimation technique
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Towards cost-sensitive learning for real-world applications
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Editors Choice Article: I2VM: Incremental import vector machines
Image and Vision Computing
Probability estimation for multi-class classification based on label ranking
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Cost-Sensitive Learning via Priority Sampling to Improve the Return on Marketing and CRM Investment
Journal of Management Information Systems
Decision trees: a recent overview
Artificial Intelligence Review
Editorial: Parameter-free classification in multi-class imbalanced data sets
Data & Knowledge Engineering
Accurate probability calibration for multiple classifiers
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Information Sciences: an International Journal
Collaborative information acquisition for data-driven decisions
Machine Learning
Hi-index | 0.01 |
In many data mining domains, misclassification costs are different for different examples, in the same way that class membership probabilities are example-dependent. In these domains, both costs and probabilities are unknown for test examples, so both cost estimators and probability estimators must be learned. After discussing how to make optimal decisions given cost and probability estimates, we present decision tree and naive Bayesian learning methods for obtaining well-calibrated probability estimates. We then explain how to obtain unbiased estimators for example-dependent costs, taking into account the difficulty that in general, probabilities and costs are not independent random variables, and the training examples for which costs are known are not representative of all examples. The latter problem is called sample selection bias in econometrics. Our solution to it is based on Nobel prize-winning work due to the economist James Heckman. We show that the methods we propose perform better than MetaCost and all other known methods, in a comprehensive experimental comparison that uses the well-known, large, and challenging dataset from the KDD'98 data mining contest.