Rule induction with CN2: some recent improvements
EWSL-91 Proceedings of the European working session on learning on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Using the m-estimate in rule induction
Journal of Computing and Information Technology
Machine Learning
Statistical methods for speech recognition
Statistical methods for speech recognition
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Classification for Imprecise Environments
Machine Learning
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
Probabilistic Estimation-Based Data Mining for Discovering Insurance Risks
IEEE Intelligent Systems
Pruning Decision Trees with Misclassification Costs
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Knowledge Acquisition form Examples Vis Multiple Models
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Dependency Networks for Collaborative Filtering and Data Visualization
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Industry: telecommunications network diagnosis
Handbook of data mining and knowledge discovery
Tree induction vs. logistic regression: a learning-curve analysis
The Journal of Machine Learning Research
Learning Bayesian networks with local structure
UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Relational Markov models and their application to adaptive web navigation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Tree induction vs. logistic regression: a learning-curve analysis
The Journal of Machine Learning Research
Comparing Naive Bayes, Decision Trees, and SVM with AUC and Accuracy
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
ACM SIGKDD Explorations Newsletter
Finding Latent Code Errors via Machine Learning over Program Executions
Proceedings of the 26th International Conference on Software Engineering
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data mining in metric space: an empirical analysis of supervised learning performance criteria
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning Bayesian network classifiers by maximizing conditional likelihood
ICML '04 Proceedings of the twenty-first international conference on Machine learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Ensemble selection from libraries of models
ICML '04 Proceedings of the twenty-first international conference on Machine learning
MOB-ESP and other improvements in probability estimation
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Using AUC and Accuracy in Evaluating Learning Algorithms
IEEE Transactions on Knowledge and Data Engineering
Using relational knowledge discovery to prevent securities fraud
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Decision tree learning with fuzzy labels
Information Sciences—Informatics and Computer Science: An International Journal
Augmenting naive Bayes for ranking
ICML '05 Proceedings of the 22nd international conference on Machine learning
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Learning Instance Greedily Cloning Naive Bayes for Ranking
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Learning through Changes: An Empirical Study of Dynamic Behaviors of Probability Estimation Trees
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
An empirical comparison of supervised learning algorithms
ICML '06 Proceedings of the 23rd international conference on Machine learning
Full Bayesian network classifiers
ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning probabilistic decision trees for AUC
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Diagnosing scrapie in sheep: A classification experiment
Computers in Biology and Medicine
Mining optimal decision trees from itemset lattices
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Relational data pre-processing techniques for improved securities fraud detection
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Rating organ failure via adverse events using data mining in the intensive care unit
Artificial Intelligence in Medicine
A critical analysis of variants of the AUC
Machine Learning
Active learning with direct query construction
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining and Knowledge Discovery
Automatically countering imbalance and its empirical relationship to cost
Data Mining and Knowledge Discovery
An Improved Model Selection Heuristic for AUC
ECML '07 Proceedings of the 18th European conference on Machine Learning
A Simple Lexicographic Ranker and Probability Estimator
ECML '07 Proceedings of the 18th European conference on Machine Learning
Learning Decision Trees for Unbalanced Data
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
One-Class Classification by Combining Density and Class Probability Estimation
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Approximation of the Optimal ROC Curve and a Tree-Based Ranking Algorithm
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Naive Bayes for optimal ranking
Journal of Experimental & Theoretical Artificial Intelligence
An experimental comparison of performance measures for classification
Pattern Recognition Letters
Learning decision tree for ranking
Knowledge and Information Systems
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Representing conditional independence using decision trees
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Learning from labeled and unlabeled data: an empirical study across techniques and domains
Journal of Artificial Intelligence Research
Keep the decision tree and estimate the class probabilities using its decision boundary
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Constructing new and better evaluation measures for machine learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Ranking cases with decision trees: a geometric method that preserves intelligibility
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Repairing concavities in ROC curves
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
ROCCER: an algorithm for rule learning based on ROC analysis
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Active cost-sensitive learning
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
IEEE Transactions on Information Theory
Multi-category bioinformatics dataset classification using extreme learning machine
CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
An Empirical Comparison of Probability Estimation Techniques for Probabilistic Rules
DS '09 Proceedings of the 12th International Conference on Discovery Science
Auto claim fraud detection using Bayesian learning neural networks
Expert Systems with Applications: An International Journal
Decision tree learning with fuzzy labels
Information Sciences: an International Journal
Learning probabilistic decision graphs
International Journal of Approximate Reasoning
Why fuzzy decision trees are good rankers
IEEE Transactions on Fuzzy Systems
An application of automated reasoning in natural language question answering
AI Communications - Practical Aspects of Automated Reasoning
AUC: a better measure than accuracy in comparing learning algorithms
AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Exploiting diversity in ensembles: improving the performance on unbalanced datasets
MCS'07 Proceedings of the 7th international conference on Multiple classifier systems
Learning locally weighted C4.4 for class probability estimation
DS'07 Proceedings of the 10th international conference on Discovery science
Analyzing PETs on imbalanced datasets when training and testing class distributions differ
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Using evidence based content trust model for spam detection
Expert Systems with Applications: An International Journal
Semi-supervised self-training for sentence subjectivity classification
Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Data mining with neural networks and support vector machines using the R/rminer tool
ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Data Mining and Knowledge Discovery
Ensembles of probability estimation trees for customer churn prediction
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II
Combining committee-based semi-supervised learning and active learning
Journal of Computer Science and Technology
Learning random forests for ranking
Frontiers of Computer Science in China
A comparative analysis of methods for probability estimation tree
WSEAS Transactions on Computers
Boosting inspired process for improving AUC
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Smooth receiver operating characteristics (smROC) curves
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Improving the ranking performance of decision trees
ECML'06 Proceedings of the 17th European conference on Machine Learning
Combining active learning and semi-supervised for improving learning performance
Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies
Many are better than one: improving probabilistic estimates from decision trees
MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Hellinger distance decision trees are robust and skew-insensitive
Data Mining and Knowledge Discovery
Active learning for probability estimation using jensen-shannon divergence
ECML'05 Proceedings of the 16th European conference on Machine Learning
A comparison of approaches for learning probability trees
ECML'05 Proceedings of the 16th European conference on Machine Learning
Learning k-nearest neighbor naive bayes for ranking
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Hybrid bayesian estimation trees based on label semantics
ECSQARU'05 Proceedings of the 8th European conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Balancing strategies and class overlapping
IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
Learning tree augmented naive bayes for ranking
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Cost-sensitive classification with inadequate labeled data
Information Systems
Learning naïve bayes tree for conditional probability estimation
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Discriminative vs. generative classifiers for cost sensitive learning
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Learning naive bayes for probability estimation by feature selection
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Probabilistic inference trees for classification and ranking
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Feature weighted minimum distance classifier with multi-class confidence estimation
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Lazy learning for improving ranking of decision trees
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Using imaginary ensembles to select GP classifiers
EuroGP'10 Proceedings of the 13th European conference on Genetic Programming
Towards cost-sensitive learning for real-world applications
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Not so greedy: Randomly Selected Naive Bayes
Expert Systems with Applications: An International Journal
Editors Choice Article: I2VM: Incremental import vector machines
Image and Vision Computing
Learning in non-stationary environments with class imbalance
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Building decision trees for the multi-class imbalance problem
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Learning compact markov logic networks with decision trees
ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
Texture based decision tree classification for Arecanut
Proceedings of the CUBE International Information Technology Conference
A linguistic decision tree approach to predicting storm surge
Fuzzy Sets and Systems
Decision trees: a recent overview
Artificial Intelligence Review
A fuzzy classifier to deal with similarity between labels on automatic prosodic labeling
Computer Speech and Language
Learning attribute weighted AODE for ROC area ranking
International Journal of Information and Communication Technology
Expert Systems with Applications: An International Journal
Hybrid Bayesian estimation tree learning with discrete and fuzzy labels
Frontiers of Computer Science: Selected Publications from Chinese Universities
Information Sciences: an International Journal
A hybrid decision tree classifier
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Hi-index | 0.07 |
Tree induction is one of the most effective and widely used methods for building classification models. However, many applications require cases to be ranked by the probability of class membership. Probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in high dimensions and on large data sets). Unfortunately, decision trees have been found to provide poor probability estimates. Several techniques have been proposed to build more accurate PETs, but, to our knowledge, there has not been a systematic experimental analysis of which techniques actually improve the probability-based rankings, and by how much. In this paper we first discuss why the decision-tree representation is not intrinsically inadequate for probability estimation. Inaccurate probabilities are partially the result of decision-tree induction algorithms that focus on maximizing classification accuracy and minimizing tree size (for example via reduced-error pruning). Larger trees can be better for probability estimation, even if the extra size is superfluous for accuracy maximization. We then present the results of a comprehensive set of experiments, testing some straightforward methods for improving probability-based rankings. We show that using a simple, common smoothing method—the Laplace correction—uniformly improves probability-based rankings. In addition, bagging substantially improves the rankings, and is even more effective for this purpose than for improving accuracy. We conclude that PETs, with these simple modifications, should be considered when rankings based on class-membership probability are required.