An Empirical Comparison of Selection Measures for Decision-Tree Induction

Authors:
John Mingers
Affiliations:
School of Industrial and Business Studies, University of Warwick, Coventry CV4 7AL, U.K. BSRCD@CU.WARWICK.AC.UK
Venue:
Machine Learning
Year:
1989

Citing 0
Cited 78

Optimal Partitioning for Classification and Regression Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Pattern Recognition Approach for Software Engineering Data Analysis

IEEE Transactions on Software Engineering - Special issue on software measurement principles, techniques, and environments
Formation of clusters and resolution of ordinal attributes in ID3 classification trees

SAC '92 Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing: technological challenges of the 1990's
Learning k&mgr; decision trees on the uniform distribution

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
In-process improvement through defect data interpretation

IBM Systems Journal
On the boosting ability of top-down decision tree learning algorithms

STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Post-process feedback with and without attribute focusing: a comparative evaluation

ICSE '93 Proceedings of the 15th international conference on Software Engineering
An Exact Probability Metric for Decision Tree Splitting and Stopping

Machine Learning
Separate-and-Conquer Rule Learning

Artificial Intelligence Review
General and Efficient Multisplitting of Numerical Attributes

Machine Learning
Multiple Comparisons in Induction Algorithms

Machine Learning
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey

Data Mining and Knowledge Discovery
A New Probabilistic Induction Method

Journal of Automated Reasoning
Modelling the Likelihood of Software Process Improvement: An Exploratory Study

Empirical Software Engineering
Learning Concepts by Arranging Appropriate Training Order

Minds and Machines
An Efficient Inductive Learning Method for Object-Oriented Database Using Attribute Entropy

IEEE Transactions on Knowledge and Data Engineering
Inductive Learning for Risk Classification

IEEE Expert: Intelligent Systems and Their Applications
A Case Study of Software Process Improvement During Development

IEEE Transactions on Software Engineering
Effect of pruning and early stopping on performance of a boosting ensemble

Computational Statistics & Data Analysis - Nonlinear methods and data mining
On the quest for easy-to-understand splitting rules

Data & Knowledge Engineering
Induction of Rules Subject to a Quality Constraint: Probabilistic Inductive Learning

IEEE Transactions on Knowledge and Data Engineering
Nonparametric Regularization of Decision Trees

ECML '00 Proceedings of the 11th European Conference on Machine Learning
A Unified Framework for Evaluation Metrics in Classification Using Decision Trees

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Possibilistic Induction in Decision-Tree Learning

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Texture Based Look-Ahead for Decision-Tree Induction

ICAPR '01 Proceedings of the Second International Conference on Advances in Pattern Recognition
Feature Transformation and Multivariate Decision Tree Induction

DS '98 Proceedings of the First International Conference on Discovery Science
Construct robust rule sets for classification

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Domain-Specific Web Search with Keyword Spices

IEEE Transactions on Knowledge and Data Engineering
Theoretical Comparison between the Gini Index and Information Gain Criteria

Annals of Mathematics and Artificial Intelligence
A review of machine learning

The Knowledge Engineering Review
Simplifying decision trees: A survey

The Knowledge Engineering Review
A flexible POS tagger using an automatically acquired language model

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Building a Medical Decision Support System for Colon Polyp Screening by Using Fuzzy Classification Trees

Applied Intelligence
Using association rules to make rule-based classifiers robust

ADC '05 Proceedings of the 16th Australasian database conference - Volume 39
Extremely randomized trees

Machine Learning
Special section: research in integrating learning capabilities into information systems

Journal of Management Information Systems - Special section: Research in integrating learning capabilities into information systems
Inductive learning for international financial analysis: a layered approach

Journal of Management Information Systems - Special section: Research in integrating learning capabilities into information systems
Comparing information‐theoretic attribute selection measures: a statistical approach

AI Communications
On the Computational Complexity of Optimal Multisplitting

Fundamenta Informaticae - Intelligent Systems
Neural network explanation using inversion

Neural Networks
Combining multiple class distribution modified subsamples in a single tree

Pattern Recognition Letters
Evaluating noise elimination techniques for software quality estimation

Intelligent Data Analysis
Anytime Learning of Decision Trees

The Journal of Machine Learning Research
Comparing probability measures using possibility theory: A notion of relative peakedness

International Journal of Approximate Reasoning
Rule quality for multiple-rule classifier: Empirical expertise and theoretical methodology

Intelligent Data Analysis
Rough set based approach for inducing decision trees

Knowledge-Based Systems
Using coverage as a model building constraint in learning classifier systems

Evolutionary Computation
Detection of e-mail concerning criminal activities using association rule-based decision tree

International Journal of Electronic Security and Digital Forensics
Parallel learning using decision trees: a novel approach

AMCOS'05 Proceedings of the 4th WSEAS International Conference on Applied Mathematics and Computer Science
Ranking Categorical Features Using Generalization Properties

The Journal of Machine Learning Research
Moving towards efficient decision tree construction

Information Sciences: an International Journal
A Bayesian Random Split to Build Ensembles of Classification Trees

ECSQARU '09 Proceedings of the 10th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Iterative optimization and simplification of hierarchical clusterings

Journal of Artificial Intelligence Research
A system for induction of oblique decision trees

Journal of Artificial Intelligence Research
An analysis of reduced error pruning

Journal of Artificial Intelligence Research
A scheme for feature construction and a comparison of empirical methods

IJCAI'91 Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2
Optimisation of the decision tree technique applied to simulated sow herd datasets

Computers and Electronics in Agriculture
RBDT-1: A New Rule-Based Decision Tree Generation Technique

RuleML '09 Proceedings of the 2009 International Symposium on Rule Interchange and Applications
Prediction by categorical features: generalization properties and application to feature ranking

COLT'07 Proceedings of the 20th annual conference on Learning theory
Efficient multi-method rule learning for pattern classification machine learning and data mining

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Data mining with differential privacy

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Myths and legends in learning classification rules

AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 2
Inductive learning in a mixed paradigm setting

AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 2
The attribute selection problem in decision tree generation

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
COGIN: symbolic induction with genetic algorithms

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Pattern discovery in distributed databases

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Separability of split value criterion with weighted separation gains

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A VPRSM based approach for inducing decision trees

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
ComEnVprs: a novel approach for inducing decision tree classifiers

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Predicate selection for structural decision trees

ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
An unsupervised feature selection framework based on clustering

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Extraction of diagnostic rules using recursive partitioning systems: A comparison of two approaches

Artificial Intelligence in Medicine
A hyper-heuristic evolutionary algorithm for automatically designing decision-tree algorithms

Proceedings of the 14th annual conference on Genetic and evolutionary computation
Inducing decision trees with an ant colony optimization algorithm

Applied Soft Computing
On the Computational Complexity of Optimal Multisplitting

Fundamenta Informaticae - Intelligent Systems
On Learning Decision Structures

Fundamenta Informaticae
Software effort prediction: a hyper-heuristic decision-tree based approach

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Automatic design of decision-tree algorithms with evolutionary algorithms

Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

One approach to induction is to develop a decision tree from a set of examples. When used with noisy rather than deterministic data, the method involves three main stages – creating a complete tree able to classify all the examples, pruning this tree to give statistical reliability, and processing the pruned tree to improve understandability. This paper is concerned with the first stage – tree creation – which relies on a measure for “goodness of split,” that is, how well the attributes discriminate between classes. Some problems encountered at this stage are missing data and multi-valued attributes. The paper considers a number of different measures and experimentally examines their behavior in four domains. The results show that the choice of measure affects the size of a tree but not its accuracy, which remains the same even when attributes are selected randomly.