The Role of Occam‘s Razor in Knowledge Discovery

Authors:
Pedro Domingos
Affiliations:
Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195. pedrod@cs.washington.edu
Venue:
Data Mining and Knowledge Discovery
Year:
1999

Citing 49
Cited 52

Randomization tests

Randomization tests
Occam's razor

Information Processing Letters
Quantifying inductive bias: AI learning algorithms and Valiant's learning framework

Artificial Intelligence
Inferring decision trees using the minimum description length principle

Information and Computation
Learning from hints in neural networks

Journal of Complexity
Elements of information theory

Elements of information theory
Neural networks and the bias/variance dilemma

Neural Computation
Bayesian interpolation

Neural Computation
Original Contribution: Stacked generalization

Neural Networks
Overfitting Avoidance as Bias

Machine Learning
Induction with randomization testing: decision-oriented analysis of large data sets

Induction with randomization testing: decision-oriented analysis of large data sets
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Machine Learning
Theory refinement combining analytical and empirical methods

Artificial Intelligence
Grammatically biased learning: learning logic programs using an explicit antecedent description language

Artificial Intelligence
The nature of statistical learning theory

The nature of statistical learning theory
Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

Machine Learning
Creating advice-taking reinforcement learners

Machine Learning - Special issue on reinforcement learning
Bagging predictors

Machine Learning
Unifying instance-based and rule-based induction

Machine Learning
Metaqueries for data mining

Advances in knowledge discovery and data mining
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Efficient Approximations for the MarginalLikelihood of Bayesian Networks with Hidden Variables

Machine Learning - Special issue on learning with probabilistic representations
Knowledge-Based Learning in Exploratory Science: Learning Rules to Predict Rodent Carcinogenicity

Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Boosting in the limit: maximizing the margin of learned ensembles

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Advances in kernel methods: support vector learning

Advances in kernel methods: support vector learning
Multiple Comparisons in Induction Algorithms

Machine Learning
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality

Data Mining and Knowledge Discovery
A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships

Data Mining and Knowledge Discovery
An Empirical Comparison of Pruning Methods for Decision Tree Induction

Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Boosting the margin: A new explanation for the effectiveness of voting methods

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Knowledge Acquisition form Examples Vis Multiple Models

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Declarative Bias in Equation Discovery

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Characterizing the generalization performance of model selection strategies

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Process-Oriented Heuristic for Model Selection

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A New SQL-like Operator for Mining Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Process-Oriented Estimation of Generalization Error

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Option Decision Trees with Majority Votes

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Extracting comprehensible models from trained neural networks

Extracting comprehensible models from trained neural networks
The lack of a priori distinctions between learning algorithms

Neural Computation
Decision tree grafting

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Further experimental evidence against the utility of Occam's razor

Journal of Artificial Intelligence Research
Oversearching and layered search in empirical learning

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Lookahead and pathology in decision tree induction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Lessons in neural network training: overfitting may be harder than expected

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
An empirical evaluation of bagging and boosting

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
A new metric-based approach to model selection

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Towards an effective cooperation of the user and the computer for classification

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Knowledge discovery with second-order relations

Knowledge and Information Systems
Beyond Occam's Razor: Process-Oriented Evaluation

ECML '00 Proceedings of the 11th European Conference on Machine Learning
Phase Transitions and Stochastic Local Search in k-Term DNF Learning

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Possibilistic Induction in Decision-Tree Learning

ECML '02 Proceedings of the 13th European Conference on Machine Learning
The Role of Domain Knowledge in a Large Scale Data Mining Project

SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Learning of Boolean Functions Using Support Vector Machines

ALT '01 Proceedings of the 12th International Conference on Algorithmic Learning Theory
Defining Similarity Measures: Top-Down vs. Bottom-Up

ECCBR '02 Proceedings of the 6th European Conference on Advances in Case-Based Reasoning
Data mining tasks and methods: scalability

Handbook of data mining and knowledge discovery
Knowledge evaluation: Other evaluations: minimum description length

Handbook of data mining and knowledge discovery
An empirical comparison of supervised machine learning techniques in bioinformatics

APBC '03 Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003 - Volume 19
Data Mining for Generating Predictive Models of Local Hydrology

Applied Intelligence
Machine Learning for Computer Graphics: A Manifesto and Tutorial

PG '03 Proceedings of the 11th Pacific Conference on Computer Graphics and Applications
ART: A Hybrid Classification Model

Machine Learning
A Bias-Variance Analysis of a Real World Learning Problem: The CoIL Challenge 2000

Machine Learning
Pareto-optimal patterns in logical analysis of data

Discrete Applied Mathematics - Discrete mathematics & data mining (DM & DM)
Induction of comprehensible models for gene expression datasets by subgroup discovery methodology

Journal of Biomedical Informatics - Special issue: Biomedical machine learning
Randomised restarted search in ILP

Machine Learning
Design and evaluation of visualization support to facilitate decision trees classification

International Journal of Human-Computer Studies
A Dichotomic Search Algorithm for Mining and Learning in Domain-Specific Logics

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Take a load off: cognitive considerations for game design

Proceedings of the 3rd Australasian conference on Interactive entertainment
Argument based machine learning

Artificial Intelligence
Hybrid systems of local basis functions

Intelligent Data Analysis
Rule effectiveness in rule-based systems: A credit scoring case study

Expert Systems with Applications: An International Journal
Learning as Data Compression

CiE '07 Proceedings of the 3rd conference on Computability in Europe: Computation and Logic in the Real World
LEGAL-tree: a lexicographic multi-objective genetic algorithm for decision tree induction

Proceedings of the 2009 ACM symposium on Applied Computing
Argument Based Rule Learning

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Using crossover based similarity measure to improve genetic programming generalization ability

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Any time induction of decision trees: an iterative improvement approach

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Learning from ambiguously labeled examples

Intelligent Data Analysis - Selected papers from IDA2005, Madrid, Spain
Occam's razor just got sharper

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Pareto-optimal patterns in logical analysis of data

Discrete Applied Mathematics
XKey: A tool for the generation of identification keys

Expert Systems with Applications: An International Journal
Interactive visual decision tree classification

HCI'07 Proceedings of the 12th international conference on Human-computer interaction: interaction platforms and techniques
Analysis on classification performance of rough set based reducts

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
An effective sampling scheme for better multi-layer perceptrons

AIKED'10 Proceedings of the 9th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
Evaluating learning algorithms and classifiers

International Journal of Intelligent Information and Database Systems
Lexicographic multi-objective evolutionary induction of decision trees

International Journal of Bio-Inspired Computation
Measuring bloat, overfitting and functional complexity in genetic programming

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Open issues in genetic programming

Genetic Programming and Evolvable Machines
Investigating better multi-layer perceptrons for the task of classification

WSEAS Transactions on Computers
A Randomized Exhaustive Propositionalization Approach for Molecule Classification

INFORMS Journal on Computing
Genetic programming, validation sets, and parsimony pressure

EuroGP'06 Proceedings of the 9th European conference on Genetic Programming
Generality is predictive of prediction accuracy

Data Mining
Learning from ambiguously labeled examples

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
Sample complexity of linear learning machines with different restrictions over weights

ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Data-Mining-Driven Neighborhood Search

INFORMS Journal on Computing
Performance of classification models from a user perspective

Decision Support Systems
Operator equalisation for bloat free genetic programming and a survey of bloat control methods

Genetic Programming and Evolvable Machines
A few useful things to know about machine learning

Communications of the ACM
A Dichotomic Search Algorithm for Mining and Learning in Domain-Specific Logics

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences

Quantified Score

Hi-index	0.02

Visualization

Abstract

Many KDD systems incorporate an implicit or explicitpreference for simpler models, but this use of “Occam‘s razor” hasbeen strongly criticized by several authors (e.g., Schaffer, 1993;Webb, 1996). This controversy arises partly because Occam‘s razor hasbeen interpreted in two quite different ways. The firstinterpretation (simplicity is a goal in itself) is essentiallycorrect, but is at heart a preference for more comprehensible models.The second interpretation (simplicity leads to greater accuracy) ismuch more problematic. A critical review of the theoretical argumentsfor and against it shows that it is unfounded as a universalprinciple, and demonstrably false. A review of empirical evidenceshows that it also fails as a practical heuristic. This articleargues that its continued use in KDD risks causing significantopportunities to be missed, and should therefore be restricted to thecomparatively few applications where it is appropriate. The articleproposes and reviews the use of domain constraints as an alternativefor avoiding overfitting, and examines possible methods for handlingthe accuracy–comprehensibility trade-off.