Explorations of an Incremental, Bayesian Algorithm for Categorization

Authors:
John R. Anderson;Michael Matessa
Affiliations:
Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15213. JAOS@ANDREW.CMU.EDU;Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15213
Venue:
Machine Learning
Year:
1992

Citing 0
Cited 8

Lazy Acquisition of Place Knowledge

Artificial Intelligence Review - Special issue on lazy learning
Learning with Probabilistic Representations

Machine Learning - Special issue on learning with probabilistic representations
Robust Incremental Clustering with Bad Instance Orderings: A New Strategy

IBERAMIA '98 Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence
Dependency-based feature selection for clustering symbolic data

Intelligent Data Analysis
Different metaheuristic strategies to solve the feature selection problem

Pattern Recognition Letters
Occam and Bayes in predicting category intuitiveness

Artificial Intelligence Review
Induction of selective Bayesian classifiers

UAI'94 Proceedings of the Tenth international conference on Uncertainty in artificial intelligence
Classification of Unseen Examples under Uncertainty

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

An incremental categorization algorithm is described which, at each step, assigns the next instance to the most probable category. Probabilities are estimated by a Bayesian inference scheme which assumes that instances are partitioned into categories and that within categories features are displayed independently and probabilistically. This algorithm can be shown to be an optimization of an ideal Bayesian algorithm in which predictive accuracy is traded for computational efficiency. The algorithm can deliver predictions about any dimension of a category and does not treat specially the prediction of category labels. The algorithm has successfully modeled much of the empirical literature on human categorization. This paper describes its application to a number of data sets from the machine learning literature. The algorithm performs reasonably well, having its only serious difficulty because the assumption of independent features is not always satisfied. Bayesian extensions to deal with nonindependent features are described and evaluated.