Recursive Prototype Reduction Schemes Applicable for Large Data Sets

Authors:
Sang-Woon Kim;B. John Oommen
Affiliations:
-;-
Venue:
Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Year:
2002

Citing 7
Cited 0

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Kohonen network incorporating explicit statistics and its application to the travelling salesman problem

Neural Networks
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Vector Quantization Technique for Nonparametric Classifier Design

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Bootstrap Technique for Nearest Neighbor Classifier Design

IEEE Transactions on Pattern Analysis and Machine Intelligence
Finding Prototypes For Nearest Neighbor Classifiers

IEEE Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most of the Prototype Reduction Schemes (PRS), which have been reported in the literature, process the data in its entirety to yield a subset of prototpyes that are useful in nearest-neighbourlike classification. Foremost among these are the Prototypes for Nearest Neighbour (PNN) classifiers, the Vector Quantization (VQ) technique, and the Support Vector Machines (SVM). These methods suffer from a major disadvantage, namely, that of the excessive computational burden encountered by processing all the data. In this paper, we suggest a recursive and computationally superior mechanism. Rather than process all the data using a PRS, we propose that the data be recursively subdivided into smaller subsets. This recursive subdivision can be arbitrary, and need not utilize any underlying clustering philosophy. The advantage of this is that the PRS processes subsets of data points that effectively sample the entire space to yield smaller subsets of prototypes. These prototypes are then, in turn, gathered and processed by the PRS to yield more refined prototypes. Our experimental results demonstrate that the proposed recursive mechansim yields classification comparable to the best reported prototype condensation schemes to-date, for both artificial data sets and for samples involving real-life data sets. The results especially demonstrate the computational advantage of using such a recursive strategy for large data sets, such as those involved in data mining and text categorization applications.