Enhancing prototype reduction schemes with recursion: a method applicable for "large" data sets

Authors:
Sang-Woon Kim;B. J. Oommen
Affiliations:
Div. of Comput. Sci. & Eng., Myongji Univ., Yongin, South Korea;-
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Year:
2004

Citing 0
Cited 17

On Using Prototype Reduction Schemes and Classifier Fusion Strategies to Optimize Kernel-Based Nonlinear Subspace Methods

IEEE Transactions on Pattern Analysis and Machine Intelligence
Prototype reduction schemes applicable for non-stationary data sets

Pattern Recognition
On using prototype reduction schemes to optimize dissimilarity-based classification

Pattern Recognition
Support vector machine classification for large data sets via minimum enclosing ball clustering

Neurocomputing
On using prototype reduction schemes to enhance the computation of volume-based inter-class overlap measures

Pattern Recognition
Democratic instance selection: A linear complexity instance selection algorithm based on classifier ensemble concepts

Artificial Intelligence
A fast computation of inter-class overlap measures using prototype reduction schemes

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Data compression by volume prototypes for streaming data

Pattern Recognition
On using prototype reduction schemes to optimize locally linear reconstruction methods

Pattern Recognition
Clinical charge profiles prediction for patients diagnosed with chronic diseases using Multi-level Support Vector Machine

Expert Systems with Applications: An International Journal
Support vector machine classification based on fuzzy clustering for large data sets

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
On optimizing dissimilarity-based classification using prototype reduction schemes

ICIAR'06 Proceedings of the Third international conference on Image Analysis and Recognition - Volume Part I
Time-varying prototype reduction schemes applicable for non-stationary data sets

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Optimizing dissimilarity-based classifiers using a newly modified hausdorff distance

PKAW'06 Proceedings of the 9th Pacific Rim Knowledge Acquisition international conference on Advances in Knowledge Acquisition and Management
Class proximity measures - Dissimilarity-based classification and display of high-dimensional data

Journal of Biomedical Informatics
Detecting RNA sequences using two-stage SVM classifier

LSMS'07 Proceedings of the 2007 international conference on Life System Modeling and Simulation
Fast classification for large data sets via random selection clustering and Support Vector Machines

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most of the prototype reduction schemes (PRS), which have been reported in the literature, process the data in its entirety to yield a subset of prototypes that are useful in nearest-neighbor-like classification. Foremost among these are the prototypes for nearest neighbor classifiers, the vector quantization technique, and the support vector machines. These methods suffer from a major disadvantage, namely, that of the excessive computational burden encountered by processing all the data. In this paper, we suggest a recursive and computationally superior mechanism referred to as adaptive recursive partitioning (ARP)_PRS. Rather than process all the data using a PRS, we propose that the data be recursively subdivided into smaller subsets. This recursive subdivision can be arbitrary, and need not utilize any underlying clustering philosophy. The advantage of ARP_PRS is that the PRS processes subsets of data points that effectively sample the entire space to yield smaller subsets of prototypes. These prototypes are then, in turn, gathered and processed by the PRS to yield more refined prototypes. In this manner, prototypes which are in the interior of the Voronoi spaces, and thus ineffective in the classification, are eliminated at the subsequent invocations of the PRS. We are unaware of any PRS that employs such a recursive philosophy. Although we marginally forfeit accuracy in return for computational efficiency, our experimental results demonstrate that the proposed recursive mechanism yields classification comparable to the best reported prototype condensation schemes reported to-date. Indeed, this is true for both artificial data sets and for samples involving real-life data sets. The results especially demonstrate that a fair computational advantage can be obtained by using such a recursive strategy for " large" data sets, such as those involved in data mining and text categorization applications.