The Generalized Condensed Nearest Neighbor Rule as A Data Reduction Method

Authors:
Chien-Hsing Chou;Bo-Han Kuo;Fu Chang
Affiliations:
Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C.;Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C.;Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C.
Venue:
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Year:
2006

Citing 0
Cited 15

Prototype Selection Via Prototype Relevance

CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing

ICCBR '09 Proceedings of the 8th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Prototype selection based on sequential search

Intelligent Data Analysis
Fast k most similar neighbor classifier for mixed data (tree k-MSN)

Pattern Recognition
Finding Small Consistent Subset for the Nearest Neighbor Classifier Based on Support Graphs

CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
A review of instance selection methods

Artificial Intelligence Review
Noise reduction for instance-based learning with a local maximal margin approach

Journal of Intelligent Information Systems
Adaptive case-based reasoning using retention and forgetting strategies

Knowledge-Based Systems
A machine learning approach to classify vigilance states in rats

Expert Systems with Applications: An International Journal
An instance selection algorithm based on reverse nearest neighbor

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
InstanceRank based on borders for instance selection

Pattern Recognition
Efficient dataset size reduction by finding homogeneous clusters

Proceedings of the Fifth Balkan Conference in Informatics
WCOID-DG: An approach for case base maintenance based on Weighting, Clustering, Outliers, Internal Detection and Dbsan-Gmeans

Journal of Computer and System Sciences
Combining classifiers using nearest decision prototypes

Applied Soft Computing
A comparison between k-Optimum Path Forest and k-Nearest Neighbors supervised classifiers

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a new data reduction algorithm that iteratively selects some samples and ignores others that can be absorbed, or represented, by those selected. This algorithm differs from the condensed nearest neighbor (CNN) rule in its employment of a strong absorption criterion, in contrast to the weak criterion employed by CNN; hence, it is called the generalized CNN (GCNN) algorithm. The new criterion allows GCNN to incorporate CNN as a special case, and can achieve consistency, or asymptotic Bayes-risk efficiency, under certain conditions. GCNN, moreover, can yield significantly better accuracy than other instance- based data reduction methods. We demonstrate the last claim through experiments on five datasets, some of which contain a very large number of samples.