A new fuzzy rule-based initialization method for K-nearest neighbor classifier

Authors:
TeckWee Chua;WoeiWan Tan
Affiliations:
Department of Electrical and Computer Engineering, National University of Singapore, Singapore;Department of Electrical and Computer Engineering, National University of Singapore, Singapore
Venue:
FUZZ-IEEE'09 Proceedings of the 18th international conference on Fuzzy Systems
Year:
2009

Citing 4
Cited 1

Fuzzy sets in pattern recognition: accomplishments and challenges

Fuzzy Sets and Systems - Special issue: fuzzy sets: where do we stand? Where do we go?
Locally Adaptive Metric Nearest-Neighbor Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Weighted Metrics to Minimize Nearest-Neighbor Classification Error

IEEE Transactions on Pattern Analysis and Machine Intelligence
An improved kNN algorithm – fuzzy kNN

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I

Fuzzy nearest neighbor algorithms: Taxonomy, experimental analysis and prospects

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performances of conventional crisp and fuzzy K-Nearest neighbor (K-NN) algorithms trained using finite samples tends to be poor [1], [2]. With "holes" in the training data, it is unlikely that the decision area formed can actually represent the underlying data distribution. There is a need to capture more useful information from the limited training samples, therefore we propose a new fuzzy rule-based KNN algorithm. A fuzzy rule-based initialization procedure differentiates our proposed algorithm from the conventional fuzzy K-NN algorithm. The new initialization procedure allows us to handle the imprecise inputs (neighborhood density and distance) through the natural framework of fuzzy logic system. Unlike conventional K-NN algorithms, the ability to fine tune the membership functions can lead to a highly versatile decision boundary. Thus, the new algorithm can be specifically tuned for different problems to achieve better results. The advantage is demonstrated on a synthetic data set in two-dimensional space. In addition, we also adopt weighted Euclidean distance measurement to overcome the curse of dimensionality [3]. The Euclidean distance weights and the parameters of the fuzzy rule-based system are then optimized with Genetic Algorithm (GA) simultaneously. The practical applicability of the proposed algorithm is verified on four UCI data sets (Bupa liver disorders, Glass, Pima Indians diabetes and Wisconsin breast cancer) and Ford automotive data set with an improvement of 3.42% in classification rate on average.