A meta-learning approach for determining the number of clusters with consideration of nearest neighbors

Authors:
Jong-Seok Lee;Sigurdur Olafsson
Affiliations:
Department of Systems Management Engineering, Sungkyunkwan University, Suwon 440-746, Republic of Korea;Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA 50011, USA
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 19
Cited 1

Bagging predictors

Machine Learning
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fuzzy sets and their application to clustering and training

Fuzzy sets and their application to clustering and training
Integration of self-organizing feature map and K-means algorithm for market segmentation

Computers and Operations Research
Finding Consistent Clusters in Data Partitions

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
K-nearest-neighbor consistency in data clustering: incorporating local information into global optimization

Proceedings of the 2004 ACM symposium on Applied computing
Stability-based validation of clustering solutions

Neural Computation
Resampling Method for Unsupervised Estimation of Cluster Validity

Neural Computation
Content-based multimedia information retrieval: State of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Fast and robust fuzzy c-means clustering algorithms incorporating local information for image segmentation

Pattern Recognition
Real-time credit card fraud detection using computational intelligence

Expert Systems with Applications: An International Journal
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Data clustering by minimizing disconnectivity

Information Sciences: an International Journal
Minimum spanning tree based split-and-merge: A hierarchical clustering method

Information Sciences: an International Journal
MiniMax ε-stable cluster validity index for Type-2 fuzziness

Information Sciences: an International Journal
MCS: A Method for Finding the Number of Clusters

Journal of Classification
Application of majority voting to pattern recognition: an analysis of its behavior and performance

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Nearest neighbor pattern classification

IEEE Transactions on Information Theory

Intelligent jamming region division with machine learning and fuzzy optimization for control of robot's part micro-manipulative task

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

An important and challenging problem in data clustering is the determination of the best number of clusters. A variety of estimation methods has been proposed over the years to address this problem. Most of these methods depend on several nontrivial assumptions about the data structure; and such methods may thus fail to discover the true clusters in a dataset that does not satisfy those assumptions. We develop a new approach that takes as a starting point the simple and intuitive observation that close objects should fall within the same cluster, whereas distant ones should not. Based on this simple notion we utilize a new measurement of good clustering called disconnectivity as well as existing goodness measurements; and we embed these measures into a meta-learning approach for estimating the number of clusters. A simulation experiment based on 13 representative models and an application to real world datasets are conducted to show the effectiveness of the proposed method.