Machine Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Pattern Recognition: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fuzzy sets and their application to clustering and training
Fuzzy sets and their application to clustering and training
Integration of self-organizing feature map and K-means algorithm for market segmentation
Computers and Operations Research
Finding Consistent Clusters in Data Partitions
MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Proceedings of the 2004 ACM symposium on Applied computing
Stability-based validation of clustering solutions
Neural Computation
Resampling Method for Unsupervised Estimation of Cluster Validity
Neural Computation
Content-based multimedia information retrieval: State of the art and challenges
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Real-time credit card fraud detection using computational intelligence
Expert Systems with Applications: An International Journal
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Data clustering by minimizing disconnectivity
Information Sciences: an International Journal
Minimum spanning tree based split-and-merge: A hierarchical clustering method
Information Sciences: an International Journal
MiniMax ε-stable cluster validity index for Type-2 fuzziness
Information Sciences: an International Journal
MCS: A Method for Finding the Number of Clusters
Journal of Classification
Application of majority voting to pattern recognition: an analysis of its behavior and performance
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
Information Sciences: an International Journal
Hi-index | 0.07 |
An important and challenging problem in data clustering is the determination of the best number of clusters. A variety of estimation methods has been proposed over the years to address this problem. Most of these methods depend on several nontrivial assumptions about the data structure; and such methods may thus fail to discover the true clusters in a dataset that does not satisfy those assumptions. We develop a new approach that takes as a starting point the simple and intuitive observation that close objects should fall within the same cluster, whereas distant ones should not. Based on this simple notion we utilize a new measurement of good clustering called disconnectivity as well as existing goodness measurements; and we embed these measures into a meta-learning approach for estimating the number of clusters. A simulation experiment based on 13 representative models and an application to real world datasets are conducted to show the effectiveness of the proposed method.