Efficient Implementation of the Fuzzy c-Means Clustering Algorithms
IEEE Transactions on Pattern Analysis and Machine Intelligence
Algorithms for clustering data
Algorithms for clustering data
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient progressive sampling
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Scalability for clustering algorithms revisited
ACM SIGKDD Explorations Newsletter
Modern Information Retrieval
Fuzzy Models and Algorithms for Pattern Recognition and Image Processing
Fuzzy Models and Algorithms for Pattern Recognition and Image Processing
Computer
A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Clustering Large Datasets in Arbitrary Metric Spaces
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
The learning-curve sampling method applied to model-based clustering
The Journal of Machine Learning Research
Convergence of alternating optimization
Neural, Parallel & Scientific Computations
IEEE Transactions on Computers
Complexity reduction for "large image" processing
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Reducing the time complexity of the fuzzy c-means algorithm
IEEE Transactions on Fuzzy Systems
Fast accurate fuzzy clustering through data reduction
IEEE Transactions on Fuzzy Systems
Parametric estimation for normal mixtures
Pattern Recognition Letters
Fuzzy Sets and Systems
A Scalable Framework For Segmenting Magnetic Resonance Images
Journal of Signal Processing Systems
New modified fuzzy C-means for determination of proper structure in dataset
Proceedings of the International Conference on Advances in Computing, Communication and Control
The fuzzy approach to statistical analysis
Computational Statistics & Data Analysis
Density-weighted fuzzy c-means clustering
IEEE Transactions on Fuzzy Systems
Clustering large data sets based on data compression technique and weighted quality measures
FUZZ-IEEE'09 Proceedings of the 18th international conference on Fuzzy Systems
Effective fuzzy c-means based kernel function in segmenting medical images
Computers in Biology and Medicine
Approximate pairwise clustering for large data sets via sampling plus extension
Pattern Recognition
Effective fuzzy c-means clustering algorithms for data clustering problems
Expert Systems with Applications: An International Journal
An evaluation of clustering technique over intrusion detection system
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Strong fuzzy c-means in medical image data analysis
Journal of Systems and Software
Credit-Card fraud profiling using a hybrid incremental clustering methodology
SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
Weighted Fuzzy-Possibilistic C-Means Over Large Data Sets
International Journal of Data Warehousing and Mining
Expert Systems with Applications: An International Journal
Two novel fuzzy clustering methods for solving data clustering problems
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Hi-index | 0.03 |
Approximating clusters in very large (VL=unloadable) data sets has been considered from many angles. The proposed approach has three basic steps: (i) progressive sampling of the VL data, terminated when a sample passes a statistical goodness of fit test; (ii) clustering the sample with a literal (or exact) algorithm; and (iii) non-iterative extension of the literal clusters to the remainder of the data set. Extension accelerates clustering on all (loadable) data sets. More importantly, extension provides feasibility-a way to find (approximate) clusters-for data sets that are too large to be loaded into the primary memory of a single computer. A good generalized sampling and extension scheme should be effective for acceleration and feasibility using any extensible clustering algorithm. A general method for progressive sampling in VL sets of feature vectors is developed, and examples are given that show how to extend the literal fuzzy (c-means) and probabilistic (expectation-maximization) clustering algorithms onto VL data. The fuzzy extension is called the generalized extensible fast fuzzy c-means (geFFCM) algorithm and is illustrated using several experiments with mixtures of five-dimensional normal distributions.