Algorithms for better representation and faster learning in radial basis function networks
Advances in neural information processing systems 2
Universal approximation using radial-basis-function networks
Neural Computation
On changing continuous attributes into ordered discrete attributes
EWSL-91 Proceedings of the European working session on learning on Machine learning
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
Data mining: concepts and techniques
Data mining: concepts and techniques
Mean Shift: A Robust Approach Toward Feature Space Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
COOLCAT: an entropy-based algorithm for categorical clustering
Proceedings of the eleventh international conference on Information and knowledge management
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
Mean Shift, Mode Seeking, and Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Algorithm for Data-Driven Bandwidth Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Clustering Categorical Data: An Approach Based on Dynamical Systems
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
On clusterings-good, bad and spectral
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
A Large Scale Clustering Scheme for Kernel K-Means
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
The Journal of Machine Learning Research
On the Kernel Widths in Radial-Basis Function Networks
Neural Processing Letters
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Adaptive dimension reduction using discriminant analysis and K-means clustering
Proceedings of the 24th international conference on Machine learning
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A tutorial on spectral clustering
Statistics and Computing
Data Clustering: Theory, Algorithms, and Applications (ASA-SIAM Series on Statistics and Applied Probability)
Fast learning in networks of locally-tuned processing units
Neural Computation
Kernel k-means for categorical data
IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
The estimation of the gradient of a density function, with applications in pattern recognition
IEEE Transactions on Information Theory
Mercer kernel-based clustering in feature space
IEEE Transactions on Neural Networks
Competitive positioning and performance assessment in the construction industry
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
Data clustering is a common technique for data analysis, which is used in many fields, including machine learning, data mining, customer segmentation, trend analysis, pattern recognition and image analysis. Although many clustering algorithms have been proposed, most of them deal with clustering of one data type (numerical or nominal) or with mix data type (numerical and nominal) and only few of them provide a generic method that clusters all types of data. It is required for most real-world applications data to handle both feature types and their mix. In this paper, we propose an automated technique, called SpectralCAT, for unsupervised clustering of high-dimensional data that contains numerical or nominal or mix of attributes. We suggest to automatically transform the high-dimensional input data into categorical values. This is done by discovering the optimal transformation according to the Calinski-Harabasz index for each feature and attribute in the dataset. Then, a method for spectral clustering via dimensionality reduction of the transformed data is applied. This is achieved by automatic non-linear transformations, which identify geometric patterns in the data, and find the connections among them while projecting them onto low-dimensional spaces. We compare our method to several clustering algorithms using 16 public datasets from different domains and types. The experiments demonstrate that our method outperforms in most cases these algorithms.