Pattern Recognition
Improvement of Kittler and Illingworth's minimum error thresholding
Pattern Recognition
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Learning Mixtures of Gaussians
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
d-Clusters: Capturing Subspace Correlation in a Large Data Set
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A Hierarchical Projection Pursuit Clustering Algorithm
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Exploiting the Geometry of Gene Expression Patterns for Unsupervised Learning
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
bigVAT: Visual assessment of cluster tendency for large data sets
Pattern Recognition
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Kernel approximately harmonic projection
Neurocomputing
Random walk distances in data clustering and applications
Advances in Data Analysis and Classification
Hi-index | 0.01 |
Classical clustering algorithms are based on the concept that a cluster center is a single point. Clusters which are not compact around a single point are not candidates for classical clustering approaches. In this paper we present a new clustering paradigm in which the cluster center is a linear manifold. Clusters are groups of points compact around a linear manifold. A linear manifold of dimension 0 is a point. So clustering around a center point is a special case of linear manifold clustering. Linear manifold clustering (LMCLUS) identifies subsets of the data which are embedded in arbitrary oriented lower dimensional linear manifolds. Minimal subsets of points are repeatedly sampled to construct trial linear manifolds of various dimensions. Histograms of the distances of the points to each trial manifold are computed. The sampling corresponding to the histogram having the best separation between a mode near zero and the rest is selected and the data points are partitioned on the basis of the best separation. The repeated sampling then continues recursively on each block of the partitioned data. A broad evaluation of some 100 experiments over real and synthetic data sets demonstrates the general superiority of this algorithm over any of the competing algorithms in terms of accuracy and computation time. Its expected computational time is linearly proportional to the data set dimension and data set size. Its accuracy ranges from near 0.90 to 0.99 depending on the experiment and is generally much higher than the accuracy of the competing clustering algorithms.