Random direction divisive clustering

Authors:
S. K. Tasoulis;D. K. Tasoulis;V. P. Plagianakos
Affiliations:
Computer Science and Biomedical Informatics, University of Central Greece, Papassiopoulou 2-4, Lamia 35100, Greece;Winton Capital Management, 1-5 St. Mary Abbot's Place, SW8 6LS London, United Kingdom;Computer Science and Biomedical Informatics, University of Central Greece, Papassiopoulou 2-4, Lamia 35100, Greece
Venue:
Pattern Recognition Letters
Year:
2013

Citing 17
Cited 0

Two algorithms for nearest-neighbor search in high dimensions

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Latent semantic indexing: a probabilistic analysis

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficient search for approximate nearest neighbor in high dimensional spaces

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Database-friendly random projections

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning mixtures of arbitrary gaussians

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Random projection in dimensionality reduction: applications to image and text data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Principal Direction Divisive Partitioning

Data Mining and Knowledge Discovery
Hierarchical Clustering Using Non-Greedy Principal Direction Divisive Partitioning

Information Retrieval
Learning Mixtures of Gaussians

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Improved Fast Gauss Transform and Efficient Kernel Density Estimation

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Introduction to Clustering Large and High-Dimensional Data

Introduction to Clustering Large and High-Dimensional Data
Enhancing principal direction divisive clustering

Pattern Recognition
A random-sampling-based algorithm for learning intersections of halfspaces

Journal of the ACM (JACM)
Experiments with random projection

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

IEEE Transactions on Information Theory
Compressed sensing

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.10

Visualization

Abstract

Projection methods for dimension reduction have enabled the discovery of otherwise unattainable structure in ultra high dimensional data. More recently, a particular method, namely Random Projection, has been shown to have the advantage of high quality data representations with minimal computation effort, even for data dimensions in the range of hundreds of thousands or even millions. Here, we couple this dimension reduction technique with data clustering algorithms that are specially designed for high dimensional cases. First, we show that the theoretical properties of both components can be combined in a sound manner, promising an effective clustering framework. Indeed, for a series of simulated and real ultra high dimensional data scenarios, as the experimental analysis shows, the resulting algorithms achieve high quality data partitions, orders of magnitude faster.