K-Landmarks: distributed dimensionality reduction for clustering quality maintenance

Authors:
Panagis Magdalinos;Christos Doulkeridis;Michalis Vazirgiannis
Affiliations:
Athens University of Economics and Business, Athens, Greece;Athens University of Economics and Business, Athens, Greece;Athens University of Economics and Business, Athens, Greece
Venue:
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Year:
2006

Citing 4
Cited 3

FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Distributed clustering using collective principal component analysis

Knowledge and Information Systems
Mining the Web: Discovering Knowledge from HyperText Data

Mining the Web: Discovering Knowledge from HyperText Data
What Is the Nearest Neighbor in High Dimensional Spaces?

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases

X-SDR: an extensible experimentation suite for dimensionality reduction

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Enhancing Clustering Quality through Landmark-Based Dimensionality Reduction

ACM Transactions on Knowledge Discovery from Data (TKDD)
Distributed knowledge discovery with non linear dimensionality reduction

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Due to the vast amount and pace of high-dimensional data production and their distribution among network nodes, the fields of Distributed Knowledge Discovery (DKD) and Distributed Dimensionality Reduction (DDR) have emerged as a necessity in many application areas. While a wealth of centralized dimensionality reduction (DR) algorithms is available, only few have been proposed for distributed environments, most of them adaptations of centralized ones. In this paper, we introduce K-Landmarks, a new DDR algorithm, and we evaluate its comparative performance against a set of well known distributed and centralized DR algorithms. We primarily focus on each algorithm's performance in maintaining clustering quality throughout the projection, while retaining low stress values. Our algorithm outperforms most other algorithms, showing its suitability for highly distributed environments.