A linear-space algorithm for distance preserving graph embedding

  • Authors:
  • Tetsuo Asano;Prosenjit Bose;Paz Carmi;Anil Maheshwari;Chang Shu;Michiel Smid;Stefanie Wuhrer

  • Affiliations:
  • Japan Advanced Institute of Science and Technology, Ishikawa, Japan;Carleton University, Ottawa, Canada;Carleton University, Ottawa, Canada;Carleton University, Ottawa, Canada;National Research Council of Canada, Ottawa, Canada;Carleton University, Ottawa, Canada;Carleton University, Ottawa, Canada and National Research Council of Canada, Ottawa, Canada

  • Venue:
  • Computational Geometry: Theory and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The distance preserving graph embedding problem is to embed the vertices of a given weighted graph onto points in d-dimensional Euclidean space for a constant d such that for each edge the distance between their corresponding endpoints is as close to the weight of the edge as possible. If the given graph is complete, that is, if the weights are given as a full matrix, then multi-dimensional scaling [Trevor Cox, Michael Cox, Multidimensional Scaling, second ed., Chapman & Hall CRC, 2001] can minimize the sum of squared embedding errors in quadratic time. A serious disadvantage of this approach is its quadratic space requirement. In this paper we develop a linear-space algorithm for this problem for the case when the weight of any edge can be computed in constant time. A key idea is to partition a set of n objects into O(n) disjoint subsets (clusters) of size O(n) such that the minimum inter cluster distance is maximized among all possible such partitions. Experimental results are included comparing the performance of the newly developed approach to the performance of the well-established least-squares multi-dimensional scaling approach [Trevor Cox, Michael Cox, Multidimensional Scaling, second ed., Chapman & Hall CRC, 2001] using three different applications. Although least-squares multi-dimensional scaling gave slightly more accurate results than our newly developed approach, least-squares multi-dimensional scaling ran out of memory for data sets larger than 15@?000 vertices.