On low dimensional random projections and similarity search

  • Authors:
  • Yu-En Lu;Pietro Lió;Steven Hand

  • Affiliations:
  • University of Cambridge, Cambridge, United Kingdom;University of Cambridge, Cambridge, United Kingdom;University of Cambridge, Cambridge, United Kingdom

  • Venue:
  • Proceedings of the 17th ACM conference on Information and knowledge management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Random projection (RP) is a common technique for dimensionality reduction under L2 norm for which many significant space embedding results have been demonstrated. However, many similarity search applications often require very low dimension embeddings in order to reduce overhead and boost performance. Inspired by the use of symmetric probability distributions in previous work, we propose a novel RP algorithm, Beta Random Projection, and give its probabilistic analyses based on Beta and Gaussian approximations. We evaluate the algorithm in terms of standard similarity metrics with other RP algorithms as well as the singular value decomposition (SVD). Our experimental results show that BRP preserves both similarity metrics well and, under various dataset types including random point sets, text (TREC5) and images, provides sharper and consistent performance.