Information retrieval using a singular value decomposition model of latent semantic structure
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Latent semantic indexing: a probabilistic analysis
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A note on a method for generating points uniformly on n-dimensional spheres
Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Peer-to-peer information retrieval using self-organizing semantic overlay networks
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
LSH forest: self-tuning indexes for similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform
Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Very sparse random projections
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the international workshop on Workshop on multimedia information retrieval
Peer-to-peer similarity search in metric spaces
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Improving random projections using marginal information
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Adaptive parallel approximate similarity search for responsive multimedia retrieval
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Random projection (RP) is a common technique for dimensionality reduction under L2 norm for which many significant space embedding results have been demonstrated. However, many similarity search applications often require very low dimension embeddings in order to reduce overhead and boost performance. Inspired by the use of symmetric probability distributions in previous work, we propose a novel RP algorithm, Beta Random Projection, and give its probabilistic analyses based on Beta and Gaussian approximations. We evaluate the algorithm in terms of standard similarity metrics with other RP algorithms as well as the singular value decomposition (SVD). Our experimental results show that BRP preserves both similarity metrics well and, under various dataset types including random point sets, text (TREC5) and images, provides sharper and consistent performance.