Very sparse random projections

Authors:
Ping Li;Trevor J. Hastie;Kenneth W. Church
Affiliations:
Stanford University, Stanford, CA;Stanford University, Stanford, CA;Microsoft Corporation, Redmond, WA
Venue:
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2006

Citing 26
Cited 24

The Johnson-Lindenstrauss Lemma and the sphericity of some graphs

Journal of Combinatorial Theory Series A
Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
On the self-similar nature of Ethernet traffic (extended version)

IEEE/ACM Transactions on Networking (TON)
Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming

Journal of the ACM (JACM)
The space complexity of approximating the frequency moments

STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Latent semantic indexing: a probabilistic analysis

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Term Weighting in Information Retrieval Using the Term Precision Model

Journal of the ACM (JACM)
Discovering unexpected information from your competitors' web sites

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Random projection in dimensionality reduction: applications to image and text data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Similarity estimation techniques from rounding algorithms

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Text Categorization with Support Vector Machines. How to Represent Texts in Input Space?

Machine Learning
An elementary proof of a theorem of Johnson and Lindenstrauss

Random Structures & Algorithms
Random Projection: A New Approach to VLSI Layout

FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
An Algorithmic Theory of Learning: Robust Concepts and Random Projection

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Stable distributions, pseudorandom generators, embeddings and data stream computation

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Database-friendly random projections: Johnson-Lindenstrauss with binary coins

Journal of Computer and System Sciences - Special issu on PODS 2001
Experiments with random projections for machine learning

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
On scaling latent semantic indexing for large peer-to-peer systems

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A comprehensive comparative study on term weighting schemes for text categorization with support vector machines

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
PRISM: indexing multi-dimensional data in P2P networks using reference vectors

Proceedings of the 13th annual ACM international conference on Multimedia
Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

IEEE Transactions on Knowledge and Data Engineering
Randomized algorithms and NLP: using locality sensitive hash function for high speed noun clustering

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Modern Applied Statistics with S

Modern Applied Statistics with S
Improving random projections using marginal information

COLT'06 Proceedings of the 19th annual conference on Learning Theory

Distributed sparse random projections for refinable approximation

Proceedings of the 6th international conference on Information processing in sensor networks
Very sparse stable random projections for dimension reduction in lα (0

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations

Computational Linguistics
Locality sensitive hash functions based on concomitant rank order statistics

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
On low dimensional random projections and similarity search

Proceedings of the 17th ACM conference on Information and knowledge management
On Estimating Frequency Moments of Data Streams

APPROX '07/RANDOM '07 Proceedings of the 10th International Workshop on Approximation and the 11th International Workshop on Randomization, and Combinatorial Optimization. Algorithms and Techniques
Beta Random Projection

Bio-Inspired Computing and Communication
A Random Network Ensemble for Face Recognition

ICB '09 Proceedings of the Third International Conference on Advances in Biometrics
A pioneering cryptic random projection based approach for privacy preserving data mining

IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Fast Manhattan sketches in data streams

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
High-dimensional Variable Selection with Sparse Random Projections: Measurement Sparsity and Statistical Efficiency

The Journal of Machine Learning Research
An analysis of random projection for changeable and privacy-preserving biometric verification

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Realtime training on mobile devices for face recognition applications

Pattern Recognition
Clustering and semantics preservation in cultural heritage information spaces

RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Randomly projected KD-trees with distance metric learning for image retrieval

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part II
Efficient online locality sensitive hashing via reservoir counting

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
CHIRP: a new classifier based on composite hypercubes on iterated random projections

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Iterative random projections for high-dimensional data clustering

Pattern Recognition Letters
Approximate privacy-preserving data mining on vertically partitioned data

DBSec'12 Proceedings of the 26th Annual IFIP WG 11.3 conference on Data and Applications Security and Privacy
Distributed high dimensional information theoretical image registration via random projections

Digital Signal Processing
Substantial improvements in the set-covering projection classifier CHIRP (composite hypercubes on iterated random projections)

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Fast large-scale approximate graph construction for NLP

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Sketch-based image retrieval on mobile devices using compact hash bits

Proceedings of the 20th ACM international conference on Multimedia
Real-time compressive tracking

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

There has been considerable interest in random projections, an approximate algorithm for estimating distances between pairs of points in a high-dimensional vector space. Let A in Rn x D be our n points in D dimensions. The method multiplies A by a random matrix R in RD x k, reducing the D dimensions down to just k for speeding up the computation. R typically consists of entries of standard normal N(0,1). It is well known that random projections preserve pairwise distances (in the expectation). Achlioptas proposed sparse random projections by replacing the N(0,1) entries in R with entries in -1,0,1 with probabilities 1/6, 2/3, 1/6, achieving a threefold speedup in processing time.We recommend using R of entries in -1,0,1 with probabilities 1/2√D, 1-1√D, 1/2√D for achieving a significant √D-fold speedup, with little loss in accuracy.