Polynomial-time approximation schemes for geometric min-sum median clustering
Journal of the ACM (JACM)
Search and Classification of High Dimensional Data
APPROX '02 Proceedings of the 5th International Workshop on Approximation Algorithms for Combinatorial Optimization
Cell-probe lower bounds for the partial match problem
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Database-friendly random projections: Johnson-Lindenstrauss with binary coins
Journal of Computer and System Sciences - Special issu on PODS 2001
Efficient similarity search and classification via rank aggregation
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Dictionary matching and indexing with errors and don't cares
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Image similarity search with compact data structures
Proceedings of the thirteenth ACM international conference on Information and knowledge management
A strong lower bound for approximate nearest neighbor searching
Information Processing Letters
Cell-probe lower bounds for the partial match problem
Journal of Computer and System Sciences - Special issue: STOC 2003
Low distortion embeddings for edit distance
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
On the impossibility of dimension reduction in l1
Journal of the ACM (JACM)
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform
Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Secure multiparty computation of approximations
ACM Transactions on Algorithms (TALG)
Ferret: a toolkit for content-based similarity search of feature-rich data
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Distributed computation of the knn graph for large high-dimensional point sets
Journal of Parallel and Distributed Computing
Embeddings of surfaces, curves, and moving points in euclidean space
SCG '07 Proceedings of the twenty-third annual symposium on Computational geometry
Finding near neighbors through cluster pruning
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Low distortion embeddings for edit distance
Journal of the ACM (JACM)
Approximate range searching in higher dimension
Computational Geometry: Theory and Applications
Expert Systems with Applications: An International Journal
Lower bounds for predecessor searching in the cell probe model
Journal of Computer and System Sciences
Comparison of image similarity queries in P2P systems
Computer Communications
Disorder inequality: a combinatorial approach to nearest neighbor search
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Fast dimension reduction using Rademacher series on dual BCH codes
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Sketching in adversarial environments
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
CACS: A Novel Classification Algorithm Based on Concept Similarity
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Approximate line nearest neighbor in high dimensions
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
The Johnson-Lindenstrauss lemma almost characterizes Hilbert space, but not quite
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Random projections preserving the Hamming distance between words
Proceedings of the 2009 conference on New Directions in Neural Networks: 18th Italian Workshop on Neural Networks: WIRN 2008
Space-time tradeoffs for approximate nearest neighbor searching
Journal of the ACM (JACM)
Communications of the ACM
Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection
The Journal of Machine Learning Research
Indexability, concentration, and VC theory
Proceedings of the Third International Conference on SImilarity Search and APplications
Lower bounds for edit distance and product metrics via Poincaré-type inequalities
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
An Optimal Randomized Cell Probe Lower Bound for Approximate Nearest Neighbor Searching
SIAM Journal on Computing
The Computational Hardness of Estimating Edit Distance
SIAM Journal on Computing
Proceedings of the Fourth International Conference on SImilarity Search and APplications
An adaptive nearest neighbor classification algorithm for data streams
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Approximate nearest neighbor search for low dimensional queries
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Random projection, margins, kernels, and feature-selection
SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Efficient error estimating coding: feasibility and applications
IEEE/ACM Transactions on Networking (TON)
SOFSEM'12 Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science
Indexability, concentration, and VC theory
Journal of Discrete Algorithms
Sketching in Adversarial Environments
SIAM Journal on Computing
The smoothed complexity of edit distance
ACM Transactions on Algorithms (TALG)
Improved sketching of hamming distance with error correcting
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Hi-index | 0.02 |
We address the problem of designing data structures that allow efficient search for approximate nearest neighbors. More specifically, given a database consisting of a set of vectors in some high dimensional Euclidean space, we want to construct a space-efficient data structure that would allow us to search, given a query vector, for the closest or nearly closest vector in the database. We also address this problem when distances are measured by the L1 norm and in the Hamming cube. Significantly improving and extending recent results of Kleinberg, we construct data structures whose size is polynomial in the size of the database and search algorithms that run in time nearly linear or nearly quadratic in the dimension. (Depending on the case, the extra factors are polylogarithmic in the size of the database.)