PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
ACM Computing Surveys (CSUR)
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
LSH forest: self-tuning indexes for similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Similarity Search: The Metric Space Approach (Advances in Database Systems)
Similarity Search: The Metric Space Approach (Advances in Database Systems)
Multi-probe LSH: efficient indexing for high-dimensional similarity search
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Combining Metric Features in Large Collections
SISAP '08 Proceedings of the First International Workshop on Similarity Search and Applications (sisap 2008)
Effective Proximity Retrieval by Ordering Permutations
IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximate similarity search in metric spaces using inverted files
Proceedings of the 3rd international conference on Scalable information systems
Approximate similarity search: A multi-faceted problem
Journal of Discrete Algorithms
Counting distance permutations
Journal of Discrete Algorithms
Metric Index: An Efficient and Scalable Solution for Similarity Search
SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
MiPai: Using the PP-Index to Build an Efficient and Scalable Similarity Search System
SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
On locality-sensitive indexing in generic metric spaces
Proceedings of the Third International Conference on SImilarity Search and APplications
Hi-index | 0.00 |
We present the Permutation Prefix Index (this work is a revised and extended version of Esuli (2009b), presented at the 2009 LSDS-IR Workshop, held in Boston) (PP-Index), an index data structure that supports efficient approximate similarity search. The PP-Index belongs to the family of the permutation-based indexes, which are based on representing any indexed object with ''its view of the surrounding world'', i.e., a list of the elements of a set of reference objects sorted by their distance order with respect to the indexed object. In its basic formulation, the PP-Index is strongly biased toward efficiency. We show how the effectiveness can easily reach optimal levels just by adopting two ''boosting'' strategies: multiple index search and multiple query search, which both have nice parallelization properties. We study both the efficiency and the effectiveness properties of the PP-Index, experimenting with collections of sizes up to one hundred million objects, represented in a very high-dimensional similarity space.