Parallel approaches to permutation-based indexing using inverted files

Authors:
Hisham Mohamed;Stéphane Marchand-Maillet
Affiliations:
Université de Genève, Geneva, Switzerland;Université de Genève, Geneva, Switzerland
Venue:
SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Year:
2012

Citing 15
Cited 1

Partitioned posting files: a parallel inverted file structure for information retrieval

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Similarity-based queries

PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Inverted files versus signature files for text indexing

ACM Transactions on Database Systems (TODS)
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Inverted files for text search engines

ACM Computing Surveys (CSUR)
Effective Proximity Retrieval by Ordering Permutations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximate similarity search in metric spaces using inverted files

Proceedings of the 3rd international conference on Scalable information systems
Approximate similarity search: A multi-faceted problem

Journal of Discrete Algorithms
Speeding Up Permutation Based Indexing with Indexing

SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
MiPai: Using the PP-Index to Build an Efficient and Scalable Similarity Search System

SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
A Brief Index for Proximity Searching

CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Similarity Search: The Metric Space Approach

Similarity Search: The Metric Space Approach
Metric Index: An efficient and scalable solution for precise and approximate similarity search

Information Systems

MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present parallel strategies for indexing and searching permutation-based indexes for high dimensional data using inverted files. In this paper, three strategies for parallelization are discussed; posting lists decomposition, reference points decomposition, and multiple independent inverted files. We study performance, efficiency, and effectiveness of our strategies on high dimensional datasets of millions of images. Experimental results show a good performance compared to the sequential version with the same efficiency and effectiveness.