Effect of inverted index partitioning schemes on performance of query processing in parallel text retrieval systems

Authors:
B. Barla Cambazoglu;Aytul Catal;Cevdet Aykanat
Affiliations:
Department of Computer Engineering, Bilkent University, Bilkent, Ankara, Turkey;Scientific and Technological Research Council of Turkey (TÜBİTAK), Kavaklıdere, Ankara, Turkey;Department of Computer Engineering, Bilkent University, Bilkent, Ankara, Turkey
Venue:
ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences
Year:
2006

Citing 9
Cited 4

Implementing ranking strategies using text signatures

ACM Transactions on Information Systems (TOIS)
Incremental updates of inverted lists for text document retrieval

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Inverted File Partitioning Schemes in Multiple Disk Systems

IEEE Transactions on Parallel and Distributed Systems
Query performance for tightly coupled distributed digital libraries

Proceedings of the third ACM conference on Digital libraries
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Performance of inverted indices in shared-nothing distributed text document informatioon retrieval systems

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Modern Information Retrieval

Modern Information Retrieval
An Efficient Indexing Technique for Full Text Databases

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Performance of query processing implementations in ranking-based text retrieval systems using inverted indices

Information Processing and Management: an International Journal

Improving the load balance for hybrid partitioning scheme by directing hybrid queries

PDCN '08 Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks
Scalable search platform: improving pipelined query processing for distributed full-text retrieval

Proceedings of the 21st international conference companion on World Wide Web
An investigation into query throughput and load balance using grid IR

FDIA'08 Proceedings of the 2nd BCS IRSG conference on Future Directions in Information Access
A term-based inverted index partitioning model for efficient distributed query processing

ACM Transactions on the Web (TWEB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Shared-nothing, parallel text retrieval systems require an inverted index, representing a document collection, to be partitioned among a number of processors. In general, the index can be partitioned based on either the terms or documents in the collection, and the way the partitioning is done greatly affects the query processing performance of the parallel system. In this work, we investigate the effect of these two index partitioning schemes on query processing. We conduct experiments on a 32-node PC cluster, considering the case where index is completely stored in disk. Performance results are reported for a large (30 GB) document collection using an MPI-based parallel query processing implementation.