Implementing ranking strategies using text signatures
ACM Transactions on Information Systems (TOIS)
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Inverted File Partitioning Schemes in Multiple Disk Systems
IEEE Transactions on Parallel and Distributed Systems
Query performance for tightly coupled distributed digital libraries
Proceedings of the third ACM conference on Digital libraries
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Modern Information Retrieval
An Efficient Indexing Technique for Full Text Databases
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Information Processing and Management: an International Journal
Improving the load balance for hybrid partitioning scheme by directing hybrid queries
PDCN '08 Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks
Scalable search platform: improving pipelined query processing for distributed full-text retrieval
Proceedings of the 21st international conference companion on World Wide Web
An investigation into query throughput and load balance using grid IR
FDIA'08 Proceedings of the 2nd BCS IRSG conference on Future Directions in Information Access
A term-based inverted index partitioning model for efficient distributed query processing
ACM Transactions on the Web (TWEB)
Hi-index | 0.00 |
Shared-nothing, parallel text retrieval systems require an inverted index, representing a document collection, to be partitioned among a number of processors. In general, the index can be partitioned based on either the terms or documents in the collection, and the way the partitioning is done greatly affects the query processing performance of the parallel system. In this work, we investigate the effect of these two index partitioning schemes on query processing. We conduct experiments on a 32-node PC cluster, considering the case where index is completely stored in disk. Performance results are reported for a large (30 GB) document collection using an MPI-based parallel query processing implementation.