(Sync|Async)+ MPI search engines

Authors:
Mauricio Marin;Veronica Gil Costa
Affiliations:
Yahoo! Research, Santiago, University of Chile;DCC, University of San Luis, Argentina
Venue:
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Year:
2007

Citing 7
Cited 5

A bridging model for parallel computation

Communications of the ACM
Partitioned posting files: a parallel inverted file structure for information retrieval

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Inverted File Partitioning Schemes in Multiple Disk Systems

IEEE Transactions on Parallel and Distributed Systems
Filtered document retrieval with frequency-sorted indexes

Journal of the American Society for Information Science
Query performance for tightly coupled distributed digital libraries

Proceedings of the third ACM conference on Digital libraries
Parallel Search using Partitioned Inverted Files

SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
A pipelined architecture for distributed text query evaluation

Information Retrieval

Exploiting Hybrid Parallelism in Web Search Engines

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Scheduling Intersection Queries in Term Partitioned Inverted Files

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A Search Engine Index for Multimedia Content

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Parallel query processing on distributed clustering indexes

Journal of Discrete Algorithms
Improving Search Engines Performance on Multithreading Processors

High Performance Computing for Computational Science - VECPAR 2008

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a parallel MPI search engine that is capable of automatically switching between asynchronous message passing and bulk-synchronous message passing modes of operation. When the observed query traffic is small or moderate the standard multiple managers/ workers thread based model of message passing is applied for processing the queries. However, when the query traffic increases a round-robin based approach is applied in order to prevent from unstable behavior coming from queries demanding the use of a large amount of resources in computation, communication and disk accesses. This is achieved by (i) a suitable object-oriented multi-threaded MPI software design and (ii) an "atomic" organization of the query processing which allows the use of a novel control strategy that decides the proper mode of operation.