On the allocation of documents in multiprocessor information retrieval systems
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Parallelizing I/O intensive applications for a workstation cluster: a case study
ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Inverted File Partitioning Schemes in Multiple Disk Systems
IEEE Transactions on Parallel and Distributed Systems
Performance evaluation of a distributed architecture for information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Query performance for tightly coupled distributed digital libraries
Proceedings of the third ACM conference on Digital libraries
Methods for information server selection
ACM Transactions on Information Systems (TOIS)
Partial collection replication versus caching for information retrieval systems
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Spatial information retrieval and geographical ontologies an overview of the SPIRIT project
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Scalable Text Retrieval for Large Digital Libraries
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Experiencies Retrieving Information in the World Wide Web
ISCC '01 Proceedings of the Sixth IEEE Symposium on Computers and Communications
Hourly analysis of a very large topically categorized web query log
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A case study of distributed information retrieval architectures to index one terabyte of text
Information Processing and Management: an International Journal
Load balancing for term-distributed parallel retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A pipelined architecture for distributed text query evaluation
Information Retrieval
Network analysis for distributed information retrieval architectures
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Scheduling queries across replicas
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Document replication strategies for geographically distributed web search engines
Information Processing and Management: an International Journal
Hi-index | 0.01 |
The amount of information available over the Internet is increasing daily as well as the importance and magnitude of Web search engines. Systems based on a single centralised index present several problems (such as lack of scalability), which lead to the use of distributed information retrieval systems to effectively search for and locate the required information. A distributed retrieval system can be clustered and/or replicated. In this paper, using simulations, we present a detailed performance analysis, both in terms of throughput and response time, of a clustered system compared to a replicated system. In addition, we consider the effect of changes in the query topics over time. We show that the performance obtained for a clustered system does not improve the performance obtained by the best replicated system. Indeed, the main advantage of a clustered system is the reduction of network traffic. However, the use of a switched network eliminates the bottleneck in the network, markedly improving the performance of the replicated systems. Moreover, we illustrate the negative performance effect of the changes over time in the query topics when a distributed clustered system is used. On the contrary, the performance of a distributed replicated system is query independent.