ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
Partitioned posting files: a parallel inverted file structure for information retrieval
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Parallel text searching in serial files using a processor farm
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieval performance of a distributed text database utilizing a parallel processor document server
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Transaction Processing: Concepts and Techniques
Transaction Processing: Concepts and Techniques
Performance Measurements of the First RAID Prototype
Performance Measurements of the First RAID Prototype
Interaction of query evaluation and buffer management for information retrieval
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Rank-preserving two-level caching for scalable search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Query processing and inverted indices in shared: nothing text document information retrieval systems
The VLDB Journal — The International Journal on Very Large Data Bases - Parallelism in database systems
Hybrid Partition Inverted Files: Experimental Validation
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
A refreshing perspective of search engine caching
Proceedings of the 19th international conference on World wide web
A five-level static cache architecture for web search engines
Information Processing and Management: an International Journal
Hi-index | 0.00 |
A common class of existing information retrieval system provides access to abstracts. For example Stanford University, through its FOLIO system, provides access to the INSPECT database of abstracts of the literature on physics, computer science, electrical engineering, etc. In this paper this database is studied by using a trace-driven simulation. We focus on physical index design, inverted index caching, and database scaling in a distributed shared-nothing system. All three issues are shown to have a strong effect on response time and throughput. Database scaling is explored in two ways. One way assumes an “optimal” configuration for a single host and then linearly scales the database by duplicating the host architecture as needed. The second way determines the optimal number of hosts given a fixed database size.