A Cache-Based Distributed Terabyte Text Retrieval System in CADAL

  • Authors:
  • Jun Cheng;Wen Gao;Bin Liu;Tie-jun Huang;Ling Zhang

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The China-America Digital Academic Library (CADAL) project aims to create a searchable collection of one million digital books freely available over the Internet. For this, a terabyte text retrieval system is required. This paper presents a cache-based, distributed terabyte text retrieval system, with fulltext retrieval, distributed computing and caching techniques. By distributing data by subject on different index servers, query searching is limited to specific index servers. With cache servers, response time is reduced. When queried, the system returns only highly relevant search results, to reduce the workload on the network. The prototype system shows the effectiveness of our design.