Efficient and effective metasearch for text databases incorporating linkages among documents

  • Authors:
  • Clement Yu;Weiyi Meng;Wensheng Wu;King-Lup Liu

  • Affiliations:
  • Dept. of CS, U. of Illinois at Chicago, Chicago, IL;Dept. of CS, SUNY at Binghamton, Binghamton, NY;Dept. of Computer Science, UIUC;School of CSTIS, DePaul University, Chicago, IL

  • Venue:
  • SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Linkages among documents have a significant impact on the importance of documents, as it can be argued that important documents are pointed to by many documents or by other important documents. Metasearch engines can be used to facilitate ordinary users for retrieving information from multiple local sources (text databases). There is a search engine associated with each database. In a large-scale metasearch engine, the contents of each local database is represented by a representative. Each user query is evaluated against he set of representatives of all databases in order to determine the appropriate databases (search engines) to search (invoke) In previous word, the linkage information between documents has not been utilized in determining the appropriate databases to search. In this paper, such information is employed to determine the degree of relevance of a document with respect to a given query. Specifically, the importance (rank) of each document as determined by the linkages is integrated in each database representative to facilitate the selection of databases for each given query. We establish a necessary and sufficient condition to rank databases optimally, while incorporating the linkage information. A method is provided to estimate the desired quantities stated in the necessary and sufficient condition. The estimation method runs in time linearly proportional to the number of query terms. Experimental results are provided to demonstrate the high retrieval effectiveness of the method.