A Meta-Search Method Reinforced by Cluster Descriptors

Authors:
Yipeng Shen;Dik Lun Lee
Affiliations:
-;-
Venue:
WISE '01 Proceedings of the Second International Conference on Web Information Systems Engineering (WISE'01) Volume 1 - Volume 1
Year:
2001

Citing 0
Cited 3

Information source selection for resource constrained environments

ACM SIGMOD Record
Information retrieval in a peer-to-peer environment

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Federated Search

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

A meta-search engine acts as an agent for the participant search engines. It receives queries from users and redirects them to one or more of the participant search engines for processing. A meta-search engine incorporating many participant search engines is better than a single global search engine in terms of the number of pages indexed and the freshness of the indexes. The meta-search engine stores descriptive data (i.e., descriptors) about the index maintained by each participant search engine so that it can estimate the relevance of each search engine when a query is received. The ability for the meta-search engine to select the most relevant search engines determines the quality of the final result. To facilitate the selection process, the document space covered by each search engine must be described not only concisely but also precisely. Existing methods tend to focus on the conciseness of the descriptors by keeping a descriptor for a search engine's entire index. This paper proposes to cluster a search engine's document space into clusters and keep a descriptor for each cluster. We show that cluster descriptors can provide a finer and more accurate representation of the document space, and hence enable the meta-search engine to improve the selection of relevant search engines. Two cluster-based search engine selection scenarios (i.e., independent and high-correlation) are discussed in this paper. Experiments verify that the cluster-based search engine selection can effectively identify the most relevant search engines and improve the quality of the search results consequently.