Discovering the representative of a search engine

Authors:
King-Lup Liu;Adrain Santoso;Clement Yu;Weiyi Meng
Affiliations:
DePaul University, Chicago, IL;DePaul University, Chicago, IL;University of Illinois at Chicago, Chicago, IL;SUNY-Binghamton, Binghamton, NY
Venue:
Proceedings of the tenth international conference on Information and knowledge management
Year:
2001

Citing 6
Cited 9

Automatic discovery of language models for text databases

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient and effective metasearch for a large number of text databases

Proceedings of the eighth international conference on Information and knowledge management
Efficient and effective metasearch for text databases incorporating linkages among documents

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A Statistical Method for Estimating the Usefulness of Text Databases

IEEE Transactions on Knowledge and Data Engineering
Determining Text Databases to Search in the Internet

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding the Most Similar Documents across Multiple Text Databases

ADL '99 Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries

Relevant document distribution estimation method for resource selection

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Distributed query sampling: a quality-conscious approach

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Does pseudo-relevance feedback improve distributed information retrieval systems?

Information Processing and Management: an International Journal
Agent-community based peer-to-peer information retrieval: an evaluation

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Evaluating sampling methods for uncooperative collections

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Generalising multiple capture-recapture to non-uniform sample sizes

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
SUSHI: scoring scaled samples for server selection

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Server selection methods in personal metasearch: a comparative empirical study

Information Retrieval
Size estimation of non-cooperative data collections

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a large number of search engines on the Internet, it is difficult for a person to determine which search engines could serve his/her information needs. A common solution is to construct a metasearch engine on top of the search engines. Upon receiving a user query, the metasearch engine sends it to those underlying search engines which are likely to return the desired documents for the query. The selection algorithm used by a metasearch engine to determine whether a search engine should be sent the query typically makes the decision based on the search-engine representative, which contains characteristic information about the database of a search engine. However, an underlying search engine may not be willing to provide the needed information to the metasearch engine. This paper shows that the needed information can be estimated from an uncooperative search engine with good accuracy. Two pieces of information which permit accurate search engine selection are the number of documents indexed by the search engine and the maximum weight of each term. In this paper, we present techniques for the estimation of these two pieces of information.