Automatic discovery of language models for text databases
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient and effective metasearch for a large number of text databases
Proceedings of the eighth international conference on Information and knowledge management
Efficient and effective metasearch for text databases incorporating linkages among documents
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A Statistical Method for Estimating the Usefulness of Text Databases
IEEE Transactions on Knowledge and Data Engineering
Determining Text Databases to Search in the Internet
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding the Most Similar Documents across Multiple Text Databases
ADL '99 Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries
Relevant document distribution estimation method for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Distributed query sampling: a quality-conscious approach
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Does pseudo-relevance feedback improve distributed information retrieval systems?
Information Processing and Management: an International Journal
Agent-community based peer-to-peer information retrieval: an evaluation
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Evaluating sampling methods for uncooperative collections
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Generalising multiple capture-recapture to non-uniform sample sizes
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
SUSHI: scoring scaled samples for server selection
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Server selection methods in personal metasearch: a comparative empirical study
Information Retrieval
Size estimation of non-cooperative data collections
Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
Given a large number of search engines on the Internet, it is difficult for a person to determine which search engines could serve his/her information needs. A common solution is to construct a metasearch engine on top of the search engines. Upon receiving a user query, the metasearch engine sends it to those underlying search engines which are likely to return the desired documents for the query. The selection algorithm used by a metasearch engine to determine whether a search engine should be sent the query typically makes the decision based on the search-engine representative, which contains characteristic information about the database of a search engine. However, an underlying search engine may not be willing to provide the needed information to the metasearch engine. This paper shows that the needed information can be estimated from an uncooperative search engine with good accuracy. Two pieces of information which permit accurate search engine selection are the number of documents indexed by the search engine and the maximum weight of each term. In this paper, we present techniques for the estimation of these two pieces of information.