The effectiveness of GIOSS for the text database discovery problem
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
STARTS: Stanford proposal for Internet meta-searching
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Experiences with selecting search engines using metasearch
ACM Transactions on Information Systems (TOIS)
Analyses of multiple evidence combination
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
An algorithm for suffix stripping
Readings in information retrieval
TREC and TIPSTER experiments with INQUERY
Readings in information retrieval
Effective retrieval with distributed collections
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Inquirus, the NECI meta search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
A technique for measuring the relative size and overlap of public Web search engines
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Automatic discovery of language models for text databases
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Comparing the performance of database selection algorithms
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based language models for distributed retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A decision-theoretic approach to database selection in networked IR
ACM Transactions on Information Systems (TOIS)
Architecture of a metasearch engine that supports user information needs
Proceedings of the eighth international conference on Information and knowledge management
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Server selection on the World Wide Web
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Collection selection and results merging with topically organized U.S. patents and TREC data
Proceedings of the ninth international conference on Information and knowledge management
Query-based sampling of text databases
ACM Transactions on Information Systems (TOIS)
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions for combining the outputs of search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The effectiveness of query expansion for distributed information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Approaches to collection selection and results merging for distributed information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Expert agreement and content based reranking in a meta search environment using Mearf
Proceedings of the 11th international conference on World Wide Web
Using sampled data and regression to merge search engine results
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Pruning long documents for distributed information retrieval
Proceedings of the eleventh international conference on Information and knowledge management
A language modeling framework for resource selection and results merging
Proceedings of the eleventh international conference on Information and knowledge management
Fusion Via a Linear Combination of Scores
Information Retrieval
QProber: A system for automatic classification of hidden-Web databases
ACM Transactions on Information Systems (TOIS)
Precision and Recall of GlOSS Estimators for Database Discovery
PDIS '94 Proceedings of the Third International Conference on Parallel and Distributed Information Systems
Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Evaluating different methods of estimating retrieval quality for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Relevant document distribution estimation method for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Result merging strategies for a current news metasearcher
Information Processing and Management: an International Journal
A Methodology for Collection Selection in Heterogeneous Contexts
ITCC '02 Proceedings of the International Conference on Information Technology: Coding and Computing
Adaptive combination of evidence for information retrieval
Adaptive combination of evidence for information retrieval
Comparing the performance of collection selection algorithms
ACM Transactions on Information Systems (TOIS)
A semisupervised learning method to merge search engine results
ACM Transactions on Information Systems (TOIS)
A unified model for metasearch, pooling, and system evaluation
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Collection selection for managed distributed document databases
Information Processing and Management: an International Journal
When one sample is not enough: improving text database selection using shrinkage
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Unified utility maximization framework for resource selection
Proceedings of the thirteenth ACM international conference on Information and knowledge management
A two-phase sampling technique for information extraction from hidden web databases
Proceedings of the 6th annual ACM international workshop on Web information and data management
Modeling search engine effectiveness for federated search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Accurately interpreting clickthrough data as implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
The FedLemur project: Federated search in the real world
Journal of the American Society for Information Science and Technology
Random sampling from a search engine's index
Proceedings of the 15th international conference on World Wide Web
Performance prediction of data fusion for information retrieval
Information Processing and Management: an International Journal
ProbFuse: a probabilistic approach to data fusion
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Capturing collection size for distributed non-cooperative retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Distributed query sampling: a quality-conscious approach
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive query-based sampling for distributed IR
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Result merging methods in distributed information retrieval with overlapping databases
Information Retrieval
Distributed text retrieval from overlapping collections
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Federated text retrieval from uncooperative overlapped collections
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating sampling methods for uncooperative collections
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating collection size with logistic regression
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Computing pagerank in a distributed internet search system
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Central-rank-based collection selection in uncooperative distributed information retrieval
ECIR'07 Proceedings of the 29th European conference on IR research
Results merging algorithm using multiple regression models
ECIR'07 Proceedings of the 29th European conference on IR research
Compact features for detection of near-duplicates in distributed retrieval
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Adaptive query-based sampling of distributed collections
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Sample sizes for query probing in uncooperative distributed information retrieval
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
SUSHI: scoring scaled samples for server selection
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Information Sciences: an International Journal
A joint probabilistic classification model for resource selection
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Research proposal for distributed deep web search
PIKM '10 Proceedings of the 3rd workshop on Ph.D. students in information and knowledge management
Foundations and Trends in Information Retrieval
A weighted curve fitting method for result merging in federated search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Mixture model with multiple centralized retrieval algorithms for result merging in federated search
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Reducing the uncertainty in resource selection
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Distributed information retrieval and applications
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Search result diversification in resource selection for federated search
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
In federated information retrieval, a query is routed to multiple collections and a single answer list is constructed by combining the results. Such metasearch provides a mechanism for locating documents on the hidden Web and, by use of sampling, can proceed even when the collections are uncooperative. However, the similarity scores for documents returned from different collections are not comparable, and, in uncooperative environments, document scores are unlikely to be reported. We introduce a new merging method for uncooperative environments, in which similarity scores for the sampled documents held for each collection are used to estimate global scores for the documents returned per query. This method requires no assumptions about properties such as the retrieval models used. Using experiments on a wide range of collections, we show that in many cases our merging methods are significantly more effective than previous techniques.