The effectiveness of GIOSS for the text database discovery problem
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Learning collection fusion strategies
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
STARTS: Stanford proposal for Internet meta-searching
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Experiences with selecting search engines using metasearch
ACM Transactions on Information Systems (TOIS)
Data structures for efficient broker implementation
ACM Transactions on Information Systems (TOIS)
A probabilistic model for distributed information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Interaction in information retrieval: selection and effectiveness of search terms
Journal of the American Society for Information Science
Nonparametric methods for quantitative analysis (3rd ed.)
Nonparametric methods for quantitative analysis (3rd ed.)
Multiple search engines in database merging
DL '97 Proceedings of the second ACM international conference on Digital libraries
Pharos: a scalable distributed architecture for locating heterogeneous information sources
CIKM '97 Proceedings of the sixth international conference on Information and knowledge management
Database selection techniques for routing bibliographic queries
Proceedings of the third ACM conference on Digital libraries
Effective retrieval with distributed collections
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating database selection techniques: a testbed and experiment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Methods for information server selection
ACM Transactions on Information Systems (TOIS)
Automatic discovery of language models for text databases
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Comparing the performance of database selection algorithms
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic solution to the selection and fusion problem in distributed information retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based language models for distributed retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Scalable collection summarization and selection
Proceedings of the fourth ACM conference on Digital libraries
A decision-theoretic approach to database selection in networked IR
ACM Transactions on Information Systems (TOIS)
ZBroker: a query routing broker for Z39.50 databases
Proceedings of the eighth international conference on Information and knowledge management
Efficient and effective metasearch for a large number of text databases
Proceedings of the eighth international conference on Information and knowledge management
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Analysis of a very large web search engine query log
ACM SIGIR Forum
Server selection on the World Wide Web
DL '00 Proceedings of the fifth ACM conference on Digital libraries
The impact of database selection on distributed searching
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Metrics for evaluating database selection techniques
World Wide Web
Determining Text Databases to Search in the Internet
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Server Ranking for Distributed Text Retrieval Systems on the Internet
Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)
Finding the Most Similar Documents across Multiple Text Databases
ADL '99 Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries
Estimating the Usefulness of Search Engines
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Multilingual Federated Searching Across Heterogeneous Collections
Multilingual Federated Searching Across Heterogeneous Collections
Using Automated Classification for Summarizing and SelectingHeterogeneous Information Sources
Using Automated Classification for Summarizing and SelectingHeterogeneous Information Sources
Effective and Efficient Automatic Database Selection
Effective and Efficient Automatic Database Selection
Characterizing World Wide Web Queries
Characterizing World Wide Web Queries
Database selection in distributed information retrieval: a study of multi-collection information retrieval
Distributed search over the hidden web: hierarchical database sampling and selection
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Unified utility maximization framework for resource selection
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Improving text collection selection with coverage and overlap statistics
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Server selection methods in hybrid portal search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Two-stage statistical language models for text database selection
Information Retrieval
Capturing collection size for distributed non-cooperative retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Effective keyword-based selection of relational databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Distributed text retrieval from overlapping collections
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Using query logs to establish vocabularies in distributed information retrieval
Information Processing and Management: an International Journal
Federated text retrieval from uncooperative overlapped collections
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Classification-aware hidden-web text database selection
ACM Transactions on Information Systems (TOIS)
Integral based source selection for uncooperative distributed information retrieval environments
Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Robust result merging using sample-based score estimates
ACM Transactions on Information Systems (TOIS)
Simple Adaptations of Data Fusion Algorithms for Source Selection
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Server selection methods in personal metasearch: a comparative empirical study
Information Retrieval
Central-rank-based collection selection in uncooperative distributed information retrieval
ECIR'07 Proceedings of the 29th European conference on IR research
Mining Query Logs: Turning Search Usage Data into Knowledge
Foundations and Trends in Information Retrieval
Information Sciences: an International Journal
Scalability of findability: effective and efficient IR operations in large information networks
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Modeling information sources as integrals for effective and efficient source selection
Information Processing and Management: an International Journal
A new perspective on collection selection
ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Foundations and Trends in Information Retrieval
K-graphs: selecting top-k data sources for XML keyword queries
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Compact features for detection of near-duplicates in distributed retrieval
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Sample sizes for query probing in uncooperative distributed information retrieval
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Querying e-catalogs using content summaries
ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part I
Top-K data source selection for keyword queries over multiple XML data sources
Journal of Information Science
Allocating images and selecting image collections for distributed visual search
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service
Federated search in the wild: the combined power of over a hundred search engines
Proceedings of the 21st ACM international conference on Information and knowledge management
Studying the clustering paradox and scalability of search in highly distributed environments
ACM Transactions on Information Systems (TOIS)
Topic based photo set retrieval using user annotated tags
Multimedia Tools and Applications
Hi-index | 0.00 |
The proliferation of online information resources increases the importance of effective and efficient information retrieval in a multicollection environment. Multicollection searching is cast in three parts: collection selection (also referred to as database selection), query processing and results merging. In this work, we focus our attention on the evaluation of the first step, collection selection.In this article, we present a detailed discussion of the methodology that we used to evaluate and compare collection selection approaches, covering both test environments and evaluation measures. We compare the CORI, CVV and gGLOSS collection selection approaches using six test environments utilizing three document testbeds. We note similar trends in performance among the collection selection approaches, but the CORI approach consistently outperforms the other approaches, suggesting that effective collection selection can be achieved using limited information about each collection.The contributions of this work are both the assembled evaluation methodology as well as the application of that methodology to compare collection selection approaches in a standardized environment.