The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based language models for distributed retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Server Ranking for Distributed Text Retrieval Systems on the Internet
Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Relevant document distribution estimation method for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A semisupervised learning method to merge search engine results
ACM Transactions on Information Systems (TOIS)
Combining the language model and inference network approaches to retrieval
Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
Less is more: probabilistic models for retrieving fewer relevant documents
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Federated text retrieval from uncooperative overlapped collections
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Robust result merging using sample-based score estimates
ACM Transactions on Information Systems (TOIS)
SUSHI: scoring scaled samples for server selection
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
An Effectiveness Measure for Ambiguous and Underspecified Queries
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
Classification-based resource selection
Proceedings of the 18th ACM conference on Information and knowledge management
Probabilistic models of ranking novel documents for faceted topic retrieval
Proceedings of the 18th ACM conference on Information and knowledge management
Central-rank-based collection selection in uncooperative distributed information retrieval
ECIR'07 Proceedings of the 29th European conference on IR research
Exploiting query reformulations for web search result diversification
Proceedings of the 19th international conference on World wide web
A joint probabilistic classification model for resource selection
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Document allocation policies for selective searching of distributed indexes
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A multi-collection latent topic model for federated search
Information Retrieval
Aggregated search result diversification
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Diversity by proportionality: an election-based approach to search result diversification
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Explicit relevance models in intent-oriented information retrieval diversification
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Evaluating aggregated search pages
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Personalized diversification of search results
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Combining implicit and explicit topic representations for result diversification
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Federated search in the wild: the combined power of over a hundred search engines
Proceedings of the 21st ACM international conference on Information and knowledge management
Reducing the uncertainty in resource selection
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Distributed information retrieval and applications
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Prior research in resource selection for federated search mainly focused on selecting a small number of information sources that are most relevant to a user query. However, result novelty and diversification are largely unexplored, which does not reflect the various kinds of information needs of users in real world applications. This paper proposes two general approaches to model both result relevance and diversification in selecting sources, in order to provide more comprehensive coverage of multiple aspects of a user query. The first approach focuses on diversifying the document ranking on a centralized sample database before selecting information sources under the framework of Relevant Document Distribution Estimation (ReDDE). The second approach first evaluates the relevance of information sources with respect to each aspect of the query, and then ranks the sources based on the novelty and relevance that they offer. Both approaches can be applied with a wide range of existing resource selection algorithms such as ReDDE, CRCS, CORI and Big Document. Moreover, this paper proposes a learning based approach to combine multiple resource selection algorithms for result diversification, which can further improve the performance. We propose a set of new metrics for resource selection in federated search to evaluate the diversification performance of different approaches. To our best knowledge, this is the first piece of work that addresses the problem of search result diversification in federated search. The effectiveness of the proposed approaches has been demonstrated by an extensive set of experiments on the federated search testbed of the Clueweb dataset.