Classification-based resource selection

Authors:
Jaime Arguello;Jamie Callan;Fernando Diaz
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Yahoo!, Montreal, PQ, Canada
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 23
Cited 16

Searching distributed collections with inference networks

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
An algorithm for suffix stripping

Readings in information retrieval
Cluster-based language models for distributed retrieval

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
GlOSS: text-source discovery over the Internet

ACM Transactions on Database Systems (TODS)
Query-based sampling of text databases

ACM Transactions on Information Systems (TOIS)
Query clustering using content words and user feedback

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling framework for resource selection and results merging

Proceedings of the eleventh international conference on Information and knowledge management
Relevant document distribution estimation method for resource selection

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Unified utility maximization framework for resource selection

Proceedings of the thirteenth ACM international conference on Information and knowledge management
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Improving Automatic Query Classification via Semi-Supervised Learning

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
KDD CUP-2005 report: facing a great challenge

ACM SIGKDD Explorations Newsletter
Q2C@UST: our winning solution to query classification in KDDCUP 2005

ACM SIGKDD Explorations Newsletter
Building bridges for web query classification

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Distributed search over the hidden web: hierarchical database sampling and selection

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Learning query intent from regularized click graphs

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Blog site search using resource selection

Proceedings of the 17th ACM conference on Information and knowledge management
Integration of news content into web results

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Sources of evidence for vertical selection

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Adaptation of offline vertical selection predictions in the presence of user feedback

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
SUSHI: scoring scaled samples for server selection

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Central-rank-based collection selection in uncooperative distributed information retrieval

ECIR'07 Proceedings of the 29th European conference on IR research
Sample sizes for query probing in uncooperative distributed information retrieval

APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development

Ranking using multiple document types in desktop search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A joint probabilistic classification model for resource selection

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Vertical selection in the presence of unlabeled verticals

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Document allocation policies for selective searching of distributed indexes

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Federated Search

Foundations and Trends in Information Retrieval
Integrating explicit semantic analysis for ontology-based resource selection

Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
Which should we try first? ranking information resources through query classification

FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Mixture model with multiple centralized retrieval algorithms for result merging in federated search

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Unsupervised linear score normalization revisited

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Shard ranking and cutoff estimation for topically partitioned collections

Proceedings of the 21st ACM international conference on Information and knowledge management
Reducing the uncertainty in resource selection

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Snippet-Based relevance predictions for federated web search

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
On CORI results merging

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Distributed information retrieval and applications

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Taily: shard selection using the tail of score distributions

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Search result diversification in resource selection for federated search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In some retrieval situations, a system must search across multiple collections. This task, referred to as federated search, occurs for example when searching a distributed index or aggregating content for web search. Resource selection refers to the subtask of deciding, given a query, which collections to search. Most existing resource selection methods rely on evidence found in collection content. We present an approach to resource selection that combines multiple sources of evidence to inform the selection decision. We derive evidence from three different sources: collection documents, the topic of the query, and query click-through data. We combine this evidence by treating resource selection as a multiclass machine learning problem. Although machine learned approaches often require large amounts of manually generated training data, we present a method for using automatically generated training data. We make use of and compare against prior resource selection work and evaluate across three experimental testbeds.