Learning to rank relevant and novel documents through user feedback

Authors:
Abhimanyu Lad;Yiming Yang
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 22
Cited 1

Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Question-answering by predictive annotation

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Novelty and redundancy detection in adaptive filtering

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance thresholds: a multi-stage predictive model of how users evaluate information

Information Processing and Management: an International Journal
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A question answering system supported by information extraction

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Text classification and named entities for new event detection

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Less is more: probabilistic models for retrieving fewer relevant documents

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Utility-based information distillation over temporally sequenced documents

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Generalizing from relevance feedback using named entity wildcards

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
An evaluation of adaptive filtering in the context of realistic task-based information exploration

Information Processing and Management: an International Journal
Learning diverse rankings with multi-armed bandits

Proceedings of the 25th international conference on Machine learning
Predicting diverse subsets using structural SVMs

Proceedings of the 25th international conference on Machine learning
Novelty and diversity in information retrieval evaluation

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Diversifying search results

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Turning down the noise in the blogosphere

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Modeling Expected Utility of Multi-session Information Distillation

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
An Analysis of NP-Completeness in Novelty and Diversity Ranking

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Boosting a Semantic Search Engine by Named Entities

ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Discounted cumulated gain based evaluation of multiple-query IR sessions

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval

A model for mining relevant and non-redundant information

Proceedings of the 27th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of learning to rank relevant and novel documents so as to directly maximize a performance metric called Expected Global Utility (EGU), which has several desirable properties: (i) It measures retrieval performance in terms of relevant as well as novel information, (ii) gives more importance to top ranks to reflect common browsing behavior of users, as opposed to existing objective functions based on set-coverage, (iii) accommodates different levels of tolerance towards redundancy, which is not taken into account by existing evaluation measures, and (iv) extends naturally to the evaluation of session-based retrieval comprising multiple ranked lists. Our ground truth is defined in terms of "information nuggets", which are obviously not known to the retrieval system when processing a new user query. Therefore, our approach uses observable query and document features (words and named entities) as surrogates for nuggets, whose weights are learned based on user feedback in an iterative search session. The ranked list is produced to maximize the weighted coverage of these surrogate nuggets. The optimization of such coverage-based metrics is known to be NP-hard. Therefore, we use a greedy algorithm and show that it guarantees good performance due to the submodularity of the objective function. Our experiments on Topic Detection and Tracking data show that the proposed approach represents an efficient and effective retrieval strategy for maximizing EGU, as compared to a purely-relevance based ranking approach that uses Indri, as well as a MMR-based approach for non-redundant ranking.