Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Question-answering by predictive annotation
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Novelty and redundancy detection in adaptive filtering
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance thresholds: a multi-stage predictive model of how users evaluate information
Information Processing and Management: an International Journal
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A question answering system supported by information extraction
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Text classification and named entities for new event detection
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Less is more: probabilistic models for retrieving fewer relevant documents
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Utility-based information distillation over temporally sequenced documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Generalizing from relevance feedback using named entity wildcards
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
An evaluation of adaptive filtering in the context of realistic task-based information exploration
Information Processing and Management: an International Journal
Learning diverse rankings with multi-armed bandits
Proceedings of the 25th international conference on Machine learning
Predicting diverse subsets using structural SVMs
Proceedings of the 25th international conference on Machine learning
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Turning down the noise in the blogosphere
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Modeling Expected Utility of Multi-session Information Distillation
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
An Analysis of NP-Completeness in Novelty and Diversity Ranking
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Boosting a Semantic Search Engine by Named Entities
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Discounted cumulated gain based evaluation of multiple-query IR sessions
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
A model for mining relevant and non-redundant information
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
We consider the problem of learning to rank relevant and novel documents so as to directly maximize a performance metric called Expected Global Utility (EGU), which has several desirable properties: (i) It measures retrieval performance in terms of relevant as well as novel information, (ii) gives more importance to top ranks to reflect common browsing behavior of users, as opposed to existing objective functions based on set-coverage, (iii) accommodates different levels of tolerance towards redundancy, which is not taken into account by existing evaluation measures, and (iv) extends naturally to the evaluation of session-based retrieval comprising multiple ranked lists. Our ground truth is defined in terms of "information nuggets", which are obviously not known to the retrieval system when processing a new user query. Therefore, our approach uses observable query and document features (words and named entities) as surrogates for nuggets, whose weights are learned based on user feedback in an iterative search session. The ranked list is produced to maximize the weighted coverage of these surrogate nuggets. The optimization of such coverage-based metrics is known to be NP-hard. Therefore, we use a greedy algorithm and show that it guarantees good performance due to the submodularity of the objective function. Our experiments on Topic Detection and Tracking data show that the proposed approach represents an efficient and effective retrieval strategy for maximizing EGU, as compared to a purely-relevance based ranking approach that uses Indri, as well as a MMR-based approach for non-redundant ranking.