A model for mining relevant and non-redundant information

Authors:
Laura Langohr;Hannu Toivonen
Affiliations:
University of Helsinki, Finland;University of Helsinki, Finland
Venue:
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Year:
2012

Citing 16
Cited 0

The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Matrix analysis and applied linear algebra

Matrix analysis and applied linear algebra
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Finding Representative Set from Massive Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Core algorithms in the CLEVER system

ACM Transactions on Internet Technology (TOIT)
Maximizing Non-Monotone Submodular Functions

FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
Learning diverse rankings with multi-armed bandits

Proceedings of the 25th international conference on Machine learning
A study of methods for negative relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and diversity in information retrieval evaluation

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval

Introduction to Information Retrieval
Active relevance feedback for difficult queries

Proceedings of the 17th ACM conference on Information and knowledge management
An axiomatic approach for result diversification

Proceedings of the 18th international conference on World wide web
Finding a team of experts in social networks

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast approximate spectral clustering

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to rank relevant and novel documents through user feedback

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
k-nearest neighbors in uncertain graphs

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a relatively simple yet powerful model for choosing relevant and non-redundant pieces of information. The model addresses data mining or information retrieval settings where relevance is measured with respect to a set of key or query objects, either specified by the user or obtained by a data mining step. The problem addressed is not only to identify other relevant objects, but also ensure that they are not related to possible negative query objects, and that they are not redundant with respect to each other. The model proposed here only assumes a similarity or distance function for the objects. It has simple parameterization to allow for different behaviors with respect to query objects. We analyze the model and give two efficient, approximate methods. We illustrate and evaluate the proposed model on different applications: linguistics and social networks. The results indicate that the model and methods are useful in finding a relevant and non-redundant set of results. While this area has been a popular topic of research, our contribution is to provide a simple, generic model that covers several related approaches while providing a systematic model for taking account of positive and negative query objects as well as non-redundancy of the output.