The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Continuous queries over data streams
ACM SIGMOD Record
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Improving recommendation lists through topic diversification
WWW '05 Proceedings of the 14th international conference on World Wide Web
Less is more: probabilistic models for retrieving fewer relevant documents
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient query subscription processing for prospective search engines
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the Second ACM International Conference on Web Search and Data Mining
It takes variety to make a world: diversification in recommender systems
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
An axiomatic approach for result diversification
Proceedings of the 18th international conference on World wide web
C-SPARQL: SPARQL for continuous querying
Proceedings of the 18th international conference on World wide web
Efficient Computation of Diverse Query Results
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Portfolio theory of information retrieval
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Preference-aware publish/subscribe delivery with diversity
Proceedings of the Third ACM International Conference on Distributed Event-Based Systems
A risk minimization framework for information retrieval
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Exploiting query reformulations for web search result diversification
Proceedings of the 19th international conference on World wide web
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Max-Sum diversification, monotone submodular functions and dynamic updates
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Dynamic diversification of continuous data
Proceedings of the 15th International Conference on Extending Database Technology
Search result diversification methods to assist lexicographers
LAW VI '12 Proceedings of the Sixth Linguistic Annotation Workshop
Efficient jaccard-based diversity analysis of large document collections
Proceedings of the 21st ACM international conference on Information and knowledge management
DisC diversity: result diversification based on dissimilarity and coverage
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Result diversification is an effective method to reduce the risk that none of the returned results satisfies a user's query intention. It has been shown to decrease query abandonment substantially. On the other hand, computing an optimally diverse set is NP-hard for the usual objectives. Existing greedy diversification algorithms require random access to the input set, rendering them impractical in the context of large result sets or continuous data. To solve this issue, we present a novel diversification approach which treats the input as a stream and processes each element in an incremental fashion, maintaining a near-optimal diverse set at any point in the stream. Our approach exhibits a linear computation and constant memory complexity with respect to input size, without significant loss of diversification quality. In an extensive evaluation on several real-world data sets, we show the applicability and efficiency of our algorithm for large result sets as well as for continuous query scenarios such as news stream subscriptions.