The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The budgeted maximum coverage problem
Information Processing Letters
Automatic generation of overview timelines
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Temporal summaries of new topics
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Term Weighting Approaches in Automatic Text Retrieval
Term Weighting Approaches in Automatic Text Retrieval
The Journal of Machine Learning Research
Enhancing digital libraries with TechLens+
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Event threading within news topics
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Predicting diverse subsets using structural SVMs
Proceedings of the 25th international conference on Machine learning
TSCAN: a novel method for topic summarization and content anatomy
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Event-Based Summarization Using Time Features
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Turning down the noise in the blogosphere
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Text summarization model based on maximum coverage problem and its variant
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
The ACL Anthology Network corpus
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
A study of global inference algorithms in multi-document summarization
ECIR'07 Proceedings of the 29th European conference on IR research
Multi-document summarization via budgeted maximization of submodular functions
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
From bursty patterns to bursty facts: The effectiveness of temporal text mining for news
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Weighted citation: An indicator of an article's prestige
Journal of the American Society for Information Science and Technology
Evolutionary timeline summarization: a balanced optimization framework via iterative substitution
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Beyond keyword search: discovering relevant scientific literature
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Structured learning of two-level dynamic rankings
Proceedings of the 20th ACM international conference on Information and knowledge management
Sentence extraction using time features in multi-document summarization
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Large-margin learning of submodular summarization models
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Self reinforcement for important passage retrieval
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
In many areas of life, we now have almost complete electronic archives reaching back for well over two decades. This includes, for example, the body of research papers in computer science, all news articles written in the US, and most people's personal email. However, we have only rather limited methods for analyzing and understanding these collections. While keyword-based retrieval systems allow efficient access to individual documents in archives, we still lack methods for understanding a corpus as a whole. In this paper, we explore methods that provide a temporal summary of such corpora in terms of landmark documents, authors, and topics. In particular, we explicitly model the temporal nature of influence between documents and re-interpret summarization as a coverage problem over words anchored in time. The resulting models provide monotone sub-modular objectives for computing informative and non-redundant summaries over time, which can be efficiently optimized with greedy algorithms. Our empirical study shows the effectiveness of our approach over several baselines.