Summarizing text documents: sentence selection and evaluation metrics
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Advances in Automatic Text Summarization
Advances in Automatic Text Summarization
Generating Text Summaries through the Relative Importance of Topics
IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Learning query-biased web page summarization
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Multi-document summarization using cluster-based link analysis
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Correlation between ROUGE and human evaluation of extractive meeting summaries
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
A SVM-Based Ensemble Approach to Multi-Document Summarization
Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
LexRank: graph-based lexical centrality as salience in text summarization
Journal of Artificial Intelligence Research
The automatic creation of literature abstracts
IBM Journal of Research and Development
Distant supervision for relation extraction without labeled data
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Boilerplate detection using shallow text features
Proceedings of the third ACM international conference on Web search and data mining
Many are better than one: improving multi-document summarization via weighted consensus
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A new approach to improving multilingual summarization using a genetic algorithm
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Hypergeometric language model and zipf-like scoring function for web document similarity retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Hi-index | 0.00 |
This work presents a sentence ranking strategy based on distant supervision for the multi-document summarization problem. Due to the difficulty of obtaining large training datasets formed by document clusters and their respective human-made summaries, we propose building a training and a testing corpus from Wikinews. Wikinews articles are modeled as "distant" summaries of their cited sources, considering that first sentences of Wikinews articles tend to summarize the event covered in the news story. Sentences from cited sources are represented as tuples of numerical features and labeled according to a relationship with the given distant summary that is based on the Zipf law. Ranking functions are trained using linear regressions and ranking SVMs, which are also combined using Borda count. Top ranked sentences are concatenated and used to build summaries, which are compared with the first sentences of the distant summary using ROUGE evaluation measures. Experimental results obtained show the effectiveness of the proposed method and that the combination of different ranking techniques outperforms the quality of the generated summary.