A trainable document summarizer
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
The automatic construction of large-scale corpora for summarization research
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Statistics-Based Summarization - Step One: Sentence Compression
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Bitext maps and alignment via pattern recognition
Computational Linguistics
Char_align: a program for aligning parallel texts at the character level
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
The automated acquisition of topic signatures for text summarization
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
K-vec: a new approach for aligning parallel texts
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Task-based evaluation of text summarization using Relevance Prediction
Information Processing and Management: an International Journal
Developing learning strategies for topic-based summarization
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Intertopic information mining for query-based summarization
Journal of the American Society for Information Science and Technology
Applying regression models to query-focused multi-document summarization
Information Processing and Management: an International Journal
Expert Systems with Applications: An International Journal
Summarization of legal texts with high cohesion and automatic compression rate
JSAI-isAI'12 Proceedings of the 2012 international conference on New Frontiers in Artificial Intelligence
Hi-index | 0.00 |
A serious bottleneck in the development of trainable text summarization systems is the shortage of training data. Constructing such data is a very tedious task, especially because there are in general many different correct ways to summarize a text. Fortunately we can utilize the Internet as a source of suitable training data. In this paper, we present a summarization system that uses the web as the source of training data. The procedure involves structuring the articles downloaded from various websites, building adequate corpora of (summary, text) and (extract, text) pairs, training on positive and negative data, and automatically learning to perform the task of extraction-based summarization at a level comparable to the best DUC systems.