An algorithm for suffix stripping
Readings in information retrieval
Term-weighting approaches in automatic text retrieval
Readings in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
TextTiling: A Quantitative Approach to Discourse
TextTiling: A Quantitative Approach to Discourse
Combining Multiple Features for Automatic Text Summarization through Machine Learning
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
A new approach to improving multilingual summarization using a genetic algorithm
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Experiments with CST-based multidocument summarization
TextGraphs-5 Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing
Selecting a feature set to summarize texts in brazilian portuguese
IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Text summarisation in progress: a literature review
Artificial Intelligence Review
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
A zipf-like distant supervision approach for multi-document summarization using wikinews articles
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
This work proposes a new extractive text-summarization algorithm based on the importance of the topics contained in a document. The basic ideas of the proposed algorithm are as follows. At first the document is partitioned by using the TextTiling algorithm, which identifies topics (coherent segments of text) based on the TF-IDF metric. Then for each topic the algorithm computes a measure of its relative relevance in the document. This measure is computed by using the notion of TF-ISF (Term Frequency - Inverse Sentence Frequency), which is our adaptation of the well-known TF-IDF (Term Frequency - Inverse Document Frequency) measure in information retrieval. Finally, the summary is generated by selecting from each topic a number of sentences proportional to the importance of that topic.