An algorithm for one-page summarization of a long text based on thematic hierarchy detection

Authors:
Yoshio Nakao
Affiliations:
Fujitsu Laboratories Ltd., Nakahara-ku, Kawasaki, Japan
Venue:
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Year:
2000

Citing 3
Cited 7

Automatic text decomposition using text segments and text themes

Proceedings of the the seventh ACM conference on Hypertext
Clumping properties of content-bearing words

Journal of the American Society for Information Science
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics

Multidocument summarization: An added value to clustering in interactive retrieval

ACM Transactions on Information Systems (TOIS)
NLP and IR approaches to monolingual and multilingual link detection

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A statistical model for domain-independent text segmentation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Automatic summarising: The state of the art

Information Processing and Management: an International Journal
Trends Analysis of Topics Based on Temporal Segmentation

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Quantifying the limits and success of extractive summarization systems across domains

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Identification of rhetorical roles for segmentation and summarization of a legal judgment

Artificial Intelligence and Law

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an algorithm for text summarization using the thematic hierarchy of a text. The algorithm is intended to generate a one-page summary for the user, thereby enabling the user to skim large volumes of an electronic book on a computer display. The algorithm first detects the thematic hierarchy of a source text with lexical cohesion measured by term repetitions. Then, it identifies boundary sentences at which a topic of appropriate grading probably starts. Finally, it generates a structured summary indicating the outline of the thematic hierarchy. This paper mainly describes and evaluates the part for boundary sentence identification in the algorithm, and then briefly discusses the readability of one-page summaries.