Text Entailment for Logical Segmentation and Summarization

Authors:
Doina Tatar;Andreea Diana Mihis;Dana Lupsa
Affiliations:
University "Babes-Bolyai", Cluj-Napoca, Romania;University "Babes-Bolyai", Cluj-Napoca, Romania;University "Babes-Bolyai", Cluj-Napoca, Romania
Venue:
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Year:
2008

Citing 14
Cited 1

Attention, intentions, and the structure of discourse

Computational Linguistics
Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
A critique and improvement of an evaluation metric for text segmentation

Computational Linguistics
Introduction to the special issue on summarization

Computational Linguistics - Summarization
Efficiently computed lexical chains as an intermediate representation for automatic text summarization

Computational Linguistics - Summarization
TextTiling: A Quantitative Approach to Discourse

TextTiling: A Quantitative Approach to Discourse
Lexical cohesion computed by thesaural relations as an indicator of the structure of text

Computational Linguistics
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
Advances in domain independent linear text segmentation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Cohesion and collocation: using context vectors in text segmentation

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Modeling local coherence: an entity-based approach

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Measuring the semantic similarity of texts

EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
The PASCAL recognising textual entailment challenge

MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment

Editorial: COMPENDIUM: A text summarization system for generating abstracts of research papers

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Summarization is the process of condensing a source text into a shorter version preserving its information content ([2]). This paper presents some original methods for text summarization by extraction of a single source document based on a particular intuition which is not explored till now: the logical structure of a text. The summarization relies on an original linear segmentation algorithm which we denote logical segmentation (LTT) because the score of a sentence is the number of sentences of the text which are entailed by it.The summary is obtained by three methods: selecting the first sentence(s) from a segment, selecting the best scored sentence(s) from a segment and selecting the most informative sentence(s) (relative to the previously selected) from a segment. Moreover, our methods permit dynamically adjusting the derived summary size, independently of the number of segments.Alternatively, a Dynamic Programming (DP) method, based on the continuity principle and applied to the sentences logically scored as above is proposed. This method proceeds by obtaining the summary firstly and then determining the segments.Our methods of segmentation are applied and evaluated against the segmentation of the text "I spent the first 19 years" of Morris and Hirst ([17]). The original text is reproduced at [26]. Some statistics about the informativeness of the summaries with different lengths and obtained with the above methods relatively to the original (summarized) text are given. These statistics prove that the segmentation preceding the summarization could improve the quality of obtained summaries.