Using hidden Markov modeling to decompose human-written summaries

Authors:
Hongyan Jing
Affiliations:
Lucent Technologies, Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ
Venue:
Computational Linguistics - Summarization
Year:
2002

Citing 6
Cited 23

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
The automatic construction of large-scale corpora for summarization research

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Cut-and-paste text summarization

Cut-and-paste text summarization
Aligning sentences in parallel corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics

Introduction to the special issue on summarization

Computational Linguistics - Summarization
Induction of Word and Phrase Alignments for Automatic Document Summarization

Computational Linguistics
Sentence alignment for monolingual comparable corpora

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Indexing and retrieval of handwritten medical forms

dg.o '07 Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains
SlideSeer: a digital library of aligned document and presentation pairs

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
GA, MR, FFNN, PNN and GMM based models for automatic text summarization

Computer Speech and Language
On the subjectivity of human-authored summaries*

Natural Language Engineering
Handwritten document retrieval strategies

Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
Using N-Grams to understand the nature of summaries

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Using signals of human interest to enhance single-document summarization

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
LexRank: graph-based lexical centrality as salience in text summarization

Journal of Artificial Intelligence Research
Quantifying the limits and success of extractive summarization systems across domains

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Automatic generation of story highlights

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Unsupervised discourse segmentation of documents with inherently parallel structure

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Title generation with quasi-synchronous grammar

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Imposing hierarchical browsing structures onto spoken documents

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
RitroveRAI: a web application for semantic indexing and hyperlinking of multimedia news

ISWC'05 Proceedings of the 4th international conference on The Semantic Web
An unsupervised alignment algorithm for text simplification corpus construction

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
Text summarisation in progress: a literature review

Artificial Intelligence Review
An approach to summarizing Bengali news documents

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Detecting human features in summaries --- symbol sequence statistical regularity

SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications
Multiple aspect summarization using integer linear programming

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Text simplification resources for Spanish

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Professional summarizers often reuse original documents to generate summaries. The task of summary sentence decomposition is to deduce whether a summary sentence is constructed by reusing the original text and to identify reused phrases. Specifically, the decomposition program needs to answer three questions for a given summary sentence: (1) Is this summary sentence constructed by reusing the text in the original document? (2) If so, what phrases in the sentence come from the original document? and (3) From where in the document do the phrases come? Solving the decomposition problem can lead to better text generation techniques for summarization. Decomposition can also provide large training and testing corpora for extraction-based summarizers. We propose a hidden Markov model solution to the decomposition problem. Evaluations show that the proposed algorithm performs well.