Combining syntax and semantics for automatic extractive single-document summarization

Authors:
Araly Barrera;Rakesh Verma
Affiliations:
Computer Science Department, University of Houston, Houston, TX;Computer Science Department, University of Houston, Houston, TX
Venue:
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Year:
2012

Citing 9
Cited 1

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Advances in Automatic Text Summarization

Advances in Automatic Text Summarization
Identifying topics by position

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Centroid-based summarization of multiple documents

Information Processing and Management: an International Journal
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Latent Dirichlet Allocation and Singular Value Decomposition Based Multi-document Summarization

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Automatic text summarization of newswire: lessons learned from the document understanding conference

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Machine-made index for technical literature: an experiment

IBM Journal of Research and Development

A knowledge induced graph-theoretical model for extract and abstract single document summarization

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of automated summarization is to tackle the "information overload" problem by extracting and perhaps compressing the most important content of a document. Due to the difficulty that single-document summarization has in beating a standard baseline, especially for news articles, most efforts are currently focused on multi-document summarization. The goal of this study is to reconsider the importance of single-document summarization by introducing a new approach and its implementation. This approach essentially combines syntactic, semantic, and statistical methodologies, and reflects psychological findings that pinpoint specific selection patterns as humans construct summaries. Successful summary evaluation results and baseline out-performance are demonstrated when our system is executed on two separate datasets: the Document Understanding Conference (DUC) 2002 data set and a scientific magazine article set. These results have implications not only for extractive and abstractive single-document summarization, but could also be leveraged in multi-document summarization.