Detecting human features in summaries --- symbol sequence statistical regularity

Authors:
George Giannakopoulos;Vangelis Karkaletsis;George A. Vouros
Affiliations:
Software and Knowledge Engineering Laboratory, National Center of Scientific Research "Demokritos", Greece;Software and Knowledge Engineering Laboratory, National Center of Scientific Research "Demokritos", Greece;Department of Digital Systems, University of Pireaus, Greece
Venue:
SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications
Year:
2012

Citing 7
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Using hidden Markov modeling to decompose human-written summaries

Computational Linguistics - Summarization
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Understanding the process of multi-document summarization: content selection, rewriting and evaluation

Understanding the process of multi-document summarization: content selection, rewriting and evaluation
Acceptability prediction by means of grammaticality quantification

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Summarization system evaluation revisited: N-gram graphs

ACM Transactions on Speech and Language Processing (TSLP)
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The presented work studies textual summaries, aiming to detect the qualities of human multi-document summaries, in contrast to automatically extracted ones. The measured features are based on a generic statistical regularity measure, named Symbol Sequence Statistical Regularity (SSSR). The measure is calculated over both character and word n-grams of various ranks, given a set of human and automatically extracted multi-document summaries from two different corpora. The results of the experiments indicate that the proposed measure provides enough distinctive power to discriminate between the human and non-human summaries. The results hint on the qualities a human summary holds, increasing intuition related to how a good summary should be generated.