Supervised ranking in open-domain text summarization

Authors:
Tadashi Nomoto;Yuji Matsumoto
Affiliations:
National Institute of Japanese Literature, Tokyo, Japan;Nara Institute of Science and Technology, Nara, Japan
Venue:
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Year:
2002

Citing 11
Cited 3

C4.5: programs for machine learning

C4.5: programs for machine learning
A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Stochastic complexity in learning

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Generic text summarization using relevance measure and latent semantic analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A new approach to unsupervised text summarization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Generalized clustering, supervised learning, and data assignment

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
An Experimental Comparison of Supervised and Unsupervised Approaches to Text Summarization

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Comparing the Minimum Description Length Principle and Boosting in the Automatic Analysis of Discourse

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
SSDT: A Scalable Subspace-Splitting Classifier for Biased Data

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
The diversity-based approach to open-domain text summarization

Information Processing and Management: an International Journal
Fast generation of abstracts from general domain text corpora by extracting relevant sentences

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2

Combining optimal clustering and Hidden Markov models for extractive summarization

MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12
One story, one flow: Hidden Markov Story Models for multilingual multidocument summarization

ACM Transactions on Speech and Language Processing (TSLP)
AUSUM: approach for unsupervised bug report summarization

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper proposes and empirically motivates an integration of supervised learning with unsupervised learning to deal with human biases in summarization. In particular, we explore the use of probabilistic decision tree within the clustering framework to account for the variation as well as regularity in human created summaries. The corpus of human created extracts is created from a newspaper corpus and used as a test set. We build probabilistic decision trees of different flavors and integrate each of them with the clustering framework. Experiments with the corpus demonstrate that the mixture of the two paradigms generally gives a significant boost in performance compared to cases where either of the two is considered alone.