Evaluation of features for sentence extraction on different types of corpora

Authors:
Chikashi Nobata;Satoshi Sekine;Hitoshi Isahara
Affiliations:
Communications Research Laboratory, Seika-cho, Soraku-gun, Kyoto, Japan;New York University, New York, NY;Communications Research Laboratory, Seika-cho, Soraku-gun, Kyoto, Japan
Venue:
MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12
Year:
2003

Citing 7
Cited 2

Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Training a selection function for extraction

Proceedings of the eighth international conference on Information and knowledge management
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Trainable, scalable summarization using robust NLP and machine learning

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic pattern acquisition for Japanese information extraction

HLT '01 Proceedings of the first international conference on Human language technology research
Extracting important sentences with support vector machines

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1

Automatic summarising: The state of the art

Information Processing and Management: an International Journal
Mining millions of reviews: a technique to rank products based on importance of reviews

Proceedings of the 13th International Conference on Electronic Commerce

Quantified Score

Hi-index	0.00

Visualization

Abstract

We report evaluation results for our summarization system and analyze the resulting summarization data for three different types of corpora. To develop a robust summarization system, we have created a system based on sentence extraction and applied it to summarize Japanese and English newspaper articles, obtained some of the top results at two evaluation workshops. We have also created sentence extraction data from Japanese lectures and evaluated our system with these data. In addition to the evaluation results, we analyze the relationships between key sentences and the features used in sentence extraction. We find that discrete combinations of features match distributions of key sentences better than sequential combinations.