Extracting important sentences with support vector machines

Authors:
Tsutomu Hirao;Hideki Isozaki;Eisaku Maeda;Yuji Matsumoto
Affiliations:
NTT Communication Science Laboratories, Kyoto, Japan;NTT Communication Science Laboratories, Kyoto, Japan;NTT Communication Science Laboratories, Kyoto, Japan;Nara Institute of Science and Technology, Nara, Japan
Venue:
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Year:
2002

Citing 13
Cited 18

C4.5: programs for machine learning

C4.5: programs for machine learning
The nature of statistical learning theory

The nature of statistical learning theory
A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning of generic and user-focused summarization

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Training a selection function for extraction

Proceedings of the eighth international conference on Information and knowledge management
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Trainable, scalable summarization using robust NLP and machine learning

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Japanese named entity extraction evaluation: analysis of results

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Fast generation of abstracts from general domain text corpora by extracting relevant sentences

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Japanese named entity recognition based on a simple rule generator and decision tree learning

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Japanese dependency structure analysis based on support vector machines

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13

Evaluating discourse understanding in spoken dialogue systems

ACM Transactions on Speech and Language Processing (TSLP)
SVM answer selection for open-domain question answering

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Evaluation of features for sentence extraction on different types of corpora

MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12
Japanese zero pronoun resolution based on ranking rules and machine learning

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Learning-based summarisation of XML documents

Information Retrieval
Investigating sentence weighting components for automatic summarisation

Information Processing and Management: an International Journal
Developing learning strategies for topic-based summarization

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Learning query-biased web page summarization

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A ranking method based on users' contexts for information recommendation

Proceedings of the 2nd international conference on Ubiquitous information management and communication
Learning to rank definitions to generate quizzes for interactive information presentation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Automatically selecting answer templates to respond to customer emails

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Text summarization techniques: SVM versus neural networks

Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
Intertopic information mining for query-based summarization

Journal of the American Society for Information Science and Technology
Proximity queries using separating bounding volumes

International Journal of Computer Applications in Technology
Efficient statement identification for automatic market forecasting

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Opinion summarization with integer linear programming formulation for sentence extraction and ordering

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Applying regression models to query-focused multi-document summarization

Information Processing and Management: an International Journal
Query-focused multi-document summarization: Automatic data annotations and supervised learning approaches

Natural Language Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extracting sentences that contain important information from a document is a form of text summarization. The technique is the key to the automatic generation of summaries similar to those written by humans. To achieve such extraction, it is important to be able to integrate heterogeneous pieces of information. One approach, parameter tuning by machine learning, has been attracting a lot of attention. This paper proposes a method of sentence extraction based on Support Vector Machines (SVMs). To confirm the method's performance, we conduct experiments that compare our method to three existing methods. Results on the Text Summarization Challenge (TSC) corpus show that our method offers the highest accuracy. Moreover, we clarify the different features effective for extracting different document genres.