Developing learning strategies for topic-based summarization

Authors:
You Ouyang;Sujian Li;Wenjie Li
Affiliations:
Hong Kong Polytechnic University, Hong Kong, Hong Kong;Peking University, Beijing, China;Hong Kong Polytechnic University, Hong Kong, Hong Kong
Venue:
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Year:
2007

Citing 10
Cited 12

The nature of statistical learning theory

The nature of statistical learning theory
A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Extracting important sentences with support vector machines

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Learning surface text patterns for a Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
GATE: an architecture for development of robust HLT applications

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A web-trained extraction summarization system

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Manual and automatic evaluation of summaries

AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Topic-focused multi-document summarization using an approximate oracle score

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Using Proximity in Query Focused Multi-document Extractive Summarization

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Learning Similarity Functions in Graph-Based Document Summarization

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Topic analysis for topic-focused multi-document summarization

Proceedings of the 18th ACM conference on Information and knowledge management
HyperSum: hypergraph based semi-supervised sentence ranking for query-oriented summarization

Proceedings of the 18th ACM conference on Information and knowledge management
Graph-based multi-modality learning for topic-focused multi-document summarization

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Focused multi-document summarization: human summarization activity vs. automated systems techniques

Journal of Computing Sciences in Colleges
Intertopic information mining for query-based summarization

Journal of the American Society for Information Science and Technology
A comparative study on ranking and selection strategies for multi-document summarization

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A study on position information in document summarization

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
pSum-SaDE: a modified p-median problem and self-adaptive differential evolution algorithm for text summarization

Applied Computational Intelligence and Soft Computing
PPSGen: learning to generate presentation slides for academic papers

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most up-to-date well-behaved topic-based summarization systems are built upon the extractive framework. They score the sentences based on the associated features by manually assigning or experimentally tuning the weights of the features. In this paper, we discuss how to develop learning strategies in order to obtain the optimal feature weights automatically, which can be used for assigning a sound score to a sentence characterized with a set of features. The two fundamental issues are about training data and learning models. To save the costly manual annotation time and effort, we construct the training data by labeling the sentence with a "true" score calculated according to human summaries. The Support Vector Regression (SVR) model is then used to learn how to relate the "true" score of the sentence to its features. Once the relations have been mathematically modeled, SVR is able to predict the "estimated" score for any given sentence. The evaluations by ROUGE-2 criterion on DUC 2006 and DUC 2005 document sets demonstrate the competitiveness and the adaptability of the proposed approaches.