Applying regression models to query-focused multi-document summarization

Authors:
You Ouyang;Wenjie Li;Sujian Li;Qin Lu
Affiliations:
Department of Computing, The Hong Kong Polytechnic University, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Hong Kong;Key Laboratory of Computational Linguistics, Peking University, Ministry of Education, China;Department of Computing, The Hong Kong Polytechnic University, Hong Kong
Venue:
Information Processing and Management: an International Journal
Year:
2011

Citing 17
Cited 13

The nature of statistical learning theory

The nature of statistical learning theory
A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning of generic and user-focused summarization

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Extracting sentence segments for text summarization: a machine learning approach

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Automatic Text Summarization Using a Machine Learning Approach

SBIA '02 Proceedings of the 16th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting important sentences with support vector machines

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
GATE: an architecture for development of robust HLT applications

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A web-trained extraction summarization system

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Manual and automatic evaluation of summaries

AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Topic-focused multi-document summarization using an approximate oracle score

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Learning query-biased web page summarization

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Document summarization using conditional random fields

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Automatic text summarization based on word-clusters and ranking algorithms

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

Multi-document summarization using A* search and discriminative training

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Improving query focused summarization using look-ahead strategy

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
pSum-SaDE: a modified p-median problem and self-adaptive differential evolution algorithm for text summarization

Applied Computational Intelligence and Soft Computing
GenDocSum+MCLR: Generic document summarization based on maximum coverage and less redundancy

Expert Systems with Applications: An International Journal
CDDS: Constraint-driven document summarization models

Expert Systems with Applications: An International Journal
DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization

Knowledge-Based Systems
Extractive speech summarization using evaluation metric-related training criteria

Information Processing and Management: an International Journal
Multiple documents summarization based on evolutionary optimization algorithm

Expert Systems with Applications: An International Journal
Formulation of document summarization as a 0-1 nonlinear programming problem

Computers and Industrial Engineering
Updating users about time critical events

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Exploring hypergraph-based semi-supervised ranking for query-oriented summarization

Information Sciences: an International Journal
The effectiveness of automatic text summarization in mobile learning contexts

Computers & Education
Cross-lingual training of summarization systems using annotated corpora in a foreign language

Information Retrieval

Quantified Score

Hi-index	0.01

Visualization

Abstract

Most existing research on applying machine learning techniques to document summarization explores either classification models or learning-to-rank models. This paper presents our recent study on how to apply a different kind of learning models, namely regression models, to query-focused multi-document summarization. We choose to use Support Vector Regression (SVR) to estimate the importance of a sentence in a document set to be summarized through a set of pre-defined features. In order to learn the regression models, we propose several methods to construct the ''pseudo'' training data by assigning each sentence with a ''nearly true'' importance score calculated with the human summaries that have been provided for the corresponding document set. A series of evaluations on the DUC data sets are conducted to examine the efficiency and the robustness of the proposed approaches. When compared with classification models and ranking models, regression models are consistently preferable.