A formal model for information selection in multi-sentence text extraction

Authors:
Elena Filatova;Vasileios Hatzivassiloglou
Affiliations:
Columbia University, New York, NY;Columbia University, New York, NY
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 9
Cited 24

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Approximating covering and packing problems: set cover, vertex cover, independent set, and related problems

Approximation algorithms for NP-hard problems
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Creating and evaluating multi-document sentence extract summaries

Proceedings of the ninth international conference on Information and knowledge management
DefScriber: a hybrid system for definitional QA

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Identifying topics by position

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Evaluating answers to definition questions

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing

Multidocument Summary Generation: Using Informative and Event Words

ACM Transactions on Asian Language Information Processing (TALIP)
A framework for identifying textual redundancy

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Sentiment summarization: evaluating and learning user preferences

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Text summarization model based on maximum coverage problem and its variant

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A predictive approach to help-desk response generation

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Text summarization model based on the budgeted median problem

Proceedings of the 18th ACM conference on Information and knowledge management
Automating help-desk responses: a comparative study of information-gathering approaches

SumQA '06 Proceedings of the Workshop on Task-Focused Summarization and Question Answering
A study of global inference algorithms in multi-document summarization

ECIR'07 Proceedings of the 29th European conference on IR research
Learning to generate summary as structured output

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Opinion summarization with integer linear programming formulation for sentence extraction and ordering

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Query snowball: a co-occurrence-based approach to multi-document summarization for question answering

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Towards strict sentence intersection: decoding and evaluation strategies

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
A comparative study of information-gathering approaches for answering help-desk email inquiries

AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
pSum-SaDE: a modified p-median problem and self-adaptive differential evolution algorithm for text summarization

Applied Computational Intelligence and Soft Computing
MCMR: Maximum coverage and minimum redundant text summarization model

Expert Systems with Applications: An International Journal
GenDocSum+MCLR: Generic document summarization based on maximum coverage and less redundancy

Expert Systems with Applications: An International Journal
CDDS: Constraint-driven document summarization models

Expert Systems with Applications: An International Journal
Multiple aspect summarization using integer linear programming

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Framework of automatic text summarization using reinforcement learning

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization

Knowledge-Based Systems
Balanced coverage of aspects for text summarization

Proceedings of the 21st ACM international conference on Information and knowledge management
Multiple documents summarization based on evolutionary optimization algorithm

Expert Systems with Applications: An International Journal
Formulation of document summarization as a 0-1 nonlinear programming problem

Computers and Industrial Engineering
Multi-document summarization based on the Yago ontology

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

Selecting important information while accounting for repetitions is a hard task for both summarization and question answering. We propose a formal model that represents a collection of documents in a two-dimensional space of textual and conceptual units with an associated mapping between these two dimensions. This representation is then used to describe the task of selecting textual units for a summary or answer as a formal optimization task. We provide approximation algorithms and empirically validate the performance of the proposed model when used with two very different sets of features, words and atomic events.