Exploring content models for multi-document summarization

Authors:
Aria Haghighi;Lucy Vanderwende
Affiliations:
UC Berkeley;Microsoft Research
Venue:
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2009

Citing 15
Cited 48

Towards multidocument summarization by reformulation: progress and prospects

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Latent dirichlet allocation

The Journal of Machine Learning Research
Probabilistic text structuring: experiments with sentence ordering

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
iNeATS: interactive multi-document summarization

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Bayesian query-focused summarization

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An information-theoretic approach to automatic evaluation of summaries

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Satisfying information needs with multi-document summaries

Information Processing and Management: an International Journal
Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion

Information Processing and Management: an International Journal
Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Bayesian unsupervised topic segmentation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Topic-driven multi-document summarization with encyclopedic knowledge and spreading activation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Improved affinity graph based multi-document summarization

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Impact of linguistic analysis on the semantic graph coverage and learning of document extracts

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
LexRank: graph-based lexical centrality as salience in text summarization

Journal of Artificial Intelligence Research
A study of global inference algorithms in multi-document summarization

ECIR'07 Proceedings of the 29th European conference on IR research

Automatic evaluation of topic coherence

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised modeling of Twitter conversations

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Generating templates of entity summaries with an entity-aspect model and pattern mining

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A hybrid hierarchical model for multi-document summarization

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Non-expert evaluation of summarization systems is risky

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Summarizing contrastive viewpoints in opinionated text

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Incorporating content structure into text analysis applications

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Topic aspect analysis for multi-document summarization

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Multi-document summarization via the minimum dominating set

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
CorrRank: update summarization based on topic correlation analysis

ICIC'10 Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing
Best topic word selection for topic labelling

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Dirichlet distribution with centroid model (DDCM) based summarization technique for web document classification

COMPUTE '11 Proceedings of the Fourth Annual ACM Bangalore Conference
A latent topic extracting method based on events in a document and its application

HLT-SS '11 Proceedings of the ACL 2011 Student Session
Automatic summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
Discovery of topically coherent sentences for extractive summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A class of submodular functions for document summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Automatic labelling of topic models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Automatic assessment of coverage quality in intelligence reports

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Probabilistic document modeling for syntax removal in text summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Extractive multi-document summaries should explicitly not contain document-specific content

WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
Summarizing web forum threads based on a latent topic propagation process

Proceedings of the 20th ACM international conference on Information and knowledge management
Text specificity and impact on quality of news summaries

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
Linear text segmentation using affinity propagation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Linguistic redundancy in Twitter

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Generating aspect-oriented multi-document summarization with event-aspect model

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Toward a Unified Framework for Standard and Update Multi-Document Summarization

ACM Transactions on Asian Language Information Processing (TALIP)
pSum-SaDE: a modified p-median problem and self-adaptive differential evolution algorithm for text summarization

Applied Computational Intelligence and Soft Computing
MCMR: Maximum coverage and minimum redundant text summarization model

Expert Systems with Applications: An International Journal
Personalized resource categorisation in folksonomies

Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics
DualSum: a topic-model based approach for update summarization

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Apples to oranges: evaluating image annotations from natural language processing systems

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Multi-document summarization via submodularity

Applied Intelligence
Pattern learning for relation extraction with a hierarchical topic model

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Unsupervised topic modeling approaches to decision summarization in spoken meetings

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
DESAMC+DocSum: Differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization

Knowledge-Based Systems
Joint topic modeling for event summarization across news and social media streams

Proceedings of the 21st ACM international conference on Information and knowledge management
Automatically building templates for entity summary construction

Information Processing and Management: an International Journal
Ontology-enriched multi-document summarization in disaster management using submodular function

Information Sciences: an International Journal
Automatically assessing machine summary content without a gold standard

Computational Linguistics
On collocations and topic models

ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Mining evidences for named entity disambiguation

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Multimedia summarization for trending topics in microblogs

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Gem-based entity-knowledge maintenance

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Feature-based models for improving the quality of noisy training data for relation extraction

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A survey of noise reduction methods for distant supervision

Proceedings of the 2013 workshop on Automated knowledge base construction
TopicDSDR: combining topic decomposition and data reconstruction for summarization

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
The effectiveness of automatic text summarization in mobile learning contexts

Computers & Education
An unsupervised cascade learning scheme for 'cluster-theme keywords' structure extraction from scientific papers

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an exploration of generative probabilistic models for multi-document summarization. Beginning with a simple word frequency based model (Nenkova and Vanderwende, 2005), we construct a sequence of models each injecting more structure into the representation of document set content and exhibiting ROUGE gains along the way. Our final model, HierSum, utilizes a hierarchical LDA-style model (Blei et al., 2004) to represent content specificity as a hierarchy of topic vocabulary distributions. At the task of producing generic DUC-style summaries, HierSum yields state-of-the-art ROUGE performance and in pairwise user evaluation strongly outperforms Toutanova et al. (2007)'s state-of-the-art discriminative system. We also explore HierSum's capacity to produce multiple 'topical summaries' in order to facilitate content discovery and navigation.