Multiple documents summarization based on genetic algorithm

Authors:
Derong Liu;Yongcheng Wang;Chuanhan Liu;Zhiqi Wang
Affiliations:
Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University;Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University;Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University;Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University
Venue:
FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Year:
2006

Citing 6
Cited 5

The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Multidocument summarization via information extraction

HLT '01 Proceedings of the first international conference on Human language technology research
Evaluation challenges in large-scale document summarization

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A common theory of information fusion from multiple text sources step one: cross-document structure

SIGDIAL '00 Proceedings of the 1st SIGdial workshop on Discourse and dialogue - Volume 10
An efficient text summarizer using lexical chains

INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Combining optimal clustering and Hidden Markov models for extractive summarization

MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12

A scalable global model for summarization

ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Swarm Diversity Based Text Summarization

ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
A new approach to improving multilingual summarization using a genetic algorithm

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Toward a Unified Framework for Standard and Update Multi-Document Summarization

ACM Transactions on Asian Language Information Processing (TALIP)
Cross-lingual training of summarization systems using annotated corpora in a foreign language

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the increasing volume of online information, it is more important to automatically extract the core content from lots of information sources. We propose a model for multiple documents summarization that maximize the coverage of topics and minimize the redundancy of contents. Based on Chinese concept lexicon and corpus, the proposed model can analyze the topic of each document, their relationships and the central theme of the collection to evaluate sentences. We present different approaches to determine which sentences are appropriate for the extraction on the basis of sentences weight and their relevance from the related documents. A genetic algorithm is designed to improve the quality of the summarization. The experimental results indicate that it is useful and effective to improve the quality of multiple documents summarization using genetic algorithm.