Multiple documents summarization based on genetic algorithm

  • Authors:
  • Derong Liu;Yongcheng Wang;Chuanhan Liu;Zhiqi Wang

  • Affiliations:
  • Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University;Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University;Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University;Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University

  • Venue:
  • FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the increasing volume of online information, it is more important to automatically extract the core content from lots of information sources. We propose a model for multiple documents summarization that maximize the coverage of topics and minimize the redundancy of contents. Based on Chinese concept lexicon and corpus, the proposed model can analyze the topic of each document, their relationships and the central theme of the collection to evaluate sentences. We present different approaches to determine which sentences are appropriate for the extraction on the basis of sentences weight and their relevance from the related documents. A genetic algorithm is designed to improve the quality of the summarization. The experimental results indicate that it is useful and effective to improve the quality of multiple documents summarization using genetic algorithm.