Multi-document Summarization Based on Cluster Using Non-negative Matrix Factorization

  • Authors:
  • Sun Park;Ju-Hong Lee;Deok-Hwan Kim;Chan-Min Ahn

  • Affiliations:
  • Department of Computer Science & Information Engineering, Inha University, Incheon, Korea;Department of Computer Science & Information Engineering, Inha University, Incheon, Korea;Department of Electronics Engineering, Inha University,;Department of Computer Science & Information Engineering, Inha University, Incheon, Korea

  • Venue:
  • SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a new summarization method, which uses non-negative matrix factorization (NMF) and K-means clustering, is introduced to extract meaningful sentences from multi-documents. The proposed method can improve the quality of document summaries because the inherent semantics of the documents are well reflected by using the semantic features calculated by NMF and the sentences most relevant to the given topic are extracted efficiently by using the semantic variables derived by NMF. Besides, it uses K-means clustering to remove noises so that it can avoid the biased inherent semantics of the documents to be reflected in summaries. We perform detail experiments with the well-known DUC test dataset. The experimental results demonstrate that the proposed method has better performance than other methods using the LSA, the Kmeans, and the NMF.