Multi-document Summarization Based on Cluster Using Non-negative Matrix Factorization

Authors:
Sun Park;Ju-Hong Lee;Deok-Hwan Kim;Chan-Min Ahn
Affiliations:
Department of Computer Science & Information Engineering, Inha University, Incheon, Korea;Department of Computer Science & Information Engineering, Inha University, Incheon, Korea;Department of Electronics Engineering, Inha University,;Department of Computer Science & Information Engineering, Inha University, Incheon, Korea
Venue:
SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
Year:
2007

Citing 8
Cited 3

Extracting sentence segments for text summarization: a machine learning approach

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Creating and evaluating multi-document sentence extract summaries

Proceedings of the ninth international conference on Information and knowledge management
Data mining: concepts and techniques

Data mining: concepts and techniques
Generic text summarization using relevance measure and latent semantic analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to the special issue on summarization

Computational Linguistics - Summarization
Topic themes for multi-document summarization

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Structure-based query-specific document summarization

Proceedings of the 14th ACM international conference on Information and knowledge management
Query based summarization using non-negative matrix factorization

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III

Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Document update summarization using incremental hierarchical clustering

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Integrating Document Clustering and Multidocument Summarization

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a new summarization method, which uses non-negative matrix factorization (NMF) and K-means clustering, is introduced to extract meaningful sentences from multi-documents. The proposed method can improve the quality of document summaries because the inherent semantics of the documents are well reflected by using the semantic features calculated by NMF and the sentences most relevant to the given topic are extracted efficiently by using the semantic variables derived by NMF. Besides, it uses K-means clustering to remove noises so that it can avoid the biased inherent semantics of the documents to be reflected in summaries. We perform detail experiments with the well-known DUC test dataset. The experimental results demonstrate that the proposed method has better performance than other methods using the LSA, the Kmeans, and the NMF.