Multi-document summarization using weighted similarity between topic and clustering-based non-negative semantic feature

Authors:
Sun Park;Ju-Hong Lee;Deok-Hwan Kim;Chan-Min Ahn
Affiliations:
Dept. of Computer Science & Information Engineering, Inha University, Incheon, Korea;Dept. of Computer Science & Information Engineering, Inha University, Incheon, Korea;Dept. of Electronics Engineering, Inha University;Dept. of Computer Science & Information Engineering, Inha University, Incheon, Korea
Venue:
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Year:
2007

Citing 7
Cited 0

Generic text summarization using relevance measure and latent semantic analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A new approach to unsupervised text summarization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to the special issue on summarization

Computational Linguistics - Summarization
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Multi-document summarization by sentence extraction

NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
Query based summarization using non-negative matrix factorization

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new multi-document summarization method using weighted similarity between topic and non-negative semantic features to extract meaningful sentences relevant to a given topic. The proposed method decomposes a sentence into the linear combination of sparse non-negative semantic features so that it can represent a sentence as the sum of a few semantic features that are comprehensible intuitively. It can avoid extracting the sentences whose similarities with topic are high but are meaningless by using the weighted similarity measure between the topic and the semantic features. Clustering sentences remove noises so that it can avoid the biased semantics of the documents to be reflected in summaries. Besides, it can enhance the coherence of document summaries by arranging extracted sentences in the order of their rank. The experimental results using DUC data show that the proposed method achieves better performance than the other methods.