Significant sentence extraction by euclidean distance based on singular value decomposition

  • Authors:
  • Changbeom Lee;Hyukro Park;Cheolyoung Ock

  • Affiliations:
  • School of Computer Engineering & Information Technology, University of Ulsan, Ulsan, South Korea;Department of Computer Science, Chonnam National University, Kwangju, South Korea;School of Computer Engineering & Information Technology, University of Ulsan, Ulsan, South Korea

  • Venue:
  • IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an automatic summarization approach that constructs a summary by extracting the significant sentences. The approach takes advantage of the cooccurrence relationships between terms only in the document. The techniques used are principal component analysis (PCA) to extract the significant terms and singular value decompostion (SVD) to find out the significant sentences. The PCA can quantify both the term frequency and term-term relationship in the document by the eigenvalue-eigenvector pairs. And the sentence-term matrix can be decomposed into the proper dimensional sentence-concentrated and term-concentrated marices which are used for the Euclidean distances between the sentence and term vectors and also removed the noise of variability in term usage by the SVD. Experimental results on Korean newspaper articles show that the proposed method is to be preferred over random selection of sentences or only PCA when summarization is the goal.