Korean text summarization using an aggregate similarity

  • Authors:
  • Jae-Hoon Kim;Joon-Hong Kim;Dosam Hwang

  • Affiliations:
  • Department of Computer Engineering, Korea Maritime University, 1 Dongsam-dong, Yeongdo-gu, Pusan, 606-791, Korea and Advanced Information Technology Research Center (AITrc), 373-1, Kusong-dong, Yu ...;-;Department of Computer Engineering, Yeungnam University, 214-1, Daedong, Kyongsan, Kyongbuk, 712-749, Korea and Advanced Information Technology Research Center (AITrc), 373-1, Kusong-dong, Yusong- ...

  • Venue:
  • IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, each document is represented by a weighted graph called a text relationship map. In the graph, each node represents a vector of nouns in a sentence, an undirected link connects two nodes if two sentences are semantically related, and a weight on the link is a value of the similarity between a pair of sentences. The vector similarity can be computed as the inner product between corresponding vector elements. The similarity is based on the word overlap between the corresponding sentences. The importance of a node on the map, called an aggregate similarity, is defined as the sum of weights on the links connecting it to other nodes on the map. In this paper, we present a Korean text summarization system using the aggregate similarity. To evaluate our system, we used two test collections: one collection (PAPER-InCon) consists of 100 papers in the domain of computer science; the other collection (NEWS) is composed of 105 articles in the newspapers. Under the compression rate of 20%, we achieved the recall of 46.6% (PAPER-InCon) and 30.5% (NEWS), and the precision of 76.9% (PAPER-InCon) and 42.3% (NEWS). Experiments show that our system outperforms two commercial systems.