Korean text summarization using an aggregate similarity

Authors:
Jae-Hoon Kim;Joon-Hong Kim;Dosam Hwang
Affiliations:
Department of Computer Engineering, Korea Maritime University, 1 Dongsam-dong, Yeongdo-gu, Pusan, 606-791, Korea and Advanced Information Technology Research Center (AITrc), 373-1, Kusong-dong, Yu ...;-;Department of Computer Engineering, Yeungnam University, 214-1, Daedong, Kyongsan, Kyongbuk, 712-749, Korea and Advanced Information Technology Research Center (AITrc), 373-1, Kusong-dong, Yusong- ...
Venue:
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Year:
2000

Citing 5
Cited 3

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
Advances in Automatic Text Summarization

Advances in Automatic Text Summarization
The Theory of Parsing, Translation, and Compiling

The Theory of Parsing, Translation, and Compiling
Chinese word segmentation without using lexicon and hand-crafted training data

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2

Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis

ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
Text summarization using a trainable summarizer and latent semantic analysis

Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
An effective sentence-extraction technique using contextual information and statistical approaches for text summarization

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, each document is represented by a weighted graph called a text relationship map. In the graph, each node represents a vector of nouns in a sentence, an undirected link connects two nodes if two sentences are semantically related, and a weight on the link is a value of the similarity between a pair of sentences. The vector similarity can be computed as the inner product between corresponding vector elements. The similarity is based on the word overlap between the corresponding sentences. The importance of a node on the map, called an aggregate similarity, is defined as the sum of weights on the links connecting it to other nodes on the map. In this paper, we present a Korean text summarization system using the aggregate similarity. To evaluate our system, we used two test collections: one collection (PAPER-InCon) consists of 100 papers in the domain of computer science; the other collection (NEWS) is composed of 105 articles in the newspapers. Under the compression rate of 20%, we achieved the recall of 46.6% (PAPER-InCon) and 30.5% (NEWS), and the precision of 76.9% (PAPER-InCon) and 42.3% (NEWS). Experiments show that our system outperforms two commercial systems.