A trainable document summarizer
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Advances in Automatic Text Summarization
Advances in Automatic Text Summarization
The Theory of Parsing, Translation, and Compiling
The Theory of Parsing, Translation, and Compiling
Chinese word segmentation without using lexicon and hand-crafted training data
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis
ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
Text summarization using a trainable summarizer and latent semantic analysis
Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Hi-index | 0.00 |
In this paper, each document is represented by a weighted graph called a text relationship map. In the graph, each node represents a vector of nouns in a sentence, an undirected link connects two nodes if two sentences are semantically related, and a weight on the link is a value of the similarity between a pair of sentences. The vector similarity can be computed as the inner product between corresponding vector elements. The similarity is based on the word overlap between the corresponding sentences. The importance of a node on the map, called an aggregate similarity, is defined as the sum of weights on the links connecting it to other nodes on the map. In this paper, we present a Korean text summarization system using the aggregate similarity. To evaluate our system, we used two test collections: one collection (PAPER-InCon) consists of 100 papers in the domain of computer science; the other collection (NEWS) is composed of 105 articles in the newspapers. Under the compression rate of 20%, we achieved the recall of 46.6% (PAPER-InCon) and 30.5% (NEWS), and the precision of 76.9% (PAPER-InCon) and 42.3% (NEWS). Experiments show that our system outperforms two commercial systems.