Context-based Hierarchical Clustering for the Ontology Learning
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
This paper presents a system which creates a KWIC index of WWW texts in Japanese by automatic summarization. The system consists of three modules: a WWWspider, an extractor of important sentences, and a sentence summarizer. The most effective module is the last one which employs a robust and fairly accurate Japanese parser: KNP. It segments an input sentence into phrases or simple sentences and assembles a summary. The accuracy of the important sentence extractor was 62.8% and that of the sentence summarizer was 76.5%.