Comparison of hierarchic agglomerative clustering methods for document retrieval
The Computer Journal
Word association norms, mutual information, and lexicography
Computational Linguistics
A self-organizing semantic map for information retrieval
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Use of syntactic context to produce term association lists for text retrieval
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Analysis of Japanese compound nouns using collocational information
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Unsupervised word sense disambiguation using bilingual comparable corpora
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Identifying synonymous concepts in preparation for technology mining
Journal of Information Science
Hi-index | 0.00 |
This paper presents a method for automatically generating an association thesaurus from a text corpus, and demonstrates its application to information retrieval. The thesaurus generation method consists of extracting terms and co-occurrence data from a corpus and analyzing the correlation between terms statistically. A new method for disambiguating the structure of compound nouns, which is a key component for term extraction, is also proposed. The automatically generated thesaurus is effectively used as a tool for exploring information. A thesaurus navigator having novel functions such as term clustering, thesaurus overview, and zooming-in is proposed.