Selection of effective contextual information for automatic synonym acquisition

Authors:
Masato Hagiwara;Yasuhiro Ogawa;Katsuhiko Toyama
Affiliations:
Nagoya University, Chikusa-ku, Nagoya, Japan;Nagoya University, Chikusa-ku, Nagoya, Japan;Nagoya University, Chikusa-ku, Nagoya, Japan
Venue:
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Year:
2006

Citing 5
Cited 4

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
PLSI utilization for automatic thesaurus construction

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A graph-based approach for biomedical thesaurus expansion

Proceedings of the third international workshop on Data and text mining in bioinformatics
Gram-free synonym extraction via suffix arrays

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Automatic construction and enrichment of informal ontologies: A survey

Programming and Computing Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Various methods have been proposed for automatic synonym acquisition, as synonyms are one of the most fundamental lexical knowledge. Whereas many methods are based on contextual clues of words, little attention has been paid to what kind of categories of contextual information are useful for the purpose. This study has experimentally investigated the impact of contextual information selection, by extracting three kinds of word relationships from corpora: dependency, sentence co-occurrence, and proximity. The evaluation result shows that while dependency and proximity perform relatively well by themselves, combination of two or more kinds of contextual information gives more stable performance. We've further investigated useful selection of dependency relations and modification categories, and it is found that modification has the greatest contribution, even greater than the widely adopted subject-object combination.