Foundations of statistical natural language processing
Foundations of statistical natural language processing
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Inducing a semantically annotated lexicon via EM-based clustering
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatic construction of a hypernym-labeled noun hierarchy from text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Semantic taxonomy induction from heterogenous evidence
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Towards terascale knowledge acquisition
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
TORISHIKI-KAI, An Autogenerated Web Search Directory
ISUC '08 Proceedings of the 2008 Second International Symposium on Universal Communication
Deriving a large scale taxonomy from Wikipedia
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Unsupervised named-entity extraction from the Web: An experimental study
Artificial Intelligence
Co-related verb argument selectional preferences
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Automatic acquisition of taxonomies in different languages from multiple Wikipedia versions
i-KNOW '11 Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
ICWL'11 Proceedings of the 10th international conference on Advances in Web-Based Learning
Hi-index | 0.00 |
This paper presents a new method of developing a large-scale hyponymy relation database by combining Wikipedia and other Web documents. We attach new words to the hyponymy database extracted from Wikipedia by using distributional similarity calculated from documents on the Web. For a given target word, our algorithm first finds k similar words from the Wikipedia database. Then, the hypernyms of these k similar words are assigned scores by considering the distributional similarities and hierarchical distances in the Wikipedia database. Finally, new hyponymy relations are output according to the scores. In this paper, we tested two distributional similarities. One is based on raw verb-noun dependencies (which we call "RVD"), and the other is based on a large-scale clustering of verb-noun dependencies (called "CVD"). Our method achieved an attachment accuracy of 91.0% for the top 10,000 relations, and an attachment accuracy of 74.5% for the top 100,000 relations when using CVD. This was a far better outcome compared to the other baseline approaches. Excluding the region that had very high scores, CVD was found to be more effective than RVD. We also confirmed that most relations extracted by our method cannot be extracted merely by applying the well-known lexico-syntactic patterns to Web documents.