Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model

Authors:
Issei Sato;Minoru Yoshida;Hiroshi Nakagawa
Affiliations:
The University of Tokyo, Tokyo, Japan;The University of Tokyo, Tokyo, Japan;The University of Tokyo, Tokyo, Japan
Venue:
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2008

Citing 4
Cited 1

Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Discovering corpus-specific word senses

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Selection of effective contextual information for automatic synonym acquisition

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Learning systems of concepts with an infinite relational model

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1

Learning semantics and selectional preference of adjective-noun pairs

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We developed a model based on nonparametric Bayesian modeling for automatic discovery of semantic relationships between words taken from a corpus. It is aimed at discovering semantic knowledge about words in particular domains, which has become increasingly important with the growing use of text mining, information retrieval, and speech recognition. The subject-predicate structure is taken as a syntactic structure with the noun as the subject and the verb as the predicate. This structure is regarded as a graph structure. The generation of this graph can be modeled using the hierarchical Dirichlet process and the Pitman-Yor process. The probabilistic generative model we developed for this graph structure consists of subject-predicate structures extracted from a corpus. Evaluation of this model by measuring the performance of graph clustering based on WordNet similarities demonstrated that it outperforms other baseline models.