Probabilistic word vector and similarity based on dictionaries

Authors:
Satoshi Suzuki
Affiliations:
NTT Communication Science Laboratories, NTT, Japan
Venue:
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Year:
2003

Citing 2
Cited 0

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Why inverse document frequency?

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new method for computing the probabilistic vector expression of words based on dictionaries. This method provides a well-founded procedure based on stochastic process whose applicability is clear. The proposed method exploits the relationship between headwords and their explanatory notes in dictionaries. An explanatory note is a set of other words, each of which is expanded by its own explanatory note. This expansion is repeatedly applied, but even explanatory notes expanded infinitely can be computed under a simple assumption. The vector expression we obtain is a semantic expansion of the explanatory notes of words. We explain how to acquire the vector expression from these expanded explanatory notes. We also demonstrate a word similarity computation based on a Japanese dictionary and evaluate it in comparison with a known system based on TF ċ IDF. The results show the effectiveness and applicability of this probabilistic vector expression.