Probabilistic word vector and similarity based on dictionaries

  • Authors:
  • Satoshi Suzuki

  • Affiliations:
  • NTT Communication Science Laboratories, NTT, Japan

  • Venue:
  • CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new method for computing the probabilistic vector expression of words based on dictionaries. This method provides a well-founded procedure based on stochastic process whose applicability is clear. The proposed method exploits the relationship between headwords and their explanatory notes in dictionaries. An explanatory note is a set of other words, each of which is expanded by its own explanatory note. This expansion is repeatedly applied, but even explanatory notes expanded infinitely can be computed under a simple assumption. The vector expression we obtain is a semantic expansion of the explanatory notes of words. We explain how to acquire the vector expression from these expanded explanatory notes. We also demonstrate a word similarity computation based on a Japanese dictionary and evaluate it in comparison with a known system based on TF ċ IDF. The results show the effectiveness and applicability of this probabilistic vector expression.