Evaluation of analogical proportions through Kolmogorov complexity

  • Authors:
  • Meriam Bayoudh;Henri Prade;Gilles Richard

  • Affiliations:
  • Centre IRD de Guyane, Route de Montabo BP165, 97323 Cayenne CEDEX, France;IRIT, 118 Route de Narbonne, 31062 Toulouse Cedex 9, United Kingdom;IRIT, 118 Route de Narbonne, 31062 Toulouse Cedex 9, United Kingdom

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we try to identify analogical proportions, i.e., statements of the form ''a is to b as c is to d'', expressed in linguistic terms. While it is conceivable to use an algebraic model for testing proportions such as ''2 is to 4 as 5 is to 10'', or even such as ''read is to reader as lecture is to lecturer'', there is no algebraic framework to support statements such as ''engine is to car as heart is to human'' or ''wine is to France as beer is to England'', helping to recognize them as meaningful analogical proportions. The idea is then to rely on text corpora, or even on the Web itself, where one may expect to find the pragmatics and the semantics of the words, in their common use. In that context, in order to attach a numerical value to the ''analogical ratio'' corresponding to the phrase ''a is to b'', we start from the works of Kolmogorov on complexity theory. This is the basis for a universal measure of the information content of a word a, or of a word a with respect to another one b, which, in practice, is estimated in a statistical manner. We investigate the link between a purely logical, recently introduced view of analogical proportions and its counterpart based on Kolmogorov theory. The criteria proposed for testing candidate proportions fit with the expected properties (symmetry, central permutation) of analogical proportions. This leads to a new computational method to define, and ultimately to try to detect, analogical proportions in natural language. Experiments with classifiers based on these ideas are reported, and results are rather encouraging with respect to the recognition of common sense linguistic analogies. The approach is also compared with existing works on similar problems.