A Language-Based Similarity Measure

  • Authors:
  • Lionel Martin;Frédéric Moal

  • Affiliations:
  • -;-

  • Venue:
  • EMCL '01 Proceedings of the 12th European Conference on Machine Learning
  • Year:
  • 2001

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents an unified framework for the definition of similarity measures for various formalisms (attribute-value, first order logic...). The underlying idea is that the similarity between two objects does not depend only on the attribute values of the objects, but more especially on the set of the potentially relevant definitions of concepts for the problem considered. In our framework, the user defines a language with a grammar to specify the similarity measure. Each term of the language represents a property of the objects. The similarity between two objects is the probability that these two objects both satisfy or both reject simultaneously the properties of the given language. When this probability is not computable, we use a stochastic generation procedure to approximate it. This measure can be applied for both clustering and classification tasks. The empirical evaluation on common classification problems shows a very good accuracy.