Properties of possibilistic string comparison

  • Authors:
  • Antoon Bronselaer;Guy De Tré

  • Affiliations:
  • Department of Telecommunications and Information Processing, Ghent University, Ghent, Belgium;Department of Telecommunications and Information Processing, Ghent University, Ghent, Belgium

  • Venue:
  • IEEE Transactions on Fuzzy Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of detecting coreferent objects of arbitrary complexity is a challenging topic in current research. A possibilistic solution for this problem is to treat it as an uncertain Boolean problem. This means that two objects are either coreferent or not (i.e., a Boolean matter), but uncertainty about this decision must be dealt with. An operator that determines the uncertainty about the coreference of two objects is called an evaluator. When we deal with structured objects, decomposition into attributes (i.e., atomic subobjects) allows the definition of evaluators on well-known subdomains. This paper proceeds previous research on evaluators for strings, which is a widely used data type for attributes. First of all, the Sugeno integral based on the framework of conditional necessity is shown to be related to the existing technique.More specifically, a special case of this Sugeno integral is equivalent to regular conjunction of transformed possibilistic truth values, which is used by existing evaluators for strings. As a consequence, a subfamily of the existing evaluator is obtained for strings. This subfamily is shown to satisfy several interesting properties, which are used to construct an efficient optimization algorithm for string evaluators. Next, the use of a frequency filter is investigated. Finally, novel and advanced techniques like interlevel-information exchange and the use of multiple quantifiers are defined and investigated. Aseries of tests on diverse datasets shows the high accuracy and robustness of the approach that is introduced in this paper.