Soft cardinality + ML: learning adaptive similarity functions for cross-lingual textual entailment

  • Authors:
  • Sergio Jimenez;Claudia Becerra;Alexander Gelbukh

  • Affiliations:
  • Universidad Nacional, de Colombia, Bogota, Ciudad Universitaria, edificio, oficina;Universidad Nacional de Colombia, Bogota;CIC-IPN Av. Juan Dios Bátiz, Av. Mendizábal, Col. Nueva Industrial Vallejo, DF, México

  • Venue:
  • SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel approach for building adaptive similarity functions based on cardinality using machine learning. Unlike current approaches that build feature sets using similarity scores, we have developed these feature sets with the cardinalities of the commonalities and differences between pairs of objects being compared. This approach allows the machine-learning algorithm to obtain an asymmetric similarity function suitable for directional judgments. Besides using the classic set cardinality, we used soft cardinality to allow flexibility in the comparison between words. Our approach used only the information from the surface of the text, a stop-word remover and a stemmer to address the cross-lingual textual entailment task 8 at SEMEVAL 2012. We have the third best result among the 29 systems submitted by 10 teams. Additionally, this paper presents better results compared with the best official score.