Acquiring entailment pairs across languages and domains: a data analysis

  • Authors:
  • Manaal Faruqui;Sebastian Padó

  • Affiliations:
  • Indian Institute of Technology, Kharagpur, India;Universität Heidelberg, Heidelberg, Germany

  • Venue:
  • IWCS '11 Proceedings of the Ninth International Conference on Computational Semantics
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Entailment pairs are sentence pairs of a premise and a hypothesis, where the premise textually entails the hypothesis. Such sentence pairs are important for the development of Textual Entailment systems. In this paper, we take a closer look at a prominent strategy for their automatic acquisition from newspaper corpora, pairing first sentences of articles with their titles. We propose a simple logistic regression model that incorporates and extends this heuristic and investigate its robustness across three languages and three domains. We manage to identify two predictors which predict entailment pairs with a fairly high accuracy across all languages. However, we find that robustness across domains within a language is more difficult to achieve.