A machine learning approach for recognizing textual entailment in Spanish

Authors:
Julio Javier Castillo
Affiliations:
National University of Córdoba, Córdoba, Argentina
Venue:
YIWCALA '10 Proceedings of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches to Languages of the Americas
Year:
2010

Citing 4
Cited 3

Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Shallow semantics in fast textual entailment rule learners

RTE '07 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing
The PASCAL recognising textual entailment challenge

MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment

Using machine translation systems to expand a corpus in textual entailment

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
A semantic oriented approach to textual entailment using wordnet-based measures

MICAI'10 Proceedings of the 9th Mexican international conference on Advances in artificial intelligence: Part I
Using sentence semantic similarity based on WordNet in recognizing textual entailment

IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a system that uses machine learning algorithms for the task of recognizing textual entailment in Spanish language. The datasets used include SPARTE Corpus and a translated version to Spanish of RTE3, RTE4 and RTE5 datasets. The features chosen quantify lexical, syntactic and semantic level matching between text and hypothesis sentences. We analyze how the different sizes of datasets and classifiers could impact on the final overall performance of the RTE classification of two-way task in Spanish. The RTE system yields 60.83% of accuracy and a competitive result of 66.50% of accuracy is reported by train and test set taken from SPARTE Corpus with 70% split.