Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Machine Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Algorithms for the Longest Common Subsequence Problem
Journal of the ACM (JACM)
A vector space model for automatic indexing
Communications of the ACM
Supervised term weighting for automated text categorization
Proceedings of the 2003 ACM symposium on Applied computing
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Introduction to Machine Learning (Adaptive Computation and Machine Learning)
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
NLTK: the Natural Language Toolkit
ETMTNLP '02 Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics - Volume 1
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
Generalized Mongue-Elkan Method for Approximate Text String Comparison
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
WordNet::Similarity: measuring the relatedness of concepts
HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL 2004
A machine learning approach to textual entailment recognition
Natural Language Engineering
Measuring the semantic similarity of texts
EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Learning textual entailment using SVMs and string similarity measures
RTE '07 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Paraphrase recognition using machine learning to combine similarity measures
ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Text comparison using soft cardinality
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Quantum latent semantic analysis
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Semeval-2012 task 8: cross-lingual textual entailment for content synchronization
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Semeval-2012 task 8: cross-lingual textual entailment for content synchronization
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Hi-index | 0.00 |
This paper presents a novel approach for building adaptive similarity functions based on cardinality using machine learning. Unlike current approaches that build feature sets using similarity scores, we have developed these feature sets with the cardinalities of the commonalities and differences between pairs of objects being compared. This approach allows the machine-learning algorithm to obtain an asymmetric similarity function suitable for directional judgments. Besides using the classic set cardinality, we used soft cardinality to allow flexibility in the comparison between words. Our approach used only the information from the surface of the text, a stop-word remover and a stemmer to address the cross-lingual textual entailment task 8 at SEMEVAL 2012. We have the third best result among the 29 systems submitted by 10 teams. Additionally, this paper presents better results compared with the best official score.