BUAP: three approaches for semantic textual similarity

Authors:
Maya Carrillo;Darnes Vilariño;David Pinto;Mireya Tovar;Saul León;Esteban Castillo
Affiliations:
Benemérita Universidad Autónoma de Puebla, Sur & Av. San Claudio, CU Puebla, Puebla, Mééxico;Benemérita Universidad Autónoma de Puebla, Sur & Av. San Claudio, CU Puebla, Puebla, Mééxico;Benemérita Universidad Autónoma de Puebla, Sur & Av. San Claudio, CU Puebla, Puebla, Mééxico;Benemérita Universidad Autónoma de Puebla, Sur & Av. San Claudio, CU Puebla, Puebla, Mééxico;Benemérita Universidad Autónoma de Puebla, Sur & Av. San Claudio, CU Puebla, Puebla, Mééxico;Benemérita Universidad Autónoma de Puebla, Sur & Av. San Claudio, CU Puebla, Puebla, Mééxico
Venue:
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Year:
2012

Citing 4
Cited 0

Verbs semantics and lexical selection

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Using bag-of-concepts to improve the performance of support vector machines in text categorization

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Corpus-based and knowledge-based measures of text semantic similarity

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
SemEval-2012 task 6: a pilot on semantic textual similarity

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe the three approaches we submitted to the Semantic Textual Similarity task of SemEval 2012. The first approach considers to calculate the semantic similarity by using the Jaccard coefficient with term expansion using synonyms. The second approach uses the semantic similarity reported by Mihalcea in (Mihalcea et al., 2006). The third approach employs Random Indexing and Bag of Concepts based on context vectors. We consider that the first and third approaches obtained a comparable performance, meanwhile the second approach got a very poor behavior. The best ALL result was obtained with the third approach, with a Pearson correlation equal to 0.663.