An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Methods for identifying versioned and plagiarized documents
Journal of the American Society for Information Science and Technology
Retrieval and novelty detection at the sentence level
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Why inverse document frequency?
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Similarity measures for tracking information flow
Proceedings of the 14th ACM international conference on Information and knowledge management
Interrogative reformulation patterns and acquisition of question paraphrases
PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
Sentence Similarity Based on Semantic Nets and Corpus Statistics
IEEE Transactions on Knowledge and Data Engineering
Evaluating WordNet-based Measures of Lexical Semantic Relatedness
Computational Linguistics
Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Aspects of sentence retrieval
A comparison of sentence retrieval techniques
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus-based and knowledge-based measures of text semantic similarity
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Knowledge derived from wikipedia for computing semantic relatedness
Journal of Artificial Intelligence Research
Automatically selecting answer templates to respond to customer emails
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Extended gloss overlaps as a measure of semantic relatedness
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Similarity measures for short segments of text
ECIR'07 Proceedings of the 29th European conference on IR research
The PASCAL recognising textual entailment challenge
MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
An enhanced framework of subjective logic for semantic document analysis
MDAI'10 Proceedings of the 7th international conference on Modeling decisions for artificial intelligence
Enhancement of subjective logic for semantic document analysis using hierarchical document signature
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
Word sense disambiguation-based sentence similarity
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
SyMSS: A syntax-based measure for short-text semantic similarity
Data & Knowledge Engineering
Compositional expectation: a purely distributional model of compositional semantics
IWCS '11 Proceedings of the Ninth International Conference on Computational Semantics
Using properties to compare both words and clauses
KES-AMSTA'11 Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications
Semi-supervised semantic role labeling via structural alignment
Computational Linguistics
IR system evaluation using nugget-based test collections
Proceedings of the fifth ACM international conference on Web search and data mining
A clustering-based approach for discovering flaws in requirements specifications
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
A preference learning approach to sentence ordering for multi-document summarization
Information Sciences: an International Journal
Semantic textual similarity using maximal weighted bipartite graph matching
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
DERI&UPM: pushing corpus based relatedness to similarity: shared task system description
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Using clustering to improve the structure of natural language requirements documents
REFSQ'13 Proceedings of the 19th international conference on Requirements Engineering: Foundation for Software Quality
Hi-index | 0.00 |
The ability to accurately judge the similarity between natural language sentences is critical to the performance of several applications such as text mining, question answering, and text summarization. Given two sentences, an effective similarity measure should be able to determine whether the sentences are semantically equivalent or not, taking into account the variability of natural language expression. That is, the correct similarity judgment should be made even if the sentences do not share similar surface form. In this work, we evaluate fourteen existing text similarity measures which have been used to calculate similarity score between sentences in many text applications. The evaluation is conducted on three different data sets, TREC9 question variants, Microsoft Research paraphrase corpus, and the third recognizing textual entailment data set.