The Journal of Machine Learning Research
Sentence Similarity Based on Semantic Nets and Corpus Statistics
IEEE Transactions on Knowledge and Data Engineering
Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
ParaEval: using paraphrases to evaluate summaries automatically
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Paraphrasing for automatic evaluation
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Semantic text similarity using corpus-based word similarity and string similarity
ACM Transactions on Knowledge Discovery from Data (TKDD)
Corpus-based and knowledge-based measures of text semantic similarity
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Extended gloss overlaps as a measure of semantic relatedness
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Automatic evaluation of text coherence: models and representations
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
A comparative study of two short text semantic similarity measures
KES-AMSTA'08 Proceedings of the 2nd KES International conference on Agent and multi-agent systems: technologies and applications
Training and testing of recommender systems on data missing not at random
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Text relatedness based on a word thesaurus
Journal of Artificial Intelligence Research
Word sense disambiguation-based sentence similarity
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Collecting highly parallel data for paraphrase evaluation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
SemEval-2012 task 6: a pilot on semantic textual similarity
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Modeling sentences in the latent space
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Learning the latent semantics of a concept from its definition
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Hi-index | 0.00 |
The Semantic Textual Similarity (STS) shared task (Agirre et al., 2012) computes the degree of semantic equivalence between two sentences. We show that a simple unsupervised latent semantics based approach, Weighted Textual Matrix Factorization that only exploits bag-of-words features, can outperform most systems for this task. The key to the approach is to carefully handle missing words that are not in the sentence, and thus rendering it superior to Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Our system ranks 20 out of 89 systems according to the official evaluation metric for the task, Pearson correlation, and it ranks 10/89 and 19/89 in the other two evaluation metrics employed by the organizers.