Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
The Journal of Machine Learning Research
Similarity-based methods for word sense disambiguation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Deeper natural language processing for evaluating student answers in intelligent tutoring systems
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Automatic evaluation of topic coherence
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
LDA based similarity modeling for question answering
SS '10 Proceedings of the NAACL HLT 2010 Workshop on Semantic Search
Paraphrase identification on the basis of supervised machine learning techniques
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Optimizing semantic coherence in topic models
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Recognizing Textual Entailment
Recognizing Textual Entailment
Similarity measures based on latent dirichlet allocation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Hi-index | 0.00 |
We present in this paper experiments with several semantic similarity measures based on the unsupervised method Latent Dirichlet Allocation. For comparison purposes, we also report experimental results using an algebraic method, Latent Semantic Analysis. The proposed semantic similarity methods were evaluated using one dataset that includes student answers from conversational intelligent tutoring systems and a standard paraphrase dataset, the Microsoft Research Paraphrase corpus. Results indicate that the method based on word representations as topic vectors outperforms methods based on distributions over topics and words. The proposed evaluation methods can also be regarded as an extrinsic method for evaluating topic coherence or selecting the number of topics in LDA models, i.e. a task-based evaluation of topic coherence and selection of number of topics in LDA.