Applying latent dirichlet allocation to automatic essay grading

Authors:
Tuomo Kakkonen;Niko Myller;Erkki Sutinen
Affiliations:
University of Joensuu, Joensuu, Finland;University of Joensuu, Joensuu, Finland;University of Joensuu, Joensuu, Finland
Venue:
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Year:
2006

Citing 8
Cited 0

Automatic essay grading using text categorization techniques

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
On an equivalence between PLSI and LDA

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
Sufficient dimensionality reduction

The Journal of Machine Learning Research
Test Data Likelihood for PLSA Models

Information Retrieval
Automatic essay grading with probabilistic latent semantic analysis

EdAppsNLP 05 Proceedings of the second workshop on Building Educational Applications Using NLP
Expectation-propagation for the generative aspect model

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We report experiments on automatic essay grading using Latent Dirichlet Allocation (LDA). LDA is a “bag-of-words” type of language modeling and dimension reduction method, reported to outperform other related methods, Latent Semantic Analysis (LSA) and Probabilistic Latent Semantic Analysis (PLSA) in Information Retrieval (IR) domain. We introduce LDA in detail and compare its strengths and weaknesses to LSA and PLSA. We also compare empirically the performance of LDA to LSA and PLSA. The experiments were run with three essay sets consisting in total of 283 essays from different domains. On contrary to the findings in IR, LDA achieved slightly worse results compared to LSA and PLSA in the experiments. We state the reasons for LSA and PLSA outperforming LDA and indicate further research directions.