Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Random projection in dimensionality reduction: applications to image and text data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Sufficient dimensionality reduction
The Journal of Machine Learning Research
A general computational model for word-form recognition and production
ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
Constraint grammar as a framework for parsing running text
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Hi-index | 0.00 |
With the Latent Semantic Analysis (LSA), it is possible to automatically grade essays, i.e., free-text responses to examinations, by comparing them to a corpus of available learning materials. In order to get grades that correspond to those given by human assessors, it is crucial to train the system with essays that have already been graded. Noise reduction refers to a process in which individual words used for comparing essays with learning materials are given weight according to their significance. To find out the optimal parameters for noise reduction, the system is trained with different parameters, and the corresponding grades for essays are predicted by each of these models. Three standard validation methods, holdout, bootstrap, and k-fold cross-validation, were applied for noise reduction. In an experiment that consisted of 283 essays from three examinations, each of a different subject, the holdout validation method turned out to give the best predictions, and hence, reduce most of the noise.