Feature selection for brain-computer interfaces
PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Subject classification of research papers based on interrelationships analysis
Proceedings of the 2011 workshop on Knowledge discovery, modeling and simulation
Hi-index | 0.00 |
This paper describes an experiment in applying a standard supervised machine learning algorithm (C4.5) to the problem of developing subject classification rules for documents. This algorithms is found to produce surprisingly concise models of document classifications. While the models are highly accurate on the training sets, evaluation over test sets or through cross-validation shows a significant decrease in classification accuracy. Given the difficult nature of the experimental task, however, the results of this investigation are promising and merit further study. An additional algorithm, 1R, is shown to be highly effective in generating lists of candidate terms for subject descriptions.