Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Graphical Features Selection Method
IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
Notions of correctness when evaluating protein name taggers
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
GAPSCORE: finding gene and protein names one word at a time
Bioinformatics
Hi-index | 0.00 |
In the biological domain, extracting newly discovered functional features from the massive literature is a major challenging issue. To automatically annotate Gene References into Function (GeneRIF) in a new literature is the main goal of this paper. We tried to find GRIF words in a training corpus, and then applied these informative words to annotate the GeneRIFs in abstracts with several different weighting schemes. The experiments showed that the Classic Dice score is at most 50.18%, when the weighting schemes proposed in the paper (Hou et al., 2003) were adopted. In contrast, after employing Support Vector Machines (SVMs) and the definition of classes proposed by Jelier et al. (2003), the score greatly improved to 56.86% for Classic Dice (CD). Adopting the same features, SVMs demonstrated advantage over the Naïve Bayes Classifier. Finally, the combination of the former two models attained a score of 59.51% for CD.