Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Background and overview for KDD Cup 2002 task 1: information extraction from biomedical articles
ACM SIGKDD Explorations Newsletter
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Hi-index | 0.00 |
We aim to design a system for classifying scientific articles based on the presence of protein characterization experiments, intending to aid the curators populating JCVI's Characterized Protein (CHAR) Database of experimentally characterized proteins. We trained two classifiers using small datasets labeled by CHAR curators, and another classifier based on a much larger dataset using annotations from public databases. Performance varied greatly, in ways we did not anticipate. We describe the datasets, the classification method, and discuss the unexpected results.