Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Feature selection, L1 vs. L2 regularization, and rotational invariance
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Bootstrapping and evaluating named entity recognition in the biomedical domain
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Corpus design for biomedical natural language processing
ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Hi-index | 0.00 |
When only a small amount of manually annotated data is available, application of a bootstrapping method is often considered to compensate for the lack of sufficient training material for a machine-learning method. The paper reports a series of experimental results of bootstrapping for protein name recognition. The results show that the performance changes significantly according to the choice of text collection where the training samples to bootstrap, and that an improvement can be obtained only with a well chosen text collection.