Uncertainty sampling-based active learning for protein-protein interaction extraction from biomedical literature

Authors:
Baojin Cui;Hongfei Lin;Zhihao Yang
Affiliations:
Department of Computer Science and Engineering, Dalian University of Technology, No. 2 LingGong Road, ShaHeKou District, Dalian 116023, China;Department of Computer Science and Engineering, Dalian University of Technology, No. 2 LingGong Road, ShaHeKou District, Dalian 116023, China;Department of Computer Science and Engineering, Dalian University of Technology, No. 2 LingGong Road, ShaHeKou District, Dalian 116023, China
Venue:
Expert Systems with Applications: An International Journal
Year:
2009

Citing 12
Cited 2

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
The nature of statistical learning theory

The nature of statistical learning theory
Queries and Concept Learning

Machine Learning
Queries and Concept Learning

Machine Learning
Active Learning for Natural Language Parsing and Information Extraction

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Employing EM and Pool-Based Active Learning for Text Classification

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Discovering patterns to extract protein--protein interactions from full texts

Bioinformatics
Extracting Protein-Protein Interaction Information from Biomedical Text with SVM

IEICE - Transactions on Information and Systems
Multi-criteria-based active learning for named entity recognition

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
BioPPIExtractor: A protein-protein interaction extraction system for biomedical literature

Expert Systems with Applications: An International Journal
Comparative experiments on learning information extractors for proteins and their interactions

Artificial Intelligence in Medicine

Remote sensing image segmentation by active queries

Pattern Recognition
Cross-domain video concept detection: A joint discriminative and generative active learning approach

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	12.05

Visualization

Abstract

Protein-protein interaction (PPI) extraction from biomedical literature has become a research focus with the rapid growth of the number of biomedical literature. Many methods have been proposed for PPI extraction including natural language processing techniques and machine learning approaches. One problem of applying machine learning approaches to PPI extraction is that large amounts of data are available but the cost of correctly labeling it prohibits its use. To reduce the amount of human labeling effort while maintaining the PPI extraction performance, the paper presents an uncertainty sampling-based method of active learning (USAL) in a lexical feature-based SVM model to tag the most informative unlabeled samples. In addition, some specific samples are ignored to speed up learning process while maintaining desired accuracy. The experiment results on AIMED and CB corpora show that our method can reduce the labeling by 40% and 20%, respectively, without degrading the performance.