Literature mining and database annotation of protein phosphorylation using a rule-based system

Authors:
Z. Z. Hu;M. Narayanaswamy;K. E. Ravikumar;K. Vijay-Shanker;C. H. Wu
Affiliations:
Department of Biochemistry and Molecular Biology, Georgetown University Medical Center Washington, DC 20057, USA;AU-KBC Research Centre, Anna University Chennai 600044, India;AU-KBC Research Centre, Anna University Chennai 600044, India;Department of Computer and Information Sciences, University of Delaware Newark, DE 19716, USA;Department of Biochemistry and Molecular Biology, Georgetown University Medical Center Washington, DC 20057, USA
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 10

Extraction of protein interaction data: a comparative analysis of methods in use

EURASIP Journal on Bioinformatics and Systems Biology
@Note: A workbench for Biomedical Text Mining

Journal of Biomedical Informatics
Using UMLS to construct a generalized hierarchical concept-based dictionary of brain functions for information extraction from the fMRI literature

Journal of Biomedical Informatics
Reconstruction of protein-protein interaction pathways by mining subject-verb-objects intermediates

PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Event extraction for post-translational modifications

BioNLP '10 Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
Towards automatic thematic sheets based on discursive categories in biomedical literature

Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Towards exhaustive protein modification event extraction

BioNLP '11 Proceedings of BioNLP 2011 Workshop
MinePhos: A Literature Mining System for Protein Phoshphorylation Information Extraction

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
PubMed-scale event extraction for post-translational modifications, epigenetics and protein structural relations

BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Text Mining of Protein Phosphorylation Information Using a Generalizable Rule-Based Approach

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: A large volume of experimental data on protein phosphorylation is buried in the fast-growing PubMed literature. While of great value, such information is limited in databases owing to the laborious process of literature-based curation. Computational literature mining holds promise to facilitate database curation. Results: A rule-based system, RLIMS-P (Rule-based LIterature Mining System for Protein Phosphorylation), was used to extract protein phosphorylation information from MEDLINE abstracts. An annotation-tagged literature corpus developed at PIR was used to evaluate the system for finding phosphorylation papers and extracting phosphorylation objects (kinases, substrates and sites) from abstracts. RLIMS-P achieved a precision and recall of 91.4 and 96.4% for paper retrieval, and of 97.9 and 88.0% for extraction of substrates and sites. Coupling the high recall for paper retrieval and high precision for information extraction, RLIMS-P facilitates literature mining and database annotation of protein phosphorylation. Availability: The program is available on request from the authors. The phosphorylation patterns and datasets used in this study are available at http://pir.georgetown.edu/iprolink/ Contact: zh9@georgetown.edu