Methodological Review: Biomedical text mining and its applications in cancer research

Authors:
Fei Zhu;Preecha Patumcharoenpol;Cheng Zhang;Yang Yang;Jonathan Chan;Asawin Meechai;Wanwipa Vongsangnak;Bairong Shen
Affiliations:
Center for Systems Biology, Soochow University, Suzhou 215006, China and School of Computer Science and Technology, Soochow University, Suzhou 215006, China;School of Information Technology, King Mongkut's University of Technology Thonburi, Thailand and School of Bioresources and Technology, King Mongkut's University of Technology Thonburi, Thailand;Center for Systems Biology, Soochow University, Suzhou 215006, China;Center for Systems Biology, Soochow University, Suzhou 215006, China and School of Computer Science and Technology, Soochow University, Suzhou 215006, China;School of Information Technology, King Mongkut's University of Technology Thonburi, Thailand;Department of Chemical Engineering, King Mongkut's University of Technology Thonburi, Thailand;Center for Systems Biology, Soochow University, Suzhou 215006, China;Center for Systems Biology, Soochow University, Suzhou 215006, China and Institute for Translational Bioinformatics and Systems Medicine, School of Biomedical Informatics, Suzhou University of Sci ...
Venue:
Journal of Biomedical Informatics
Year:
2013

Citing 20
Cited 0

Knowledge discovery in databases: an overview

AI Magazine
HPID: The Human Protein Interaction Database

Bioinformatics
PDZBase: a protein--protein interaction database for PDZ-domains

Bioinformatics
Tuning support vector machines for biomedical named entity recognition

BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Implementing the iHOP concept for navigation of biomedical literature

Bioinformatics
Building an abbreviation dictionary using a term recognition approach

Bioinformatics
Classifying semantic relations in bioscience texts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Bidirectional inference with the easiest-first strategy for tagging sequence data

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Multi-way relation classification: application to protein-protein interactions

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Link test-A statistical method for finding prostate cancer biomarkers

Computational Biology and Chemistry
Text Mining of Clinical Records for Cancer Diagnosis

ICICIC '07 Proceedings of the Second International Conference on Innovative Computing, Informatio and Control
Exploring deep knowledge resources in biomedical name recognition

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Brief Communication: Two-phase biomedical named entity recognition using CRFs

Computational Biology and Chemistry
Feature selection techniques for maximum entropy based biomedical named entity recognition

Journal of Biomedical Informatics
Integrating linguistic knowledge into a conditional random fieldframework to identify biomedical named entities

Expert Systems with Applications: An International Journal
GeneTUKit

Bioinformatics
MAPLSC: A novel multi-class classifier for medical diagnosis

International Journal of Data Mining and Bioinformatics
MinePhos: A Literature Mining System for Protein Phoshphorylation Information Extraction

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
BioTextQuest

Bioinformatics
Hidden Markov processes

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cancer is a malignant disease that has caused millions of human deaths. Its study has a long history of well over 100years. There have been an enormous number of publications on cancer research. This integrated but unstructured biomedical text is of great value for cancer diagnostics, treatment, and prevention. The immense body and rapid growth of biomedical text on cancer has led to the appearance of a large number of text mining techniques aimed at extracting novel knowledge from scientific text. Biomedical text mining on cancer research is computationally automatic and high-throughput in nature. However, it is error-prone due to the complexity of natural language processing. In this review, we introduce the basic concepts underlying text mining and examine some frequently used algorithms, tools, and data sets, as well as assessing how much these algorithms have been utilized. We then discuss the current state-of-the-art text mining applications in cancer research and we also provide some resources for cancer text mining. With the development of systems biology, researchers tend to understand complex biomedical systems from a systems biology viewpoint. Thus, the full utilization of text mining to facilitate cancer systems biology research is fast becoming a major concern. To address this issue, we describe the general workflow of text mining in cancer systems biology and each phase of the workflow. We hope that this review can (i) provide a useful overview of the current work of this field; (ii) help researchers to choose text mining tools and datasets; and (iii) highlight how to apply text mining to assist cancer systems biology research.