A text-mining technique for extracting gene-disease associations from the biomedical literature

Authors:
Hisham Al-Mubaid;Rajit K. Singh
Affiliations:
School of Science and Computer Engineering, University of Houston-Clear Lake, 2700 Bay Area Blvd, Box 40, Houston, Texas 77058, USA.;School of Science and Computer Engineering, University of Houston-Clear Lake, 2700 Bay Area Blvd, Box 40, Houston, Texas 77058, USA
Venue:
International Journal of Bioinformatics Research and Applications
Year:
2010

Citing 7
Cited 0

A Multi-Level Text Mining Method to Extract Biological Relationships

CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
A Literature Based Method for Identifying Gene-Disease Connections

CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Text mining: generating hypotheses from MEDLINE

Journal of the American Society for Information Science and Technology
New Techniques for Disambiguation in Natural Language and Their Application to Biological Text

The Journal of Machine Learning Research
A text-mining system for knowledge discovery from biomedical documents

IBM Systems Journal
Knowledge discovery by automated identification and ranking of implicit relationships

Bioinformatics
Mining MEDLINE for implicit links between dietary substances and diseases

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new text mining technique to identify associations between biological entities, specifically genes-diseases associations, from the biomedical literature. The proposed method is very simple and straightforward; it uses two sets (a positive set and a negative set) of documents and utilises the concepts of expectation (ex), evidence (ev), and Z-scores in combining positive and negative evidences in determining the significant gene-disease associations from Medline documents. Moreover, the method offers an efficient way to handle gene names, aliases, symbols, and abbreviations. We evaluated the method in discovering gene-to-disease associations from literature and the experimental results are impressive. We verified our results and confirmed the effectiveness of the proposed technique by various ways. For example, we ran the technique on some discovered and known genes-diseases relationships. Our method was able to discover associations between genes and various diseases like Amyotrophic lateral sclerosis, Tuberous Sclerosis, Autism, Homocystinuria, Bipolar Disorder, Atherosclerosis and more.