Mining the relationship between gene and disease from literature

Authors:
Yan Xu;Zhiqiang Chang;Wen Hu;Lili Yu;Huizi DuanMu;Xia Li
Affiliations:
School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China and College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China;College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China;College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China;College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China;College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China and College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
Venue:
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
Year:
2009

Citing 4
Cited 0

Discovering patterns to extract protein--protein interactions from full texts

Bioinformatics
Finding the evidence for protein-protein interactions from PubMed abstracts

Bioinformatics
Biological relation extraction and query answering from MEDLINE abstracts using ontology-based text mining

Data & Knowledge Engineering
High-performance gene name normalization with GeNo

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text mining refers to extract high-quality information including entities and relationships between them from text. Although several methods have been applied to extract protein interaction relationships and other information, few researches have focused on dealing with sentences for extracting precise relationships. This paper has provided several strategies in the processes of filtering the sentences which contain non-positive relationships, then using the pattern of entities and relationship phrases to extract the relationships between gene and disease. We selected abstracts associated with "receptor", using 1000 sentences which contain the entity names and relationship phrases as the test set, the results show that the method achieved a precision of 84.6%, a recall of 77.5% and an F-score of 80.9%. Moreover, we analyzed the usual problems which might happen in the process of extracting the relationships frequently.