Knowledge-based gene symbol disambiguation

  • Authors:
  • He Tan

  • Affiliations:
  • Linköpings Universitet, Linköpings, Sweden

  • Venue:
  • Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Since there is no standard naming convention for genes and gene products, gene symbol disambiguation (GSD) has become a big challenge when mining biomedical literature. Several GSD methods have been proposed based on MEDLINE references to genes. However, nowadays gene databases, e.g. Entrez Gene, provide plenty of information about genes, and many biomedical ontologies, e.g. UMLS Metathesaurus and Semantic Network, have been developed. These knowledge sources could be used for disambiguation, in this paper we propose a method which relies on information about gene candidates from gene databases, contexts of gene symbols and biomedical ontologies. We implement our method, and evaluate the performance of the implementation using BioCreAtIvE II data sets.