Using contextual information to clarify gene normalization ambiguity

  • Authors:
  • Po-Ting Lai;Yue-Yang Bow;Chi-Hsin Huang;Hong-Jie Dai;Richard Tzong-Han Tsai;Wen-Lian Hsu

  • Affiliations:
  • Dept. of Computer Science & Engineering, Yuan Ze Univ., Chung-Li, Taiwan, R.O.C.;Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C.;Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C.;Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C. and Dept. of Computer Science, National Tsing-Hua Univ., Hsinchu, Taiwan, R.O.C.;Dept. of Computer Science & Engineering, Yuan Ze Univ., Chung-Li, Taiwan, R.O.C.;Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C. and Dept. of Computer Science, National Tsing-Hua Univ., Hsinchu, Taiwan, R.O.C.

  • Venue:
  • IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of Gene Normalization (GN) is to identify the unique database identifiers of genes and proteins mentioned in biomedical literature. A major difficulty in GN comes from inter-species gene ambiguity. That is, the same gene name can refer to different database identifiers depending on the species in question. In this paper, we introduce a method to exploit contextual information in an abstract, like tissue type, chromosome location, etc., to tackle this problem. Using this technique, we have been able to improve system performance (F-score) by 14.3% on the BioCreAtIvE-II GN task test set.