Boosting performance of bio-entity recognition by combining results from multiple systems

  • Authors:
  • Luo Si;Tapas Kanungo;Xiangji Huang

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA;IBM Almaden Research Center, San Jose, CA;York University Toronto, Canada

  • Venue:
  • Proceedings of the 5th international workshop on Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The task of biomedical named-entity recognition is to identify technical terms in the domain of biology that are of special interest to domain experts. While numerous algorithms have been proposed for this task, biomedical named-entity recognition remains a challenging task and an active area of research, as there is still a large accuracy gap between the best algorithms for biomedical named-entity recognition and those for general newswire named-entity recognition. The reason for such discrepancy in accuracy results is generally attributed to inadequate feature representations of individual entity recognition systems and external domain knowledge.In order to take advantage of the rich feature representations and external domain knowledge used by different systems, we propose several Meta biomedical named-entity recognition algorithms that combine recognition results of various recognition systems. The proposed algorithms -- majority vote, unstructured exponential model and conditional random field -- were tested on the GENIA biomedical corpus. Empirical results show that the F score can be improved from 0.72, which is attained by the best individual system, to 0.96 by our Meta entity recognition approach.