ME-based biomedical named entity recognition using lexical knowledge

  • Authors:
  • Kyung-Mi Park;Seon-Ho Kim;Hae-Chang Rim;Young-Sook Hwang

  • Affiliations:
  • Korea University, Seoul, Korea;Korea University, Seoul, Korea;Korea University, Seoul, Korea;Advanced Telecommunications Research Institute (ATR), Kyoto, Japan

  • Venue:
  • ACM Transactions on Asian Language Information Processing (TALIP)
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a two-phase biomedical NE-recognition method based on a ME model: we first recognize biomedical terms and then assign appropriate semantic classes to the recognized terms. In the two-phase NE-recognition method, the performance of the term-recognition phase is very important, because the semantic classification is performed on the region identified at the recognition phase. In this study, in order to improve the performance of term recognition, we try to incorporate lexical knowledge into pre- and postprocessing of the term-recognition phase. In the preprocessing step, we use domain-salient words as lexical knowledge obtained by corpus comparison. In the postprocessing step, we utilize χ2-based collocations gained from Medline corpus. In addition, we use morphological patterns extracted from the training data as features for learning the ME-based classifiers. Experimental results show that the performance of NE-recognition can be improved by utilizing such lexical knowledge.