Rich features based Conditional Random Fields for biological named entities recognition

  • Authors:
  • Chengjie Sun;Yi Guan;Xiaolong Wang;Lei Lin

  • Affiliations:
  • School of Computer Science, Harbin Institute of Technology, Mailbox 319, West Da-zhi Street 92, Harbin, Heilongjiang 150001, China;School of Computer Science, Harbin Institute of Technology, Mailbox 319, West Da-zhi Street 92, Harbin, Heilongjiang 150001, China;School of Computer Science, Harbin Institute of Technology, Mailbox 319, West Da-zhi Street 92, Harbin, Heilongjiang 150001, China;School of Computer Science, Harbin Institute of Technology, Mailbox 319, West Da-zhi Street 92, Harbin, Heilongjiang 150001, China

  • Venue:
  • Computers in Biology and Medicine
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Biological named entity recognition is a critical task for automatically mining knowledge from biological literature. In this paper, this task is cast as a sequential labeling problem and Conditional Random Fields model is introduced to solve it. Under the framework of Conditional Random Fields model, rich features including literal, context and semantics are involved. Among these features, shallow syntactic features are first introduced, which effectively improve the model's performance. Experiments show that our method can achieve an F-measure of 71.2% in an open evaluation data, which is better than most of state-of-the-art systems.