Classifying biological full-text articles for multi-database curation

  • Authors:
  • Wen-Juan Hou;Chih Lee;Hsin-Hsi Chen

  • Affiliations:
  • National Taiwan University, Taipei, Taiwan;National Taiwan University, Taipei, Taiwan;National Taiwan University, Taipei, Taiwan

  • Venue:
  • EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper, we propose an approach for identifying curatable articles from a large document set. This system considers three parts of an article (title and abstract, MeSH terms, and captions) as its three individual representations and utilizes two domain-specific resources (UMLS and a tumor name list) to reveal the deep knowledge contained in the article. An SVM classifier is trained and cross-validation is employed to find the best combination of representations. The experimental results show overall high performance.