A machine learning approach for the curation of biomedical literature: KDD Cup 2002 (task 1)

  • Authors:
  • S. Sathiya Keerthi;Chong Jin Ong;Keng Boon Siah;David B. L. Lim;Wei Chu;Min Shi;David S. Edwin;Rakesh Menon;Lixiang Shen;Jonathan Y. K. Lim;Han Tong Loh

  • Affiliations:
  • National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore

  • Venue:
  • ACM SIGKDD Explorations Newsletter
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present an automated text classification system for the classification of biomedical papers. This classification is based on whether there is experimental evidence for the expression of molecular gene products for specified genes within a given paper. The system performs pre-processing and data cleaning, followed by feature extraction from the raw text. It subsequently classifies the paper using the extracted features with a Naïve Bayes Classifier. Our approach has made it possible to classify (and curate) biomedical papers automatically, thus potentially saving considerable time and resources.