Efficient classification method for complex biological literature using text and data mining combination

  • Authors:
  • Yun Jeong Choi;Seung Soo Park

  • Affiliations:
  • Department of Computer Science & Engineering, Ewha Womans University, Seoul, Korea;Department of Computer Science & Engineering, Ewha Womans University, Seoul, Korea

  • Venue:
  • IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, as the size of genetic knowledge grows faster, the automated analysis and systemization into high-throughput database has become a hot issue. In bioinformatics area, one of the essential tasks is to recognize and identify genomic entities and discover their relations from various sources. Generally, biological literatures containing ambiguous entities, are laid by decision boundaries. The purpose of this paper is to design and implement a classification system for improving performance in identifying entity problems. The system is based on reinforcement training and post-processing method and supplemented by data mining algorithms to enhance its performance. For experiments, we add some intentional noises to training data for testing the robustness and stability. The result shows significantly improved stability on training errors.