A KNN-Based learning method for biology species categorization

  • Authors:
  • Yan Dang;Yulei Zhang;Dongmo Zhang;Liping Zhao

  • Affiliations:
  • Computer Science and Engineering Department, Shanghai Jiao Tong University, Shanghai, P. R. China;School of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, P. R. China;Computer Science and Engineering Department, Shanghai Jiao Tong University, Shanghai, P. R. China;School of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, P. R. China

  • Venue:
  • ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part I
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel approach toward high precision biology species categorization which is mainly based on KNN algorithm. KNN has been successfully used in natural language processing (NLP). Our work extends the learning method for biological data. We view the DNA or RNA sequences of certain species as special natural language texts. The approach for constructing composition vectors of DNA and RNA sequences is described. A learning method based on KNN algorithm is proposed. An experimental system for biology species categorization is implemented. Forty three different bacteria organisms selected randomly from EMBL are used for evaluation purpose. And the preliminary experiments show promising results on precision.