Feature selection for semi-supervised multi-label learning with application to gene function analysis

  • Authors:
  • Guo-Zheng Li;Mingyu You;Lei Ge;Jack Y. Yang;Mary Qu Yang

  • Affiliations:
  • Tongji University, Shanghai, China;Tongji University, Shanghai, China;Shanghai University, Shanghai, China;Indiana Univ Bioinformatics Center, Indianapolis, IN;Indiana Univ Bioinformatics Center, Indianapolis, IN

  • Venue:
  • Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates gene function annotation of Yeast by using semi-supervised multi-label learning. Multi-label learning has been a hot topic in the bioinformatics field, but there are many samples unlabeled. Semi-supervised learning may be employed to utilize the unlabeled data. This paper proposes a novel semi-supervised multi-label learning algorithm COMN by combining Co-Training with ML-kNN to utilize the unlabeled yeast gene data to improve modeling accuracy of function annotation. Furthermore, an embedded feature selection algorithm PRECOMN is proposed to perform feature selection for COMN to remove the irrelevant and redundant features. Experimental results on one benchmark data set of Yeast show COMN and PRECOMN perform better than the original multi-label learning algorithm ML-kNN. Furthermore PRECOMN improves generalization performance of COMN.