Feature space transformation and decision results interpretation

  • Authors:
  • Jinyan Li;Hwee-Leng Ong

  • Affiliations:
  • Laboratories for Information Technology, 21 Heng Mui Keng Terrace, Singapore;Laboratories for Information Technology, 21 Heng Mui Keng Terrace, Singapore

  • Venue:
  • APBC '03 Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003 - Volume 19
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Gene expression profiles and proteomic data are extremely high-dimensional data. Though support vector machines can well learn the inner relationship of the data for classification, the non-linear kernel functions pose an obstacle to explain the prediction reasons to non-specialists. We prefer to use rule-based methods due to their easy interpretability. In this paper, we first discuss feature space transformation. Each new feature (a rule) is a combination of multiple original features provided that the new feature captures a large percentage of a class of data, but with no occurrence in the other class. Under the description of new features, training or test data are clearly class-separable. Then we discuss a more sophisticated rule-based method, called PCL, for classification. PCL provides easily explainable classification scores for us to better understand the predictions and the test data themselves. Visualization is also used to enhance the understanding of the classifier output. We use rich examples to demonstrate our main points.