Iterative Weighting of Phylogenetic Profiles Increases Classification Accuracy

  • Authors:
  • Roger Craig;Li Liao

  • Affiliations:
  • University of Delaware;University of Delaware

  • Venue:
  • ICMLA '05 Proceedings of the Fourth International Conference on Machine Learning and Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.03

Visualization

Abstract

Phylogenetic profiles of proteins .. strings of ones and zeros encoding the presence and absence of proteins in a group of genomes .. have been utilized to predict functionally linked proteins. In this work, we developed a method that incorporates into profile similarity the evolutionary relations that are represented in the phylogenetic tree of the genomes. The method extends the profile to encode the phylogenetic tree as extra bits, with scores reflecting the chances of interior nodes - hypothetical ancestral genomes of developing divergence in the descendants. The scoring scheme is refined with weighting factors that are collected from the training data and are iteratively updated from the predicted results. We tested the method on the proteome of Saccharomyces cerevisias - the budding yeast and used the MIPS classification as the benchmark. With such weighted phylogenetic profiles, the accuracy of our classifier-- a support vector machine-- was greatly increased.