Spectral clustering of biological sequence data

  • Authors:
  • William Pentney;Marina Meila

  • Affiliations:
  • Department of Computer Science and Engineering, University of Washington;Department of Statistics, University of Washington

  • Venue:
  • AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we apply spectral techniques to clustering biological sequence data that has proved more difficult to cluster effectively. For this purpose, we have to (1) extend spectral clustering algorithms to deal with asymmetric affinities. like the alignment scores used in the comparison of biological sequences. and (2) devise a hierarchical algorithm that can handle many clusters with imbalanced sizes robustly. We present an algorithm for clustering asymmetric affinity data, and demonstrate the performance of this algorithm at recovering the higher levels of the Structural Classification of Proteins (SCOP) on a data base of highly conserved subsequences.