A new protein motif extraction framework based on constrained co-clustering

  • Authors:
  • Francesca Cordero;Alessia Visconti;Marco Botta

  • Affiliations:
  • University of Torino, Torino, Italy;University of Torino, Torino, Italy;University of Torino, Torino, Italy

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Signal finding (pattern discovery) in biological sequences is a fundamental problem in both computer science and molecular biology. Many approaches have been proposed for extracting interesting patterns (or motifs) from DNA/RNA and protein sequences. Some approaches are based on simple and multiple alignment techniques, some use biological knowledge and others do not. In this paper, we propose a de novo framework that performs motifs identification and exploits a constrained co-clustering technique allowing one to simultaneously find associations between groups of protein sequences and groups of motifs. We show that the presented approach is able to group together protein sequences belonging to the same families and, at the same time to provide a set of characterizing motifs.