Integrating protein family sequence similarities with gene expression to find signature gene networks in breast cancer metastasis

  • Authors:
  • Sepideh Babaei;Erik Van Den Akker;Jeroen De Ridder;Marcel Reinders

  • Affiliations:
  • Delft Bioinformatics Lab, Delft University of Technology, The Netherlands and Netherlands Bioinformatics Centre;Delft Bioinformatics Lab, Delft University of Technology, The Netherlands and Molecular Epidemiology, Leiden University Medical Centre, Leiden, The Netherlands;Delft Bioinformatics Lab, Delft University of Technology, The Netherlands and Netherlands Bioinformatics Centre;Delft Bioinformatics Lab, Delft University of Technology, The Netherlands and Netherlands Bioinformatics Centre

  • Venue:
  • PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding robust marker genes is one of the key challenges in breast cancer research. Significant signatures identified in independent datasets often show little to no overlap, possibly due to small sample size, noise in gene expression measurements, and heterogeneity across patients. To find more robust markers, several studies analyzed the gene expression data by grouping functionally related genes using pathways or protein interaction data. Here we pursue a protein similarity measure based on Pfam protein family information to aid the identification of robust subnetworks for prediction of metastasis. The proposed protein-to-protein similarities are derived from a protein-to-family network using family HMM profiles. The gene expression data is overlaid with the obtained protein-protein sequence similarity network on six breast cancer datasets. The results indicate that the captured protein similarities represent interesting predictive capacity that aids interpretation of the resulting signatures and improves robustness.