An Algorithm to Assess the Reliability of Hierarchical Clusters in Gene Expression Data

  • Authors:
  • Roberto Avogadri;Matteo Brioschi;Francesca Ruffino;Fulvia Ferrazzi;Alessandro Beghini;Giorgio Valentini

  • Affiliations:
  • DSI - Dip. Scienze dell' Informazione, Università degli Studi di Milano, Italy;DBioGen - Dip. Biologia e Genetica per le Scienze Mediche, Università degli Studi di Milano, Italy;DSI - Dip. Scienze dell' Informazione, Università degli Studi di Milano, Italy;Dip. Informatica e Sistemistica, Università degli Studi di Pavia, Italy;DBioGen - Dip. Biologia e Genetica per le Scienze Mediche, Università degli Studi di Milano, Italy;DSI - Dip. Scienze dell' Informazione, Università degli Studi di Milano, Italy

  • Venue:
  • KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The validation of clusters discovered in bio-molecular data is a central issue in bioinformatics. Recently, stability-based methods have been successfully applied to the analysis of the reliability of clusterings characterized by a relatively low number of examples and clusters. Nevertheless, several problems in functional genomics are characterized by a very large number of examples and clusters. We present a stability-based algorithm to discover significant clusters in hierarchical clusterings with a large number of examples and clusters. Preliminary results on gene expression data of patients affected by Human Myeloid Leukemia, show how to apply the proposed method when thousands of gene clusters are involved.